Patrick J. Burns nous explique comment parcourir les nombreux textes proposés par le site “The latin library” et en extraire, grâce à quelques lignes de code, les mots palindromes les plus fréquents par exemple :
A playful diversion for the morning: What is the longest palindrome in the Latin language? And secondarily, what are the most common? (Before we even check, it won’t be too much of a surprise that non takes the top spot. It is the only palindrome in the Top 10 Most Frequent Latin Words list.)
As with other experiments in this series, we will use the Latin Library as a corpus and let it be our lexical playground. In this post, I will post some comments about method and report results. The code itself, using the CLTK and the CLTK Latin Library corpus with Python3, is available in this notebook.
As far as method, this experiment is fairly straightforward. First, we import the Latin Library, preprocess it in the usual ways, tokenize the text, and remove tokens of less than 3 letters. Now that we have a list of tokens, we can look for palindrom
Lire l’article de Patrick J. Burns : https://disiectamembra.wordpress.com/2017/03/26/finding-palindromes-in-the-latin-library/