Those like me with a love of the byways of the English language, or with a love of cryptic crosswords, will no doubt have collected over the years a private trove of single word anagrams. One of my favourites (because both words are common) is
Clearly smaller words are likely to have more anagrams than longer ones:
OPST, POTS, POST, STOP, SPOT, TOPS
From the Anagrams FAQ comes this lovely example, of almost familiar words (at least, they are not scientific or technical terms):
INTERROGATIVES, REINVESTIGATOR, TERGIVERSATION
From an anagrams programming page:
ANGOR, ARGON, GORAN, GRANO, GROAN, NAGOR, ORANG, ORGAN, ROGAN
several of which are unfamiliar to me as English words. I’m not allowing proper names.
How can we measure the “anagrammability” of a set of letters? Some account should be given to the number of anagrams, and to the length of the word. A long word with only two anagrams should get a higher score than a small word with many. Let be the number of letters, and the number of (single word) English anagrams. My anagrammability measure (discussed with my eldest son) is:
This gives a score of zero if there are no anagrams other than the word itself, and is weighted for large words.
With the examples above:
According to the Anagrams FAQ again, the set of letters with the most one word anagrams is AEINRST with:
ANESTRI, ASTERIN, ERANIST, NASTIER, RATINES, RESIANT, RESTAIN, RETAINS, RETINAS, RETSINA, SAINTER, STAINER, STARNIE, STEARIN
for a score of 637. (I’ve never seen some of these words before, but apparently they can all be found in Merriam-Webster’s unabridged dictionary).
It’s not hard to experiment with anagrams in Sage; all you need is a wordlist, which you can read in, turn into a list and strip the trailing non-printing characters. You can download a list of 109582 words here, then save it as, say,
f = open("words.txt","r") nwords = f.readlines() words =  for i in nwords: words.append(i.strip('\r\n'))
Then we can write a simple program which given any string, checks if each permutation is in the list we’ve just created:
def find_anagrams(myword): anagrams= for i in permutations(myword): anagrams.append(join(i,"")) for i in anagrams: if i in words: print i
sage: find_anagrams('opst') opts post pots spot stop tops sage: find_anagrams('aeirnst') retains retinas retsina nastier stainer stearin sage: find_anagrams('organ') orang organ groan argon
Clearly this wordlist doesn’t include all the words listed earlier. Clearly also this method is hideously inefficient – it creates all the possible anagrams first, and then checks if each one is in the wordlist. A much better method, for words of say, 8 letters or longer, would be to first set up the wordlist into sublists of words with the same number of letters. Then for your entered word, you simply check if each word in your list is an anagram of your entered word.
And here’s how to do that. First we discover that the longest word in our list has 28 letters:
words = [ for i in range(28)] for i in nwords: li=len(i) words[li-3].append(i.strip('\r\n'))
and a little procedure to test anagrams:
def isanagram(w1,w2): l1=list(w1) l2=list(w2) l1.sort() l2.sort() return l1==l2
and finally the program to find anagrams in our New Improved List:
def find2_anagrams(myword): lm=len(myword); for i in words[lm-1]: if isanagram(i,myword): print i
A quick test of it:
sage: find2_anagrams('retains') nastier retains retinas retsina stainer stearin
This is much faster than the first method.
Running this over the entire word list, the highest scoring group of letters I found was AEIGLNRT, with
ALERTING, ALTERING, INTEGRAL, RELATING, TANGLIER, TRIANGLE
for a score of 320.
Filed under: Sage