– Ben Martens

DAWG Part 2

This post assumes that you’ve read at least the radix tree portion of the post from two days ago. Given that tree built from my dictionary of 100,000 words, what word would require the most nodes to be touched? One key is that the children of every node are sorted alphabetically.

So thinking logically about this, it’s probably going to be a long word. It’s probably going to contain a good number of letters that are towards the end of the alphabet. “aadvark” for example, is not going to work because right away we find the string “aa” after looking at only two nods. “zoo” on the other hand takes 26 nodes just to find the z.

I solved this by pumping all 100,000 words through the radix tree and counting how many nodes I touched for each word. The winner was “superstructures". Here are the nodes I touched at each level:

a b c d e f g h i j k l m n o p q r s
a c e f g h i k l m n o p q s t u
a b c d e f g i k l m n p
a b c e f g h i l m n o p r s
a c e i m o p t
a i o r
a e o u
a e

Now you have learned your random piece of trivia for the day. I can’t imagine the situation where this would be useful knowledge. Also, this answer would vary depending on the dictionary you use.

PS. I haven’t had a ton to write about so you get stuck with whatever thoughts fly through my brain. Scary, eh? I wonder what normal people think about.