Like a lot of people, Wordle has become part of my daily routine. So, of course, I became quite interested when I started seeing a few articles online claiming to have scientifically calculated the best starting word. For me, this raised interesting questions--for example, how should we define "best"?
So I thought I'd try to get my hands on the data and play around with it myself. There are a lot of instructions that show you how to save off a local copy of Wordle--unfortunately, none of these worked for me. But, with a little digging using Safari's Inspect Elements command, I was able to find two lists of words.
Be aware, the first list is the ordered list of words that wordle uses to produce its daily puzzles. So, you probably don't want to look at it too hard--unless you want to spoil the surprise for yourself. The second list appears to be an alphabetical list of all the other words that Wordle considers valid 5-letter words. One thing that wasn't immediately clear to me was the fact that none of the words in the first list are in the second list.
So I had to combine the two. i also alphabetically sorted the list, so I could no longer tell which list the word came from. This gave me 12,972 words to play with.
Obviously, letter frequency is important. so i started with a simple, naive approach. First, go through and count the number of times each letter appears in the set of words. Then add up the total for each letter of each word. This lets me sort the words based on how common their letters are.
This produced "esses" as the best word, and "fuzzy" as the worst. Obviously these aren't great results. "Esses" makes sense because S and E are the most common letters in our list of words--but I would never recommend them as an initial guess.
The problem is pretty easy to spot. S is the most common letter, and "esses" has three of them. Currently, we're getting full credit for the duplicates Ses--despite the fact that very few words actually have three.
OK, round two. Let's rework the algorithm to account for duplicates. This time, the first S will match any word with an S. The second S will only match words with two Ses. And so forth. As expected, this produced better results. "Aeros," "arose," and "soare" are tied for first place, with "aesir," "arise," "raise," "reais," and "serai" tied for second.
Oddly, the worst word was "fluffy". which felt counterintuitive to me. It seems like "pzazz" or "xylyl" would be worse. But "pzazz" has 7649 matches. "Xylyl" has 5725, but "fluffy" only has 5582.
While this is good--there's still room for improvement. the frequency with duplicate counts don't take green hits (having the right letter in the right space) into account. Green hits should be worth more than yellow. Doing some simple back-of-the-napkin guessing, a single green can be 4x as valuable as a single yellow--since for a yellow, there are still 4 places where it could go. Of course, the actual value is probably less--after all, I'd much rather get 4 yellows than a single green.
So, let's count the letter frequency for each of the five positions, and only score each letter based on the frequency in its position. This lets us sort the list based on the expected number of green hits. Unfortunately, in practice, this doesn't appear to be very helpful. "Sores," "sanes", "sales" and similar words make up the top of the list--which kinda nails home the importance of a starting S or ending ~ES, but doesn't really tell us much else.
OK, let's combine the last two methods. this takes into account both green and yellow matches, and gives a green match twice the weight of a yellow match. Not surprisingly, these results look promising. It gives us a lot of new words with "tares," "lares," and "rales" and other words ending in ~ES jumping to the top. Also, "xylyl" finally shows up as the worst word.
We could continue to tweak the weight between the green and yellow hits--but that feels like we'd just be fine-tuning the list to match our expectations. Instead, let's take a step back and look at the problem again.
When you make a guess, you get a lot of information. not just the green and yellow matches, but you also learnd about all the letters that can't be used, or can't be used in certain positions. So far, all of our algorithms throw away this extra information. However, it's relatively easy to filter the word list based on the greens, yellows, and nopes.
Here's how it works. Take any guess word. compare it to the first word in the word list, then filter the word list based on the results. Count the number of possible words remaining. Continue to check your guess word against each word in the word list, and you can calculate the average words remaining for that guess word.
This, I'd argue, gets us to a very pragmatic definition of "best word." The best word is the word that's eliminates the most words--helping us rapidly whittle down the search space. And this is easy enough to code for a single word. Unfortunately, my current version takes about 20 seconds to calculate the average remaining words for a single test word. If I wanted to do an exhaustive search to determine the absolute best word, I'd expect it to take about three days--assuming nothing went wrong during that time.
Yes, I'd like to look at my code and see if there's any way i can squeeze out some extra performance (or any place where I'm doing something dumb)--but the sheer number of combinations and the size of the word list mean it's going to be expensive, no matter what we shave off in the margins.
So for now, let's create a list of curated test words and check them. We can use the earlier algorithms to prescreen the search space. Let's take 100 from the combo algorithm, and 100 from the frequency-with-duplicates list. These produced different top words, and both seemed valuable. I also added the following words, either because I saw them mentioned online, or because I know someone who uses it as a starter (and they asked me to check it for them). This adds "early," "later," "arise," "irate," "reais," "blahs," and "centu." Filtering out duplicates, this gives us 168 unique words to rank. And, it should finish running while I'm eating dinner (note: it didn't--but it did finish while I was watching the Olympics after).
So, what are the results? the ~es words still making a strong showing with "lares," "rales," and "tares" taking the top three spots. Then we get some familiar words from the other algorithm, with "soare" and "reais" rounding out the top five.
"Reais" is particularly interesting to me, because it's an anagram of my preferred start word, "arise." One of the reasons I like "arise" is because, if I don't get enough useful information on the first guess, I can always fall back on "tough" which hits the rest of the vowels and some other strong letters. So far, the combo hasn't let me down.
Unfortunately, all the top five are somewhat esoteric. if you're looking for something more common, we can find a few in the top 20: "aeros" comes in at number 7, "rates" at 8, and "aloes," "reals," and "lanes" at 13, 14, and 15. Other words that I was tracking include "arise" (75th), "later" (139), "irate" (157), and "early" (165).
So, does this mean I'm going to be switching my Wordle strategy? Eh, probably not. I'm pretty happy with the way things are going--and I'm already in the top 100, so that's good enough for me.
For those who are interested, here are the top 100 words with the expected number of remaining words.
- 1: lares - 288.74 avg.
- 2: rales - 292.11 avg.
- 3: tares - 302.49 avg.
- 4: soare - 303.83 avg.
- 5: reais - 304.76 avg.
- 6: nares - 305.55 avg.
- 7: aeros - 309.73 avg.
- 8: rates - 311.36 avg.
- 9: arles - 314.73 avg.
- 10: serai - 315.13 avg.
- 11: saner - 317.74 avg.
- 12: tales - 325.59 avg.
- 13: aloes - 326.70 avg.
- 14: reals - 329.11 avg.
- 15: lanes - 330.21 avg.
- 16: lears - 331.28 avg.
- 17: salet - 333.65 avg.
- 18: laers - 334.47 avg.
- 19: seral - 335.70 avg.
- 20: teras - 337.04 avg.
- 21: lores - 339.86 avg.
- 22: earls - 342.10 avg.
- 23: saine - 342.25 avg.
- 24: raise - 343.40 avg.
- 25: roles - 345.10 avg.
- 26: reans - 347.04 avg.
- 27: tears - 349.52 avg.
- 28: aeons - 350.06 avg.
- 29: nears - 350.40 avg.
- 30: cares - 350.94 avg.
- 31: aures - 352.22 avg.
- 32: dares - 352.58 avg.
- 33: nates - 352.67 avg.
- 34: stoae - 353.46 avg.
- 35: laser - 354.04 avg.
- 36: strae - 354.16 avg.
- 37: toeas - 354.70 avg.
- 38: tores - 358.14 avg.
- 39: mares - 359.21 avg.
- 40: sared - 360.94 avg.
- 41: races - 361.31 avg.
- 42: pares - 362.07 avg.
- 43: taser - 362.45 avg.
- 44: earns - 362.55 avg.
- 45: rones - 365.40 avg.
- 46: hares - 366.06 avg.
- 47: snare - 368.44 avg.
- 48: riles - 368.79 avg.
- 49: sayer - 369.99 avg.
- 50: rotes - 370.56 avg.
- 51: teals - 371.23 avg.
- 52: stare - 371.77 avg.
- 53: rapes - 372.65 avg.
- 54: leans - 373.50 avg.
- 55: leats - 374.99 avg.
- 56: aesir - 375.09 avg.
- 57: gares - 375.63 avg.
- 58: neals - 376.75 avg.
- 59: taels - 376.87 avg.
- 60: dales - 377.69 avg.
- 61: dates - 381.90 avg.
- 62: arose - 382.05 avg.
- 63: bares - 383.07 avg.
- 64: tires - 384.26 avg.
- 65: lades - 384.73 avg.
- 66: roues - 384.76 avg.
- 67: kaies - 385.22 avg.
- 68: rages - 385.32 avg.
- 69: arets - 386.25 avg.
- 70: slate - 386.73 avg.
- 71: toles - 388.74 avg.
- 72: setal - 389.84 avg.
- 73: dears - 390.07 avg.
- 74: aides - 391.98 avg.
- 75: arise - 392.47 avg.
- 76: laces - 393.12 avg.
- 77: ayres - 394.05 avg.
- 78: slane - 394.15 avg.
- 79: antes - 394.45 avg.
- 80: stear - 394.64 avg.
- 81: yales - 395.27 avg.
- 82: acres - 395.28 avg.
- 83: reads - 395.54 avg.
- 84: males - 395.64 avg.
- 85: canes - 396.79 avg.
- 86: taces - 396.91 avg.
- 87: lotes - 397.48 avg.
- 88: cates - 397.76 avg.
- 89: neats - 399.23 avg.
- 90: rakes - 399.66 avg.
- 91: stale - 399.72 avg.
- 92: pales - 399.98 avg.
- 93: lames - 400.68 avg.
- 94: tames - 401.32 avg.
- 95: resat - 401.51 avg.
- 96: ureas - 403.65 avg.
- 97: serac - 403.96 avg.
- 98: manes - 404.22 avg.
- 99: hales - 404.35 avg.
- 100: mates - 404.41 avg.