Sunday, February 6, 2022

Another attempt at Calculating the best starting word for wordle.

Like a lot of people, Wordle has become part of my daily routine. So, of course, I became quite interested when I started seeing a few articles online claiming to have scientifically calculated the best starting word. For me, this raised interesting questions--for example, how should we define "best"?

So I thought I'd try to get my hands on the data and play around with it myself. There are a lot of instructions that show you how to save off a local copy of Wordle--unfortunately, none of these worked for me. But, with a little digging using Safari's Inspect Elements command, I was able to find two lists of words.

Be aware, the first list is the ordered list of words that wordle uses to produce its daily puzzles. So, you probably don't want to look at it too hard--unless you want to spoil the surprise for yourself. The second list appears to be an alphabetical list of all the other words that Wordle considers valid 5-letter words. One thing that wasn't immediately clear to me was the fact that none of the words in the first list are in the second list.

So I had to combine the two. i also alphabetically sorted the list, so I could no longer tell which list the word came from. This gave me 12,972 words to play with.

Obviously, letter frequency is important. so i started with a simple, naive approach. First, go through and count the number of times each letter appears in the set of words. Then add up the total for each letter of each word. This lets me sort the words based on how common their letters are.

This produced "esses" as the best word, and "fuzzy" as the worst. Obviously these aren't great results. "Esses" makes sense because S and E are the most common letters in our list of words--but I would never recommend them as an initial guess. 

The problem is pretty easy to spot. S is the most common letter, and "esses" has three of them. Currently,  we're getting full credit for the duplicates Ses--despite the fact that very few words actually have three. 

OK, round two. Let's rework the algorithm to account for duplicates. This time, the first S will match any word with an S. The second S will only match words with two Ses. And so forth. As expected, this produced better results. "Aeros," "arose," and "soare" are tied for first place, with "aesir," "arise," "raise," "reais," and "serai" tied for second. 

Oddly, the worst word was "fluffy". which felt counterintuitive to me. It seems like "pzazz" or "xylyl" would be worse. But "pzazz" has 7649 matches. "Xylyl" has 5725, but "fluffy" only has 5582. 

While this is good--there's still room for improvement. the frequency with duplicate counts don't take green hits (having the right letter in the right space) into account. Green hits should be worth more than yellow. Doing some simple back-of-the-napkin guessing, a single green can be 4x as valuable as a single yellow--since for a yellow, there are still 4 places where it could go. Of course, the actual value is probably less--after all, I'd much rather get 4 yellows than a single green.

So, let's count the letter frequency for each of the five positions, and only score each letter based on the frequency in its position. This lets us sort the list based on the expected number of green hits. Unfortunately, in practice, this doesn't appear to be very helpful. "Sores," "sanes", "sales" and similar words make up the top of the list--which kinda nails home the importance of a starting S or ending ~ES, but doesn't really tell us much else. 

OK, let's combine the last two methods. this takes into account both green and yellow matches, and gives a green match twice the weight of a yellow match. Not surprisingly, these results look promising. It gives us a lot of new words with "tares," "lares," and "rales" and other words ending in ~ES jumping to the top. Also, "xylyl" finally shows up as the worst word.

We could continue to tweak the weight between the green and yellow hits--but that feels like we'd just be fine-tuning the list to match our expectations. Instead, let's take a step back and look at the problem again. 

When you make a guess, you get a lot of information. not just the green and yellow matches, but you also learnd about all the letters that can't be used, or can't be used in certain positions. So far, all of our algorithms throw away this extra information. However, it's relatively easy to filter the word list based on the greens, yellows, and nopes. 

Here's how it works. Take any guess word. compare it to the first word in the word list, then filter the word list based on the results. Count the number of possible words remaining. Continue to check your guess word against each word in the word list, and you can calculate the average words remaining for that guess word.

This, I'd argue, gets us to a very pragmatic definition of "best word." The best word is the word that's eliminates the most words--helping us rapidly whittle down the search space. And this is easy enough to code for a single word. Unfortunately, my current version takes about 20 seconds to calculate the average remaining words for a single test word. If I wanted to do an exhaustive search to determine the absolute best word, I'd expect it to take about three days--assuming nothing went wrong during that time. 

Yes, I'd like to look at my code and see if there's any way i can squeeze out some extra performance (or any place where I'm doing something dumb)--but the sheer number of combinations and the size of the word list mean it's going to be expensive, no matter what we shave off in the margins.

So for now, let's create a list of curated test words and check them. We can use the earlier algorithms to prescreen the search space. Let's take 100 from the combo algorithm, and 100 from the frequency-with-duplicates list. These produced different top words, and both seemed valuable. I also added the following words, either because I saw them mentioned online, or because I know someone who uses it as a starter (and they asked me to check it for them). This adds "early," "later," "arise," "irate," "reais," "blahs," and "centu." Filtering out duplicates, this gives us 168 unique words to rank. And, it should finish running while I'm eating dinner (note: it didn't--but it did finish while I was watching the Olympics after).

So, what are the results? the ~es words still making a strong showing with "lares," "rales," and "tares" taking the top three spots. Then we get some familiar words from the other algorithm, with "soare" and "reais" rounding out the top five. 

"Reais" is particularly interesting to me, because it's an anagram of my preferred start word, "arise." One of the reasons I like "arise" is because, if I don't get enough useful information on the first guess, I can always fall back on "tough" which hits the rest of the vowels and some other strong letters. So far, the combo hasn't let me down.

Unfortunately, all the top five are somewhat esoteric. if you're looking for something more common, we can find a few in the top 20: "aeros" comes in at number 7, "rates" at 8, and "aloes," "reals," and "lanes" at 13, 14, and 15. Other words that I was tracking include "arise" (75th), "later" (139), "irate" (157), and "early" (165).

So, does this mean I'm going to be switching my Wordle strategy? Eh, probably not. I'm pretty happy with the way things are going--and I'm already in the top 100, so that's good enough for me.

For those who are interested, here are the top 100 words with the expected number of remaining words. 

- 1: lares - 288.74 avg.

- 2: rales - 292.11 avg.

- 3: tares - 302.49 avg.

- 4: soare - 303.83 avg.

- 5: reais - 304.76 avg.

- 6: nares - 305.55 avg.

- 7: aeros - 309.73 avg.

- 8: rates - 311.36 avg.

- 9: arles - 314.73 avg.

- 10: serai - 315.13 avg.

- 11: saner - 317.74 avg.

- 12: tales - 325.59 avg.

- 13: aloes - 326.70 avg.

- 14: reals - 329.11 avg.

- 15: lanes - 330.21 avg.

- 16: lears - 331.28 avg.

- 17: salet - 333.65 avg.

- 18: laers - 334.47 avg.

- 19: seral - 335.70 avg.

- 20: teras - 337.04 avg.

- 21: lores - 339.86 avg.

- 22: earls - 342.10 avg.

- 23: saine - 342.25 avg.

- 24: raise - 343.40 avg.

- 25: roles - 345.10 avg.

- 26: reans - 347.04 avg.

- 27: tears - 349.52 avg.

- 28: aeons - 350.06 avg.

- 29: nears - 350.40 avg.

- 30: cares - 350.94 avg.

- 31: aures - 352.22 avg.

- 32: dares - 352.58 avg.

- 33: nates - 352.67 avg.

- 34: stoae - 353.46 avg.

- 35: laser - 354.04 avg.

- 36: strae - 354.16 avg.

- 37: toeas - 354.70 avg.

- 38: tores - 358.14 avg.

- 39: mares - 359.21 avg.

- 40: sared - 360.94 avg.

- 41: races - 361.31 avg.

- 42: pares - 362.07 avg.

- 43: taser - 362.45 avg.

- 44: earns - 362.55 avg.

- 45: rones - 365.40 avg.

- 46: hares - 366.06 avg.

- 47: snare - 368.44 avg.

- 48: riles - 368.79 avg.

- 49: sayer - 369.99 avg.

- 50: rotes - 370.56 avg.

- 51: teals - 371.23 avg.

- 52: stare - 371.77 avg.

- 53: rapes - 372.65 avg.

- 54: leans - 373.50 avg.

- 55: leats - 374.99 avg.

- 56: aesir - 375.09 avg.

- 57: gares - 375.63 avg.

- 58: neals - 376.75 avg.

- 59: taels - 376.87 avg.

- 60: dales - 377.69 avg.

- 61: dates - 381.90 avg.

- 62: arose - 382.05 avg.

- 63: bares - 383.07 avg.

- 64: tires - 384.26 avg.

- 65: lades - 384.73 avg.

- 66: roues - 384.76 avg.

- 67: kaies - 385.22 avg.

- 68: rages - 385.32 avg.

- 69: arets - 386.25 avg.

- 70: slate - 386.73 avg.

- 71: toles - 388.74 avg.

- 72: setal - 389.84 avg.

- 73: dears - 390.07 avg.

- 74: aides - 391.98 avg.

- 75: arise - 392.47 avg.

- 76: laces - 393.12 avg.

- 77: ayres - 394.05 avg.

- 78: slane - 394.15 avg.

- 79: antes - 394.45 avg.

- 80: stear - 394.64 avg.

- 81: yales - 395.27 avg.

- 82: acres - 395.28 avg.

- 83: reads - 395.54 avg.

- 84: males - 395.64 avg.

- 85: canes - 396.79 avg.

- 86: taces - 396.91 avg.

- 87: lotes - 397.48 avg.

- 88: cates - 397.76 avg.

- 89: neats - 399.23 avg.

- 90: rakes - 399.66 avg.

- 91: stale - 399.72 avg.

- 92: pales - 399.98 avg.

- 93: lames - 400.68 avg.

- 94: tames - 401.32 avg.

- 95: resat - 401.51 avg.

- 96: ureas - 403.65 avg.

- 97: serac - 403.96 avg.

- 98: manes - 404.22 avg.

- 99: hales - 404.35 avg.

- 100: mates - 404.41 avg.


Monday, April 4, 2011

Blog Moving

I've moved my blog to www.freelancemadscience.com. I hope to have new things to announce there soon.

-Rich-

Thursday, March 3, 2011

What the frick is wrong with Instructables

I'm a big DIY nut. I admit, I don't have enough time to do any big projects on my own. My writing, coding and kids keep me pretty busy. But I love reading about other people's projects, and dreaming up things that I might try to do someday (when I have enough space, money and time).

As a result, I've been following the Instructables feed for several years now. At first it was a very cool mix of interesting food and technology projects. Things that really inspired the imagination, with clear step-by-step instructions on how to do it yourself.

Recently, however, I've noticed that a growing number of Instructable posts are either inane (e.g. "How to put on pants") or there simply someone showing off their own project without bothering to provide the step-by-step instructions. As a result, I've been growing more and more frustrated with the feed. And, I hate to admit it, but the time has come. Something must be done.

This is actually a symptom of a broader internet problem. Something similar happened with Twitter's trending topics. At first, they were a great way to keep track of geek news--new movies, new technology, new video games, whatever. If it was a trending topic, I was probably interested in it. Now, however, they're almost useless to me. The Twitter demographic has clearly shifted to a more mainstream audience, and I just can't bring myself to care about Justin Bieber or Ke$ha. Ironically, I was going to use the current trending topics to prove my point, but Blade Runner is currently #2, so maybe all hope is not lost.

Now the problem with Twitter and the problem with Instructables are not exactly the same. In Twitter's case, it's more of a lowest common denominator issue. For Instructables, that's part of the problem--more kids posting low-quality 'Ibles (not to pick on kids--I like to see them getting involved and trying new things--but if I see another "I'm 13 and this is my first 'Ible, please be nice" post that goes on to show me how to draw ligers using only blue and purple crayons, I may have to gouge out my own eyes). But there's also a ton of people posting "joke" 'Ibles (I use quotes, because they really aren't funny). This feels more like the comment troll problem, where a few individuals seem intent on entertaining themselves at everyone else's expense.

In both cases (and in the internet at large), there's still a lot of really cool stuff going on. It's just getting harder and harder to find it. For Instructables, it's become painfully clear that sucking on the main RSS fire hose is no longer the way to go. I'll give the Editor's Pick or Popular feed a try. Hopefully that will give me a more-curated experience. For Twitter, I'll carefully select the people I follow, and blissfully ignore trending topics. For the internet at large--who knows. We've been struggling with this issue for years now, and while the battlefield shifts around a bit, it really hasn't gotten any better.

We're still at the early days in this technology, where rampant growth and changes are the norm. The current hope is that social search will save us. I'm not so optimistic. Still, by the time the internet matures, we should have better tools for content discovery. But, who knows when (or if) that will occur.

-Rich-

Thursday, February 17, 2011

Thoughts on Watson

Ok, let me start by admitting that I don't watch Jeopardy. I don't even own a TV. I have, however, been following the epic man-vs-machine battle that played out this week, and I must say, I'm honestly surprised. Not that the computer won. That was inevitable. If not this year, then someday soon. No, I was surprised that most of the people who commented on Watson's victory completely missed the point.

If you look online (go ahead, you know you want to), you'll find a lot of people talking about how, of course, the computer won. They say its reaction time gave it an unfair advantage. That it could push its button faster, cutting out the human opponents. Or they talk about its database. Of course, if you load all that data into a machine, you'll be able to answer any trivia question with ease.

In both cases, the commenters are fixating on the minor details and missing the main point. Watson was able to parse natural language questions and come up with reasonable answers. That was the hard part. That was the key accomplishment. Natural language processing is unbelievably difficult (trust me, I worked in the NLP lab as an undergraduate). Button pushing and database access are trivial.

The interesting thin is, this shows our natural bias. As humans, parsing the question is easy, so easy we don't even think about it. Instead, we focus on the things that give us trouble. Do we know the answer? Can we beat our opponent to the buzzer? Those are the areas that concern us, so any perceived advantage in those areas seems grossly unfair. But, in doing this, we forget the first step. You must understand the question before you can answer it.

This just highlights the differences between humans and computers. Our brains and Watson's brain have vastly different areas of competence. And lets face it, our brains work very hard to make many tasks seem trivial (object recognition, natural language processing, etc.). Even the dumbest Wheel of Fortune contestant has more processing power inside their skull than Watson could ever dream of. But here's the thing. Computers can always add more processing power. Human brains--not so much. Once computers get as complicated as a human brain, then things really get interesting.

I will say, I am a bit more moved by Noam Chomsky's criticism of Watson, dismissing it as "a bigger steamroller". Chomsky claims that Watson doesn't really understand the questions. But, I'm not so sure. How do we measure understanding? It seems like this line of argument steps into a murky, metaphysical swamp, from which we can never escape.

Instead, I tend to agree with Kurzweil's comments (from the same article):

Kurzweil says that Chomsky’s “answers are so brief that it is difficult to understand what he is trying to say. I would say that Watson is clearly not yet ‘strong AI’, but it is an important step in that direction. It is the clearest demonstration I’ve seen of computers handling the subtleties of language including metaphors, puns and jokes, something people had said would not be possible. I don’t agree with Chomsky that Watson is not impressive in that regard. As long as AI has any flaws or limitations, people will jump on these. By the time that the set of these limitations is nil, AI will have long since surpassed unaided human intelligence.”


-Rich-

Wednesday, January 12, 2011

Jumping into Existing Projects

For the last 6 months, I've had the good fortune to work on a number of awesome projects. Things that I would love to show to other people and brag about, but I always hesitate because they aren't unconditionally awesome--and they aren't all mine.

As part of my contracting gig, I've jumped into a number of projects that were about 75% finished. My job was to polish them up and get them ready for the app store. While I'm really proud of the work I've done, there are several lingering problems that I really wish I could go back and "do right." Most of these are existing features that worked (more or less), and we just didn't have the time or money to fix.

There were also a number of management-level decisions that I didn't entirely agree with. Things like the way we handled in-app ads. I mean, I understand. My client needs to make money. Hell, I want them to make money. After all, I want them to keep paying me. But, it would have been nice if we could have toned things down a bit. Still, they asked for it; I implemented it. No shame in that.

Mostly, though, I don't want to look like I'm taking credit for someone else's work. After all, they lead the horse to water--I just got it to drink. Or, in other words, I didn't make it, I just made it awesome.

The really weird thing is that this uneasy feeling goes cuts other way as well. I often find myself having trouble talking to my clients, especially when things start to go wrong.

I mean, I don't want to be THAT guy. You know the one. The guy who is always whining about the last person who had his job. Always finding some way to blame them or excuse his poor performance on them. In an ideal world, I want to be the one who finds the problems, owns the problems and fixes the problems. But when my clients are pestering me for results, and I'm fighting to meet a tight deadline, and I just spent the last 15 hours cleaning up someone else's mess...well, it's hard to find a constructive way to express my concerns. I mean, let's be honest, (and I'm sure Obama would have my back here) sometimes it really is the Bush Administration's fault. Or in my case, it's the fault of whatever knuckle-typing orangutan they hired to cobble this thing together.

Don't get me wrong, I love a good orangutan. But, let's face it, they are the hippies of the simian world. If you need a sidekick to ride on your hog and watch your back at the roadhouse, then a orangutan is definitely the way to go. But, despite the convincing neck beard, they should not be allowed anywhere near code.

And I guess that's the real lesson here. Don't hire orangutans. They may work for bananas, but eventually you'll have to hire someone like me to come in and shovel out their poop. In the long run, those will become the most expensive bananas you've ever purchased.

-Rich-

Friday, November 5, 2010

From the Crazy Idea Lab

Hey,

Over the last couple of days, an idea has begun to bubble up from the bottom recesses of my brain. Preparing and putting on my presentation just reminded me how much I enjoy teaching. As a result, I'm seriously considering putting together a few tech training courses. I think I'd like to start by focusing on "Intro to iOS Programming" classes, then possibly branch out into other topic areas (Ruby on Rails comes to mind).

If anyone would be interested, please drop me a line. Also, let me know what sort of topics you would be interested in covering, as well as preferences for class length, etc.

Thanks,

-Rich-

MacTech Conference

I'm sitting in the great hall on the last day of the MacTech conference. It's been a great 3 days. I've learned a lot, and had a chance to talk to so many interesting people. My presentation went over well (though, being between sandwiched between two of the best sessions all conference, I am afraid it may have been somewhat overshadowed). It was just about the right length. I'm starting to feel presentation fatigue just as things are wrapping up.

I hope they will do this again next year.

-Rich-