I too have been playing Wordle. Imagine that!
My current browser session shows a streak of 44 days, and I played in a different browser before that, so I’ll call it an even 50.
This means it has taken me 50 days to finally break down and write some JavaScript to analyze words.
I’ve seen quite a few sharings of “the most common letters in the English language” and while they are interesting and probably accurate enough, they deal with the entire English language rather than words that are 5 letters long.
Aaron did a couple levels better and grabbed the word list from the JavaScript powering Wordle to analyze letter counts. Reading his post made me take a closer look at what I’ve been pondering: common character pairs.
When solving, I often think about possible word endings and beginnings. It’s easier to think of those as pairs of characters (or trios?) rather than individual letters. There’s a similar list published each day for the NYT Spelling Bee community to help prod brains into thinking of words they aren’t seeing.
So. I wrote a hacky little script that loops through possible words, slices each into its possible pairs, counts the occurrence of each pair in the entire word list, and then calculates a score for each word based on those character pair occurrences.
Phew!
I then took the generated list and manually filtered any words with duplicate letters or that were anagrams of previous words. That gave me: inter, feral, liner, paler, alert, baler, later, stern, steal, and miner.
Okay. So I should start with inter.
If no words match with those letters, or if you’re playing with “easy” mode rules and like to waste your second guess 🙈, then the next word should be one sharing no letters with inter.
I filtered the list again (this time not manually) and found the top 5 to be: chalk, aloud, shoal, scaly, and macho.
So if inter doesn’t match any characters, play chalk next.
Now, how does it work in practice? Well. I wouldn’t know yet. Today I accidentally played the anagram inert instead. 🤦🏻
It matched the r, but in the wrong position. I then played roach, which kept the r, but introduced the ch pair. That led to a match of a, o, and r, but all in the wrong position. Things seemed to be going okay, but then there are too many _A_OR words and for the first time… I lost.
That my first loss came from an attempt to script the best word is not lost on me. I’ve been laughing all morning.
Anyhow. I’m probably going to try this a couple times just to see how it goes. After that I’ll stick to my standard “think of an interesting word and go” strategy and instead do some post analysis on how inter and chalk would have worked. While it’s fun to poke at frequency counts, it would take a little bit of the magic away for me if I use the same words every time.
If you’d like, here’s some raw-er data on character pairs:
- Top 10 character pairs (total): er, in, st, al, ra, re, ar, ch, ro, or.
- Top 10 character pairs starting words: st, sh, cr, sp, ch, gr, re, tr, fl, br.
- Top 10 character pairs ending words: er, ch, ly, se, al, ck, ty, te, el, ge.
Responses and reactions
Replies
I love this ... and for the record you're my WordleParent as I found out about it from you ;)
I was using WASTE as my starter word due to the A-S-T-E but that hasn't born out much success at all, so I'm going to try INTER to see how it goes!
The only requirement for your mention to be recognized is a link to this post in your post's content. You can update or delete your post and then re-submit the URL in the form to update or remove your response from this page.
Learn more about Webmentions.