## How to Beat Threes (and 2048)

**Threes** is a terrifically simple, yet tremendously intriguing game from the creative team of Asher Vollmer and Greg Wohlwend, known for their work on indie games such as Puzzlejuice and Ridiculous Fishing. While each of them have created tons of great indie games, I don’t think it’s an exaggeration to say that *Threes* could well be their best game yet.

The rules are very simple: every time you swipe across the screen, all of the tiles try to move in the direction you swiped. Two tiles can combine if their values add up to 3, or if the tiles are equal and both integer multiples of 3. If you try to combine two tiles (by squishing one against the wall) and they can’t, then they act as a barrier, and that particular column or row doesn’t budge. Finally, there’s the part which makes it tricky: Every time you move the tiles, another tile is introduced. The goal is to reach the elusive 6144 tile, or more realistically, to last as long as you can without running out of possible moves.

For reasons which are yet to be fully understood, this game has attracted a stunningly large audience of players. Perhaps it’s because the game’s aesthetically appealing, or perhaps it’s because it’s apparently (and initially) easy to get past the first few stages, and yet nearly impossible to reach the final goal. It’s also a game which encourages study from just about everyone who’s played it (in the same way as Chess does), but also lacks any sort of threat of failure- a cheerful sound plays, and you get to see how many points you wound up with without losing in any sort of public way. In any case, even though it’s been around for only 2 and one-third months, there’s been an enormous amount of public interest and quite a few successful attempts to work out how the game works internally. Unfortunately, quite a few clones, some of which have offered notable modifications to the game, have also been created.

One of the most notable clones is Gabriel Cirulli’s * 2048*, which is almost identical to

*Threes*, except for a few differences:

- The tiles are the powers of 2 (2,4,8…) instead of three times the powers of two along with 1 and 2 (1,2,3,6,12,24…)
- Only tiles reading 2 and 4 are ever inserted, as opposed to the 1,2,3, and sometimes 6 or more of
*Threes* - The tiles slide as far as possible instead of moving at most one space
- The tiles are placed randomly on the board (in
*Threes*, they only ever enter from the edge you swiped from) - The goal is to get to 2048 instead of 6144, which makes the game a bit easier, since there are two types of tiles you never have to deal with, and
*2048*is free and open-source, and this, more than anything else, has probably led to its popularity and the number of subsequent clones.

If you’ve never played *Threes* or *2048* before, I highly recommend giving them a try, if only so that you can develop your own intuition for these games.

One of the few things that everyone who’s played *Threes* or *2048* agrees about is that these games are *really* difficult. As it turns out, people have discovered quite a few strategies for playing these games which make them a bit easier, but usually not by very much. However, there is a complicated and arcane algorithm for *2048*, known as the Corner Strategy, that will allow you to match or surpass scores you may have spent hours achieving in just a few minutes, using only a few very simple calculations.

## The Corner Strategy

**Tap Right and Down in alternating order until you can’t move. Then press Up. Repeat.**This works *ridiculously* well, given the amount of thought necessary to run this algorithm.

(This isn’t sped up or time-lapsed in any way)

Of course, the standard corner strategy rarely gets you to the 2048 tile, though it sometimes does. There are other methods, such as the so-called Mancini technique (keep the largest number in a corner, construct chains, never press Left), but almost all are versions of the Corner Strategy.

What we’d really like to know, though, is how to play *2048* or *Threes* optimally; that is, what is an algorithm that will play a game so as to gain the highest score? While this is almost certainly extraordinarily computationally intensive, a number of programmers have developed algorithms which play *2048* extraordinarily well. There was actually a bit of a competition on StackExchange not very long ago to design the best AI for the game, and many of the submissions were able to attain the 2048 tile almost all of the time! In particular, the best AI, developed by nneonneo, uses a surprisingly simple technique known as *expectimax optimization*, which works something like this:

- Consider the game tree of
*2048*: We move in one direction, then the computer places a piece randomly, then we move in another direction, the computer places another piece randomly, and so on until we can’t move anymore. - Suppose we can assign a “score” to each state of the board, which tells us roughly how good a position is, without actually looking into the future for possible moves or anything like that. The function used to calculate the score can be as simple as counting the number of empty spaces on the board, to complicated heuristics (such as ovolve’s combination of monotonicity, smoothness, and free tiles) It’s a bit like looking at a chessboard and guessing how much one side is winning.
- That said, we can get a better idea of how good a particular state is if we can look a few moves ahead, and measure the approximate score of each of
*those*positions, assuming we played optimally up to each one. - Now, suppose we’re at a particular state, and we want to determine how good a move such as, say, moving right is. It’s actually fairly easy to compute the expected score of the
*computer**‘s*move- just add up the score times the probability of a particular move for each move the computer could make. For instance, if there were a probability of 0.9 that the computer would place a 2 (say) resulting in a state with a score of 5, and a probability of 0.1 that the computer would place a 4, resulting in a state with a score of 2, then the expected score would be

0.9*5 + 0.1*2 = an expected score of 4.7

- If we know how good each move we could make is, then we should just play the best move (obviously).
- Expectimax optimization starts by asking “What is the best move I can make?” at the current state. To do that, it has to compute the score of each of the moves it could make, which it does by first branching over each of the moves the computer could make, and then measuring the score of each of the resulting positions by asking “What is the score of the best move I can make?” for each of
*those*. Theoretically, this could go on forever, so expectimax just uses the base heuristic once it’s sufficiently far down the game tree (that is, once it’s thinking a particular number of moves ahead). Once it has decided on an accurate score for each of the possible moves it could make, it simply plays the one with the best score.

Not only is this algorithm *very* good at playing the game when equipped with a good base heuristic- nneoneo’s implementation achieved ** 4096** in 97 of 100 trials, and gets to 16384 about once in every eight attempts – it’s also very fast!

(this is *also* not sped up in any way- it really is doing 30 moves per second!)

Of course, if you have an AI that play the game, it’s not difficult to create an AI that can always place the new tile in the worst possible place for the player, making the game more or less impossible. (For instance, see Stephen B. Beevan’s *Hatetris*) This is exactly what Zsolt Sz. Sztupák has done with *2048-Hard*, based off of Matt Overlan’s solver. Interestingly enough, the “Impossible” mode isn’t *entirely* impossible- I actually managed to get the 64 tile, with a final score of 540, while the embedded AI solver often gets to the 128 tile.

## It’s Not Just You; Threes is Difficult for Computers as Well!

Unfortunately, if you try the Corner Strategy on *Threes*, you’ll probably get the *lowest* score you’ve ever gotten. In fact, the designers of *Threes* found out about the corner strategy fairly early on, and modified the game a bit to make braindead strategies like it ineffective. This has the side effect of making the game *much* more difficult.

*Threes*, actually, is a bit less random, for two main reasons:

- Not only do you get to see what type of card will be placed next, but you can also predict future tiles by counting cards! According to TouchArcade member kamikaze28, the tiles are drawn from a shuffled deck of 12 cards (4 1s, 4 2s, and 4 3s), which is reshuffled every time the deck runs out of cards. (This means, for instance, that if you’ve just drawn 2 1s, 3 2s, and 4 3s, and the next card is a 2, then the one after that will almost certainly be a 1.) Additionally, if the highest card on the board is greater than 24, there is a 1 in 21 chance that the next card will come not from the deck of normal cards, but will be (apparently?) randomly chosen out of a set of cards from 6 to (top card)/8.
- As mentioned above, cards can come only from the side you swipe from, and even then they only enter into rows or columns that just moved. This is incredibly useful for combining 1s and 2s, although it’s still very easy to get a stray 1 or 2 in an inconvenient area on the board.

However, *Threes* not only has two more cards than *2048* (which alone would make it like 8192), but the first two of these cards, unlike 2 and 4, cannot combine with themselves! According to the designers, as of March 28th only six people had actually reached the 6144 card in *Threes*. One of these people, known on TouchArcade as y2kmp3, gave a few observations on the game after reaching the final card:

1. Once again, it was a random “high number” tile card (in this case, a 384 tile card) that made this run a success.

2. The most difficult part of the game is to learn how to get out of a potential jam, the most dangerous of which is “staggering”. This occurs when a “low number” tile card appears between two very “high number” tile cards. It is very important to remove staggering as early as possible (without replacing it with another staggering).

3. I don’t use so-called center or corner strategy. Instead, I make it a priority to keep the “high number” tile card that I want to match against the wall, preferably not in a corner. This way, when a random “high number” tile card appears on the same wall, I can get to that card quickly to match.

4. While the game undoubtedly requires skills to win, the “element” of chance plays a significant role in this game. In fact, I would argue that chance dominates over skills in the later levels. I found it simply too difficult to maintain two separate chains to create two identical “high number” card tiles to merge. Instead, my strategy is to create only one “high number” tile card of each kind, so that whatever the random “high number” tile card appears, you can make use of it to escalate.

5. The game is quite taxing to play at the later levels. Near the end, I was keeping count of the 1’s and 2’s that were appearing and would frequently change my strategy when I could count on the fact that these tiles might not appear for awhile (assuming the stack theory is correct; see #6).

6. I, too, am convinced that there is some unknown stack from which the tile cards are drawn and this stack gets renewed and reshuffled.

7. I am fairly convinced that, given a number of open rows or columns where a new tile card can appear, their probabilities are NOT equal. More often than not, the new tile card would appear in a “less” favorable row or column instead of a “more” favorable row or column with which I could do an immediate match. I am fully aware of the potential issue of “recall” bias, so I welcome other players’ impression of this theory.

I should emphasize that good games of *Threes* take a *lot* of time to play through; y2kmp3’s run, for instance, lasted “10-15 hours”, most of it spent planning.

Although AIs have been written to play *Threes*, and even though *Threes* might appear to be a more deterministic game when compared to *2048*, I know of none that have actually beaten the game on an actual (non-simulated) device. However, a few (most notably Team Colorblind’s Threesus) have gotten very close.

So far as I know, the first *Threes* AI to have been published is Nicola Salmoria (of MAME and Nontrivial Games)’s simulator, which uses expectimax at a depth of 9 with the following heuristic:

+4 points for each empty square

+4 points for every pair of adjacent cards that can be merged

-1 point for each card which is between two higher cards vertically or horizontally (-2 points if both)

The reasoning behind the scoring should be clear: reward empty spaces or spaces which can be emptied later, and penalize checkerboard patterns which are harder to get rid of.

Although he’s apparently still tuning the AI, Salmoria’s program has some pretty good simulated scores:

[percentage of times each card was reached]

384: 100%

768: 100%

1536: 88%

3072: 34%

6144: 5%

min score = 29,553

median score = 89,235

max score = 733,119

Note, however, that while his AI searches through nearly as many positions as Deep Blue, it almost never achieves a 6144. An “oracle” version of the AI (that is, one that knew all the future cards and where each card would be placed) managed to achieve a 12288 a whopping 18% of the time, which seems to indicate that his program probably doesn’t have any major bugs; *Threes* is just *that* difficult.

Probably the most famous attempt at beating *Threes* via computer analysis is the robotic **Threesus** from Team Colorblind.

Not only is Threesus a remarkably good player, but it’s also capable of playing *Threes* on an iPad using an Arduino and two servomotors. In a particular sense, this robot with a Twitch channel has turned *Threes* into something of a spectator sport (Matthew Wegner, one of the programmers of Threesus as well as one-half of Team Colorblind, usually streams it playing the game for a few hours every night). Perhaps because of this popularity, Threesus is continually being improved based on suggestions from channel viewers, and has gotten *very *close to reaching 6144 (at one point, it had enough material on the board to each the final card, but things were disorganized enough that the board became blocked up.) However, Threesus has been playing *Threes* constantly at the Aztez (Team Colorblind’s flagship game) booth at the 2014 Penny Arcade Expo, and at some point, whether by sheer perseverance or just random chance, it finally succeeded.

Threesus got a 6144 at the @aztezgame PAX booth today! Then the tile intro screen barfed AI state, oops: pic.twitter.com/Do7dKmKaxH

— Matthew Wegner (@mwegner) April 11, 2014

…and then the AI crashed. Or started making horrible moves. (I’m not exactly sure.)

As explained by Walt Destler (the other programmer of Threesus and prolific game designer), Threesus uses expectimax with a depth of 6, and card-counts for the first 3 of those moves. (Afterwards, it assumes the cards are randomly distributed). Furthermore, in order to increase performance, it codes the entire board as a single 64-bit integer, using 4 bits per square to represent values from 0 to 12288. Although this is almost identical to Salmoria’s approach, Threesus somehow has a better record of reaching every tile up to 6144, despite evaluating far fewer numbers of states!

100 games completed!

Total time: 03:03:22.0262799

Low Score: 30126

Median Score: 89436

High Score: 717960

% of games with at least a 384: 100%

% of games with at least a 768: 100%

% of games with at least a 1536: 94%

% of games with at least a 3072: 41%

% of games with at least a 6144: 1%(Don’t read too much into the decrease in % of 6144. The difference between 1 game and 3 games is statistically insignificant.)

So why does Threesus do so well? The answer, so far as I can tell, is that it uses a better evaluation function – that is, it’s better at determining, without doing any heavy computation, how good or bad a position is. It’s not actually all that difficult to make a good *Threes*-playing AI using mediocre heuristics, but it’s nearly impossible to create a great AI without great heuristics. The original evaluation function worked a bit like this:

- Every empty space is worth 2 points.
- Every matching pair of adjacent cards is worth 2 points.
- A card next to another card twice its value is worth 1 point.
- A card trapped between two other cards of higher value, or between a wall and a card of higher value, is
penalized1 point.

but since then, it’s been modified quite a bit:

- Every empty space is worth
**3**points. - Every matching pair of adjacent cards is worth 2 points.
- A card next to another card twice its value is worth 2 points.
- A card trapped between two other cards of higher value, or between a wall and a card of higher value, is
*penalized***5**points. - Cards of the second-largest size get a bonus of 1 point if they’re next to the largest card, and an extra point if they’re next to a wall.
- Cards of the third-largest size get a bonus of 1 point if they’re next to a wall and are next to a card of the second-largest size.
- The largest card gets a +3 bonus if it’s next to one wall, or a +6 bonus if it’s in a corner.

Notice that last +6 bonus for having the largest card in the corner: Threesus uses a Corner Strategy!

In conclusion, while the world’s best *Threes* AIs are pretty good at playing the game, and occasionally beat it, there’s still room for experimentation and improvement- from modifying evaluation functions, to reverse-engineering the deeper secrets of the game, to even trying completely new search methods.

Finally, here’s a quick puzzle: What’s the largest tile you can possibly achieve on the board of *Threes*, assuming the random number generator will give you exactly the tiles you need it to?

Nice explaination. You should talk more in detail on the ecpectimax algorithm for 2048 and inculde a diagram too.

49,152 is the highest assuming you don’t get any bonus tiles, but with bonus tiles, practically infinite.

You forgot one fact in your article:

2048 is to Threes! as Dennis Leary is to Bill. Hicks, except Leary eventually wrote original material after Hicks died. Threes! and it’s devs are still around, yet 2048 is somehow more popular, despite being coded by rejects from North Korea (“LEFT AN’ UP 2 WIN!”)

Threes! is ridiculously hard, but unlike 2048, it’s a ridiculously hard *game*, rather than a ridiculously *shameless ripoff*.