Advertisement
Advertisement


What is the optimal algorithm for the game 2048?


Question

I have recently stumbled upon the game 2048. You merge similar tiles by moving them in any of the four directions to make "bigger" tiles. After each move, a new tile appears at random empty position with a value of either 2 or 4. The game terminates when all the boxes are filled and there are no moves that can merge tiles, or you create a tile with a value of 2048.

One, I need to follow a well-defined strategy to reach the goal. So, I thought of writing a program for it.

My current algorithm:

while (!game_over) {
    for each possible move:
        count_no_of_merges_for_2-tiles and 4-tiles
    choose the move with a large number of merges
}

What I am doing is at any point, I will try to merge the tiles with values 2 and 4, that is, I try to have 2 and 4 tiles, as minimum as possible. If I try it this way, all other tiles were automatically getting merged and the strategy seems good.

But, when I actually use this algorithm, I only get around 4000 points before the game terminates. Maximum points AFAIK is slightly more than 20,000 points which is way larger than my current score. Is there a better algorithm than the above?

2017/02/22
1
1929
2/22/2017 3:52:20 AM

Accepted Answer

I developed a 2048 AI using expectimax optimization, instead of the minimax search used by @ovolve's algorithm. The AI simply performs maximization over all possible moves, followed by expectation over all possible tile spawns (weighted by the probability of the tiles, i.e. 10% for a 4 and 90% for a 2). As far as I'm aware, it is not possible to prune expectimax optimization (except to remove branches that are exceedingly unlikely), and so the algorithm used is a carefully optimized brute force search.

Performance

The AI in its default configuration (max search depth of 8) takes anywhere from 10ms to 200ms to execute a move, depending on the complexity of the board position. In testing, the AI achieves an average move rate of 5-10 moves per second over the course of an entire game. If the search depth is limited to 6 moves, the AI can easily execute 20+ moves per second, which makes for some interesting watching.

To assess the score performance of the AI, I ran the AI 100 times (connected to the browser game via remote control). For each tile, here are the proportions of games in which that tile was achieved at least once:

2048: 100%
4096: 100%
8192: 100%
16384: 94%
32768: 36%

The minimum score over all runs was 124024; the maximum score achieved was 794076. The median score is 387222. The AI never failed to obtain the 2048 tile (so it never lost the game even once in 100 games); in fact, it achieved the 8192 tile at least once in every run!

Here's the screenshot of the best run:

32768 tile, score 794076

This game took 27830 moves over 96 minutes, or an average of 4.8 moves per second.

Implementation

My approach encodes the entire board (16 entries) as a single 64-bit integer (where tiles are the nybbles, i.e. 4-bit chunks). On a 64-bit machine, this enables the entire board to be passed around in a single machine register.

Bit shift operations are used to extract individual rows and columns. A single row or column is a 16-bit quantity, so a table of size 65536 can encode transformations which operate on a single row or column. For example, moves are implemented as 4 lookups into a precomputed "move effect table" which describes how each move affects a single row or column (for example, the "move right" table contains the entry "1122 -> 0023" describing how the row [2,2,4,4] becomes the row [0,0,4,8] when moved to the right).

Scoring is also done using table lookup. The tables contain heuristic scores computed on all possible rows/columns, and the resultant score for a board is simply the sum of the table values across each row and column.

This board representation, along with the table lookup approach for movement and scoring, allows the AI to search a huge number of game states in a short period of time (over 10,000,000 game states per second on one core of my mid-2011 laptop).

The expectimax search itself is coded as a recursive search which alternates between "expectation" steps (testing all possible tile spawn locations and values, and weighting their optimized scores by the probability of each possibility), and "maximization" steps (testing all possible moves and selecting the one with the best score). The tree search terminates when it sees a previously-seen position (using a transposition table), when it reaches a predefined depth limit, or when it reaches a board state that is highly unlikely (e.g. it was reached by getting 6 "4" tiles in a row from the starting position). The typical search depth is 4-8 moves.

Heuristics

Several heuristics are used to direct the optimization algorithm towards favorable positions. The precise choice of heuristic has a huge effect on the performance of the algorithm. The various heuristics are weighted and combined into a positional score, which determines how "good" a given board position is. The optimization search will then aim to maximize the average score of all possible board positions. The actual score, as shown by the game, is not used to calculate the board score, since it is too heavily weighted in favor of merging tiles (when delayed merging could produce a large benefit).

Initially, I used two very simple heuristics, granting "bonuses" for open squares and for having large values on the edge. These heuristics performed pretty well, frequently achieving 16384 but never getting to 32768.

Petr Morávek (@xificurk) took my AI and added two new heuristics. The first heuristic was a penalty for having non-monotonic rows and columns which increased as the ranks increased, ensuring that non-monotonic rows of small numbers would not strongly affect the score, but non-monotonic rows of large numbers hurt the score substantially. The second heuristic counted the number of potential merges (adjacent equal values) in addition to open spaces. These two heuristics served to push the algorithm towards monotonic boards (which are easier to merge), and towards board positions with lots of merges (encouraging it to align merges where possible for greater effect).

Furthermore, Petr also optimized the heuristic weights using a "meta-optimization" strategy (using an algorithm called CMA-ES), where the weights themselves were adjusted to obtain the highest possible average score.

The effect of these changes are extremely significant. The algorithm went from achieving the 16384 tile around 13% of the time to achieving it over 90% of the time, and the algorithm began to achieve 32768 over 1/3 of the time (whereas the old heuristics never once produced a 32768 tile).

I believe there's still room for improvement on the heuristics. This algorithm definitely isn't yet "optimal", but I feel like it's getting pretty close.


That the AI achieves the 32768 tile in over a third of its games is a huge milestone; I will be surprised to hear if any human players have achieved 32768 on the official game (i.e. without using tools like savestates or undo). I think the 65536 tile is within reach!

You can try the AI for yourself. The code is available at https://github.com/nneonneo/2048-ai.

2015/05/08
1274
5/8/2015 2:54:58 PM


I became interested in the idea of an AI for this game containing no hard-coded intelligence (i.e no heuristics, scoring functions etc). The AI should "know" only the game rules, and "figure out" the game play. This is in contrast to most AIs (like the ones in this thread) where the game play is essentially brute force steered by a scoring function representing human understanding of the game.

AI Algorithm

I found a simple yet surprisingly good playing algorithm: To determine the next move for a given board, the AI plays the game in memory using random moves until the game is over. This is done several times while keeping track of the end game score. Then the average end score per starting move is calculated. The starting move with the highest average end score is chosen as the next move.

With just 100 runs (i.e in memory games) per move, the AI achieves the 2048 tile 80% of the times and the 4096 tile 50% of the times. Using 10000 runs gets the 2048 tile 100%, 70% for 4096 tile, and about 1% for the 8192 tile.

See it in action

The best achieved score is shown here:

best score

An interesting fact about this algorithm is that while the random-play games are unsurprisingly quite bad, choosing the best (or least bad) move leads to very good game play: A typical AI game can reach 70000 points and last 3000 moves, yet the in-memory random play games from any given position yield an average of 340 additional points in about 40 extra moves before dying. (You can see this for yourself by running the AI and opening the debug console.)

This graph illustrates this point: The blue line shows the board score after each move. The red line shows the algorithm's best random-run end game score from that position. In essence, the red values are "pulling" the blue values upwards towards them, as they are the algorithm's best guess. It's interesting to see the red line is just a tiny bit above the blue line at each point, yet the blue line continues to increase more and more.

scoring graph

I find it quite surprising that the algorithm doesn't need to actually foresee good game play in order to chose the moves that produce it.

Searching later I found this algorithm might be classified as a Pure Monte Carlo Tree Search algorithm.

First I created a JavaScript version which can be seen in action here. This version can run 100's of runs in decent time. Open the console for extra info. (source)

Later, in order to play around some more I used @nneonneo highly optimized infrastructure and implemented my version in C++. This version allows for up to 100000 runs per move and even 1000000 if you have the patience. Building instructions provided. It runs in the console and also has a remote-control to play the web version. (source)

Results

Surprisingly, increasing the number of runs does not drastically improve the game play. There seems to be a limit to this strategy at around 80000 points with the 4096 tile and all the smaller ones, very close to the achieving the 8192 tile. Increasing the number of runs from 100 to 100000 increases the odds of getting to this score limit (from 5% to 40%) but not breaking through it.

Running 10000 runs with a temporary increase to 1000000 near critical positions managed to break this barrier less than 1% of the times achieving a max score of 129892 and the 8192 tile.

Improvements

After implementing this algorithm I tried many improvements including using the min or max scores, or a combination of min,max,and avg. I also tried using depth: Instead of trying K runs per move, I tried K moves per move list of a given length ("up,up,left" for example) and selecting the first move of the best scoring move list.

Later I implemented a scoring tree that took into account the conditional probability of being able to play a move after a given move list.

However, none of these ideas showed any real advantage over the simple first idea. I left the code for these ideas commented out in the C++ code.

I did add a "Deep Search" mechanism that increased the run number temporarily to 1000000 when any of the runs managed to accidentally reach the next highest tile. This offered a time improvement.

I'd be interested to hear if anyone has other improvement ideas that maintain the domain-independence of the AI.

2048 Variants and Clones

Just for fun, I've also implemented the AI as a bookmarklet, hooking into the game's controls. This allows the AI to work with the original game and many of its variants.

This is possible due to domain-independent nature of the AI. Some of the variants are quite distinct, such as the Hexagonal clone.

2018/06/03

EDIT: This is a naive algorithm, modelling human conscious thought process, and gets very weak results compared to AI that search all possibilities since it only looks one tile ahead. It was submitted early in the response timeline.

I have refined the algorithm and beaten the game! It may fail due to simple bad luck close to the end (you are forced to move down, which you should never do, and a tile appears where your highest should be. Just try to keep the top row filled, so moving left does not break the pattern), but basically you end up having a fixed part and a mobile part to play with. This is your objective:

Ready to finish

This is the model I chose by default.

1024 512 256 128
  8   16  32  64
  4   2   x   x
  x   x   x   x

The chosen corner is arbitrary, you basically never press one key (the forbidden move), and if you do, you press the contrary again and try to fix it. For future tiles the model always expects the next random tile to be a 2 and appear on the opposite side to the current model (while the first row is incomplete, on the bottom right corner, once the first row is completed, on the bottom left corner).

Here goes the algorithm. Around 80% wins (it seems it is always possible to win with more "professional" AI techniques, I am not sure about this, though.)

initiateModel();

while(!game_over)
{    
    checkCornerChosen(); // Unimplemented, but it might be an improvement to change the reference point

    for each 3 possible move:
        evaluateResult()
    execute move with best score
    if no move is available, execute forbidden move and undo, recalculateModel()
 }

 evaluateResult() {
     calculatesBestCurrentModel()
     calculates distance to chosen model
     stores result
 }

 calculateBestCurrentModel() {
      (according to the current highest tile acheived and their distribution)
  }

A few pointers on the missing steps. Here: model change

The model has changed due to the luck of being closer to the expected model. The model the AI is trying to achieve is

 512 256 128  x
  X   X   x   x
  X   X   x   x
  x   x   x   x

And the chain to get there has become:

 512 256  64  O
  8   16  32  O
  4   x   x   x
  x   x   x   x

The O represent forbidden spaces...

So it will press right, then right again, then (right or top depending on where the 4 has created) then will proceed to complete the chain until it gets:

Chain completed

So now the model and chain are back to:

 512 256 128  64
  4   8  16   32
  X   X   x   x
  x   x   x   x

Second pointer, it has had bad luck and its main spot has been taken. It is likely that it will fail, but it can still achieve it:

Enter image description here

Here the model and chain is:

  O 1024 512 256
  O   O   O  128
  8  16   32  64
  4   x   x   x

When it manages to reach the 128 it gains a whole row is gained again:

  O 1024 512 256
  x   x  128 128
  x   x   x   x
  x   x   x   x
2015/02/04

I copy here the content of a post on my blog


The solution I propose is very simple and easy to implement. Although, it has reached the score of 131040. Several benchmarks of the algorithm performances are presented.

Score

Algorithm

Heuristic scoring algorithm

The assumption on which my algorithm is based is rather simple: if you want to achieve higher score, the board must be kept as tidy as possible. In particular, the optimal setup is given by a linear and monotonic decreasing order of the tile values. This intuition will give you also the upper bound for a tile value: s where n is the number of tile on the board.

(There's a possibility to reach the 131072 tile if the 4-tile is randomly generated instead of the 2-tile when needed)

Two possible ways of organizing the board are shown in the following images:

enter image description here

To enforce the ordination of the tiles in a monotonic decreasing order, the score si computed as the sum of the linearized values on the board multiplied by the values of a geometric sequence with common ratio r<1 .

s

s

Several linear path could be evaluated at once, the final score will be the maximum score of any path.

Decision rule

The decision rule implemented is not quite smart, the code in Python is presented here:

@staticmethod
def nextMove(board,recursion_depth=3):
    m,s = AI.nextMoveRecur(board,recursion_depth,recursion_depth)
    return m

@staticmethod
def nextMoveRecur(board,depth,maxDepth,base=0.9):
    bestScore = -1.
    bestMove = 0
    for m in range(1,5):
        if(board.validMove(m)):
            newBoard = copy.deepcopy(board)
            newBoard.move(m,add_tile=True)

            score = AI.evaluate(newBoard)
            if depth != 0:
                my_m,my_s = AI.nextMoveRecur(newBoard,depth-1,maxDepth)
                score += my_s*pow(base,maxDepth-depth+1)

            if(score > bestScore):
                bestMove = m
                bestScore = score
    return (bestMove,bestScore);

An implementation of the minmax or the Expectiminimax will surely improve the algorithm. Obviously a more sophisticated decision rule will slow down the algorithm and it will require some time to be implemented.I will try a minimax implementation in the near future. (stay tuned)

Benchmark

  • T1 - 121 tests - 8 different paths - r=0.125
  • T2 - 122 tests - 8-different paths - r=0.25
  • T3 - 132 tests - 8-different paths - r=0.5
  • T4 - 211 tests - 2-different paths - r=0.125
  • T5 - 274 tests - 2-different paths - r=0.25
  • T6 - 211 tests - 2-different paths - r=0.5

enter image description here enter image description here enter image description here enter image description here

In case of T2, four tests in ten generate the 4096 tile with an average score of s 42000

Code

The code can be found on GiHub at the following link: https://github.com/Nicola17/term2048-AI It is based on term2048 and it's written in Python. I will implement a more efficient version in C++ as soon as possible.

2014/03/26

My attempt uses expectimax like other solutions above, but without bitboards. Nneonneo's solution can check 10millions of moves which is approximately a depth of 4 with 6 tiles left and 4 moves possible (2*6*4)4. In my case, this depth takes too long to explore, I adjust the depth of expectimax search according to the number of free tiles left:

depth = free > 7 ? 1 : (free > 4 ? 2 : 3)

The scores of the boards are computed with the weighted sum of the square of the number of free tiles and the dot product of the 2D grid with this:

[[10,8,7,6.5],
 [.5,.7,1,3],
 [-.5,-1.5,-1.8,-2],
 [-3.8,-3.7,-3.5,-3]]

which forces to organize tiles descendingly in a sort of snake from the top left tile.

code below or on github:

var n = 4,
 M = new MatrixTransform(n);

var ai = {weights: [1, 1], depth: 1}; // depth=1 by default, but we adjust it on every prediction according to the number of free tiles

var snake= [[10,8,7,6.5],
            [.5,.7,1,3],
            [-.5,-1.5,-1.8,-2],
            [-3.8,-3.7,-3.5,-3]]
snake=snake.map(function(a){return a.map(Math.exp)})

initialize(ai)

function run(ai) {
 var p;
 while ((p = predict(ai)) != null) {
  move(p, ai);
 }
 //console.log(ai.grid , maxValue(ai.grid))
 ai.maxValue = maxValue(ai.grid)
 console.log(ai)
}

function initialize(ai) {
 ai.grid = [];
 for (var i = 0; i < n; i++) {
  ai.grid[i] = []
  for (var j = 0; j < n; j++) {
   ai.grid[i][j] = 0;
  }
 }
 rand(ai.grid)
 rand(ai.grid)
 ai.steps = 0;
}

function move(p, ai) { //0:up, 1:right, 2:down, 3:left
 var newgrid = mv(p, ai.grid);
 if (!equal(newgrid, ai.grid)) {
  //console.log(stats(newgrid, ai.grid))
  ai.grid = newgrid;
  try {
   rand(ai.grid)
   ai.steps++;
  } catch (e) {
   console.log('no room', e)
  }
 }
}

function predict(ai) {
 var free = freeCells(ai.grid);
 ai.depth = free > 7 ? 1 : (free > 4 ? 2 : 3);
 var root = {path: [],prob: 1,grid: ai.grid,children: []};
 var x = expandMove(root, ai)
 //console.log("number of leaves", x)
 //console.log("number of leaves2", countLeaves(root))
 if (!root.children.length) return null
 var values = root.children.map(expectimax);
 var mx = max(values);
 return root.children[mx[1]].path[0]

}

function countLeaves(node) {
 var x = 0;
 if (!node.children.length) return 1;
 for (var n of node.children)
  x += countLeaves(n);
 return x;
}

function expectimax(node) {
 if (!node.children.length) {
  return node.score
 } else {
  var values = node.children.map(expectimax);
  if (node.prob) { //we are at a max node
   return Math.max.apply(null, values)
  } else { // we are at a random node
   var avg = 0;
   for (var i = 0; i < values.length; i++)
    avg += node.children[i].prob * values[i]
   return avg / (values.length / 2)
  }
 }
}

function expandRandom(node, ai) {
 var x = 0;
 for (var i = 0; i < node.grid.length; i++)
  for (var j = 0; j < node.grid.length; j++)
   if (!node.grid[i][j]) {
    var grid2 = M.copy(node.grid),
     grid4 = M.copy(node.grid);
    grid2[i][j] = 2;
    grid4[i][j] = 4;
    var child2 = {grid: grid2,prob: .9,path: node.path,children: []};
    var child4 = {grid: grid4,prob: .1,path: node.path,children: []}
    node.children.push(child2)
    node.children.push(child4)
    x += expandMove(child2, ai)
    x += expandMove(child4, ai)
   }
 return x;
}

function expandMove(node, ai) { // node={grid,path,score}
 var isLeaf = true,
  x = 0;
 if (node.path.length < ai.depth) {
  for (var move of[0, 1, 2, 3]) {
   var grid = mv(move, node.grid);
   if (!equal(grid, node.grid)) {
    isLeaf = false;
    var child = {grid: grid,path: node.path.concat([move]),children: []}
    node.children.push(child)
    x += expandRandom(child, ai)
   }
  }
 }
 if (isLeaf) node.score = dot(ai.weights, stats(node.grid))
 return isLeaf ? 1 : x;
}



var cells = []
var table = document.querySelector("table");
for (var i = 0; i < n; i++) {
 var tr = document.createElement("tr");
 cells[i] = [];
 for (var j = 0; j < n; j++) {
  cells[i][j] = document.createElement("td");
  tr.appendChild(cells[i][j])
 }
 table.appendChild(tr);
}

function updateUI(ai) {
 cells.forEach(function(a, i) {
  a.forEach(function(el, j) {
   el.innerHTML = ai.grid[i][j] || ''
  })
 });
}


updateUI(ai);
updateHint(predict(ai));

function runAI() {
 var p = predict(ai);
 if (p != null && ai.running) {
  move(p, ai);
  updateUI(ai);
  updateHint(p);
  requestAnimationFrame(runAI);
 }
}
runai.onclick = function() {
 if (!ai.running) {
  this.innerHTML = 'stop AI';
  ai.running = true;
  runAI();
 } else {
  this.innerHTML = 'run AI';
  ai.running = false;
  updateHint(predict(ai));
 }
}


function updateHint(dir) {
 hintvalue.innerHTML = ['↑', '→', '↓', '←'][dir] || '';
}

document.addEventListener("keydown", function(event) {
 if (!event.target.matches('.r *')) return;
 event.preventDefault(); // avoid scrolling
 if (event.which in map) {
  move(map[event.which], ai)
  console.log(stats(ai.grid))
  updateUI(ai);
  updateHint(predict(ai));
 }
})
var map = {
 38: 0, // Up
 39: 1, // Right
 40: 2, // Down
 37: 3, // Left
};
init.onclick = function() {
 initialize(ai);
 updateUI(ai);
 updateHint(predict(ai));
}


function stats(grid, previousGrid) {

 var free = freeCells(grid);

 var c = dot2(grid, snake);

 return [c, free * free];
}

function dist2(a, b) { //squared 2D distance
 return Math.pow(a[0] - b[0], 2) + Math.pow(a[1] - b[1], 2)
}

function dot(a, b) {
 var r = 0;
 for (var i = 0; i < a.length; i++)
  r += a[i] * b[i];
 return r
}

function dot2(a, b) {
 var r = 0;
 for (var i = 0; i < a.length; i++)
  for (var j = 0; j < a[0].length; j++)
   r += a[i][j] * b[i][j]
 return r;
}

function product(a) {
 return a.reduce(function(v, x) {
  return v * x
 }, 1)
}

function maxValue(grid) {
 return Math.max.apply(null, grid.map(function(a) {
  return Math.max.apply(null, a)
 }));
}

function freeCells(grid) {
 return grid.reduce(function(v, a) {
  return v + a.reduce(function(t, x) {
   return t + (x == 0)
  }, 0)
 }, 0)
}

function max(arr) { // return [value, index] of the max
 var m = [-Infinity, null];
 for (var i = 0; i < arr.length; i++) {
  if (arr[i] > m[0]) m = [arr[i], i];
 }
 return m
}

function min(arr) { // return [value, index] of the min
 var m = [Infinity, null];
 for (var i = 0; i < arr.length; i++) {
  if (arr[i] < m[0]) m = [arr[i], i];
 }
 return m
}

function maxScore(nodes) {
 var min = {
  score: -Infinity,
  path: []
 };
 for (var node of nodes) {
  if (node.score > min.score) min = node;
 }
 return min;
}


function mv(k, grid) {
 var tgrid = M.itransform(k, grid);
 for (var i = 0; i < tgrid.length; i++) {
  var a = tgrid[i];
  for (var j = 0, jj = 0; j < a.length; j++)
   if (a[j]) a[jj++] = (j < a.length - 1 && a[j] == a[j + 1]) ? 2 * a[j++] : a[j]
  for (; jj < a.length; jj++)
   a[jj] = 0;
 }
 return M.transform(k, tgrid);
}

function rand(grid) {
 var r = Math.floor(Math.random() * freeCells(grid)),
  _r = 0;
 for (var i = 0; i < grid.length; i++) {
  for (var j = 0; j < grid.length; j++) {
   if (!grid[i][j]) {
    if (_r == r) {
     grid[i][j] = Math.random() < .9 ? 2 : 4
    }
    _r++;
   }
  }
 }
}

function equal(grid1, grid2) {
 for (var i = 0; i < grid1.length; i++)
  for (var j = 0; j < grid1.length; j++)
   if (grid1[i][j] != grid2[i][j]) return false;
 return true;
}

function conv44valid(a, b) {
 var r = 0;
 for (var i = 0; i < 4; i++)
  for (var j = 0; j < 4; j++)
   r += a[i][j] * b[3 - i][3 - j]
 return r
}

function MatrixTransform(n) {
 var g = [],
  ig = [];
 for (var i = 0; i < n; i++) {
  g[i] = [];
  ig[i] = [];
  for (var j = 0; j < n; j++) {
   g[i][j] = [[j, i],[i, n-1-j],[j, n-1-i],[i, j]]; // transformation matrix in the 4 directions g[i][j] = [up, right, down, left]
   ig[i][j] = [[j, i],[i, n-1-j],[n-1-j, i],[i, j]]; // the inverse tranformations
  }
 }
 this.transform = function(k, grid) {
  return this.transformer(k, grid, g)
 }
 this.itransform = function(k, grid) { // inverse transform
  return this.transformer(k, grid, ig)
 }
 this.transformer = function(k, grid, mat) {
  var newgrid = [];
  for (var i = 0; i < grid.length; i++) {
   newgrid[i] = [];
   for (var j = 0; j < grid.length; j++)
    newgrid[i][j] = grid[mat[i][j][k][0]][mat[i][j][k][1]];
  }
  return newgrid;
 }
 this.copy = function(grid) {
  return this.transform(3, grid)
 }
}
body {
 font-family: Arial;
}
table, th, td {
 border: 1px solid black;
 margin: 0 auto;
 border-collapse: collapse;
}
td {
 width: 35px;
 height: 35px;
 text-align: center;
}
button {
 margin: 2px;
 padding: 3px 15px;
 color: rgba(0,0,0,.9);
}
.r {
 display: flex;
 align-items: center;
 justify-content: center;
 margin: .2em;
 position: relative;
}
#hintvalue {
 font-size: 1.4em;
 padding: 2px 8px;
 display: inline-flex;
 justify-content: center;
 width: 30px;
}
<table title="press arrow keys"></table>
<div class="r">
    <button id=init>init</button>
    <button id=runai>run AI</button>
    <span id="hintvalue" title="Best predicted move to do, use your arrow keys" tabindex="-1"></span>
</div>

2019/01/01

I am the author of a 2048 controller that scores better than any other program mentioned in this thread. An efficient implementation of the controller is available on github. In a separate repo there is also the code used for training the controller's state evaluation function. The training method is described in the paper.

The controller uses expectimax search with a state evaluation function learned from scratch (without human 2048 expertise) by a variant of temporal difference learning (a reinforcement learning technique). The state-value function uses an n-tuple network, which is basically a weighted linear function of patterns observed on the board. It involved more than 1 billion weights, in total.

Performance

At 1 moves/s: 609104 (100 games average)

At 10 moves/s: 589355 (300 games average)

At 3-ply (ca. 1500 moves/s): 511759 (1000 games average)

The tile statistics for 10 moves/s are as follows:

2048: 100%
4096: 100%
8192: 100%
16384: 97%
32768: 64%
32768,16384,8192,4096: 10%

(The last line means having the given tiles at the same time on the board).

For 3-ply:

2048: 100%
4096: 100%
8192: 100%
16384: 96%
32768: 54%
32768,16384,8192,4096: 8%

However, I have never observed it obtaining the 65536 tile.

2016/12/23