Tic Tac Toe AI Bugs | 易学教程

问题

I'm trying to implement an AI for Tic Tac Toe that is smart enough to never lose. I've tried two different algorithms but the AI still makes mistakes.

I started with this minimax alpha-beta pruning algorithm. Here's a live demo: http://iioengine.com/ttt/minimax.htm

It runs without error, but if you take the bottom left corner first, then either of the other two squares on the bottom row - the AI doesn't see that coming. I'm sure this isn't a flaw in the minimax algorithm - can anyone see an error in my source? you can inspect the demo page to see everything but here is the primary ai function:

function bestMove(board,depth,low,high,opponent){
      var best=new Move(null,-iio.maxInt);
      var p;
      for (var c=0;c<grid.C;c++)
        for(var r=0;r<grid.R;r++){
          if (board[c][r]=='_'){
            var nuBoard=board.clone();
            nuBoard[c][r]=getTypeChar(opponent);
            if(checkWin(nuBoard,getTypeChar(opponent)))
              p=new Move([c,r],-evaluateBoard(board,getTypeChar(opponent))*10000);
            else if (checkScratch(nuBoard))
              p=new Move([c,r],0);
            else if (depth==0)
              p=new Move([c,r],-evaluateBoard(board,getTypeChar(opponent)));
            else {
              p=bestMove(nuBoard,depth-1,-high,-low,!opponent);
            }
            if (p.score>best.score){
              best=p;
              if (best.score > low)
                low=best.score;
              if (best.score >= high) return best;
            }
          }
        }
      return best;
    }

If you are more familiar with negamax, I tried that one too. I lifted the logic straight from this page. Here is a live demo: http://iioengine.com/ttt/negamax.htm

That one freezes up once you reach a win state, but you can already see that the AI is pretty stupid. Is something wrong with the code integration?

Please let me know if you find a flaw in my code that prevents these algrothims from running properly. Thnx.

Update with code:

function checkWin(board,type){
      for (var i=0;i<3;i++)
        if (evaluateRow(board,[i,0,i,1,i,2],type) >= WIN_SCORE
          ||evaluateRow(board,[0,i,1,i,2,i],type) >= WIN_SCORE)
          return true;
      if(evaluateRow(board,[0,0,1,1,2,2],type) >= WIN_SCORE
       ||evaluateRow(board,[2,0,1,1,0,2],type) >= WIN_SCORE)
        return true;
      return false;
    }

function evaluateBoard(board,type){
      var moveTotal=0;
      for (var i=0;i<3;i++){
        moveTotal+=evaluateRow(board,[i,0,i,1,i,2],type);
        moveTotal+=evaluateRow(board,[0,i,1,i,2,i],type);
      }
      moveTotal+=evaluateRow(board,[0,0,1,1,2,2],type);
      moveTotal+=evaluateRow(board,[2,0,1,1,0,2],type);
      return moveTotal;
    }

回答1:

The problem lies in your evaluateBoard() function. The evaluation function is the heart of a minimax/negamax algorithm. If your AI is behaving poorly, the problem usually lies in the evaluation of the board at each move.

For the evaluation of the board, you need to take into consideration three things: winning moves, blocking moves, and moves that result in a fork.

Winning Moves

The static evaluation function needs to know if a move results in a win or a loss for the current player. If the move results in a loss for the current player, it needs to return a very low negative number (lower than any regular move). If the move results in a win for the current player, it needs to return a very high positive number (larger than any regular move).

What is important to remember is that this evaluation has to be relative to the player whose turn the AI is making. If the AI is currently predicting where the Human player will move, then the evaluation must look at the board from the point of view of the Human player. When it's the AI's turn move, the evaluation must look at the board from the point of view of the Computer player.

Blocking Moves

When you run your evaluation function, the AI actually doesn't think blocking the Human player is beneficial. Your evaluation function looks like it just counts the number of available moves and returns the result. Instead, you need to return a higher positive number for moves that will help the AI win.

To account for blocking, you need to figure out if a player has 2 of their tokens in an open row, column, or diagonal, and then score the blocking square higher than any other square. So if it is the Computer's turn to move, and the Human player has 2 tokens in an open row, the 3rd square in the row needs to have a high positive number (but not as high as a winning square). This will cause the computer to favor that square over any others.

By just accounting for Winning moves and Blocking moves, you will have a Computer that plays fairly well.

Forking Moves

Forking moves cause problems for the Computer. The main problem is that the Computer is 'too smart' for it's own good. Since it assumes that the Human player will always make the best move every time, it will find situations where all moves that it could make will eventually end in a loss for it, so it will just pick the first move on the board it can since nothing else matters.

If we go through your example, we can see this happen: Human player plays bottom left, Computer plays top middle.

   | O |
---+---+---
   |   |
---+---+---
 X |   |

When the Human player makes a move to the bottom right corner, the Computer sees that if it tries to block that move, the best move the Human player would make is to take the middle square, resulting in a fork and a win for the Human (although this won't happen even time since Humans are fallible, the Computer doesn't know that).

   | O |
---+---+---
   | X |
---+---+---
 X | O | X

Because the computer will lose whether it blocks or doesn't block the Human from winning, blocking the Human will actually bubble up the lowest possible score (since it results in a loss for the Computer). This means that the Computer will take the best score it can - the middle square.

You'll have to figure out what is the best way to handle such situations, since everyone would play it differently. It's just something to be aware of.

回答2:

With pure Minimax implementation for Tic-Tac-Toe, the A.I. should never lose. At worst, it should go into a draw.

By pure Minimax, I mean an implementation that explores each and every possible move (actually transition from one move to the other) and creates a tree for said moves and transitions (starting with an empty board at the top of the tree, branching off in all possible first moves, then all possible 2nd moves, etc).

(There's also heuristic Minimax, in which you do not render all positions in the tree node from the start, but only go a certain depth.)

The tree should have as leafs only board positions that end the game (X wins, O wins or draw). Such a tree for classing Tic-Tac-Toe (3x3 board) contains 5477 nodes (not counting the all-empty board at the top).

Once such a tree is created, the leaves are scored directly by simply evaluating how the game ended: top score for a leaf node containing a board state where A.I. wins, 0 score for draw, and lowest score for nodes with board state where the human player has won.

(in heuristic Minimax, you'll have to create a "guesstimation" function, that evaluates the leafs of the partial tree and assigns min/0/max score accordingly - in this implementation, there's a chance that the A.I. might lose at the end, and that chance is inversely proportional with how good your "guesstimator" function is at assessing partial game states.)

Next, all intermediate, non-leaf nodes of the tree are scored based of their children. Obviously, you'd do this bottoms-up, as initially, only the lowest non leaf nodes have scored children (the leaf nodes) from which to draw their own score.

(In the context of Tic-Tac-Toe there's no point in making a heuristic implementation of Minimax, as it's fairly cheap to render a tree with 5477 + 1 nodes, and then score them all. This kind of implementation is useful for games where there's a lot of branching (a lot of possible moves for a given game state), thus making for a slow/memory-hog full tree - such as chess))

In the end, you'll have a data structure containing all possible Tic-Tac-Toe games, and an exact idea of what's the best move to perform in response to any move the human player does. As such, due to how Tic-Tac-Toe rules work, Minimax A.I. will only win (if the human player makes at least one crucial mistake) or draw (if the human player always makes the best possible move). This stands true no matter who makes the first move.

I've implemented this myself, and it works as expected.

Here are some of the finer points (with which I've struggled a bit):

make sure the function you use to evaluate the board works well, i.e. that it correctly spots when there's a win/draw situation for either X and O. This function will be used on almost each node of your Minimax tree as you build it, and having it bug-out will result in seemingly working but in fact flawed code. Test this part extensively
Make sure you navigate your tree properly, especially when you're scoring intermediate nodes (but also when you're searching for the next move to make). A trivial solution is to make, along-side the tree, a hash table containing each intermediary node (non-leaf node) per level of tree depth. This way you'll be sure to get all nodes at the right time when you do the bottom-up scoring.

来源：https://stackoverflow.com/questions/18777294/tic-tac-toe-ai-bugs

标签

algorithm

artificial-intelligence

tic-tac-toe