Programmer Puzzle: Encoding a chess board state throughout a game

前端 未结 30 2054
闹比i
闹比i 2021-01-29 17:16

Not strictly a question, more of a puzzle...

Over the years, I\'ve been involved in a few technical interviews of new employees. Other than asking the standard \"do you

30条回答
  •  你的背包
    2021-01-29 17:48

    Great puzzle!

    I see that most people are storing the position of each piece. How about taking a more simple-minded approach, and storing the contents of each square? That takes care of promotion and captured pieces automatically.

    And it allows for Huffman encoding. Actually, the initial frequency of pieces on the board is nearly perfect for this: half of the squares are empty, half of the remaining squares are pawns, etcetera.

    Considering the frequency of each piece, I constructed a Huffman tree on paper, which I won't repeat here. The result, where c stands for the colour (white = 0, black = 1):

    • 0 for empty squares
    • 1c0 for pawn
    • 1c100 for rook
    • 1c101 for knight
    • 1c110 for bishop
    • 1c1110 for queen
    • 1c1111 for king

    For the entire board in its initial situation, we have

    • empty squares: 32 * 1 bit = 32 bits
    • pawns: 16 * 3 bits = 48 bits
    • rooks/knights/bishops: 12 * 5 bits = 60 bits
    • queens/kings: 4 * 6 bits = 24 bits

    Total: 164 bits for the initial board state. Significantly less than the 235 bits of the currently highest voted answer. And it's only going to get smaller as the game progresses (except after a promotion).

    I looked only at the position of the pieces on the board; additional state (whose turn, who has castled, en passant, repeating moves, etc.) will have to be encoded separately. Maybe another 16 bits at most, so 180 bits for the entire game state. Possible optimizations:

    • Leaving out the less frequent pieces, and storing their position separately. But that won't help... replacing king and queen by an empty square saves 5 bits, which are exactly the 5 bits you need to encode their position in another way.
    • "No pawns on the back row" could easily be encoded by using a different Huffman table for the back rows, but I doubt it helps much. You'd probably still end up with the same Huffman tree.
    • "One white, one black bishop" can be encoded by introducing extra symbols that don't have the c bit, which can then be deduced from the square that the bishop is on. (Pawns promoted to bishops disrupt this scheme...)
    • Repetitions of empty squares could be run-length encoded by introducing extra symbols for, say, "2 empty squares in a row" and "4 empty squares in a row". But it is not so easy to estimate the frequency of those, and if you get it wrong, it's going to hurt rather than help.

提交回复
热议问题