Not strictly a question, more of a puzzle...
Over the years, I\'ve been involved in a few technical interviews of new employees. Other than asking the standard \"do you
I'd try to use Huffman encoding. The theory behind this is - in every chess game there will be some pieces that will move around a lot, and some that don't move much or get eliminated early. If the starting position has some pieces already removed - all the better. The same goes for squares - some squares get to see all action, while some don't get touched much.
Thus I'd have two Huffman tables - one for pieces, other for squares. They will be generated by looking at the actual game. I could have one large table for every piece-square pair, but I think that would be pretty inefficient because there aren't many instances of the same piece moving on the same square again.
Every piece would have an assigned ID. Since there are 32 different pieces, I would need just 5 bits for the piece ID. The piece IDs are not changing from game to game. The same goes for square IDs, for which I would need 6 bits.
The Huffman trees would be encoded by writing each node down as they are traversed in inorder (that is, first the node is output, then its children from left to right). For every node there will be one bit specifiying whether it's a leaf node or a branch node. If it's a leaf node, it will be followed by the bits giving the ID.
The starting position will simply be given by a series of piece-location pairs. After that there will be one piece-location pair for every move. You can find the end of the starting position descriptor (and the start of the moves descriptor) simply by finding the first piece that is mentioned twice. In case a pawn gets promoted there will be 2 extra bits specifying what it becomes, but the piece ID won't change.
To account for the possibility that a pawn is promoted at the start of the game there will also be a "promotion table" between the huffman trees and the data. At first there will be 4 bits specifying how many pawns are upgraded. Then for each pawn there will be its huffman-encoded ID and 2 bits specifying what it has become.
The huffman trees will be generated by taking into account all the data (both the starting position and the moves) and the promotion table. Although normally the promotion table will be empty or have just a few entries.
To sum it up in graphical terms:
:= ( | <1 bit for next move - see Added 2 below>)
:= ...
:= "0" | "1" <5 bits with piece ID>
:= ...
:= "0" | "1" <6 bits with square ID>
:= <4 bits with count of promotions> ...
:= <2 bits with what it becomes>
:= ...
:= ...
:= (<2 bits specifying the upgrade - optional>)
Added: This could still be optimized. Every piece only has a few legal moves. Instead of simply encoding the target square one could give 0-based IDs for the possible moves of every piece. The same IDs would get reused for every piece, so in total there would be no more than 21 different IDs (the queen can have at most 21 diffferent possible move options). Put this in a huffman table instead of the fields.
This would however present a difficulty in representing the original state. One might generate a series of moves to put each piece in its place. In this case it would be necessary to somehow mark the end of the initial state and start of the moves.
Alternatively they could be placed by using uncompressed 6-bit square IDs.
Whether this would present an overall decrease in size - I don't know. Probably, but should experiment a bit.
Added 2: One more special case. If the game state has NO moves it becomes important to distinguish who moves next. Add one more bit at the end for that. :)