The ultimate goal is comparing 2 binaries built from exact same source in exact same environment and being able to tell that they indeed are functionally equivalent.
<
I solved this to an extent.
Currently we have build system that makes sure all new builds are on the path of constant length (builds/001, builds/002, etc), thus avoiding shifts in the PE layout. After build a tool compares old and new binaries ignoring relevant PE fields and other locations with known superficial changes. It also runs some simple heuristics to detect dynamic ignorable changes. Here is full list of things to ignore:
Once in a while linker would make some PE sections bigger without throwing anything else out of alignment. Looks like it moves section boundary inside the padding -- it is zeros all around anyway, but because of it I'll get binaries with 1 byte difference.
UPDATE: we recently opensourced the tool on GitHub. See Compare section in documentation.