I want to compute an md5 hash not of a string, but of an entire data structure. I understand the mechanics of a way to do this (dispatch on the type of the value, canonical
While it does require a dependency on joblib, I've found that joblib.hashing.hash(object) works very well and is designed for use with joblib's disk caching mechanism. Empirically it seems to be producing consistent results from run to run, even on data that pickle mixes up on different runs.
Alternatively, you might be interested in artemis-ml's compute_fixed_hash function, which theoretically hashes objects in a way that is consistent across runs. However, I've not tested it myself.
Sorry for posting millions of years after the original question