The pickle module documentation says right at the beginning:
Warning: The pickle module is not intended to be secure against erron
I'd go so far as saying that there is no safe way to use pickle to handle untrusted data.
Even with restricted globals, the dynamic nature of Python is such that a determined hacker still has a chance of finding a way back to the __builtins__
mapping and from there to the Crown Jewels.
See Ned Batchelder's blog posts on circumventing restrictions on eval() that apply in equal measure to pickle
.
Remember that pickle
is still a stack language and you cannot foresee all possible objects produced from allowing arbitrary calls even to a limited set of globals. The pickle documentation also doesn't mention the EXT*
opcodes that allow calling copyreg
-installed extensions; you'll have to account for anything installed in that registry too here. All it takes is one vector allowing a object call to be turned into a getattr
equivalent for your defences to crumble.
At the very least use a cryptographic signature to your data so you can validate the integrity. You'll limit the risks, but if an attacker ever managed to steal your signing secrets (keys) then they could again slip you a hacked pickle.
I would instead use an an existing innocuous format like JSON and add type annotations; e.g. store data in dictionaries with a type key and convert when loading the data.