- Assuming you are using
numpy for your numerical experiments, instead of pickle I would suggest using numpy.savez.
- Keep it simple and make optimizations only if it you feel that the script runs too long.
- Opening and closing files does affect the run time, but having a backup is anyway better.
And I would use collections.defaultdict(list) instead of plain dict and setdefault.