*large* python dictionary with persistence storage for quick look-ups

前端 未结 6 1730
心在旅途
心在旅途 2020-12-08 21:05

I have a 400 million lines of unique key-value info that I would like to be available for quick look ups in a script. I am wondering what would be a slick way of doing this.

6条回答
  •  無奈伤痛
    2020-12-08 21:39

    In principle the shelve module does exactly what you want. It provides a persistent dictionary backed by a database file. Keys must be strings, but shelve will take care of pickling/unpickling values. The type of db file can vary, but it can be a Berkeley DB hash, which is an excellent light weight key-value database.

    Your data size sounds huge so you must do some testing, but shelve/BDB is probably up to it.

    Note: The bsddb module has been deprecated. Possibly shelve will not support BDB hashes in future.

提交回复
热议问题