Java: large persistent hash structure?

旧巷老猫 提交于 2019-11-30 04:02:01

问题


I'm looking for a persistent hash structure in java, a simple key-value store, where key is a unique string and value is an int. The value of a key is to be incremented each time an existing key is added to the store.

I need this to be quite large - possibly 500m - 1bn keys. I've been evaluating tokyo-cabinet http://fallabs.com/tokyocabinet/javadoc/ but not sure how well it will scale - insert times seem to be getting longer as the hash grows.

Any ideas on what might be appropriate?

Thanks

Edit: In order to reduce disk I/O I'm going to be caching data in an in-memory HashMap, then updating the persistent hash in one go when the cache grows to a certain size.

Edit2: One of the reasons for the persistence is that I have limited RAM, 4GB, so I can't fit a big struture into memory.


回答1:


I thing Megamap is what you are looking for: http://megamap.sourceforge.net/. Here is a short description of Megamap from its homepage:

MegaMap is a Java implementation of a map (or hashtable) that can store an unbounded amount of data, limited only by the amount of disk space available. Objects stored in the map are persisted to disk. Good performance is achieved by an in-memory cache. The MegaMap can, for all practical reasons, be thought of as a map implementation with unlimited storage space.




回答2:


Use a database not a hash. Even for a database 500M rows is getting quite large. How many updates are you expecting per second?




回答3:


Have you checked out Berkeley BD Java Edition? They have a Collections-compatible API (see also the Javadoc for StoredMap).




回答4:


So, if I understand correctly, Redis might be an option. You can issue INCR [key] commands to atomically increment the value associated with that key. If the key does not exist, its set to zero and then incremented (resulting in one). According to the docs, INCR is a constant-time operation. Speed is a primary design goal for Redis.

Redis is able to persist itself to file, and you can control the parameters on how that happens.




回答5:


I think Memcached is good option for your case along with a suitable database in the backend.



来源:https://stackoverflow.com/questions/3832622/java-large-persistent-hash-structure

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!