Need memory efficient way to store tons of strings (was: HAT-Trie implementation in java)

前端 未结 4 2141
野的像风
野的像风 2020-12-24 02:15

I am working with a large set (5-20 million) of String keys (average length 10 chars) which I need to store in an in memory data structure that supports th

4条回答
  •  余生分开走
    2020-12-24 02:48

    For space efficiency, O(log(n)) lookup, and simple code, try binary search over an array of characters. 20 million keys of average length 10 makes 200 million characters: 400MB if you need 2 bytes/char; 200MB if you can get away with 1. On top of this you need to somehow represent the boundaries between the keys in the array. If you can reserve a separator character, that's one way; otherwise you might use a parallel array of int offsets.

    The simplest variant would use an array of Strings, at a high space cost from per-object overhead. It ought to still beat a hashtable in space efficiency, though not as impressively.

提交回复
热议问题