How to find a word in large word list (vocabulary) with descent memory consumption and look-up time?

后端 未结 2 431
小蘑菇
小蘑菇 2021-02-06 02:03

Problem

[Here follows a description of what the app should do under which constrains]

I want a data-structure that searches if a string e

2条回答
  •  难免孤独
    2021-02-06 02:44

    I had this same issue and ended up going with an "on-disk" trie. That is, I encode the data structure into a single file using byte offsets instead of pointers (packing the nodes in reverse order, with the "root" node being the last written).

    It is fast to load by simply reading the file into a byte array, with trie traversal using offset values the same way it would pointers.

    My 200K word set fits in 1.7 MB (uncompressed) with a 4 byte value in each word terminating node.

提交回复
热议问题