Persisting a trie to a file - C

こ雲淡風輕ζ 提交于 2019-12-29 06:19:46

问题


I have a trie which I am using to do some string processing. I have a simple compiler which generates trie from some data. Once generated, my trie won't change at run time.

I am looking for an approach where I can persist the trie in a file and load it effectively. I have looked at sqllite to understand how they are persisting b-treebut their file format looks bit advanced and I may not need all of those.

It'd be helpful if someone can provide some ideas to persist and read the trie. I am programming using C.


回答1:


I did some research and found the following little gems online:

  1. trie.h
  2. trie.c

A working trie with serialization and deserialization. It was originally written for use in Python (there's a corresponding triemodule.c for tying it to Python), but it's pure C; you could mine it for ideas or use it as you wish.

Update:

It appears the links are no longer working. I'll keep the originals up, but here are the links in the wayback machine:

  1. trie.h
  2. trie.c



回答2:


Assuming your entire data structure fits in memory, a recursive serialization approach is simplest. Sqllite works with data structures that won't fit in memory, so it is probably overkill to try copying their methods.

Here is example pseudocode for reading/writing a node. It works by recursively reading/writing the child nodes. It has nothing trie-specific, and should work for other tree data structures as well.

void writeNode(Node *node)
    write node data to file
    write node.numOfChildren to file
    for each child:
        writeNode(child)

Node *readNode()
    Node *node = allocateNewNode()
    read node data from file
    read node.numOfChildren from file
    for (i=0; i<node.numOfChildren; i++)
        Node *child = readNode()
        node.addChild(child)



回答3:


If all of your nodes are the same size then you can just enumerate your nodes (root = 0) and write each of them to a file at their index. While writing them you will have to change their references to other nodes to those nodes' indexes, though. You will probably also need a NULL value. You could use -1 or you could use (root = 1) and (NULL = 0).

You will probably also be able to compress these nodes somewhat by changing their pointer fields to be smaller types.

If your nodes are different sizes then it's more complicated.



来源:https://stackoverflow.com/questions/2572078/persisting-a-trie-to-a-file-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!