Store/retrieve a data structure

前端 未结 5 554
轮回少年
轮回少年 2020-12-31 11:58

I have implemented a suffix tree in Python to make full-text-searchs, and it\'s working really well. But there\'s a problem: the indexed text can be very bi

5条回答
  •  死守一世寂寞
    2020-12-31 12:26

    Try a compressed suffix tree instead.

    The main idea is that instead of having lots of nodes of 1 character, you can compact them into 1 node of multiple characters thus saving extra nodes.

    This link here (http://www.cs.sunysb.edu/~algorith/implement/suds/implement.shtml) says you can transform a 160MB suffix tree to 33MB compressed suffix tree. Quite a gain.

    These compressed trees are used for genetic substring matching on huge strings. I used to run out of memory with a suffix tree, but after I compressed it, the out of memory error disappeared.

    I wish I could find an unpaid article which explains the implementation better. (http://dl.acm.org/citation.cfm?id=1768593)

提交回复
热议问题