Store/retrieve a data structure

前端未结

关注

 5  554

轮回少年 2020-12-31 11:58

I have implemented a suffix tree in Python to make full-text-searchs, and it\'s working really well. But there\'s a problem: the indexed text can be very bi

5条回答

死守一世寂寞 (楼主)

2020-12-31 12:26

Try a compressed suffix tree instead.

The main idea is that instead of having lots of nodes of 1 character, you can compact them into 1 node of multiple characters thus saving extra nodes.

This link here (http://www.cs.sunysb.edu/~algorith/implement/suds/implement.shtml) says you can transform a 160MB suffix tree to 33MB compressed suffix tree. Quite a gain.

These compressed trees are used for genetic substring matching on huge strings. I used to run out of memory with a suffix tree, but after I compressed it, the out of memory error disappeared.

I wish I could find an unpaid article which explains the implementation better. (http://dl.acm.org/citation.cfm?id=1768593)

0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...