I have implemented a suffix tree in Python to make full-text-searchs, and it\'s working really well. But there\'s a problem: the indexed text can be very bi
Try a compressed suffix tree instead.
The main idea is that instead of having lots of nodes of 1 character, you can compact them into 1 node of multiple characters thus saving extra nodes.
This link here (http://www.cs.sunysb.edu/~algorith/implement/suds/implement.shtml) says you can transform a 160MB suffix tree to 33MB compressed suffix tree. Quite a gain.
These compressed trees are used for genetic substring matching on huge strings. I used to run out of memory with a suffix tree, but after I compressed it, the out of memory error disappeared.
I wish I could find an unpaid article which explains the implementation better. (http://dl.acm.org/citation.cfm?id=1768593)