Algorithms to detect phrases and keywords from text

前端 未结 5 2190
不思量自难忘°
不思量自难忘° 2020-12-12 09:33

I have around 100 megabytes of text, without any markup, divided to approximately 10,000 entries. I would like to automatically generate a \'tag\' list. The problem is that

5条回答
  •  暗喜
    暗喜 (楼主)
    2020-12-12 09:41

    One way would be to build yourself an automaton. most likely a Nondeterministic Finite Automaton(NFA). NFA

    Another more simple way would be to create a file that has contains the words and/or word groups that you want to ignore, find, compare, etc. and store them in memory when the program starts and then you can compare the file you are parsing with the word/word groups that are contained in the file.

提交回复
热议问题