Algorithms to detect phrases and keywords from text

前端 未结 5 2170
不思量自难忘°
不思量自难忘° 2020-12-12 09:33

I have around 100 megabytes of text, without any markup, divided to approximately 10,000 entries. I would like to automatically generate a \'tag\' list. The problem is that

5条回答
  •  死守一世寂寞
    2020-12-12 09:57

    I'd start with a wonderful chapter, by Peter Norvig, in the O'Reilly book Beautiful Data. He provides the ngram data you'll need, along with beautiful Python code (which may solve your problems as-is, or with some modification) on his personal web site.

提交回复
热议问题