How to auto-tag content, algorithms and suggestions needed

后端 未结 8 1844
我在风中等你
我在风中等你 2020-12-22 18:51

I am working with some really large databases of newspaper articles, I have them in a MySQL database, and I can query them all.

I am now searching for ways to help m

8条回答
  •  感情败类
    2020-12-22 19:14

    Your approach seems sensible and there are two ways you can improve the tagging.

    1. Use a known list of keywords/phrases for your tagging and if the count of the instances of this word/phrase is greater than a threshold (probably based on the length of the article) then include the tag.
    2. Use a part of speech tagging algorithm to help reduce the article into a sensible set of phrases and use a sensible method to extract tags out of this. Once you have the articles reduced using such an algorithm, you would be able to identify some good candidate words/phrases to use in your keyword/phrase list for method 1.

提交回复
热议问题