How to detect features of a product in an english sentence - nlp

问题

I am trying to detect features(eg.: screen, processing speed) of a product(eg.: mobile, respectively) in an english sentence. For this, my approach is that in a paragraph(that talks about the product) containing multiple sentences, the words( apart from words like pronouns or sentiment words like good, bad etc, which I store in a file) that appear most frequently are the features of that product and so I rank on the basis of their frequency and their distance with the sentiment words and take teh top n of them.

However, it is not very effective. Can anyone suggest some other and better approach for detecting the words which are features of a product?

回答1:

There's been massive amount of research in this area. Start from reading Bing Liu's seminal work (Liu 2004, Liu 2005) in this area.

One popular technique is using Dependency Graph using Stanford CodeNLP. You can make rules like a Noun (NN) connected to an Adjective (JJ) using nsubj dependency. 5-10 rules of this kind would be sufficient for a basic system.

State of the art in this area uses Sequence Tagging approach (CRF/HMM) for tagging each word whether it is an feature term or not. However you need good amount of labelled data for it. Check recent works in the area of Aspect Based Sentiment Analysis.

Resources for your help:

http://alt.qcri.org/semeval2015/task12/
http://www.aueb.gr/users/ion/docs/pavlopoulos_phd_thesis.pdf
http://www.aclweb.org/anthology/S14-2004

来源：https://stackoverflow.com/questions/30585228/how-to-detect-features-of-a-product-in-an-english-sentence-nlp

标签

java

nlp

artificial-intelligence