Associating free text statements with pre-defined attributes

懵懂的女人 提交于 2019-12-13 19:07:39

问题


I have a list of several dozen product attributes that people are concerned with, like

  • Financing
  • Manufacturing quality
  • Durability
  • Sales experience

and several million free-text statements from customers about the product, e.g.

"The financing was easy but the housing is flimsy."

I would like to score each free text statement in terms of how strongly it relates to each of the attributes, and whether that is a positive or negative association.

In the given example, there would be a strong positive association to Financing and a strong negative association to Manufacturing quality.

It feels like this type of problem is probably the realm of Natural Language Programming (NLP). However, I spent several hours reading up on things like OpenNLP and NLTK and find there's so much domain specific terminology that I cannot figure out where to focus to solve this specific problem.

So my three-part question:

  • Is NLP the correct route to solve this class of problem?
  • What aspect of NLP should I focus on learning for this specific problem?
  • Are there alternatives I have not considered?

回答1:


Yes, this is a NLP problem by the name of Sentiment analysis. Sentiment analysis is an active research area with different approaches and a task where a lot of other NLP-methods have to work together, so it is certainly not the easiest field to get started with in NLP.

A more or less recent survey of the academic research in the field can be found in Pang & Lee (2008).




回答2:


A resource you might find handy is SentiWordNet. (http://sentiwordnet.isti.cnr.it/) Which is like a dictionary that has a sentiment grade for words. It will tell you to what degree it thinks a word is positive, negative, or objective.

You can then combine that with some nltk code that looks through your sentences for the words you want to associate the sentiment with. So you would write a script to get some level of meaningful chunks of text that surround the words you were looking at, maybe sentence or clause level. Then you can have another thing that runs through the surrounding words and grab all the sentiment scores from the SentiWordNet.

I have some old code that did this and can place on github if you'd like, but you'd still need to make your own request for SentiWordNet.




回答3:


I guess your problem is more on association rather than just classification. Now moving forward with this assumption:

Is NLP the correct route to solve this class of problem?

Yes.

What aspect of NLP should I focus on learning for this specific problem?

  • Part of speech tagging
  • Sentiment analysis
  • Maximum entrophy

Are there alternatives I have not considered?

In depth study of automata theory with respect to NLP will help you a lot, it helped me a lot in grasping the implementations like OpenNLP.



来源:https://stackoverflow.com/questions/8541179/associating-free-text-statements-with-pre-defined-attributes

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!