Extract relevant sentences to entity

為{幸葍}努か 提交于 2019-12-03 21:54:00

What you want is a Named Entity Recognizer (NER). Given an input sentence, the NER will identify the various entities in the sentence as persons, organizations, products etc. You can then check entities recognized as products, and keep or discard the sentence accordingly. One very simple possibility would be to use the named entity recognizer of NLTK in Python. Here is an example:

import nltk
sent = "Albert Einstein spent many years at Princeton University in New Jersey"
sent1 = nltk.word_tokenize(sent)
sent2 = nltk.pos_tag(sent1)
sent3 = nltk.ne_chunk(sent2)
print sent3

The output will be:

(S
  (PERSON Albert/NNP)
  (PERSON Einstein/NNP)
  spent/VBD
  many/JJ 
  years/NNS
  at/IN
  (ORGANIZATION Princeton/NNP University/NNP)
  in/IN
  (GPE New/NNP Jersey/NNP))

NLTK works well for this simple example, but to be honest I'm not sure how accurate it is or if it can be customized to fit your purposes (identifying products). But I know that the Stanford NER is both customizable and accurate, so you might want to have a look at the above link.

this paper might be solution for your problem. https://www.aclweb.org/anthology/W12-4702

The approach to this type of problem is complex. like sentences which are talking of entity , can be any type like descriptive/comparative/question type ..etc. and with that there are cases where entity may or may not be mentioned explicitly.

some approaches can be tried : entity transition , co-reference resolution , discourse relation extraction..etc.

Thanks.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!