问题
How can I extract person names from the text?
I have applied some NLP toolkit for this, specifically I used the Stanford NER toolkit to extract names from text. With that, I can extract person names from the text, but when I want the program to extract words like 'programmer', 'lecturer' or 'engineer', the libraries couldn't extract those. Is there any way to extract these from the text?
回答1:
Since "Programmer, lecturer, and engineer" are not named-entities, you may have to maintain a list of those words. I think you can obtain them from word derivation relationships in Wordnet, like "sing" (verb) and "singer" or "lecture" (verb) and "lecturer" (noun).
A SuperSense tagger may also be used as NER, I think it can tag those words you mentioned as "noun.person" which is what you need. ArkRef (Java) is a coreference tool that uses it (through a Java port of supersense tagger, bundled), and there's an online demo there, so you can check if your target words are tagged in square brackets.
来源:https://stackoverflow.com/questions/9561370/how-can-i-differentiate-between-a-persons-name-and-other-names-that-are-derived