I used NLTK\'s ne_chunk
to extract named entities from a text:
my_sent = \"WASHINGTON -- In the wake of a string of abuses by New York police of
You can also extract the label
of each Name Entity in the text using this code:
import nltk
for sent in nltk.sent_tokenize(sentence):
for chunk in nltk.ne_chunk(nltk.pos_tag(nltk.word_tokenize(sent))):
if hasattr(chunk, 'label'):
print(chunk.label(), ' '.join(c[0] for c in chunk))
Output:
GPE WASHINGTON
GPE New York
PERSON Loretta E. Lynch
GPE Brooklyn
You can see Washington
, New York
and Brooklyn
are GPE
means geo-political entities
and Loretta E. Lynch
is a PERSON