I am trying to extract list of persons and organizations using Stanford Named Entity Recognizer (NER) in Python NLTK. When I run:
from nltk.tag.stanford impo
Try using the "enumerate" method.
When you apply NER to the list of words, once tuples are created of (word,type), enumerate this list using the enumerate(list). This would assign an index to every tuple in the list.
So later, when you extract PERSON/ORGANISATION/LOCATION from the list they would have an index attached to it.
1 Hussein
2 Obama
3 II
6 James
7 Naismith
21 Naismith
19 Tony
20 Hinkle
0 Frank
1 Mahan
14 Naismith
0 Naismith
0 Mahan
0 Mahan
0 Naismith
Now on the basis of the consecutive index a single name can be filtered out.
Hussein Obama II, James Naismith, Tony Hank, Frank Mahan