Efficient Named Entity Recognition in R

天涯浪子 提交于 2021-01-29 12:58:24

问题


I have below code in R for extracting person and locations from text:

library(rvest)
library(NLP)
library(openNLP)

page = pdf_text("C:/Users/u214738/Documents/NER_Data.pdf")

text = as.String(page)

sent_annot = Maxent_Sent_Token_Annotator()
word_annot = Maxent_Word_Token_Annotator()

install.packages("openNLPmodels", repos = "http://datacube.wu.ac.at/src/contrib/", type = "source")
install.packages("openNLPmodels.en", repos = "http://datacube.wu.ac.at/", type = "source")
install.packages("openNLPmodels.en", repos = "http://datacube.wu.ac.at/", type = "source",kind="person")
install.packages("openNLPmodels.en",repos ="http://datacube.wu.ac.at/", type = "source",kind="location")
install.packages("openNLPmodels.de", repos = "http://datacube.wu.ac.at/", type = "source")

library(openNLPmodels.de)
library(openNLPmodels.en)

loc_annot = Maxent_Entity_Annotator(kind = "location") #annotate location
people_annot = Maxent_Entity_Annotator(kind = "person") #annotate person

annot.l1 = NLP::annotate(text, list(sent_annot,word_annot))

k <- sapply(annot.l1$features,`[[`,"kind")
Locations = text[annot.l1[k=="location"]]
People = text[annot.l1[k == "person"]]

unique(Locations)
print(Locations)

unique(People)
print(People)

But Results I get are as follows:

unique(Locations)

character(0)

print(Locations)

character(0)

unique(People)

character(0)

print(People)

character(0)

NER_Data contains any text with people names and locations like info of Bill Gates, Warren Buffet

Need your fast guidance on this module.

来源:https://stackoverflow.com/questions/58169707/efficient-named-entity-recognition-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!