问题
I have a set of text files. I am using Stanford's coreNLP Name Entity Recogniser to extract details of the lines where patient name is mentioned out of those files. When I am running NER on a single sentence, it is printing results correctly but when I am running it on set of files, it is printing the results along with error and also I am not able to write the results on a text file because of this:
500 Server Error: Internal Server Error for url: http://localhost:9000/?properties=%7B%22outputFormat%22%3A+%22json%22%2C+%22annotators%22%3A+%22tokenize%2Cssplit%2Cner%22%2C+%22ssplit.isOneSentence%22%3A+%22true%22%7D
Here is the code which I am using:
import re
import os
from nltk.parse import CoreNLPParser
tagger = CoreNLPParser(url='http://localhost:9000', tagtype='ner')
def name_detail_extracter():
data_location="D:\Data" # folder containing all the data
for root, dirs, files in os.walk(data_location):
for filename in files:
with open(os.path.join(root, filename), encoding="utf8",mode="r") as f:
patient_name_check=re.compile(r".*\s+(patient name)\s*:*\s*(.*)",re.I)
for line_number, line in enumerate(f, 1):
patient_name_matches=patient_name_check.findall(line)
for match in patient_name_matches:
name_details=match[1]
tokens = name_details.split()
result=tagger.tag(tokens)
for m in result:
print(m)
name_detail_extracter()
回答1:
The issue has been resolved as there were some empty tokens getting passed to NER, so now I have put a check for them.
for match in patient_name_matches:
name_details=match[1]
tokens = name_details.split()
if tokens: # this is the check which I put
result=tagger.tag(tokens)
for m in result:
print(m)
来源:https://stackoverflow.com/questions/52031337/stanfords-corenlp-name-entity-recogniser-throwing-error-500-server-error-inter