spacy | 易学教程

conda-forge::tqdm-4.19.4-py_0 - CondaError: Cannot link a source that does not exist

阅读更多关于 conda-forge::tqdm-4.19.4-py_0 - CondaError: Cannot link a source that does not exist

问题 I'm trying to install SpaCy on Windows 7 using Conda and getting the following error: conda install -c conda-forge spacy tqdm-4.19.4-py 100% |###############################| Time: 0:00:00 804.27 kB/s ERROR conda.core.link:_execute_actions(337): An error occurred while installing package 'conda-forge::tqdm-4.19.4-py_0'. CondaError: Cannot link a source that does not exist. C:\Users\xxxxx\AppData\Local\Continuum\Anaconda3\Scripts\conda.exe Attempting to roll back. Has anyone else go this and

“has no attribute 'reduce_cython” error when using Pyinstaller exe

阅读更多关于 “has no attribute 'reduce_cython” error when using Pyinstaller exe

问题 I used pyinstaller to convert my Python file to exe. While executing it I got the below error, AttributeError: type object 'neuralcoref.neuralcoref.array' has no attribute ' reduce_cython ' I'm using Python 3.6.7, Pyinstaller 4.0, NeuralCoref 4, Spacy 2.1.0, Cython 0.27.3. Any suggestion to solve this or better way to convert .py to exe? I've tried py2exe, Cxfreee but doesnt work. A minimal version of my code: import neuralcoref def ApplyCorefResolutionToPreProcessedMail(text, nlp): # load

Differentiate between countries and cities in spacy NER

阅读更多关于 Differentiate between countries and cities in spacy NER

问题 I'm trying to extract countries from organisation addresses using spacy NER, however, it labels countries and cities with the same tag GPE . Is there any way I can differentiate them? for instance: nlp = en_core_web_sm.load() doc= nlp('Resilience Engineering Institute, Tempe, AZ, United States; Naval Postgraduate School, Department of Operations Research, Monterey, CA, United States; Arizona State University, School of Sustainable Engineering and the Built Environment, Tempe, AZ, United

How to train custom NER in Spacy with single words data set?

阅读更多关于 How to train custom NER in Spacy with single words data set?

问题 I am trying to train a custom ner in Spacy with the new entity 'ANIMAL'. But I have a data set with single words as: TRAIN_DATA = [("Whale_ Blue", {"entities": [(0,11,LABEL)]}), ("Shark_ whale", {"entities": [(0,12,LABEL)]}), ("Elephant_ African", {"entities": [(0,17,LABEL)]}), ("Elephant_ Indian", {"entities": [(0,16,LABEL)]}), ("Giraffe_ male", {"entities": [(0,13,LABEL)]}), ("Mule", {"entities": [(0,4,LABEL)]}), ("Camel", {"entities": [(0,5,LABEL)]}), ("Horse", {"entities": [(0,5,LABEL)]})

What is a good way to speed up test runs utilizing larger spacy models?

阅读更多关于 What is a good way to speed up test runs utilizing larger spacy models?

问题 I have constructed some tests relying on the en_core_web_md model. The model takes ~15 sec to load into memory on my computer making the tests a pain to run. Is there a smart way to speed it up? 回答1: The v2.2.[0-5] md models have a minor bug that make them particularly slow to load (see https://github.com/explosion/spaCy/pull/4990). You can reformat one file in the model package to improve the load time. In the vocab directory for the model package (e.g., lib/python3.7/site-packages/en_core

NER training using Spacy

阅读更多关于 NER training using Spacy

问题 When running a train on an empty NER model, should I include only labeled data (data that contain necessarily at least one entity), or should I also include data that do not contain any label at all (in this case, teaching the model that in some circunstances these words do not have any label)? 回答1: If you look at the commonly used training data for NER (you can find links at http://nlpprogress.com/english/named_entity_recognition.html ), you’ll see that most/every example has at least one

Azure Python Deployment - Spacy nomodule found exception

阅读更多关于 Azure Python Deployment - Spacy nomodule found exception

问题 I'm using Linux App Service. I'm trying to deploy python 3.6 flask application through the Azure DevOps pipeline. It worked fine for a basic app but when I add an additional code (spacy module), it started to throw 2019-12-24T18:07:33.079953940Z __import__(module) 2019-12-24T18:07:33.079961840Z File "/home/site/wwwroot/application.py", line 3, in <module> 2019-12-24T18:07:33.079970340Z from Data_Cleanup_utility.clear_content_utility import ClearContent 2019-12-24T18:07:33.079978440Z File "

Merging tags into my file using named entity annotation

阅读更多关于 Merging tags into my file using named entity annotation

问题 While learning the basics of text mining i run into the following problem: I must use named entity annotation to find and locate named entities. However, when found, the tag must be included in the document. So for example: "Hello I am Koen" must result in "Hello I am < PERSON> Koen < /PERSON>. I figured out how to find and label the named entities but I am stuck on getting them in the file in the right way. I've tried comparing if the ent.orth_ is in the file and then replace it with the tag

Export inception output to spacy's training input format

阅读更多关于 Export inception output to spacy's training input format

问题 I am using INCEpTION 0.11.0 (https://inception-project.github.io/) to annotate my training data. I would like to use python spacy to use this training data. I could see couple of format in Inception to which I can exported to but I am not sure which one is best suited for spacy. I could not see any document about converting these exported file to space’s format. I could write a new script to do this conversion. Before doing that I was wondering is someone already solved this and can give some

Spacy custom sentence spliting

阅读更多关于 Spacy custom sentence spliting

问题 I using Spacy for custom sentence spliting and i need to parametrized the custom_delimeter/word for sentence spiting but i didnt find how to pass as an arugument here is the function, # Manual or Custom Based def mycustom_boundary(docx): for token in docx[:-1]: if token.text == '...': docx[token.i+1].is_sent_start = True return docx # Adding the rule before parsing nlp.add_pipe(mycustom_boundary,before='parser') Please let me know how can i send as a argument custom based splitter as list to