spacy | 易学教程

Extract entity from dataframe using spacy

阅读更多关于 Extract entity from dataframe using spacy

问题 I read contents from excel file using pandas:: import pandas as pd df = pd.read_excel("FAM_template_Update 1911274_JS.xlsx" ) df While trying to extract entities using spacy:: import spacy nlp = spacy.load("en_core_web_sm") doc = nlp(df) for enitity in doc.ents: print((entity.text)) Got Error:: TypeError: Argument 'string' has incorrect type (expected str, got DataFrame) On line(3)-----> doc = nlp(df) 回答1: This is expected as Spacy is not prepared to deal with a dataframe as-is. You need to

Spacy Entity Rule doesn't work for cardinal (Social Security number)

阅读更多关于 Spacy Entity Rule doesn't work for cardinal (Social Security number)

问题 I have used Entity Rule to add new label for social security number. I even set overwrite_ents=true but it still does't recognize I verified regular expression is correct. not sure what else I need to do I tried before="ner" but same result text = "My name is yuyyvb and I leave on 605 W Clinton Street. My social security 690-96-4032" nlp = spacy.load("en_core_web_sm") ruler = EntityRuler(nlp, overwrite_ents=True) ruler.add_patterns([{"label": "SSN", "pattern": [{"TEXT": {"REGEX": r"\d{3}[^\w]

Spacy Entity Rule doesn't work for cardinal (Social Security number)

阅读更多关于 Spacy Entity Rule doesn't work for cardinal (Social Security number)

Kernel Died when running Neuralcoref

阅读更多关于 Kernel Died when running Neuralcoref

问题 I am trying to install neuralcoref and following the instructions given here. I created a jupyter notebook and try to run the following code. # Load your usual SpaCy model (one of SpaCy English models) import spacy nlp = spacy.load('en') # Add neural coref to SpaCy's pipe import neuralcoref neuralcoref.add_to_pipe(nlp) # You're done. You can now use NeuralCoref as you usually manipulate a SpaCy document annotations. doc = nlp(u'My sister has a dog. She loves him.') doc._.has_coref doc._.coref

Get position of word in sentence with spacy

阅读更多关于 Get position of word in sentence with spacy

问题 I'm aware of the basic spacy workflow for getting various attributes from a document, however I can't find a built in function to return the position (start/end) of a word which is part of a sentence. Would anyone know if this is possible with Spacy? 回答1: These are available as attributes of the tokens in the sentences. Doc says: idx int The character offset of the token within the parent document. i int The index of the token within the parent document. >>> import spacy >>> nlp = spacy.load(

I am getting an InvalidArchiveError in anaconda prompt when I am trying to install spacy. How to solve it?

阅读更多关于 I am getting an InvalidArchiveError in anaconda prompt when I am trying to install spacy. How to solve it?

问题 InvalidArchiveError('Error with archive C:\Users\Sahaja Reddy\Anaconda3\pkgs\openssl-1.1.1g-he774522_0.conda. You probably need to delete and re-download or re-create this file. Message from libarchive was:\n\nCould not unlink (errno=22, retcode=-25, archive_p=1873471744752)') 回答1: I had this same problem - I had an IPython instance open that was holding onto the Openssl handle open so I wasn't able to delete the Openssl folder as mentioned above by Prayson. After closing all of my IPython &

I am getting an InvalidArchiveError in anaconda prompt when I am trying to install spacy. How to solve it?

阅读更多关于 I am getting an InvalidArchiveError in anaconda prompt when I am trying to install spacy. How to solve it?

spaCy and spaCy models in setup.py

阅读更多关于 spaCy and spaCy models in setup.py

问题 In my project I have spaCy as a dependency in my setup.py , but I want to add also a default model. My attempt so far has been: install_requires=['spacy', 'en_core_web_sm'], dependency_links=['https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm'], inside my setup.py , but both a regular pip install of my package and a pip install --process-dependency-links return: pip._internal.exceptions.DistributionNotFound: No

Removing named entities from a document using spacy

阅读更多关于 Removing named entities from a document using spacy

问题 I have tried to remove words from a document that are considered to be named entities by spacy, so basically removing "Sweden" and "Nokia" from the string example. I could not find a way to work around the problem that entities are stored as a span. So when comparing them with single tokens from a spacy doc, it prompts an error. In a later step, this process is supposed to be a function applied to several text documents stored in a pandas data frame. I would appreciate any kind of help and

Removing named entities from a document using spacy

阅读更多关于 Removing named entities from a document using spacy