spacy

Extract entity from dataframe using spacy

依然范特西╮ 提交于 2020-07-23 09:14:11
问题 I read contents from excel file using pandas:: import pandas as pd df = pd.read_excel("FAM_template_Update 1911274_JS.xlsx" ) df While trying to extract entities using spacy:: import spacy nlp = spacy.load("en_core_web_sm") doc = nlp(df) for enitity in doc.ents: print((entity.text)) Got Error:: TypeError: Argument 'string' has incorrect type (expected str, got DataFrame) On line(3)-----> doc = nlp(df) 回答1: This is expected as Spacy is not prepared to deal with a dataframe as-is. You need to

Spacy Entity Rule doesn't work for cardinal (Social Security number)

时光怂恿深爱的人放手 提交于 2020-07-22 18:36:11
问题 I have used Entity Rule to add new label for social security number. I even set overwrite_ents=true but it still does't recognize I verified regular expression is correct. not sure what else I need to do I tried before="ner" but same result text = "My name is yuyyvb and I leave on 605 W Clinton Street. My social security 690-96-4032" nlp = spacy.load("en_core_web_sm") ruler = EntityRuler(nlp, overwrite_ents=True) ruler.add_patterns([{"label": "SSN", "pattern": [{"TEXT": {"REGEX": r"\d{3}[^\w]

Spacy Entity Rule doesn't work for cardinal (Social Security number)

試著忘記壹切 提交于 2020-07-22 18:35:11
问题 I have used Entity Rule to add new label for social security number. I even set overwrite_ents=true but it still does't recognize I verified regular expression is correct. not sure what else I need to do I tried before="ner" but same result text = "My name is yuyyvb and I leave on 605 W Clinton Street. My social security 690-96-4032" nlp = spacy.load("en_core_web_sm") ruler = EntityRuler(nlp, overwrite_ents=True) ruler.add_patterns([{"label": "SSN", "pattern": [{"TEXT": {"REGEX": r"\d{3}[^\w]

Kernel Died when running Neuralcoref

冷暖自知 提交于 2020-07-22 07:22:05
问题 I am trying to install neuralcoref and following the instructions given here. I created a jupyter notebook and try to run the following code. # Load your usual SpaCy model (one of SpaCy English models) import spacy nlp = spacy.load('en') # Add neural coref to SpaCy's pipe import neuralcoref neuralcoref.add_to_pipe(nlp) # You're done. You can now use NeuralCoref as you usually manipulate a SpaCy document annotations. doc = nlp(u'My sister has a dog. She loves him.') doc._.has_coref doc._.coref

Get position of word in sentence with spacy

荒凉一梦 提交于 2020-07-18 08:55:11
问题 I'm aware of the basic spacy workflow for getting various attributes from a document, however I can't find a built in function to return the position (start/end) of a word which is part of a sentence. Would anyone know if this is possible with Spacy? 回答1: These are available as attributes of the tokens in the sentences. Doc says: idx int The character offset of the token within the parent document. i int The index of the token within the parent document. >>> import spacy >>> nlp = spacy.load(

I am getting an InvalidArchiveError in anaconda prompt when I am trying to install spacy. How to solve it?

☆樱花仙子☆ 提交于 2020-07-18 06:13:43
问题 InvalidArchiveError('Error with archive C:\Users\Sahaja Reddy\Anaconda3\pkgs\openssl-1.1.1g-he774522_0.conda. You probably need to delete and re-download or re-create this file. Message from libarchive was:\n\nCould not unlink (errno=22, retcode=-25, archive_p=1873471744752)') 回答1: I had this same problem - I had an IPython instance open that was holding onto the Openssl handle open so I wasn't able to delete the Openssl folder as mentioned above by Prayson. After closing all of my IPython &

I am getting an InvalidArchiveError in anaconda prompt when I am trying to install spacy. How to solve it?

限于喜欢 提交于 2020-07-18 06:12:32
问题 InvalidArchiveError('Error with archive C:\Users\Sahaja Reddy\Anaconda3\pkgs\openssl-1.1.1g-he774522_0.conda. You probably need to delete and re-download or re-create this file. Message from libarchive was:\n\nCould not unlink (errno=22, retcode=-25, archive_p=1873471744752)') 回答1: I had this same problem - I had an IPython instance open that was holding onto the Openssl handle open so I wasn't able to delete the Openssl folder as mentioned above by Prayson. After closing all of my IPython &

spaCy and spaCy models in setup.py

六眼飞鱼酱① 提交于 2020-07-17 07:50:10
问题 In my project I have spaCy as a dependency in my setup.py , but I want to add also a default model. My attempt so far has been: install_requires=['spacy', 'en_core_web_sm'], dependency_links=['https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.0.0/en_core_web_sm-2.0.0.tar.gz#egg=en_core_web_sm'], inside my setup.py , but both a regular pip install of my package and a pip install --process-dependency-links return: pip._internal.exceptions.DistributionNotFound: No

Removing named entities from a document using spacy

自闭症网瘾萝莉.ら 提交于 2020-07-08 20:36:15
问题 I have tried to remove words from a document that are considered to be named entities by spacy, so basically removing "Sweden" and "Nokia" from the string example. I could not find a way to work around the problem that entities are stored as a span. So when comparing them with single tokens from a spacy doc, it prompts an error. In a later step, this process is supposed to be a function applied to several text documents stored in a pandas data frame. I would appreciate any kind of help and

Removing named entities from a document using spacy

霸气de小男生 提交于 2020-07-08 20:33:26
问题 I have tried to remove words from a document that are considered to be named entities by spacy, so basically removing "Sweden" and "Nokia" from the string example. I could not find a way to work around the problem that entities are stored as a span. So when comparing them with single tokens from a spacy doc, it prompts an error. In a later step, this process is supposed to be a function applied to several text documents stored in a pandas data frame. I would appreciate any kind of help and