spacy

spacy similarity method doesn't not work correctly

落爺英雄遲暮 提交于 2019-12-24 04:52:11
问题 I always get a lot of help from stack overflows. Thank you all the time. I am doing simple natural language processing using spacy . I'm working on filtering out words by measuring the similarity between words. I wrote and used the following simple code shown in the spacy documentation, but the result does not look like a documentation. import spacy nlp = spacy.load('en_core_web_lg') tokens = nlp('dog cat banana') for token1 in tokens: for token2 in tokens: sim = token1.similarity(token2)

Matcher is returning some duplicates entry

喜你入骨 提交于 2019-12-24 03:46:04
问题 I want output as ["good customer service","great ambience"] but I am getting ["good customer","good customer service","great ambience"] because pattern is matching with good customer also but this phrase doesn't make any sense. How can I remove these kind of duplicates import spacy from spacy.matcher import Matcher nlp = spacy.load("en_core_web_sm") doc = nlp("good customer service and great ambience") matcher = Matcher(nlp.vocab) # Create a pattern matching two tokens: adjective followed by

Can't install spaCy on WinPython: “ ModuleNotFoundError: No module named 'semver'”

梦想与她 提交于 2019-12-24 03:01:47
问题 I'm trying to use a portable Python interpreter therefore I installed WinPython and plan to deploy my application to other machines someday. For my application I need to use a NLP module "spaCy". I tried to install spaCy on WinPython ( pip install -U spacy ), but it can not be installed. When it installs the module dependencies, a module "semver" seems can not be installed: Collecting semver (from sputnik<0.10.0,>=0.9.2->spacy) Using cached semver-2.7.6.tar.gz Complete output from command

Need approach on building Custom NER for extracting below keywords from any format of payslips

痞子三分冷 提交于 2019-12-24 00:38:21
问题 I am trying to build a generic extraction of below parameters from any format of payslip: Name His PostCode Pay Date Net Pay. Challenge I am facing is due to variety of format that may come, I want to apply NER (Spacy) to learn these under the entities Name - PERSON His PostCode Pay Date - DATE Net Pay. - MONEY But I am unsuccess so far, I even tried to build a custom EntityMatcher for Postcode & Date but to no success. I seek any guideline and approach to make me take the right path in

Load up previously saved NER models in SpaCy v1.1.2

筅森魡賤 提交于 2019-12-23 23:16:58
问题 So whenever I try to load up a previously saved model for SpaCy NER, I get a core dump. if os.path.isfile( model_path ): ner.model.load( model_path ) for itn in range( 5 ): random.shuffle( TRAIN_DATA ) for raw_text, entity_offsets in TRAIN_DATA: doc = nlp.make_doc( raw_text ) gold = GoldParse( doc, entities=entity_offsets ) ner.update( doc, gold ) # <- Core dump occurs here Dump report: 7fb1b7459000-7fb1b7499000 rw-p 00000000 00:00 0 [1] 23967 abort (core dumped) Am I doing/loading it wrong?

Spacy: get position of word with entity tag

冷暖自知 提交于 2019-12-23 21:14:38
问题 I'm trying to get the position of a word and it's entity tag by iterating over a sentence, as per the spacy docs import spacy nlp = spacy.load('en') doc = nlp(u'London is a big city in the United Kingdom.') for ent in doc.ents: print(ent.label_, ent.text) # GPE London # GPE United Kingdom I've tried to get the position of the word with the tag ent.i and ent.idx however neither of these work and give the following error AttributeError: 'spacy.tokens.span.Span' object has no attribute 'i' 回答1:

spacy module install in conda

回眸只為那壹抹淺笑 提交于 2019-12-23 05:04:10
问题 After installing spacy with conda in windows 7 machine I ran the following code: import spacy nlp = spacy.load('en') The error I received is the following: Warning: no model found for 'en' Only loading the 'en' tokenizer. Following some searches I ran the following code on commandline (cmd): python -m spacy download en The error I receive is: Traceback (most recent call last): File "C:\Users\vranjan2\AppData\Local\Continuum\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main "__main__",

spacy module install in conda

时间秒杀一切 提交于 2019-12-23 05:03:21
问题 After installing spacy with conda in windows 7 machine I ran the following code: import spacy nlp = spacy.load('en') The error I received is the following: Warning: no model found for 'en' Only loading the 'en' tokenizer. Following some searches I ran the following code on commandline (cmd): python -m spacy download en The error I receive is: Traceback (most recent call last): File "C:\Users\vranjan2\AppData\Local\Continuum\Anaconda3\lib\runpy.py", line 193, in _run_module_as_main "__main__",

Multi-Threaded NLP with Spacy pipe

随声附和 提交于 2019-12-22 08:35:23
问题 I'm trying to apply Spacy NLP (Natural Language Processing) pipline to a big text file like Wikipedia Dump. Here is my code based on Spacy's documentation example: from spacy.en import English input = open("big_file.txt") big_text= input.read() input.close() nlp= English() out = nlp.pipe([unicode(big_text, errors='ignore')], n_threads=-1) doc = out.next() Spacy applies all nlp operations like POS tagging, Lemmatizing and etc all at once. It is like a pipeline for NLP that takes care of

how to write spacy matcher of POS regex

帅比萌擦擦* 提交于 2019-12-22 08:19:02
问题 Spacy has two features I'd like to combine - part-of-speech (POS) and rule-based matching. How can I combine them in a neat way? For example - let's say input is a single sentence and I'd like to verify it meets some POS ordering condition - for example the verb is after the noun (something like noun**verb regex). result should be true or false. Is that doable? or the matcher is specific like in the example Rule-based matching can have POS rules? If not - here is my current plan - gather