spacy | 易学教程

Is it possible to install SpaCy to Raspberry Pi 4 Raspbian Buster

阅读更多关于 Is it possible to install SpaCy to Raspberry Pi 4 Raspbian Buster

问题 I have been stuck at installing SpaCy the entire day. sudo pip install -U spacy Looking in indexes: https://pypi.org/simple, https://www.piwheels.org/simple Collecting spacy Using cached https://files.pythonhosted... Installing build dependencies ... done Complete output from command python setup.py egg_info: Failed building wheel for blis ERROR: Failed to build one or more wheels Traceback (most recent call last): File "/tmp/pip-build-env-e4fo917j/lib/python3.7/site-packages/setuptools

read corpus of text files in spacy

阅读更多关于 read corpus of text files in spacy

问题 All the examples that I see for using spacy just read in a single text file (that is small in size). How does one load a corpus of text files into spacy? I can do this with textacy by pickling all the text in the corpus: docs = textacy.io.spacy.read_spacy_docs('E:/spacy/DICKENS/dick.pkl', lang='en') for doc in docs: print(doc) But I am not clear as to how to use this generator object (docs) for further analysis. Also, I would rather use spacy, not textacy. spacy also fails to read in a single

read corpus of text files in spacy

阅读更多关于 read corpus of text files in spacy

How to speed up spaCy lemmatization?

阅读更多关于 How to speed up spaCy lemmatization?

问题 I'm using spaCy (version 2.0.11) for lemmatization in the first step of my NLP pipeline but unfortunately it's taking a verrry long time. It is clearly the slowest part of my processing pipeline and I want to know if there are improvements I could be making. I am using a pipeline as: nlp.pipe(docs_generator, batch_size=200, n_threads=6, disable=['ner']) on a 8 core machine, and I have verified that the machine is using all the cores. On a corpus of about 3 million short texts totaling almost

How to speed up spaCy lemmatization?

阅读更多关于 How to speed up spaCy lemmatization?

How to speed up spaCy lemmatization?

阅读更多关于 How to speed up spaCy lemmatization?

Spacy to extract specific noun phrase

阅读更多关于 Spacy to extract specific noun phrase

问题 Can I use spacy in python to find NP with specific neighbors? I want Noun phrases from my text that has verb before and after it. 回答1: You can merge the noun phrases ( so that they do not get tokenized seperately). Analyse the dependency parse tree, and see the POS of neighbouring tokens. >>> import spacy >>> nlp = spacy.load('en') >>> sent = u'run python program run, to make this work' >>> parsed = nlp(sent) >>> list(parsed.noun_chunks) [python program] >>> for noun_phrase in list(parsed

Spacy, Strange similarity between two sentences

阅读更多关于 Spacy, Strange similarity between two sentences

问题 I have downloaded en_core_web_lg model and trying to find similarity between two sentences: nlp = spacy.load('en_core_web_lg') search_doc = nlp("This was very strange argument between american and british person") main_doc = nlp("He was from Japan, but a true English gentleman in my eyes, and another one of the reasons as to why I liked going to school.") print(main_doc.similarity(search_doc)) Which returns very strange value: 0.9066019751888448 These two sentences should not be 90% similar

Converting Spacy Training Data format to Spacy CLI Format (for blank NER)

阅读更多关于 Converting Spacy Training Data format to Spacy CLI Format (for blank NER)

问题 This is the classic training format. TRAIN_DATA = [ ("Who is Shaka Khan?", {"entities": [(7, 17, "PERSON")]}), ("I like London and Berlin.", {"entities": [(7, 13, "LOC"), (18, 24, "LOC")]}), ] I used to train with code but as I understand, the training is better with CLI train method. However, my format is this. I have found code-snippets for this type of conversion but every one of them is performing spacy.load('en') rather than going with blank - which made me think, are they training

Package spacy model

阅读更多关于 Package spacy model

问题 I want to include the spacy model de_core_news_sm in a python package. Here is my project: https://github.com/michaelhochleitner/package_de_core_news_sm . I package and install the project with the following commands. python setup.py sdist bdist_wheel pip install dist/example-pkg-mh-0.0.1.tar.gz I want to import the module example_pkg.import-model.py . $ python >>> import example_pkg.import_model Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/mh