spacy | 易学教程

spacy Can't find model 'en_core_web_sm' on windows 10 and Python 3.5.3 :: Anaconda custom (64-bit)

阅读更多关于 spacy Can't find model 'en_core_web_sm' on windows 10 and Python 3.5.3 :: Anaconda custom (64-bit)

what is difference between spacy.load('en_core_web_sm') and spacy.load('en') ? This link explains different model sizes. But i am still not clear how spacy.load('en_core_web_sm') and spacy.load('en') differ spacy.load('en') runs fine for me. But the spacy.load('en_core_web_sm') throws error i have installed spacy as below. when i go to jupyter notebook and run command nlp = spacy.load('en_core_web_sm') I get the below error --------------------------------------------------------------------------- OSError Traceback (most recent call last) <ipython-input-4-b472bef03043> in <module>() 1 #

spacy Can't find model 'en_core_web_sm' on windows 10 and Python 3.5.3 :: Anaconda custom (64-bit)

阅读更多关于 spacy Can't find model 'en_core_web_sm' on windows 10 and Python 3.5.3 :: Anaconda custom (64-bit)

问题 what is difference between spacy.load('en_core_web_sm') and spacy.load('en') ? This link explains different model sizes. But i am still not clear how spacy.load('en_core_web_sm') and spacy.load('en') differ spacy.load('en') runs fine for me. But the spacy.load('en_core_web_sm') throws error i have installed spacy as below. when i go to jupyter notebook and run command nlp = spacy.load('en_core_web_sm') I get the below error ---------------------------------------------------------------------

NLP自然语言处理中英文分词工具集锦与基本使用介绍

阅读更多关于 NLP自然语言处理中英文分词工具集锦与基本使用介绍

一、中文分词工具（1）Jieba （2）snowNLP分词工具（3）thulac分词工具（4）pynlpir 分词工具（5）StanfordCoreNLP分词工具 1. from stanfordcorenlp import StanfordCoreNLP 2. with StanfordCoreNLP(r'E:\Users\Eternal Sun\PycharmProjects\1\venv\Lib\stanford-corenlp-full-2018-10-05', lang='zh') as nlp: 3. print("stanfordcorenlp分词：\n",nlp.word_tokenize(Chinese)) （6）Hanlp分词工具分词结果如下：二、英文分词工具 1. NLTK：二者之间的区别在于，如果先分句再分词，那么将保留句子的独立性，即生成结果是一个二维列表，而对于直接分词来说，生成的是一个直接的一维列表，结果如下： 2. SpaCy： 3. StanfordCoreNLP：分词结果来源： oschina 链接： https://my.oschina.net/u/3793864/blog/3056365

Spacy language model installation in python returns ImportError from _mklinit (ImportError: DLL load failed: The specified module could not be found.)

阅读更多关于 Spacy language model installation in python returns ImportError from _mklinit (ImportError: DLL load failed: The specified module could not be found.)

问题 I am currently trying to set up spaCy in my system. When I downloaded the module, no errors are being shown. However, upon downloading a language model (specifically, the english one), I got an error. The output is as follows: Traceback (most recent call last): File "C:\ProgramData\Anaconda3\lib\runpy.py", line 183, in _run_module_as_main mod_name, mod_spec, code = _get_module_details(mod_name, _Error) File "C:\ProgramData\Anaconda3\lib\runpy.py", line 142, in _get_module_details return _get

Unable to load the spacy model 'en_core_web_lg' on Google colab

阅读更多关于 Unable to load the spacy model 'en_core_web_lg' on Google colab

问题 I am using spacy in google colab to build an NER model for which I have downloaded the spaCy 'en_core_web_lg' model using import spacy.cli spacy.cli.download("en_core_web_lg") and I get a message saying ✔ Download and installation successful You can now load the model via spacy.load('en_core_web_lg') However then when i try to load the model nlp = spacy.load('en_core_web_lg') the following error is printed: OSError: [E050] Can't find model 'en_core_web_lg'. It doesn't seem to be a shortcut

How to generate bi/tri-grams using spacy/nltk

阅读更多关于 How to generate bi/tri-grams using spacy/nltk

问题 The input text are always list of dish names where there are 1~3 adjectives and a noun Inputs thai iced tea spicy fried chicken sweet chili pork thai chicken curry outputs: thai tea, iced tea spicy chicken, fried chicken sweet pork, chili pork thai chicken, chicken curry, thai curry Basically, I am looking to parse the sentence tree and try to generate bi-grams by pairing an adjective with the noun. And I would like to achieve this with spacy or nltk 回答1: I used spacy 2.0 with english model.

How do I create gold data for TextCategorizer training?

阅读更多关于 How do I create gold data for TextCategorizer training?

问题 I want to train a TextCategorizer model with the following (text, label) pairs. Label COLOR : The door is brown. The barn is red. The flower is yellow. Label ANIMAL : The horse is running. The fish is jumping. The chicken is asleep. I am copying the example code in the documentation for TextCategorizer. textcat = TextCategorizer(nlp.vocab) losses = {} optimizer = nlp.begin_training() textcat.update([doc1, doc2], [gold1, gold2], losses=losses, sgd=optimizer) The doc variables will presumably

Remove a word in a span from SpaCy?

阅读更多关于 Remove a word in a span from SpaCy?

问题 I am parsing a sentence with Spacy like following: import spacy nlp = spacy.load("en") span = nlp("This is some text.") I am wondering if there is a way to delete a word in the span, while still keep the remaining words format like a sentence. Such as del span[3] which could yield a sentence like This is some. If some other methods without SpaCy could achieve the same effect that will be great too. 回答1: There is a workaround for that. The idea is that you create a numpy array from the doc,

Why does spaCy not preserve intra-word-hyphens during tokenization like Stanford CoreNLP does?

阅读更多关于 Why does spaCy not preserve intra-word-hyphens during tokenization like Stanford CoreNLP does?

SpaCy Version: 2.0.11 Python Version: 3.6.5 OS: Ubuntu 16.04 My Sentence Samples: Marketing-Representative- won't die in car accident. or Out-of-box implementation Expected Tokens: ["Marketing-Representative", "-", "wo", "n't", "die", "in", "car", "accident", "."] ["Out-of-box", "implementation"] SpaCy Tokens(Default Tokenizer): ["Marketing", "-", "Representative-", "wo", "n't", "die", "in", "car", "accident", "."] ["Out", "-", "of", "-", "box", "implementation"] I tried creating custom tokenizer but it won't handle all edge cases as handled by spaCy using tokenizer_exceptions(Code below):

Import error with spacy: “No module named en”

阅读更多关于 Import error with spacy: “No module named en”

问题 I'm having trouble using the Python spaCy library. It seems to be installed correctly but at from spacy.en import English I get the following import error: Traceback (most recent call last): File "spacy.py", line 1, in <module> from spacy.en import English File "/home/user/CmdData/spacy.py", line 1, in <module> from spacy.en import English ImportError: No module named en I'm not very familiar with Python but that's the standard import I saw online, and the library is installed: $ pip list |