nltk

机器学习之nltk download出错:Error connecting to server...

拜拜、爱过 提交于 2020-04-10 18:20:13
机器学习常用到python的自然语言处理框架NLTK,这个是机器学习的常用包,在使用过程中会遇到不少问题。我会和大家分享在这其中的一些经验。 今天闲来说一下安装,在安装中出现的download错误。 >>> import nltk >>> nltk.download() NLTK Downloader --------------------------------------------------------------------------- d) Download l) List c) Config h) Help q) Quit --------------------------------------------------------------------------- Downloader> l Packages: Error connecting to server: [Errno -2] Name or service not known 经过推测,是服务器无法连接下载服务器地址导致的。 查看一下nltk download配置 Downloader> c Data Server: - URL: <http://nltk.googlecode.com/svn/trunk/nltk_data/index.xml> - 3 Package Collections

Python3下的NLTK及nltk_data安装问题(Ubuntu环境)

萝らか妹 提交于 2020-04-10 16:38:21
在使用Python强大的第三方库nltk进行一些自然语言处理工作的时候遇到了一些困难,折腾一番总算解决。现在在这里记录一下,当作备忘。 网上找到挺多安装nltk的教程,但经测试,好像都是适用于Python2的,对于Python3,就勉为其难了。这里的主要问题是,前辈们分享的nltk_data包是不兼容Python3的。 所以我的解决方案是: 到https://github.com/nltk/nltk_data下载gh-pages分支,里面的Packages就是我们要的资源。(注:截至2016年3月24日时本方案仍有效) 详细情况记录如下: 1.安装nltk。截至今天,安装的是nltk3.2,在有pip这些工具的情况下,安装这些库变得非常简单: pip install nltk 另外官方的安装说明还附带了numpy,一个“赛Matlab的Python开源的数值计算扩展库”,说不定以后用得上: pip install numpy 来源: oschina 链接: https://my.oschina.net/u/3611008/blog/2980365

NLP python库 nltk 安装

て烟熏妆下的殇ゞ 提交于 2020-04-10 15:11:19
使用python进行自然语言处理,有一些第三方库供大家使用: ·NLTK(Python自然语言工具包)用于诸如标记化、词形还原、词干化、解析、POS标注等任务。该库具有几乎所有NLP任务的工具。 ·Spacy是NLTK的主要竞争对手。这两个库可用于相同的任务。 ·Scikit-learn为机器学习提供了一个大型库。此外还提供了用于文本预处理的工具。 ·Gensim是一个主题和向量空间建模、文档集合相似性的工具包。 ·Pattern库的一般任务是充当Web挖掘模块。因此,它仅支持自然语言处理(NLP)作为辅助任务。 ·Polyglot是自然语言处理(NLP)的另一个Python工具包。它不是很受欢迎,但也可以用于各种NLP任务。 先由nltk入手学习。 1. NLTK安装 简单来说还是跟python其他第三方库的安装方式一样,直接在命令行运行:pip install nltk 2. 运行不起来? 当你安装完成后,想要试试下面的代码对一段英文文本进行简单的切分: import nltk text=nltk.word_tokenize("PierreVinken , 59 years old , will join as a nonexecutive director on Nov. 29 .") print(text) 运行结果, 报错如下: ... raise LookupError

How to use spacy to do Name Entity recognition on CSV file

夙愿已清 提交于 2020-04-07 08:08:18
问题 I have tried so many things to do name entity recognition on a column in my csv file, i tried ne_chunk but i am unable to get the result of my ne_chunk in columns like so ID STORY PERSON NE NP NN VB GE 1 Washington, a police officer James... 1 0 0 0 0 1 Instead after using this code, news=pd.read_csv("news.csv") news['tokenize'] = news.apply(lambda row: nltk.word_tokenize(row['STORY']), axis=1) news['pos_tags'] = news.apply(lambda row: nltk.pos_tag(row['tokenize']), axis=1) news['entityrecog'

How to use spacy to do Name Entity recognition on CSV file

可紊 提交于 2020-04-07 08:06:14
问题 I have tried so many things to do name entity recognition on a column in my csv file, i tried ne_chunk but i am unable to get the result of my ne_chunk in columns like so ID STORY PERSON NE NP NN VB GE 1 Washington, a police officer James... 1 0 0 0 0 1 Instead after using this code, news=pd.read_csv("news.csv") news['tokenize'] = news.apply(lambda row: nltk.word_tokenize(row['STORY']), axis=1) news['pos_tags'] = news.apply(lambda row: nltk.pos_tag(row['tokenize']), axis=1) news['entityrecog'

How to use spacy to do Name Entity recognition on CSV file

流过昼夜 提交于 2020-04-07 08:06:02
问题 I have tried so many things to do name entity recognition on a column in my csv file, i tried ne_chunk but i am unable to get the result of my ne_chunk in columns like so ID STORY PERSON NE NP NN VB GE 1 Washington, a police officer James... 1 0 0 0 0 1 Instead after using this code, news=pd.read_csv("news.csv") news['tokenize'] = news.apply(lambda row: nltk.word_tokenize(row['STORY']), axis=1) news['pos_tags'] = news.apply(lambda row: nltk.pos_tag(row['tokenize']), axis=1) news['entityrecog'

How to use spacy to do Name Entity recognition on CSV file

纵然是瞬间 提交于 2020-04-07 08:05:06
问题 I have tried so many things to do name entity recognition on a column in my csv file, i tried ne_chunk but i am unable to get the result of my ne_chunk in columns like so ID STORY PERSON NE NP NN VB GE 1 Washington, a police officer James... 1 0 0 0 0 1 Instead after using this code, news=pd.read_csv("news.csv") news['tokenize'] = news.apply(lambda row: nltk.word_tokenize(row['STORY']), axis=1) news['pos_tags'] = news.apply(lambda row: nltk.pos_tag(row['tokenize']), axis=1) news['entityrecog'

nltk download error13 Permission denied mac

天涯浪子 提交于 2020-03-26 10:18:19
问题 I ran the command nltk.download() after typing python3 on terminal in mac OS X. Then I am getting this error PermissionError: [Errno 13] Permission denied: '/Users/shreya/nltk_data/corpora/panlex_swadesh.zip' This is what I got on terminal: >>> nltk.download() showing info https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/index.xml Exception in thread Thread-1: Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line

malt parser gives assertion error when using it with nltk

佐手、 提交于 2020-03-16 06:35:34
问题 I am using malt parser with python nltk. I have successfully downloaded the training data and updated the latest nltk. When I call the malt parser it gives me an asertion error. Below is the code from python which includes the traceback as well. mp = MaltParser("C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1","C:/Users/mustufain/Desktop/Python Files/maltparser-1.7.2",additional_java_args=['-Xmx512m']) Traceback (most recent call last): File "<pyshell#10>", line 1, in <module> mp =

malt parser gives assertion error when using it with nltk

人盡茶涼 提交于 2020-03-16 06:35:09
问题 I am using malt parser with python nltk. I have successfully downloaded the training data and updated the latest nltk. When I call the malt parser it gives me an asertion error. Below is the code from python which includes the traceback as well. mp = MaltParser("C:/Users/mustufain/Desktop/Python Files/maltparser-1.8.1","C:/Users/mustufain/Desktop/Python Files/maltparser-1.7.2",additional_java_args=['-Xmx512m']) Traceback (most recent call last): File "<pyshell#10>", line 1, in <module> mp =