问题
I have the following code and I made sure its extension and name are correct. However, I still get the error outputted as seen below.
I did see another person asked a similar question here on Stack Overflow, and read the answer but it did not help me.
Failed to load a .bin.gz pre trained words2vecx
Any suggestions how to fix this?
Input:
import gensim
word2vec_path = "GoogleNews-vectors-negative300.bin.gz"
word2vec = gensim.models.KeyedVectors.load_word2vec_format(word2vec_path, binary=True)
Output:
OSError: Not a gzipped file (b've')
回答1:
The problem is that the file you've downloaded is not a gzip file. If you check the size of the file it maybe in KBs (that is what happened with me, when I downloaded it from this Github link because it needed git-lfs)
Here is an alternate solution to resolve this issue:
Download the model using the below command on your terminal:
wget -c "https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz"
Then, load the model as you would using gensim:
from gensim import models
w = models.KeyedVectors.load_word2vec_format(
'GoogleNews-vectors-negative300.bin', binary=True)
Hope this helps you!!
来源:https://stackoverflow.com/questions/49410113/oserror-not-a-gzipped-file-bve-python