Is there pre-trained doc2vec model?

问题

Is there a pre-trained doc2vec model with a large data set, like Wikipedia or similar?

回答1:

I don't know of any good one. There's one linked from this project, but:

it's based on a custom fork from an older gensim, so won't load in recent code
it's not clear what parameters or data it was trained with, and the associated paper may have made uninformed choices about the effects of parameters
it doesn't appear to be the right size to include actual doc-vectors for either Wikipedia articles (4-million-plus) or article paragraphs (tens-of-millions), or a significant number of word-vectors, so it's unclear what's been discarded

While it takes a long time and significant amount of working RAM, there is a Jupyter notebook demonstrating the creation of a Doc2Vec model from Wikipedia included in gensim:

https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/doc2vec-wikipedia.ipynb

So, I would recommend fixing the mistakes in your attempt. (And, if you succeed in creating a model, and want to document it for others, you could upload it somewhere for others to re-use.)

回答2:

Yes! I could find two pre-trained doc2vec models at this link

but still could not find any pre-trained doc2vec model which is trained on tweets

来源：https://stackoverflow.com/questions/51132848/is-there-pre-trained-doc2vec-model

标签

gensim

doc2vec

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!