how to fine-tune word2vec when training our CNN for text classification?

你离开我真会死。 提交于 2020-01-04 05:38:17

问题


I have 3 Questions about fine-tuning word vectors. Please, help me out. I will really appreciate it! Many thanks in advance!

  1. When I train my own CNN for text classification, I use Word2vec to initialize the words, then I just employ these pre-trained vectors as my input features to train CNN, so if I never had a embedding layer, it surely can not do any fine-tunes through back-propagation. my question is if I want to do fine-tuning, does it means to create a Embedding layer?and how to create it?

  2. When we train Word2vec, we use unsupervised training right? as in my case, I use the skip-gram model to get my pre-trained word2vec; But when I had the vec.bin and use it in the text classification model (CNN) as my words initialiser, if I could fine-tune the word-to-vector map in vec.bin, does it means that I have to have a CNN net structure exactly same as the one when training my Word2vec? and does the fine-tunes stuff would change the vec.bin or just fine-tune in computer memory?

  3. Are the skip-gram model and CBOW model are only used for unsupervised Word2vec training? Or they could also apply for other general text classification tasks? and what's the different of the network between Word2vec unsupervised training supervised fine-tuning?

@Franck Dernoncourt thank you for reminding me. I'm green here, and hope to learn something from the powerful community. Please have a look at my questions when you have time, thank you again!


回答1:


1) What you need is just a good example of using pretrained word embedding with trainable/fixed embedding layer with following change in code. In Keras you can update this layer by default, to exclude it from training you need set trainable to False.

embedding_layer = Embedding(nb_words + 1,
                            EMBEDDING_DIM,
                            weights=[embedding_matrix],
                            input_length=MAX_SEQUENCE_LENGTH,
                            trainable=True)

2) Your w2v is just for embedding layer initialization , no more relation to what CNN structure you are going to use. Will only update the weights in memory.




回答2:


Answer to your 1st question-

When you set trainable=True in your Embedding constructor. Your pretrained-embeddings are set as weights of that embedding layer. Now any fine-tuning that happens on those weights has nothing to do with w2v(CBOW or SG). If you want to finetune you will have to finetune your w2v model using any of these techniques. Refer to these answers.

Answer 2-

Any finetuning on weights of embedding layer does not affect your vec.bin. These updated weights are saved along with the model though so theoretically you can get them out.

Answer 3 -

gensim implements only these two methods (SG and CBOW). However there are multiple new methods being used for training word vectors like MLM(masked language modeling). glove tries to model probabilities of co-occurences of words.

If you want to fine-tune using your own custom method. You will just have to specify a task(like text classification) and then save your updated embedding layer weights. You will have to take proper care of indexing to assign each word its corresponding vector.



来源:https://stackoverflow.com/questions/40143405/how-to-fine-tune-word2vec-when-training-our-cnn-for-text-classification

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!