问题
I am trying to use 2D CNN to do text classification on Chinese Article and have trouble on setting arguments of keras Convolution2D
. I know the basic flow of Convolution2D
to cope with image, but stuck by using my dataset with keras.
Input data
My data is 9800 Chinese Article, max sentence length is 6810,with 200 word2vec size.
So the input shape is `(9800, 1, 6810, 200)`
Code for building model
MAX_FEATURES = 6810
# I just randomly pick one filter, seems this is the problem?
nb_filter = 128
input_shape = (1, 6810, 200)
# each word is 200 (word2vec size)
embedding_size = 200
# 3 word length
n_gram = 3
# so stride here is embedding_size*n_gram
model = Sequential()
model.add(Convolution2D(nb_filter, n_gram, embedding_size, border_mode='valid', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(100, 1), border_mode='valid'))
model.add(Dropout(0.5))
model.add(Activation('relu'))
model.add(Flatten())
model.add(Dense(hidden_dims))
model.add(Dropout(0.5))
model.add(Activation('relu'))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# X is (9800, 1, 6810, 200)
model.fit(X, y, batch_size=32,
nb_epoch=5,
validation_split=0.1)
Question 1. I have problem to set Convolution2D
arguments. My reseach is below,
The official docs do not contain an exmaple for 2D CNN text classifacation(though has 1D CNN).
Convolution2D
defination is here https://keras.io/layers/convolutional/:
keras.layers.convolutional.Convolution2D(nb_filter, nb_row, nb_col, init='glorot_uniform', activation=None, weights=None, border_mode='valid', subsample=(1, 1), dim_ordering='default', W_regularizer=None, b_regularizer=None, activity_regularizer=None, W_constraint=None, b_constraint=None, bias=True)
nb_filter: Number of convolution filters to use.
nb_row: Number of rows in the convolution kernel. nb_col: Number of columns in the convolution kernel. border_mode: 'valid', 'same' or 'full'. ('full' requires the Theano backend.)
Some research about the arguments:
This issue https://github.com/fchollet/keras/issues/233 is about 2D CNN for text classification, I read all comments and pick:
(1) https://github.com/fchollet/keras/issues/233#issuecomment-117427013
model.add(Convolution2D(nb_filter=N_FILTERS, stack_size=1, nb_row=FIELD_SIZE, nb_col=1, subsample=(STRIDE, 1)))
(2) https://github.com/fchollet/keras/issues/233#issuecomment-117700913
sequential.add(Convolution2D(nb_feature_maps, 1, n_gram, embedding_size))
But it seems has some diference to current keras version, also the arguments naming by different people are in a mess (I hope keras has an easy understandable argument expanation).
Another comment I see about current api:
https://github.com/fchollet/keras/issues/1665#issuecomment-181181000
The current API is as below:
keras.layers.convolutional.Convolution2D(nb_filter, nb_row, nb_col, init='glorot_uniform', activation='linear', weights=None, border_mode='valid', subsample=(1, 1), dim_ordering='th', W_regularizer=None, b_regularizer=None, activity_regularizer=None, W_constraint=None, b_constraint=None)
So (36,1,7,7) seems the reason, the correct arguments would be (36,7,7,...).
By above research, on my understanding of convolution, Convolution2D
create a (nb_filter, nb_row, nb_col) filter , by sliding a stride to get one filter result, recurse sliding, finally combine the result into array with shape (1, one_sample_article_length[6810] / nb_filter), and go to the next layer, is that right? Is my code below set nb_row
and nb_col
correct ?
Question 2. What is the proper MaxPooling2D arguments? (for my dateset or for commonm, either is OK)
I refer this issue https://github.com/fchollet/keras/issues/233#issuecomment-117427013 to set the argument, there are two kinds:
MaxPooling2D(poolsize=(((nb_features - FIELD_SIZE) / STRIDE) + 1, 1))
MaxPooling2D(poolsize=(maxlen - n_gram + 1, 1))
I have no idea why they calculate MaxPooling2D
argument like that.
Question 3. Any recommendation for batch_size
and nb_epoch
to do such text classification? I have no idea at all.
来源:https://stackoverflow.com/questions/41798359/how-to-set-proper-arguments-to-build-keras-convolution2d-nn-model-text-classifi