How to set proper arguments to build keras Convolution2D NN model [Text Classification]?

问题

I am trying to use 2D CNN to do text classification on Chinese Article and have trouble on setting arguments of keras Convolution2D. I know the basic flow of Convolution2D to cope with image, but stuck by using my dataset with keras.

Input data

My data is 9800 Chinese Article, max sentence length is 6810，with 200 word2vec size.
So the input shape is `(9800, 1, 6810, 200)`

Code for building model

MAX_FEATURES = 6810

# I just randomly pick one filter, seems this is the problem?
nb_filter = 128

input_shape = (1, 6810, 200)

# each word is 200 (word2vec size)
embedding_size = 200 

# 3 word length
n_gram = 3 

# so stride here is embedding_size*n_gram

model = Sequential()

model.add(Convolution2D(nb_filter, n_gram, embedding_size, border_mode='valid', input_shape=input_shape))

model.add(MaxPooling2D(pool_size=(100, 1), border_mode='valid'))

model.add(Dropout(0.5))
model.add(Activation('relu'))

model.add(Flatten())

model.add(Dense(hidden_dims))
model.add(Dropout(0.5))
model.add(Activation('relu'))

model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])


# X is (9800, 1, 6810, 200)

model.fit(X, y, batch_size=32,
              nb_epoch=5,
              validation_split=0.1)

Question 1. I have problem to set Convolution2D arguments. My reseach is below,

The official docs do not contain an exmaple for 2D CNN text classifacation(though has 1D CNN).

Convolution2D defination is here https://keras.io/layers/convolutional/:

keras.layers.convolutional.Convolution2D(nb_filter, nb_row, nb_col, init='glorot_uniform', activation=None, weights=None, border_mode='valid', subsample=(1, 1), dim_ordering='default', W_regularizer=None, b_regularizer=None, activity_regularizer=None, W_constraint=None, b_constraint=None, bias=True)

nb_filter: Number of convolution filters to use.
nb_row: Number of rows in the convolution kernel. nb_col: Number of columns in the convolution kernel. border_mode: 'valid', 'same' or 'full'. ('full' requires the Theano backend.)

Some research about the arguments:

This issue https://github.com/fchollet/keras/issues/233 is about 2D CNN for text classification, I read all comments and pick:

(1) https://github.com/fchollet/keras/issues/233#issuecomment-117427013
```
model.add(Convolution2D(nb_filter=N_FILTERS, stack_size=1, nb_row=FIELD_SIZE,
                    nb_col=1, subsample=(STRIDE, 1)))
```
(2) https://github.com/fchollet/keras/issues/233#issuecomment-117700913
```
sequential.add(Convolution2D(nb_feature_maps, 1, n_gram, embedding_size))
```
But it seems has some diference to current keras version, also the arguments naming by different people are in a mess (I hope keras has an easy understandable argument expanation).

Another comment I see about current api:

https://github.com/fchollet/keras/issues/1665#issuecomment-181181000

The current API is as below:

    keras.layers.convolutional.Convolution2D(nb_filter, nb_row, nb_col, init='glorot_uniform', activation='linear', weights=None, border_mode='valid', subsample=(1, 1), dim_ordering='th', W_regularizer=None, b_regularizer=None, activity_regularizer=None, W_constraint=None, b_constraint=None)

So (36,1,7,7) seems the reason, the correct arguments would be (36,7,7,...).

By above research, on my understanding of convolution, Convolution2D create a (nb_filter, nb_row, nb_col) filter , by sliding a stride to get one filter result, recurse sliding, finally combine the result into array with shape (1, one_sample_article_length[6810] / nb_filter), and go to the next layer, is that right? Is my code below set nb_row and nb_col correct ?

Question 2. What is the proper MaxPooling2D arguments? (for my dateset or for commonm, either is OK)

I refer this issue https://github.com/fchollet/keras/issues/233#issuecomment-117427013 to set the argument, there are two kinds:

MaxPooling2D(poolsize=(((nb_features - FIELD_SIZE) / STRIDE) + 1, 1))
MaxPooling2D(poolsize=(maxlen - n_gram + 1, 1))

I have no idea why they calculate MaxPooling2D argument like that.

Question 3. Any recommendation for batch_size and nb_epoch to do such text classification? I have no idea at all.

来源：https://stackoverflow.com/questions/41798359/how-to-set-proper-arguments-to-build-keras-convolution2d-nn-model-text-classifi

标签

neural-network

classification

deep-learning

keras

conv-neural-network