Configuration of CNN model for recognition of sequential data - Architecture of the top of the CNN - Parallel Layers

。_饼干妹妹 提交于 2020-08-10 06:16:24

问题


I am trying to configure a network for character recognition of sequential data like license plates. Now I would like to use the architecture which is noted in Table 3 in Deep Automatic Licence Plate Recognition system (link: http://www.ee.iisc.ac.in/people/faculty/soma.biswas/Papers/jain_icgvip2016_alpr.pdf).

The architecture the authors presented is this one:

The first layers are very common, but where I was stumbling was the top (the part in the red frame) of the architecture. They mention 11 parallel layers and I am really unsure how to get this in Python. I coded this architecture but it does not seem to be right to me.

model = Sequential()
model.add(Conv2D(64, kernel_size=(5, 5), input_shape = (32, 96, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(128, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(256, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(1024, activation = "relu"))
model.add(Dense(11*37, activation="Softmax"))
model.add(keras.layers.Reshape((11, 37)))

Could someone help? How do I have to code the top to get an equal architecture like the authors?


回答1:


The code below can build the architecture described in the image.

import tensorflow as tf
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, Flatten, MaxPooling2D, Dense, Input, Reshape, Concatenate, Dropout

def create_model(input_shape = (32, 96, 1)):
    input_img = Input(shape=input_shape)
    '''
    Add the ST Layer here.
    '''
    model = Conv2D(64, kernel_size=(5, 5), input_shape = input_shape, activation = "relu")(input_img)
    model = MaxPooling2D(pool_size=(2, 2))(model)
    model = Dropout(0.25)(model)

    model = Conv2D(128, kernel_size=(3, 3), input_shape = input_shape, activation = "relu")(model)
    model = MaxPooling2D(pool_size=(2, 2))(model)
    model = Dropout(0.25)(model)

    model = Conv2D(256, kernel_size=(3, 3), input_shape = input_shape, activation = "relu")(model)
    model = MaxPooling2D(pool_size=(2, 2))(model)
    model = Dropout(0.25)(model)

    model = Flatten()(model)
    backbone = Dense(1024, activation="relu")(model)

    branches = []
    for i in range(11):
        branches.append(backbone)
        branches[i] = Dense(37, activation = "softmax", name="branch_"+str(i))(branches[i])
    
    output = Concatenate(axis=1)(branches)
    output = Reshape((11, 37))(output)
    model = Model(input_img, output)

    return model




回答2:


From my understanding, your implementation is almost correct. The authors train 11 individual classifiers taking as input the output from the Fully Connected Layer. Here, you can think of "parallel" as "independent".

However, you cannot apply the Softmax activation right after the Fully Connected Layer. Since all the classifiers are independent, we want each of them to output a probability for each possible character. Putting things differently, we want the sum of the outputs of each classifier to be 1. Hence, the correct implementation would be:

...
model.add(Dense(1024, activation = "relu"))
# Feeding every neuron with the previous layer's output
model.add(Dense(11*37))
model.add(keras.layers.Reshape((11, 37)))
model.add(keras.activations.softmax(x, axis=1))


来源:https://stackoverflow.com/questions/62695857/configuration-of-cnn-model-for-recognition-of-sequential-data-architecture-of

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!