CNN Image Recognition with Regression Output on Tensorflow

烈酒焚心 提交于 2019-12-03 07:26:00

问题


I want to predict the estimated wait time based on images using a CNN. So I would imagine that this would use a CNN to output a regression type output using a loss function of RMSE which is what I am using right now, but it is not working properly.

Can someone point out examples that use CNN image recognition to output a scalar/regression output (instead of a class output) similar to wait time so that I can use their techniques to get this to work because I haven't been able to find a suitable example.

All of the CNN examples that I found are for the MSINT data and distinguishing between cats and dogs which output a class output, not a number/scalar output of wait time.

Can someone give me an example using tensorflow of a CNN giving a scalar or regression output based on image recognition.

Thanks so much! I am honestly super stuck and am getting no progress and it has been over two weeks working on this same problem.


回答1:


Check out the Udacity self-driving-car models which take an input image from a dash cam and predict a steering angle (i.e. continuous scalar) to stay on the road...usually using a regression output after one or more fully connected layers on top of the CNN layers.

https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models

Here is a typical model:

https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/autumn

...it uses tf.atan() or you can use tf.tanh() or just linear to get your final output y.

Use MSE for your loss function.

Here is another example in keras...

model = models.Sequential()
model.add(convolutional.Convolution2D(16, 3, 3, input_shape=(32, 128, 3), activation='relu'))
model.add(pooling.MaxPooling2D(pool_size=(2, 2)))
model.add(convolutional.Convolution2D(32, 3, 3, activation='relu'))
model.add(pooling.MaxPooling2D(pool_size=(2, 2)))
model.add(convolutional.Convolution2D(64, 3, 3, activation='relu'))
model.add(pooling.MaxPooling2D(pool_size=(2, 2)))
model.add(core.Flatten())
model.add(core.Dense(500, activation='relu'))
model.add(core.Dropout(.5))
model.add(core.Dense(100, activation='relu'))
model.add(core.Dropout(.25))
model.add(core.Dense(20, activation='relu'))
model.add(core.Dense(1))
model.compile(optimizer=optimizers.Adam(lr=1e-04), loss='mean_squared_error')

They key difference from the MNIST examples is that instead of funneling down to a N-dim vector of logits into softmax w/ cross entropy loss take it down to a 1-dim vector w/ MSE loss.




回答2:


The key is to have NO activation function in your last Fully Connected (output) layer. Note that you must have at least 1 FC layer beforehand.



来源:https://stackoverflow.com/questions/45528285/cnn-image-recognition-with-regression-output-on-tensorflow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!