What should be appropriate image size input to faster RCNN caffe model?

问题

I am trying to train Faster RCNN using caffe for Custom dataset. I have acknowledged that the Faster RCNN caffe model is build considering input image size as 600*1000. I have many images with size 300*400 in my custom dataset. Do I need to zero pad the image upto size 600*100 or upscale it? If neither both, what should be appropriate modification to the images before giving it as input to the network. Please suggest.

Thank you.

回答1:

Faster RCNN was trained on pascal VOC images with image sizes quite close from yours (~500×375 for pascalVOC). You don't need to zero pad or upscale your images, it is part of the overall process if you use the original python code. I think that you can just use it as it is.

In my opinion you should only resize your input images if your images are big and your objects small.

For example, I had 3000x4000 images, with 100x100 objects to detect. After resizing to 600x1000 my objects are close to 25x25. But the receptive field is hard coded in the network (171 and 228 pixels for ZF and VGG, respectively). So in this case, my object would be very small with respect to this receptive field. It means that the features describing a positive would actually contain more background info than foreground...

In that case, I think that the best approach is to cut the images for the training phase (you can have different scaling for training and testing).

来源：https://stackoverflow.com/questions/39334226/what-should-be-appropriate-image-size-input-to-faster-rcnn-caffe-model

标签

caffe

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!