问题
I am trying to train Faster RCNN using caffe for Custom dataset. I have acknowledged that the Faster RCNN caffe model is build considering input image size as 600*1000. I have many images with size 300*400 in my custom dataset. Do I need to zero pad the image upto size 600*100 or upscale it? If neither both, what should be appropriate modification to the images before giving it as input to the network. Please suggest.
Thank you.
回答1:
Faster RCNN was trained on pascal VOC images with image sizes quite close from yours (~500×375 for pascalVOC). You don't need to zero pad or upscale your images, it is part of the overall process if you use the original python code. I think that you can just use it as it is.
In my opinion you should only resize your input images if your images are big and your objects small.
For example, I had 3000x4000 images, with 100x100 objects to detect. After resizing to 600x1000 my objects are close to 25x25. But the receptive field is hard coded in the network (171 and 228 pixels for ZF and VGG, respectively). So in this case, my object would be very small with respect to this receptive field. It means that the features describing a positive would actually contain more background info than foreground...
In that case, I think that the best approach is to cut the images for the training phase (you can have different scaling for training and testing).
来源:https://stackoverflow.com/questions/39334226/what-should-be-appropriate-image-size-input-to-faster-rcnn-caffe-model