可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I am trying to train custom object classifier in Darknet YOLO v2 https://pjreddie.com/darknet/yolo/
I gathered a dataset for images most of them are 6000 x 4000 px and some lower resolutions as well.
Do I need to resize the images before training to be squared ?
I found that the config uses:
[net] batch=64 subdivisions=8 height=416 width=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1
thats why I was wondering how to use it for different sizes of data sets.
回答1:
You don't have to resize it, because Darknet will do it instead of you!
It means you really don't need to do that and you can use different image sizes during your training. What you posted above is just network configuration. There should be full network definition as well. And the height and the width tell you what's the network resolution. And it also keeps aspect ratio, check e.g this.
回答2:
It is very common to resize images before training. 416x416 is slightly larger than common. Most imagenet models resize and square the images to 256x256 for example. So I would expect the same here. Trying to train on 6000x4000 is going to require a farm of GPUs. The standard process is to square the image to the largest dimension (height, or width), padding with 0's on the shorter side, then resizing using standard image resizing tools like PIL.
回答3:
You do not need to resize the images, you can directly change the values in darknet.cfg
file.
- When you open
darknet.cfg
(yolo-darknet.cfg) file, you can all
hyper-parameters and their values. - As showed in your
cfg
file images dimensions are (416,416)->(weight,height), you can change the values, so that darknet will automatically resize the images before training. - Since the images have high dimensions, you can adjust batch and sub-division values (lower the values 32,16,8 . it has to be multiples of 2), so that darknet will not crash (memory allocation error)