How to format a data set for fully convolutional networks?

▼魔方 西西 提交于 2020-01-06 08:07:51

问题


I am trying to prepare my data set for fully convolutional network. I've looked through some data sets and I'm having a really hard time figuring out how to format it. For instance, in the Kitti data set, there are these 2 images and this text file in the training folder:

image 1

image 2

text

P0: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 0.000000000000e+00 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00 P1: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 -3.875744000000e+02 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00 P2: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 4.485728000000e+01 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 2.163791000000e-01 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 2.745884000000e-03 P3: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 -3.395242000000e+02 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 2.199936000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 2.729905000000e-03 R0_rect: 9.999239000000e-01 9.837760000000e-03 -7.445048000000e-03 -9.869795000000e-03 9.999421000000e-01 -4.278459000000e-03 7.402527000000e-03 4.351614000000e-03 9.999631000000e-01 Tr_velo_to_cam: 7.533745000000e-03 -9.999714000000e-01 -6.166020000000e-04 -4.069766000000e-03 1.480249000000e-02 7.280733000000e-04 -9.998902000000e-01 -7.631618000000e-02 9.998621000000e-01 7.523790000000e-03 1.480755000000e-02 -2.717806000000e-01 Tr_imu_to_velo: 9.999976000000e-01 7.553071000000e-04 -2.035826000000e-03 -8.086759000000e-01 -7.854027000000e-04 9.998898000000e-01 -1.482298000000e-02 3.195559000000e-01 2.024406000000e-03 1.482454000000e-02 9.998881000000e-01 -7.997231000000e-01 Tr_cam_to_road: 9.999570839814e-01 -5.508724949246e-03 -7.452906591504e-03 9.610489538319e-03 5.425697507328e-03 9.999234779341e-01 -1.111504746388e-02 -1.597134401910e+00 7.513565886504e-03 1.107413060494e-02 9.999104059534e-01 2.788606298060e-01

This data set is very different from the regular data sets I've seen being used for CNNs. Hence, I had the following questions:

  1. What is happening in the text file?
  2. How to generate the 2nd image with solid colored pixels?
  3. One of the proposed advantages of FCNs is the ability to feed input images of arbitrary sizes. How small can I make the input images - is 50x50 too small? I looked for some literature surrounding this but couldn't find much.

Essentially, I'm trying to create a data set to use this network from this github. Which has only 2 folders for training: training_img_lmdb and training_label_lmdb. So, I'm not exactly sure if the text file or the pixelated image goes in the label folder. Any help would be greatly appreciated!!


回答1:


  1. Looks like some kind of telemetry, from Tr_cam_to_road, Tr_velo_to_cam, etc... usually the dataset will have documentation

  2. Please clarify. You posted the image. Surely you know how to load an image?

  3. You are correct, however any purely convolutional network will have a minimum input size equivalent to the input neighborhood size of a single output pixel.



来源:https://stackoverflow.com/questions/47964716/how-to-format-a-data-set-for-fully-convolutional-networks

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!