I am using fast-rcnn and try to train the system for new class (label) I followed this: https://github.com/EdisonResearch/fast-rcnn/tree/master/help/train
Placed the images
Placed the annotations
Prepare the ImageSet with all the image name prefix
Prepared selective search output: train.mat
I failed while running the train_net.py with the following error:
./tools/train_net.py --gpu 0 --solver models/VGG_1024_pascal2007/solver.prototxt --imdb voc_2007_train_top_5000
Called with args: Namespace(cfg_file=None, gpu_id=0, imdb_name='voc_2007_train_top_5000', max_iters=40000, pretrained_model=None, randomize=False, solver='models/VGG_1024_pascal2007/solver.prototxt') Using config: {'DEDUP_BOXES': 0.0625, 'EPS': 1e-14, 'EXP_DIR': 'default', 'PIXEL_MEANS': array([[[ 102.9801, 115.9465, 122.7717]]]), 'RNG_SEED': 3, 'ROOT_DIR': '/home/hagay/fast-rcnn', 'TEST': {'BBOX_REG': True,
'MAX_SIZE': 1000,
'NMS': 0.3,
'SCALES': [600],
'SVM': False}, 'TRAIN': {'BATCH_SIZE': 128,
'BBOX_REG': True,
'BBOX_THRESH': 0.5,
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.1,
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'IMS_PER_BATCH': 2,
'MAX_SIZE': 1000,
'SCALES': [600],
'SNAPSHOT_INFIX': '',
'SNAPSHOT_ITERS': 10000,
'USE_FLIPPED': True,
'USE_PREFETCH': False}} Loaded dataset `voc_2007_train` for training Appending horizontally-flipped training examples... voc_2007_train gt roidb loaded from /home/hagay/fast-rcnn/data/cache/voc_2007_train_gt_roidb.pkl /usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py:2507: VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`. VisibleDeprecationWarning) wrote ss roidb to /home/hagay/fast-rcnn/data/cache/voc_2007_train_selective_search_IJCV_top_5000_roidb.pkl Traceback (most recent call last): File "./tools/train_net.py", line 80, in <module>
roidb = get_training_roidb(imdb) File "/home/hagay/fast-rcnn/tools/../lib/fast_rcnn/train.py", line 107, in get_training_roidb
imdb.append_flipped_images() File "/home/hagay/fast-rcnn/tools/../lib/datasets/imdb.py", line 104, in append_flipped_images
assert (boxes[:, 2] >= boxes[:, 0]).all() AssertionError
My Questions is:
- Why am I having this error?
- Do I need to rescale the images to fix: 256x256 before training?
- Do I need to prepare something in order to set the
__background__
class?
- it says that there exists
boxes[:,2] < boxes[:, 0]
,boxes[:, 2]
is the x-max of bounding box whileboxes[:, 0]
is x-min. So the problem is related to region proposal. I came across with this problem too. I found that it was causes by overflow. I remember that the dtype for boxes is np.uint8(need to check), if the image is too big, you get this error. - rescale is one solution, however this may influence the performance. You can change the dtype from uint8 to float instead.
- As far as I know, there is no need for that.
I'm late to the party, but when I was editing the code this was my yakkity-hack of a solution
for b in range(len(boxes)):
if boxes[b][2] < boxes[b][0]:
boxes[b][0] = 0
assert (boxes[:, 2] >= boxes[:, 0]).all()
There are smarter ways to do this, as every single grad student seems to point out, but this works fine.
Check out the solution described in the following blog post, Part 4, Issue #4. The solution is to flip the x1 and x2 coordinate values.
https://huangying-zhan.github.io/2016/09/22/detection-faster-rcnn.html
Following is copied from link:
box [:, 0] > box[:, 2]
Solution: add the following code block in imdb.py
def append_flipped_images(self):
num_images = self.num_images
widths = self._get_widths()
for i in xrange(num_images):
boxes = self.roidb[i]['boxes'].copy()
oldx1 = boxes[:, 0].copy()
oldx2 = boxes[:, 2].copy()
boxes[:, 0] = widths[i] - oldx2
boxes[:, 2] = widths[i] - oldx1
for b in range(len(boxes)):
if boxes[b][2] < boxes[b][0]:
boxes[b][0]=0
assert (boxes[:, 2] >= boxes[:, 0]).all()
来源:https://stackoverflow.com/questions/31005463/how-to-train-new-fast-rcnn-imageset