Caffe hangs after printing data -> label

问题

I'm trying to train a LeNet on my own data (37 by 37 grayscale images of 1024 categories).

I created the lmdb files, and changed the size of the ouput layer to 1024. When I ran the caffe train with my solver file, the program got stuck after printing

...
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "score"
  bottom: "label"
  top: "loss"
}
I0713 17:11:13.334890  9595 layer_factory.hpp:77] Creating layer data
I0713 17:11:13.334939  9595 net.cpp:91] Creating Layer data
I0713 17:11:13.334950  9595 net.cpp:399] data -> data
I0713 17:11:13.334961  9595 net.cpp:399] data -> label

What could possibly be the problem?
I'm new with caffe, any help will be appreciated.

solver.prototxt

net: "lenet_auto_train.prototxt"
test_iter: 100
test_interval: 500
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
lr_policy: "inv"
gamma: 0.0001
power: 0.75
display: 100
max_iter: 10000
snapshot: 5000
snapshot_prefix: "lenet"

lenet.prototxt

layer {
  name: "data"
  type: "Data"
  top: "data"
  top: "label"
  transform_param {
    scale: 0.00392156862745
  }
  data_param {
    source: "dir/dat/1024_37*37_gray_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  convolution_param {
    num_output: 20
    kernel_size: 5
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2"
  convolution_param {
    num_output: 50
    kernel_size: 5
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "pool2"
  top: "fc1"
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "fc1"
  top: "fc1"
}
layer {
  name: "score"
  type: "InnerProduct"
  bottom: "fc1"
  top: "score"
  inner_product_param {
    num_output: 1024
    weight_filler {
      type: "xavier"
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "score"
  bottom: "label"
  top: "loss"
}

回答1:

In my case this happened when the same LMDB is used for both train and test.

回答2:

It seems like caffe is trying to read the lmdb and then encounters a problem.
My guess is that your db name "dir/dat/1024_37*37_gray_lmdb" is causing the problem: having "*" character in file name is not a good practice.
Change the db name to something like "dir/dat/1024_37x37_gray_lmdb" and try again (don't forget to change the prototxt as well)

回答3:

The problem is, I put the options test_iter: 100 and test_interval: 500 in the solver file, but I did not specify a test network or test data layer in the network file.

来源：https://stackoverflow.com/questions/38348801/caffe-hangs-after-printing-data-label

标签

machine-learning

neural-network

deep-learning

caffe