问题
I am using convolution neural network.
My data is quite imbalanced, I have two classes.
My first class contains: 551,462 image files
My second class contains: 52,377 image files
I want to use weighted_cross_entropy_with_logits, but I'm not sure I'm calculating pos_weight variable correctly.
Right now I'm using
classes_weights = tf.constant([0.0949784, 1.0])
cross_entropy = tf.reduce_mean(tf.nn.weighted_cross_entropy_with_logits(logits=logits, targets=y_, pos_weight=classes_weights))
train_step = tf.train.AdamOptimizer(LEARNING_RATE, epsilon=1e-03).minimize(
cross_entropy
, global_step=global_step
)
Or should I use
classes_weights = 10.5287
回答1:
From the documentation:
pos_weight: A coefficient to use on the positive examples.
and
The argument pos_weight is used as a multiplier for the positive targets:
So if your first class is positive, then pos_weights = 52,377 / 551,462, otherwise 551,462 / 52,377
回答2:
As @Salvador Dali said, the best source is the source code https://github.com/tensorflow/tensorflow/blob/5b10b3474bea72e29875264bb34be476e187039c/tensorflow/python/ops/nn_impl.py#L183
We have
log_weight = 1 + (pos_weight - 1) * targets
so it only applies if targets==1.
If targets==0 then log_weight = 1
If targets==1 then log_weight = pos_weight
So if we have ratio of positives to negatives x/y we need pos_weight to be y/x so both categories will contribute equally in total
Please note that each scalar in targets tensor corresponds to each category so each member of pos_weight corresponds to each category as well (not positive or negative probability for one category) .
来源:https://stackoverflow.com/questions/43564490/how-correctly-calculate-tf-nn-weighted-cross-entropy-with-logits-pos-weight-vari