问题
This is the linear classifier that I am using to perform binary classification, here is code snippet:
my_optimizer = tf.train.AdagradOptimizer(learning_rate = learning_rate)
my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer,5.0)
# Create a linear classifier object
linear_classifier = tf.estimator.LinearClassifier(
feature_columns = feature_columns,
optimizer = my_optimizer
)
linear_classifier.train(input_fn = training_input_fn, steps = steps)
The dataset is imbalanced, there are only two classes yes/no. The number of NO class examples are 36548 while number of YES class examples are 4640.
How can I apply balancing to this data? I have been searching around and I could find stuff related to class weights etc but I couldn't find how can I create class weights and how to apply to the train method of tensor flow.
Here is how I am calculating losses:
training_probabilities = linear_classifier.predict(input_fn = training_predict_input_fn)
training_probabilities = np.array([item['probabilities'] for item in training_probabilities])
validation_probabilities = linear_classifier.predict(input_fn=validation_predict_input_fn)
validation_probabilities = np.array([item['probabilities'] for item in validation_probabilities])
training_log_loss = metrics.log_loss(training_targets, training_probabilities)
validation_log_loss = metrics.log_loss(validation_targets, validation_probabilities)
回答1:
I assume that you are using the log_loss
function from sklearn for computing your loss. If that is the case you can add class weights by using the argument sample_weight
and pass on an array containing the weight to be given for each data point. sample_weight
is an rolled out version of class_weights
. You can compute sample_weight
array by passing on the sample weights as given here.
Add the following lines to your code:
sample_wts = compute_sample_weight("balanced", training_targets)
training_log_loss = metrics.log_loss(training_targets, training_probabilities, sample_weight= sample_wts)
Hope this helps!
来源:https://stackoverflow.com/questions/57375168/how-to-apply-class-weights-in-linear-classifier-for-binary-classification