I need to compute Information Gain scores for >100k features in >10k documents for text classification. Code below works fine but f
It is this code feature_not_set_indices = [i for i in feature_range if i not in feature_set_indices] takes 90% of the time, try to change to set operation
feature_not_set_indices = [i for i in feature_range if i not in feature_set_indices]