auc

F1 Score vs ROC AUC

旧街凉风 提交于 2019-11-27 09:20:09
问题 I have the below F1 and AUC scores for 2 different cases Model 1: Precision: 85.11 Recall: 99.04 F1: 91.55 AUC: 69.94 Model 2: Precision: 85.1 Recall: 98.73 F1: 91.41 AUC: 71.69 The main motive of my problem to predict the positive cases correctly,ie, reduce the False Negative cases (FN). Should I use F1 score and choose Model 1 or use AUC and choose Model 2. Thanks 回答1: Introduction As a rule of thumb, every time you want to compare ROC AUC vs F1 Score , think about it as if you are

Getting a low ROC AUC score but a high accuracy

被刻印的时光 ゝ 提交于 2019-11-27 04:34:29
Using a LogisticRegression class in scikit-learn on a version of the flight delay dataset . I use pandas to select some columns: df = df[["MONTH", "DAY_OF_MONTH", "DAY_OF_WEEK", "ORIGIN", "DEST", "CRS_DEP_TIME", "ARR_DEL15"]] I fill in NaN values with 0: df = df.fillna({'ARR_DEL15': 0}) Make sure the categorical columns are marked with the 'category' data type: df["ORIGIN"] = df["ORIGIN"].astype('category') df["DEST"] = df["DEST"].astype('category') Then call get_dummies() from pandas : df = pd.get_dummies(df) Now I train and test my data set: from sklearn.linear_model import

High AUC but bad predictions with imbalanced data

試著忘記壹切 提交于 2019-11-27 03:26:11
问题 I am trying to build a classifier with LightGBM on a very imbalanced dataset. Imbalance is in the ratio 97:3 , i.e.: Class 0 0.970691 1 0.029309 Params I used and the code for training is as shown below. lgb_params = { 'boosting_type': 'gbdt', 'objective': 'binary', 'metric':'auc', 'learning_rate': 0.1, 'is_unbalance': 'true', #because training data is unbalance (replaced with scale_pos_weight) 'num_leaves': 31, # we should let it be smaller than 2^(max_depth) 'max_depth': 6, # -1 means no

Getting a low ROC AUC score but a high accuracy

五迷三道 提交于 2019-11-26 11:15:24
问题 Using a LogisticRegression class in scikit-learn on a version of the flight delay dataset. I use pandas to select some columns: df = df[[\"MONTH\", \"DAY_OF_MONTH\", \"DAY_OF_WEEK\", \"ORIGIN\", \"DEST\", \"CRS_DEP_TIME\", \"ARR_DEL15\"]] I fill in NaN values with 0: df = df.fillna({\'ARR_DEL15\': 0}) Make sure the categorical columns are marked with the \'category\' data type: df[\"ORIGIN\"] = df[\"ORIGIN\"].astype(\'category\') df[\"DEST\"] = df[\"DEST\"].astype(\'category\') Then call get