show feature names after feature selection

前端 未结 2 1252
感情败类
感情败类 2020-12-08 08:22

I need to build a classifier for text, and now I\'m using TfidfVectorizer and SelectKBest to selection the features, as following:

vectorizer = TfidfVectoriz         


        
相关标签:
2条回答
  • 2020-12-08 08:46

    To expand on @ogrisel's answer, the returned list of features is in the same order when they've been vectorized. The code below will give you a list of top ranked features sorted according to their Chi-2 scores in descending order (along with the corresponding p-values):

    top_ranked_features = sorted(enumerate(ch2.scores_),key=lambda x:x[1], reverse=True)[:1000]
    top_ranked_features_indices = map(list,zip(*top_ranked_features))[0]
    for feature_pvalue in zip(np.asarray(train_vectorizer.get_feature_names())[top_ranked_features_indices],ch2.pvalues_[top_ranked_features_indices]):
            print feature_pvalue
    
    0 讨论(0)
  • 2020-12-08 09:09

    The following should work:

    np.asarray(vectorizer.get_feature_names())[ch2.get_support()]
    
    0 讨论(0)
提交回复
热议问题