Retain feature names after Scikit Feature Selection

后端 未结 5 1522
感情败类
感情败类 2021-02-07 13:16

After running a Variance Threshold from Scikit-Learn on a set of data, it removes a couple of features. I feel I\'m doing something simple yet stupid, but I\'d like to retain th

5条回答
  •  没有蜡笔的小新
    2021-02-07 13:59

    There's probably better ways to do this, but for those interested here's how I did:

    def VarianceThreshold_selector(data):
    
        #Select Model
        selector = VarianceThreshold(0) #Defaults to 0.0, e.g. only remove features with the same value in all samples
    
        #Fit the Model
        selector.fit(data)
        features = selector.get_support(indices = True) #returns an array of integers corresponding to nonremoved features
        features = [column for column in data[features]] #Array of all nonremoved features' names
    
        #Format and Return
        selector = pd.DataFrame(selector.transform(data))
        selector.columns = features
        return selector
    

提交回复
热议问题