After running a Variance Threshold from Scikit-Learn on a set of data, it removes a couple of features. I feel I\'m doing something simple yet stupid, but I\'d like to retain th
There's probably better ways to do this, but for those interested here's how I did:
def VarianceThreshold_selector(data):
#Select Model
selector = VarianceThreshold(0) #Defaults to 0.0, e.g. only remove features with the same value in all samples
#Fit the Model
selector.fit(data)
features = selector.get_support(indices = True) #returns an array of integers corresponding to nonremoved features
features = [column for column in data[features]] #Array of all nonremoved features' names
#Format and Return
selector = pd.DataFrame(selector.transform(data))
selector.columns = features
return selector