Python sklearn - how to calculate p-values

前端 未结 2 1630
再見小時候
再見小時候 2020-12-28 08:56

This is probably a simple question but I am trying to calculate the p-values for my features either using classifiers for a classification problem or regressors for regressi

2条回答
  •  借酒劲吻你
    2020-12-28 09:40

    Just run the significance test on X, y directly. Example using 20news and chi2:

    >>> from sklearn.datasets import fetch_20newsgroups_vectorized
    >>> from sklearn.feature_selection import chi2
    >>> data = fetch_20newsgroups_vectorized()
    >>> X, y = data.data, data.target
    >>> scores, pvalues = chi2(X, y)
    >>> pvalues
    array([  4.10171798e-17,   4.34003018e-01,   9.99999996e-01, ...,
             9.99999995e-01,   9.99999869e-01,   9.99981414e-01])
    

提交回复
热议问题