Difference between PCA (Principal Component Analysis) and Feature Selection

后端 未结 4 505
轻奢々
轻奢々 2021-01-31 10:57

What is the difference between Principal Component Analysis (PCA) and Feature Selection in Machine Learning? Is PCA a means of feature selection?

4条回答
  •  名媛妹妹
    2021-01-31 11:43

    You can do feature selection with PCA.

    Principal component analysis (PCA) is a technique that

    "uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables called principal components."

    The question that PCA helps us to answer fundamentally is this: Which of these M parameters explain a signficant amount of variation contained within the data set? PCA essentially helps to apply an 80-20 rule: can a small subset of parameters (say 20%) explain 80% or more of the variation in the data?

    (see here)

    But it has some shortcomings: it is sensitive to scale, and gives more weight to data with higher order of magnitude. Data normalization cannot always be the solution, as explained here:

    http://www.simafore.com/blog/bid/105347/Feature-selection-with-mutual-information-Part-2-PCA-disadvantages

    There are other ways to do feature selection:

    A feature selection algorithm can be seen as the combination of a search technique for proposing new feature subsets, along with an evaluation measure which scores the different feature subsets. The simplest algorithm is to test each possible subset of features finding the one which minimises the error rate. This is an exhaustive search of the space, and is computationally intractable for all but the smallest of feature sets. The choice of evaluation metric heavily influences the algorithm, and it is these evaluation metrics which distinguish between the three main categories of feature selection algorithms: wrappers, filters and embedded methods.

    (see here)

    In some fields, feature extraction can suggest specific goals: in image processing, you may want to perform blob, edge or ridge detection.

提交回复
热议问题