PCA For categorical features?

后端未结

关注

 6  2046

清酒与你 2020-12-13 06:15

In my understanding, I thought PCA can be performed only for continuous features. But while trying to understand the difference between onehot encoding and label encoding ca

6条回答

暗喜 (楼主)

2020-12-13 07:09

PCA is a dimensionality reduction method that can be applied any set of features. Here is an example using OneHotEncoded (i.e. categorical) data:

from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder()
X = enc.fit_transform([[0, 0, 3], [1, 1, 0], [0, 2, 1], [1, 0, 2]]).toarray()

print(X)

> array([[ 1.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  1.],
       [ 0.,  1.,  0.,  1.,  0.,  1.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  1.,  0.,  1.,  0.,  0.],
       [ 0.,  1.,  1.,  0.,  0.,  0.,  0.,  1.,  0.]])


from sklearn.decomposition import PCA
pca = PCA(n_components=3)
X_pca = pca.fit_transform(X)

print(X_pca)

> array([[-0.70710678,  0.79056942,  0.70710678],
       [ 1.14412281, -0.79056942,  0.43701602],
       [-1.14412281, -0.79056942, -0.43701602],
       [ 0.70710678,  0.79056942, -0.70710678]])

0 讨论(0)

查看其它6个回答