Linear Discriminant Analysis inverse transform

前端 未结 2 1979
忘了有多久
忘了有多久 2020-12-21 12:10

I try to use Linear Discriminant Analysis from scikit-learn library, in order to perform dimensionality reduction on my data which has more than 200 features. But I could no

2条回答
  •  自闭症患者
    2020-12-21 12:38

    There is no inverse transform because in general, you can not return from the lower dimensional feature space to your original coordinate space.

    Think of it like looking at your 2-dimensional shadow projected on a wall. You can't get back to your 3-dimensional geometry from a single shadow because information is lost during the projection.

    To address your comment regarding PCA, consider a data set of 10 random 3-dimensional vectors:

    In [1]: import numpy as np
    
    In [2]: from sklearn.decomposition import PCA
    
    In [3]: X = np.random.rand(30).reshape(10, 3)
    

    Now, what happens if we apply the Principal Components Transformation (PCT) and apply dimensionality reduction by keeping only the top 2 (out of 3) PCs, then apply the inverse transform?

    In [4]: pca = PCA(n_components=2)
    
    In [5]: pca.fit(X)
    Out[5]: 
    PCA(copy=True, iterated_power='auto', n_components=2, random_state=None,
      svd_solver='auto', tol=0.0, whiten=False)
    
    In [6]: Y = pca.transform(X)
    
    In [7]: X.shape
    Out[7]: (10, 3)
    
    In [8]: Y.shape
    Out[8]: (10, 2)
    
    In [9]: XX = pca.inverse_transform(Y)
    
    In [10]: X[0]
    Out[10]: array([ 0.95780971,  0.23739785,  0.06678655])
    
    In [11]: XX[0]
    Out[11]: array([ 0.87931369,  0.34958407, -0.01145125])
    

    Obviously, the inverse transform did not reconstruct the original data. The reason is that by dropping the lowest PC, we lost information. Next, let's see what happens if we retain all PCs (i.e., we do not apply any dimensionality reduction):

    In [12]: pca2 = PCA(n_components=3)
    
    In [13]: pca2.fit(X)
    Out[13]: 
    PCA(copy=True, iterated_power='auto', n_components=3, random_state=None,
      svd_solver='auto', tol=0.0, whiten=False)
    
    In [14]: Y = pca2.transform(X)
    
    In [15]: XX = pca2.inverse_transform(Y)
    
    In [16]: X[0]
    Out[16]: array([ 0.95780971,  0.23739785,  0.06678655])
    
    In [17]: XX[0]
    Out[17]: array([ 0.95780971,  0.23739785,  0.06678655])
    

    In this case, we were able to reconstruct the original data because we didn't throw away any information (since we retained all the PCs).

    The situation with LDA is even worse because the maximum number of components that can be retained is not 200 (the number of features for your input data); rather, the maximum number of components you can retain is n_classes - 1. So if, for example, you were doing a binary classification problem (2 classes), the LDA transform would be going from 200 input dimensions down to just a single dimension.

提交回复
热议问题