keras autoencoder vs PCA

前端 未结 2 1549
忘掉有多难
忘掉有多难 2021-01-03 09:07

I am playing with a toy example to understand PCA vs keras autoencoder

I have the following code for understanding PCA:

import numpy as np
import mat         


        
2条回答
  •  庸人自扰
    2021-01-03 09:24

    The earlier answer cover the whole thing, however I am doing the analysis on the Iris data - my code comes with a slightly modificiation from this post which dives further into the topic. As it was request, lets load the data

    from sklearn.datasets import load_iris
    from sklearn.preprocessing import MinMaxScaler
    
    iris = load_iris()
    X = iris.data
    y = iris.target
    target_names = iris.target_names
    
    scaler = MinMaxScaler()
    scaler.fit(X)
    X_scaled = scaler.transform(X)
    

    Let's do a regular PCA

    from sklearn import decomposition
    pca = decomposition.PCA()
    pca_transformed = pca.fit_transform(X_scaled)
    plot3clusters(pca_transformed[:,:2], 'PCA', 'PC') 
    

    A very simple AE model with linear layers, as the earlier answer pointed out with ... the first reference, one linear hidden layer and the mean squared error criterion is used to train the network, then the k hidden units learn to project the input in the span of the first k principal components of the data.

    from keras.layers import Input, Dense
    from keras.models import Model
    import matplotlib.pyplot as plt
    
    #create an AE and fit it with our data using 3 neurons in the dense layer using keras' functional API
    input_dim = X_scaled.shape[1]
    encoding_dim = 2  
    input_img = Input(shape=(input_dim,))
    encoded = Dense(encoding_dim, activation='linear')(input_img)
    decoded = Dense(input_dim, activation='linear')(encoded)
    autoencoder = Model(input_img, decoded)
    autoencoder.compile(optimizer='adam', loss='mse')
    print(autoencoder.summary())
    
    history = autoencoder.fit(X_scaled, X_scaled,
                    epochs=1000,
                    batch_size=16,
                    shuffle=True,
                    validation_split=0.1,
                    verbose = 0)
    
    # use our encoded layer to encode the training input
    encoder = Model(input_img, encoded)
    encoded_input = Input(shape=(encoding_dim,))
    decoder_layer = autoencoder.layers[-1]
    decoder = Model(encoded_input, decoder_layer(encoded_input))
    encoded_data = encoder.predict(X_scaled)
    
    plot3clusters(encoded_data[:,:2], 'Linear AE', 'AE')
    

    You can look into the loss if you want

    #plot our loss 
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('model train vs validation loss')
    plt.ylabel('loss')
    plt.xlabel('epoch')
    plt.legend(['train', 'validation'], loc='upper right')
    plt.show()
    

    The function to plot the data

    def plot3clusters(X, title, vtitle):
        import matplotlib.pyplot as plt
        plt.figure()
        colors = ['navy', 'turquoise', 'darkorange']
        lw = 2
    
        for color, i, target_name in zip(colors, [0, 1, 2], target_names):
            plt.scatter(X[y == i, 0], X[y == i, 1], color=color, alpha=1., lw=lw, label=target_name)
    
        plt.legend(loc='best', shadow=False, scatterpoints=1)
        plt.title(title)  
        plt.xlabel(vtitle + "1")
        plt.ylabel(vtitle + "2")
        return(plt.show())
    

    Regarding explaining the variability, using non-linear hidden function, leads to other approximation similar to ICA / TSNE and others. Where the idea of variance explanation is not there, still one can look into the convergence.

提交回复
热议问题