pca

PCA局部与整体的关系

泄露秘密 提交于 2020-01-12 01:34:40
哈喽,今天周六啦。不知道有多少妹子想我呢。哈哈。 言归正传。这个问题源于菜鸟思维。 数据集X shape (1000,64)的整体进行PCA与分成两部分或多部分是否结果相同??为何不同?差别大不大?为何? 依旧以MNIST数据集为例进行探索: 1-整体PCA与结果展示,64->20 奇异值: [567.0065665 542.25185421 504.63059421 426.11767608 353.33503278 325.82036568 305.26157987 281.16033046 269.06977886 257.8239478 226.3187942 221.51478853 198.33066914 195.70009822 177.97288431 174.46075724 168.72640164 164.15235888 148.22422881 139.8223383 ] 占比:0.8942989517025197 2-分成两部分各自PCA,PCA后都是20维 [494.16894798 449.86043083 361.65617118 248.75442367 238.56963133 219.44600126 190.14159973 182.20945787 174.00574178 156.07163314 148.54921821 138

ggbiplot - change the group color and marker

微笑、不失礼 提交于 2020-01-11 02:58:06
问题 In the example ggbiplot script plot there are 3 groups, how can I change the marker colors and shapes? library(ggbiplot) data(wine) wine.pca <- prcomp(wine, scale. = TRUE) ggbiplot(wine.pca, obs.scale = 1, var.scale = 1, group=wine.class, varname.size = 3, labels.size=3, ellipse = TRUE, circle = TRUE) + scale_color_discrete(name = '') + geom_point(aes(colour=wine.class), size = 3) + theme(legend.direction ='horizontal', legend.position = 'top') 回答1: The following works for me. ggbiplot(wine

【Matlab】PCA降维实现人脸识别(附学习资料、代码程序及注解、运行结果)

北城余情 提交于 2020-01-11 00:01:46
Matlab实现PCA人脸识别 寒假来了,阿汪先生总结了这一学期里学到的一些东西,并来和大家分享一下。 一、理论知识基础 1、一些前辈的经验分享(不局限于这些) (1) PCA人脸识别详解——初学者必看 . (2) 理解主成分分析 (PCA) . (3) LLE算法 . (4) 拉格朗日乘子法 . 2、阿汪先生做的一些笔记和用到的资料 原理资料上讲的很好,阿汪做了一些批注。水平不够,大家见谅呀!^-^ (1) 05-人脸图像超分辨率重建 . (2) 6.5-基于K-L变换的特征提取 . (3) Matlab_PCA_图像降维和人脸匹配_笔记 . 主要用到的资料: 人脸识别与人体动作识别技术及应用 [专著] / 曹林著.——北京:电子工业出版社,2015.8,ISBN:978-7-121-26660-7. 模式识别及MATLAB实现 [专著] / 杨杰主编.——北京:电子工业出版社,2017.8,ISBN:978-7-121-32127-6. 二、注解代码程序 1、重塑训练数据-T() function T = CreateDatabase(TrainDatabasePath) %此函数重塑训练数据库的所有2D图像放入一维列向量中。 %然后,将这些一维列向量放在一行中构造2D矩阵“ T”。 %一个2D矩阵,包含所有1D图像矢量。 %假设训练数据库中的所有P图像的MxN大小相同。

数据降维(PCA、KPCA、PPCA)及C++实现

流过昼夜 提交于 2020-01-10 22:13:31
1、何为数据降维 1.1维数灾难:往往满足采样条件所需的样本数目巨大、样本稀疏、距离计算困难。 1.2降维:利用数学变换将原始高维属性空间转变为低维“子空间”,即在高维采样数据中提取能够表达原始数据的特征。 1.3 降维优点:数据集更易懂、使用和显示;降低算法计算开销;去除噪声。 2、一些降维算法 Principal Component Analysis (PCA) Linear Discriminant Analysis(LDA) Locally linear embedding(LLE) Laplacian Eigenmaps 本文主要针对以下三种算法: 2.1 PCA:PCA算法是一种线性投影技术,利用降维后使数据的方差最大原则保留尽可能多的信息; 2.2 KPCA:PCA仅考虑了数据的二阶统计信息,而没有利用高阶统计信息,忽略了数据的非线性相关性,而KPCA,通过非线性变换将数据映射到了高维,在高维空间中进行特征提取,获得了更好的特征提取性能; 2.3 PPCA:PCA没有将数据的概率分布考虑,PPCA对PCA做了概率上的解释,延伸了PCA算法。 总之:PPCA和KPCA都是针对PCA算法的缺陷,做出了不同方向上的改进。 3 PCA、KPCA、PPCA算法步骤 3.1 PCA: 数据在低维线性空间上的正交投影,这个线性空间被称为主子空间,使得投影数据的方差被最大化。

Principal component analysis in R with prcomp and by myself: different results

…衆ロ難τιáo~ 提交于 2020-01-10 10:44:01
问题 Where do I am wrong? I am trying to perform PCA through prcomp and by myself, and I get different results, can you please help me? DOING IT BY MYSELF: >database <- read.csv("E:/R/database.csv", sep=";", dec=",") #it's a 105 rows x 8 columns, each column is a variable >matrix.cor<-cor(database) >standardize<-function(x) {(x-mean(x))/sd(x)} >values.standard<-apply(database, MARGIN=2, FUN=standardize) >my.eigen<-eigen(matrix.cor) >loadings<-my.eigen$vectors >scores<-values.standard %*% loadings

Plot PCA loadings and loading in biplot in sklearn (like R's autoplot)

守給你的承諾、 提交于 2020-01-09 13:09:29
问题 I saw this tutorial in R w/ autoplot . They plotted the loadings and loading labels: autoplot(prcomp(df), data = iris, colour = 'Species', loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3) https://cran.r-project.org/web/packages/ggfortify/vignettes/plot_pca.html I prefer Python 3 w/ matplotlib, scikit-learn, and pandas for my data analysis. However, I don't know how to add these on? How can you plot these vectors w/ matplotlib ? I've been reading

Plot PCA loadings and loading in biplot in sklearn (like R's autoplot)

强颜欢笑 提交于 2020-01-09 13:08:38
问题 I saw this tutorial in R w/ autoplot . They plotted the loadings and loading labels: autoplot(prcomp(df), data = iris, colour = 'Species', loadings = TRUE, loadings.colour = 'blue', loadings.label = TRUE, loadings.label.size = 3) https://cran.r-project.org/web/packages/ggfortify/vignettes/plot_pca.html I prefer Python 3 w/ matplotlib, scikit-learn, and pandas for my data analysis. However, I don't know how to add these on? How can you plot these vectors w/ matplotlib ? I've been reading

Extracting Principal Components in FactoMiner R

安稳与你 提交于 2020-01-06 19:58:19
问题 I am trying to extract the principal components for a covariance matrix using PCA in FactoMiner. However, for some reason , I only see n-1 components in the var-->coord variable library(FactoMineR) x = matrix(rnorm(10000),nrow = 100,ncol = 100) y = PCA(x,ncp = 100,graph = FALSE) dim(y$var$coord) This leads to an output of 100 99. I am new to this package and hope to get more insights 回答1: The maximum number of dimensions in a principal component analysis performed on p variables and n

How to extract Principal Components using FactoMineR package

梦想的初衷 提交于 2020-01-06 05:29:09
问题 I have a dataset with a mixture of categorical and numeric features. I have used the FAMD function from the FactoMineR package to perform Principal Component Analysis. However, I am unable to figure out a way to extract them into another dataframe, so that I can perform Principal Component Regression. I have attached an image of a subset my data. #Performing Principal Component Analysis on Mixed Data library(FactoMineR) pca = FAMD(train[3:378],ncp=10) #Displaying the eigen value matrix pca

Normalize PCA with scikit-learn when data is split

余生长醉 提交于 2020-01-04 06:39:08
问题 I have a followup question on: How to normalize with PCA and scikit-learn. I'm creating an emotion detection system and what I do now is: Split data over all emotion (distributing data over multiple subsets). Add all data together (the multiple subsets into 1 set) Get PCA parameters of combined data (self.pca = RandomizedPCA(n_components=self.n_components, whiten=True).fit(self.data)) Per emotion (per subset), apply PCA to data of that emotion (subset). I should do the normalization at: step