pca

Comparing svd and princomp in R

给你一囗甜甜゛ 提交于 2019-12-31 02:03:46
问题 I want to get singular values of a matrix in R to get the principal components, then make princomp(x) too to compare results I know princomp() would give the principal components Question How to get the principal components from $d, $u, and $v (solution of s = svd(x) )? 回答1: One way or another, you should probably look into prcomp , which calculates PCA using svd instead of eigen (as in princomp ). That way, if all you want is the PCA output, but calculated using svd , you're golden. Also, if

Comparing svd and princomp in R

我们两清 提交于 2019-12-31 02:03:12
问题 I want to get singular values of a matrix in R to get the principal components, then make princomp(x) too to compare results I know princomp() would give the principal components Question How to get the principal components from $d, $u, and $v (solution of s = svd(x) )? 回答1: One way or another, you should probably look into prcomp , which calculates PCA using svd instead of eigen (as in princomp ). That way, if all you want is the PCA output, but calculated using svd , you're golden. Also, if

keras autoencoder vs PCA

纵然是瞬间 提交于 2019-12-30 10:18:11
问题 I am playing with a toy example to understand PCA vs keras autoencoder I have the following code for understanding PCA: import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D from sklearn import decomposition from sklearn import datasets iris = datasets.load_iris() X = iris.data pca = decomposition.PCA(n_components=3) pca.fit(X) pca.explained_variance_ratio_ array([ 0.92461621, 0.05301557, 0.01718514]) pca.components_ array([[ 0.36158968, -0.08226889, 0

keras autoencoder vs PCA

左心房为你撑大大i 提交于 2019-12-30 10:18:06
问题 I am playing with a toy example to understand PCA vs keras autoencoder I have the following code for understanding PCA: import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D from sklearn import decomposition from sklearn import datasets iris = datasets.load_iris() X = iris.data pca = decomposition.PCA(n_components=3) pca.fit(X) pca.explained_variance_ratio_ array([ 0.92461621, 0.05301557, 0.01718514]) pca.components_ array([[ 0.36158968, -0.08226889, 0

Why did PCA reduced the performance of Logistic Regression?

牧云@^-^@ 提交于 2019-12-30 07:18:08
问题 I performed Logistic regression on a binary classification problem with data of 50000 X 370 dimensions.I got accuracy of about 90%.But when i did PCA + logistic on data, my accuracy reduced to 10%, I was very shocked to see this result. Can anybody explain what could have gone wrong? 回答1: There is no guarantee that PCA will ever help, or not harm the learning process. In particular - if you use PCA to reduce amount of dimensions - you are removing information from your data, thus everything

PCA first or normalization first?

只谈情不闲聊 提交于 2019-12-29 03:36:08
问题 When doing regression or classification, what is the correct (or better) way to preprocess the data? Normalize the data -> PCA -> training PCA -> normalize PCA output -> training Normalize the data -> PCA -> normalize PCA output -> training Which of the above is more correct, or is the "standardized" way to preprocess the data? By "normalize" I mean either standardization, linear scaling or some other techniques. 回答1: You should normalize the data before doing PCA. For example, consider the

Adding ellipses to a principal component analysis (PCA) plot

纵饮孤独 提交于 2019-12-28 05:21:23
问题 I am having trouble adding grouping variable ellipses on top of an individual site PCA factor plot which also includes PCA variable factor arrows. My code: prin_comp<-rda(data[,2:9], scale=TRUE) pca_scores<-scores(prin_comp) #sites=individual site PC1 & PC2 scores, Waterbody=Row Grouping Variable. #site scores in the PCA plot are stratified by Waterbody type. plot(pca_scores$sites[,1], pca_scores$sites[,2], pch=21, bg=point_colors[data$Waterbody], xlim=c(-2,2), ylim=c(-2,2), xlab=x_axis_text,

PCA主成分分析

我是研究僧i 提交于 2019-12-28 00:23:34
PCA的流程: 代码参考: https://www.cnblogs.com/clnchanpin/p/7199713.html 协方差矩阵的计算 https://docs.scipy.org/doc/numpy/reference/generated/numpy.cov.html 思想: https://www.cnblogs.com/clnchanpin/p/7199713.html 求解协方差矩阵的特征值和特征向量 为什么PCA第一步是进行去掉数据中的平均值? 因为每列数据减去该列的平均值后才能进行协方差计算。 按照特征值的大小进行排序,用到了numpy 中argsort函数 https://blog.csdn.net/maoersong/article/details/21875705 这篇对numpy中的matrix 总结的很好 https://www.cnblogs.com/sumuncle/p/5760458.html 三、特征值和特征向量的应用实例 1、主成分分析(Principle Component Analysis, PCA) (1)方差、协方差、相关系数、协方差矩阵 方差: 协方差: , , **方差是衡量单变量的离散程度,协方差是衡量两个变量的相关程度(亲疏),协方差越大表明两个变量越相似(亲密),协方差越小表明两个变量之间相互独立的程度越大。 相关系数:

机器学习之降维方法

≯℡__Kan透↙ 提交于 2019-12-26 19:06:05
数据降维的 目的 :数据降维,直观地好处是维度降低了,便于计算和可视化,其更深层次的意义在于有效信息的提取综合及无用信息的摈弃。 数据降维的 好处 :降维可以方便数据可视化+数据分析+数据压缩+数据提取等。 降维方法 __ 属性选择 :过滤法;包装法;嵌入法;       | _ 映射方法 _ 线性映射方法:PCA、LDA、SVD分解等             | _ 非线性映射方法:                       |__核方法:KPCA、KFDA等                       |__二维化:                       |__流形学习:ISOMap、LLE、LPP等。             | __ 其他方法:神经网络和聚类 PCA方法简介   主成分分析的思想,就是线性代数里面的K-L变换,就是在均方误差准则下失真最小的一种变换。是将原空间变换到特征向量空间内,数学表示为Ax=λx。   PCA优缺点:   优点:1)最小误差。2)提取了主要信息   缺点:1)计算协方差矩阵,计算量大 LDA方法简介 (1)LDA核心思想:往线性判别超平面的法向量上投影,使得区分度最大(高内聚,低耦合)。   (2)LDA优缺点:   优点:1)简单易于理解   缺点:2)计算较为复杂 (3)问题 之前我们讨论的PCA、ICA也好,对样本数据来言

SVD分解技术详解

时光总嘲笑我的痴心妄想 提交于 2019-12-25 15:58:11
版权声明: 本文由LeftNotEasy发布于 http://leftnoteasy.cnblogs.com , 本文可以被全部的转载或者部分使用,但请注明出处,如果有问题,请联系 wheeleast@gmail.com 前言: 上一次写了关于 PCA与LDA 的文章,PCA的实现一般有两种,一种是用特征值分解去实现的,一种是用奇异值分解去实现的。在上篇文章中便是基于特征值分解的一种解释。特征值和奇异值在大部分人的印象中,往往是停留在纯粹的数学计算中。而且线性代数或者矩阵论里面,也很少讲任何跟特征值与奇异值有关的应用背景。奇异值分解是一个有着很明显的物理意义的一种方法,它可以将一个比较复杂的矩阵用更小更简单的几个子矩阵的相乘来表示,这些小矩阵描述的是矩阵的重要的特性。就像是描述一个人一样,给别人描述说这个人长得浓眉大眼,方脸,络腮胡,而且带个黑框的眼镜,这样寥寥的几个特征,就让别人脑海里面就有一个较为清楚的认识,实际上,人脸上的特征是有着无数种的,之所以能这么描述,是因为人天生就有着非常好的抽取重要特征的能力,让机器学会抽取重要的特征,SVD是一个重要的方法。 在机器学习领域,有相当多的应用与奇异值都可以扯上关系,比如做feature reduction的PCA,做数据压缩(以图像压缩为代表)的算法,还有做搜索引擎语义层次检索的LSI(Latent Semantic