pca

Rubost PCA 优化

你说的曾经没有我的故事 提交于 2019-12-04 19:51:59
Rubost PCA 优化 2017-09-03 13:08:08 YongqiangGao 阅读数 2284 更多 分类专栏: 背景建模 版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。 本文链接: https://blog.csdn.net/u010510350/article/details/77803572 最近一直在看Robust PCA做背景建模的paper, 顺便总结了一下了Robust PCA.前面一篇博客介绍了 PCA与Robust PCA区别 ,本篇博客总结Robust PCA 常见的优化方法,欢迎交流学习。在这里强烈推荐一篇博客 Rachel Zhang的Robust PCA 学习笔记 。 1.预备知识 2.问题描述 许多实际应用中已知的数据矩阵D往往是低秩或近似低秩的,但存在随机幅值任意大且分布稀疏的误差破坏了原有数据的低秩性,为了恢复矩阵D的低秩结构,可将矩阵D分解为两个矩阵之和,即D=A+E,其中矩阵A和E未知,但A是低秩的,E是稀疏的。 当矩阵E的元素服从独立同分布的高斯分布时,可用经典的PCA来获得最优的矩阵A,即转换为如下最优化问题: 当E为稀疏的大噪声矩阵时,同时引入折中因此,此问题可转化为如下优化问题: 上式中秩函数、0范数均非凸,变成了NP-hard问题,需要对其松弛,方可进行优化

What's wrong with my PCA?

女生的网名这么多〃 提交于 2019-12-04 19:06:07
问题 My code: from numpy import * def pca(orig_data): data = array(orig_data) data = (data - data.mean(axis=0)) / data.std(axis=0) u, s, v = linalg.svd(data) print s #should be s**2 instead! print v def load_iris(path): lines = [] with open(path) as input_file: lines = input_file.readlines() data = [] for line in lines: cur_line = line.rstrip().split(',') cur_line = cur_line[:-1] cur_line = [float(elem) for elem in cur_line] data.append(array(cur_line)) return array(data) if __name__ == '__main__'

Omit NA and data imputation before doing PCA analysis using R

北城余情 提交于 2019-12-04 15:57:53
I am trying to do PCA analysis using princomp function in R. The following is the example code: mydf <- data.frame ( A = c("NA", rnorm(10, 4, 5)), B = c("NA", rnorm(9, 4, 5), "NA"), C = c("NA", "NA", rnorm(8, 4, 5), "NA") ) out <- princomp(mydf, cor = TRUE, na.action=na.exclude) Error in cov.wt(z) : 'x' must contain finite values only I tried to remove the NA from the dataset, but it does not work. ndnew <- mydf[complete.cases(mydf),] A B C 1 NA NA NA 2 1.67558617743171 1.28714736288378 NA 3 -1.03388645096478 9.8370942023751 10.9522215389562 4 7.10494481721949 14.7686678743866 4.06560213642725

Extracting PCA components with sklearn

别来无恙 提交于 2019-12-04 15:44:45
问题 I am using sklearn's PCA for dimensionality reduction on a large set of images. Once the PCA is fitted, I would like to see what the components look like. One can do so by looking at the components_ attribute. Not realizing that was available, I did something else instead: each_component = np.eye(total_components) component_im_array = pca.inverse_transform(each_component) for i in range(num_components): component_im = component_im_array[i, :].reshape(height, width) # do something with

scikits-learn pca dimension reduction issue

社会主义新天地 提交于 2019-12-04 14:53:51
I have a problem with reduction dimension using scikit-learn and PCA. I have two numpy matrices, one has size (1050,4096) and another has size (50,4096). I tried to reduce the dimensions of both to yield (1050, 399) and (50,399) but, after doing the pca I got (1050,399) and (50,50) matrices. One matrix is for knn training and another for knn test. What's wrong with my code below? pca = decomposition.PCA() pca.fit(train) pca.n_components = 399 train_reduced = pca.fit_transform(train) pca.n_components = 399 pca.fit(test) test_reduced = pca.fit_transform(test) Call fit_transform() on train,

PCA众享车时代模式定制

倾然丶 夕夏残阳落幕 提交于 2019-12-04 13:38:34
PCA众享车时代模式定制▋苏生191微5743电0729▋PCA众享车时代系统开发、PCA众享车时代软件开发、PCA众享车时代现成软件源码、PCA众享车时代app开发、PCA众享车时代源码开发、PCA众享车时代区块预约游戏开发、PCA众享车时代app源码开发、PCA众享车时代模式开发、PCA众享车时代预约收益软件开发、PCA众享车时代模式定制开发 移动互联网时代,用户获取信息的场景和使用目的变得越来越多元和细分化,PC端的图文搜索方式已经不再符合移动端的特点,从海量数据中挖掘有效信息并快速反馈给用户变得越来越重要。当然,这事也变得相当困难,这需要搜索引擎能够“理解”用户的需求。 PCA众享车时代系统开发 LOOM币产出收益及团队收益 一级推荐收益10% 二级推荐收益8% 三级推荐收益5% 四级推荐收益3% 五级推荐收益1% 达到五级车友可以享受1%无限代收益 直推车友达到10人以上且有效雇佣众享汽车3次以上,奖励1000推荐收益 直推车友达到20人以上且有效雇佣众享汽车3次以上,奖励2500推荐收益 直推车友达到30人以上且有效雇佣众享汽车3次以上,奖励4000推荐收益 建群人数达到200人以上(活跃人数达到50%)平台奖励2000汽油且奖励推荐收益1000 建群人数达到300人以上(活跃人数达到50%)平台奖励4500汽油且奖励推荐收益1500 建群人数达到400人以上

PCA众享车时代现成app开发

随声附和 提交于 2019-12-04 13:38:09
PCA众享车时代现成app开发▋苏生191微5743电0729▋PCA众享车时代系统开发、PCA众享车时代软件开发、PCA众享车时代现成软件源码、PCA众享车时代app开发、PCA众享车时代源码开发、PCA众享车时代区块预约游戏开发、PCA众享车时代app源码开发、PCA众享车时代模式开发、PCA众享车时代预约收益软件开发、PCA众享车时代模式定制开发 传统企业在寻求与移动互联网的深度融合过程中,所体现出的营销短视及技术瓶颈,是所有传统企业面临的首要难题。在传统互联网时代,传统企业想要嫁接互联网营销模式,需要付出极为沉重的代价。 PCA众享车时代系统开发 LOOM币产出收益及团队收益 一级推荐收益10% 二级推荐收益8% 三级推荐收益5% 四级推荐收益3% 五级推荐收益1% 达到五级车友可以享受1%无限代收益 直推车友达到10人以上且有效雇佣众享汽车3次以上,奖励1000推荐收益 直推车友达到20人以上且有效雇佣众享汽车3次以上,奖励2500推荐收益 直推车友达到30人以上且有效雇佣众享汽车3次以上,奖励4000推荐收益 建群人数达到200人以上(活跃人数达到50%)平台奖励2000汽油且奖励推荐收益1000 建群人数达到300人以上(活跃人数达到50%)平台奖励4500汽油且奖励推荐收益1500 建群人数达到400人以上(活跃人数达到50%

Principal Component Analysis on Weka

ⅰ亾dé卋堺 提交于 2019-12-04 10:01:15
I have just computed PCA on a training set and Weka returned me the new attributes with the way in which they were selected and computed. Now, I want to build a model using these data and then use the model on a test set. Do you know if there is a way to automatically modify the test set according to the new type of attributes? Do you need the principal components for analysis or just to feed into the classifier? If not just use the Meta->FilteredClassifier classifier. Set the filter to PrincipalComponents and and the classifier to whatever classifier you want to use. Train it on the un

PCA Dimensionality Reduction

我是研究僧i 提交于 2019-12-04 09:39:48
问题 I am trying to perform PCA reducing 900 dimensions to 10. So far I have: covariancex = cov(labels); [V, d] = eigs(covariancex, 40); pcatrain = (trainingData - repmat(mean(traingData), 699, 1)) * V; pcatest = (test - repmat(mean(trainingData), 225, 1)) * V; Where labels are 1x699 labels for chars (1-26). trainingData is 699x900, 900-dimensional data for the images of 699 chars. test is 225x900, 225 900-dimensional chars. Basically I want to reduce this down to 225x10 i.e. 10 dimensions but am

How to programmatically determine the column indices of principal components using FactoMineR package?

こ雲淡風輕ζ 提交于 2019-12-04 07:29:49
Given a data frame containing mixed variables (i.e. both categorical and continuous) like, digits = 0:9 # set seed for reproducibility set.seed(17) # function to create random string createRandString <- function(n = 5000) { a <- do.call(paste0, replicate(5, sample(LETTERS, n, TRUE), FALSE)) paste0(a, sprintf("%04d", sample(9999, n, TRUE)), sample(LETTERS, n, TRUE)) } df <- data.frame(ID=c(1:10), name=sample(letters[1:10]), studLoc=sample(createRandString(10)), finalmark=sample(c(0:100),10), subj1mark=sample(c(0:100),10),subj2mark=sample(c(0:100),10) ) I perform unsupervised feature selection