pca

Face recognition - Python

坚强是说给别人听的谎言 提交于 2019-12-14 00:31:42
问题 I am trying to make face recognition by Principal Component Analysis (PCA) using python. Now I am able to get the minimum euclidean distance between the training images images and the input image input_image . Here is my code: import os from PIL import Image import numpy as np import glob import numpy.linalg as linalg #Step1: put database images into a 2D array filenames = glob.glob('C:\\Users\\me\\Downloads\\/*.pgm') filenames.sort() img = [Image.open(fn).convert('L').resize((90, 90)) for fn

R - 'princomp' can only be used with more units than variables

爷,独闯天下 提交于 2019-12-14 00:22:20
问题 I am using R software (R commander) to cluster my data. I have a smaller subset of my data containing 200 rows and about 800 columns. I am getting the following error when trying kmeans cluster and plot on a graph. "'princomp' can only be used with more units than variables" I then created a test doc of 10 row and 10 columns whch plots fine but when I add an extra column I get te error again. Why is this? I need to be able to plot my cluster. When I view my data set after performing kmeans on

PCA decomposition with python: features relevances

大憨熊 提交于 2019-12-13 16:30:04
问题 I'm following now next topic: How can I use PCA/SVD in Python for feature selection AND identification? Now, we decompose our data set in Python with PCA method and use for this the sklearn.decomposition.PCA With the usage of attributes components_ we get all components. Now we have very similar goal: want take only first several components (this part is not a problem) and see, what the input features proportions has every PCA component (to know, which features are much important for us). How

how to color points in 17 colors based on principal component? [closed]

做~自己de王妃 提交于 2019-12-13 09:47:33
问题 Closed . This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 4 years ago . I am doing PCA in R on a data frame(df_f) pc_gtex <- prcomp(df_f) plot(pc_gtex$x[,1], pc_gtex$x[,2], col=gtex_group, main = "PCA", xlab = "PC1", ylab = "PC2") legend("topleft", col=1:17, legend = paste(unique(gtex_pm$tissue), 1:17), pch = 20, bty='n', cex=1.5) Below is my group table for the PCA

Colouring a PCA plot by clusters in R

♀尐吖头ヾ 提交于 2019-12-13 09:28:15
问题 I have some biological data that looks like this, with 2 different types of clusters (A and B): Cluster_ID A1 A2 A3 B1 B2 B3 5 chr5:100947454..100947489,+ 3.31322 7.52365 3.67255 21.15730 8.732710 17.42640 12 chr5:101227760..101227782,+ 1.48223 3.76182 5.11534 15.71680 4.426170 13.43560 29 chr5:102236093..102236457,+ 15.60700 10.38260 12.46040 6.85094 15.551400 7.18341 I clean up the data: CAGE<-read.table("CAGE_expression_matrix.txt", header=T) CAGE_data <- as.data.frame(CAGE) #Remove

Using memmap files for batch processing

喜欢而已 提交于 2019-12-13 07:24:29
问题 I have a huge dataset on which I wish to PCA. I am limited by RAM and computational efficency of PCA. Therefore, I shifted to using Iterative PCA. Dataset Size-(140000,3504) The documentation states that This algorithm has constant memory complexity, on the order of batch_size, enabling use of np.memmap files without loading the entire file into memory. This is really good, but unsure on how take advantage of this. I tried load one memmap hoping it would access it in chunks but my RAM blew.

Why are there differences between GPArotation::Varimax and stats::varimax?

给你一囗甜甜゛ 提交于 2019-12-13 06:06:38
问题 There are (at least) two different ways to varimax-rotate a loadings matrix in R, GPArotation::Varimax and stats::varimax . Oddly, even if the Kaiser-Normalization is enabled for both, they yield subtly different results. That's a bit of a pain for testing. library(GPArotation) library(psych) data("Thurstone") principal.unrotated <- principal(r = Thurstone, nfactors = 4, rotate = "none") # find unrotated PCs first loa <- unclass(principal.unrotated$loadings) varimax.stats <- stats::varimax(x

package “fdapace” (R) - create a functional plot of the first principal component

拟墨画扇 提交于 2019-12-13 03:46:22
问题 My question is about functional principal component analysis in R. I am working with a multi-dimensional time series looking something like this: My goal is to reduce the dimensions by applying functional PCA and then plot the first principal component like this: I have already used the FPCA function of the fdapace package on the dataset. Unfortunately, I don't understand how to interpret the resulting matrix of the FPCA estimates ( xiEst ). In my understanding the values of the Principal

[学习笔记] L1-PCA

半世苍凉 提交于 2019-12-13 02:27:59
L1-PCA Intro PCA的本质就是从高维空间向低维空间投影,投影的本质又是左乘(或右乘)一个向量(表征原来特征空间到投影后特征空间的权重),经过线性加权,转换到低维空间表征,如果将向量换成矩阵(假设由m个向量组成),则是将原高维空间降维到m维空间上去。 L1-PCA解决的问题是outlier问题(一般PCA假设噪声服从高斯分布,如果有异常不服从,那么此时PCA的效果将非常差),一般PCA是对outlier比较敏感的,而L1-PCA不对outlier敏感. PCA回顾 有必要从数学角度理解一下PCA。 就像上面我说的,PCA本质是做一种变换,一种变换往往可以通过矩阵乘积来表征,因此: \[ 定义向量a \in R^{p \times 1},特征矩阵X\in R^{n \times p},那么将X降维到1维是相当简单:\\X' = Xa,x'\in R^{n \times 1} \] 在我们学过的信号处理书中,我们知道,信号往往具有较大的方差,而噪声的方差是较小的,因此我们就很当然的认为,经过降维后的数据应该具有较大的方差。因此,下一步就是方差最大化: \[ \sigma^2_a = (Xa)^T(Xa) = a^TX^TXa = a^TVa \] 下面的任务就成了最大化方差优化求解a的问题了,我们引入约束条件,利用拉格朗日法: \[ z = a^TVa - \lambda(a

奇异值分解SVD

可紊 提交于 2019-12-13 00:45:46
在介绍奇异值分解(SVD)之前我们先来回顾一下关于矩阵的一些基础知识。 矩阵基础知识 方阵 给定一个$ n×m $的矩阵$ A $,若n和m相等也就是矩阵的行和列相等那矩阵$ A $就是一个方阵。 单位矩阵 在线性代数中,n阶单位矩阵,是一个$ n×n $的方阵,其主对角线元素为1,其余元素为0。单位矩阵以$ mathbf { I } _ { n } $表示。 单位矩阵性质: $$ text { 1. } I _ { n } B _ { n times m } = B _ { n times m } $$ $$ text { 2. } B _ { n times m } I _ { m } = B _ { n times m } $$ $$ text { 3. } A _ { n } I _ { n } = I _ { n } A _ { n } = A _ { n } $$ $$ text { 4. } I _ { n } I _ { n } = I _ { n } $$ 转置 矩阵的转置是最简单的一种矩阵变换。简单来说若$ n×m $的矩阵$ A $的转置为$ A ^ { mathrm { T } } $,则$ A ^ { mathrm { T } } $是一个$ m×n $的矩阵并且有$ mathbf { A } _ { i j } = mathbf { A } _ { j