pca | 易学教程

Face recognition - Python

阅读更多关于 Face recognition - Python

问题 I am trying to make face recognition by Principal Component Analysis (PCA) using python. Now I am able to get the minimum euclidean distance between the training images images and the input image input_image . Here is my code: import os from PIL import Image import numpy as np import glob import numpy.linalg as linalg #Step1: put database images into a 2D array filenames = glob.glob('C:\\Users\\me\\Downloads\\/*.pgm') filenames.sort() img = [Image.open(fn).convert('L').resize((90, 90)) for fn

R - 'princomp' can only be used with more units than variables

阅读更多关于 R - 'princomp' can only be used with more units than variables

问题 I am using R software (R commander) to cluster my data. I have a smaller subset of my data containing 200 rows and about 800 columns. I am getting the following error when trying kmeans cluster and plot on a graph. "'princomp' can only be used with more units than variables" I then created a test doc of 10 row and 10 columns whch plots fine but when I add an extra column I get te error again. Why is this? I need to be able to plot my cluster. When I view my data set after performing kmeans on

PCA decomposition with python: features relevances

阅读更多关于 PCA decomposition with python: features relevances

问题 I'm following now next topic: How can I use PCA/SVD in Python for feature selection AND identification? Now, we decompose our data set in Python with PCA method and use for this the sklearn.decomposition.PCA With the usage of attributes components_ we get all components. Now we have very similar goal: want take only first several components (this part is not a problem) and see, what the input features proportions has every PCA component (to know, which features are much important for us). How

how to color points in 17 colors based on principal component? [closed]

阅读更多关于 how to color points in 17 colors based on principal component? [closed]

问题 Closed . This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed 4 years ago . I am doing PCA in R on a data frame(df_f) pc_gtex <- prcomp(df_f) plot(pc_gtex$x[,1], pc_gtex$x[,2], col=gtex_group, main = "PCA", xlab = "PC1", ylab = "PC2") legend("topleft", col=1:17, legend = paste(unique(gtex_pm$tissue), 1:17), pch = 20, bty='n', cex=1.5) Below is my group table for the PCA

Colouring a PCA plot by clusters in R

阅读更多关于 Colouring a PCA plot by clusters in R

问题 I have some biological data that looks like this, with 2 different types of clusters (A and B): Cluster_ID A1 A2 A3 B1 B2 B3 5 chr5:100947454..100947489,+ 3.31322 7.52365 3.67255 21.15730 8.732710 17.42640 12 chr5:101227760..101227782,+ 1.48223 3.76182 5.11534 15.71680 4.426170 13.43560 29 chr5:102236093..102236457,+ 15.60700 10.38260 12.46040 6.85094 15.551400 7.18341 I clean up the data: CAGE<-read.table("CAGE_expression_matrix.txt", header=T) CAGE_data <- as.data.frame(CAGE) #Remove

Using memmap files for batch processing

阅读更多关于 Using memmap files for batch processing

问题 I have a huge dataset on which I wish to PCA. I am limited by RAM and computational efficency of PCA. Therefore, I shifted to using Iterative PCA. Dataset Size-(140000,3504) The documentation states that This algorithm has constant memory complexity, on the order of batch_size, enabling use of np.memmap files without loading the entire file into memory. This is really good, but unsure on how take advantage of this. I tried load one memmap hoping it would access it in chunks but my RAM blew.

Why are there differences between GPArotation::Varimax and stats::varimax?

阅读更多关于 Why are there differences between GPArotation::Varimax and stats::varimax?

问题 There are (at least) two different ways to varimax-rotate a loadings matrix in R, GPArotation::Varimax and stats::varimax . Oddly, even if the Kaiser-Normalization is enabled for both, they yield subtly different results. That's a bit of a pain for testing. library(GPArotation) library(psych) data("Thurstone") principal.unrotated <- principal(r = Thurstone, nfactors = 4, rotate = "none") # find unrotated PCs first loa <- unclass(principal.unrotated$loadings) varimax.stats <- stats::varimax(x

package “fdapace” (R) - create a functional plot of the first principal component

阅读更多关于 package “fdapace” (R) - create a functional plot of the first principal component

问题 My question is about functional principal component analysis in R. I am working with a multi-dimensional time series looking something like this: My goal is to reduce the dimensions by applying functional PCA and then plot the first principal component like this: I have already used the FPCA function of the fdapace package on the dataset. Unfortunately, I don't understand how to interpret the resulting matrix of the FPCA estimates ( xiEst ). In my understanding the values of the Principal

[学习笔记] L1-PCA

阅读更多关于 [学习笔记] L1-PCA

L1-PCA Intro PCA的本质就是从高维空间向低维空间投影，投影的本质又是左乘(或右乘)一个向量(表征原来特征空间到投影后特征空间的权重)，经过线性加权，转换到低维空间表征，如果将向量换成矩阵(假设由m个向量组成)，则是将原高维空间降维到m维空间上去。 L1-PCA解决的问题是outlier问题(一般PCA假设噪声服从高斯分布，如果有异常不服从，那么此时PCA的效果将非常差)，一般PCA是对outlier比较敏感的，而L1-PCA不对outlier敏感. PCA回顾有必要从数学角度理解一下PCA。就像上面我说的，PCA本质是做一种变换，一种变换往往可以通过矩阵乘积来表征，因此： \[ 定义向量a \in R^{p \times 1},特征矩阵X\in R^{n \times p},那么将X降维到1维是相当简单：\\X' = Xa,x'\in R^{n \times 1} \] 在我们学过的信号处理书中，我们知道，信号往往具有较大的方差，而噪声的方差是较小的，因此我们就很当然的认为，经过降维后的数据应该具有较大的方差。因此，下一步就是方差最大化： \[ \sigma^2_a = (Xa)^T(Xa) = a^TX^TXa = a^TVa \] 下面的任务就成了最大化方差优化求解a的问题了，我们引入约束条件，利用拉格朗日法： \[ z = a^TVa - \lambda(a

奇异值分解SVD

阅读更多关于奇异值分解SVD

在介绍奇异值分解（SVD）之前我们先来回顾一下关于矩阵的一些基础知识。矩阵基础知识方阵给定一个$ n×m $的矩阵$ A $，若n和m相等也就是矩阵的行和列相等那矩阵$ A $就是一个方阵。单位矩阵在线性代数中，n阶单位矩阵，是一个$ n×n $的方阵，其主对角线元素为1，其余元素为0。单位矩阵以$ mathbf { I } _ { n } $表示。单位矩阵性质： $$ text { 1. } I _ { n } B _ { n times m } = B _ { n times m } $$ $$ text { 2. } B _ { n times m } I _ { m } = B _ { n times m } $$ $$ text { 3. } A _ { n } I _ { n } = I _ { n } A _ { n } = A _ { n } $$ $$ text { 4. } I _ { n } I _ { n } = I _ { n } $$ 转置矩阵的转置是最简单的一种矩阵变换。简单来说若$ n×m $的矩阵$ A $的转置为$ A ^ { mathrm { T } } $，则$ A ^ { mathrm { T } } $是一个$ m×n $的矩阵并且有$ mathbf { A } _ { i j } = mathbf { A } _ { j