Diffrent PCA plots

守給你的承諾、 提交于 2019-12-11 15:17:16

问题


I was trying to to learn pca(using the iris dataset) with python and i got some results,so i wanted to test the results ir R to make sure it was good.When i checked the results,it gave me a mirror diagram that of python(in the y axis),and the negative numeric sign in some of the values(python: [140,1]=0.1826089,r[141,2]=-0.1826089[python counts form zero]).

The python code:

import numpy as np
import matplotlib.pyplot as plt
import sklearn.decomposition as p
data=np.loadtxt("sample_data/iris.txt",delimiter=';',usecols=(0,1,2,3))
pca=p.PCA().fit(data)
pcaData=pca.transform(data)
plt.scatter(pcaData[:,0],pcaData[:,1])
print(pcaData[140,1])

My python diagram

The R code:

data=read.csv("C:\\Users\\George\\Desktop\\iris.csv",sep=";",colClasses=c(NA, NA, NA,NA,"NULL"));data=data[-151,]
pca=prcomp(data)
plot(pca$x[,1],pca$x[,2])
print(pca$x[141,2])

My R diagram

In search i did on the internet,i found the same happens.

The R diagram on the internet-Source

The Python diagram on the internet-Source.

I was expecting to be the same. Is somthing that i do not understand well?

Thank you.


回答1:


ScikitLearn uses a pseudo-randomized method to determine an approximation of the singular value decomposition.

see https://scikit-learn.org/stable/modules/generated/sklearn.utils.extmath.randomized_svd.html

Therefore, unless you can guarantee that the methods are the same and use the same random seed, you will not obtain exactly the same values for the principal components.



来源:https://stackoverflow.com/questions/56253444/diffrent-pca-plots

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!