pearson

Pearson Correlation without using zero element in Matlab

我是研究僧i 提交于 2019-12-10 11:48:03
问题 I have 2 example vector in Matlab : A = [5,3,3,0,4,1,5,0,2,5,5,0,5,3,4,0,1,4,4,0,4,2]; B = [1,0,0,0,1,0,4,0,0,0,0,4,4,0,1,0,0,0,0,0,0,0]; When, I try to calculate pearson correlation with manual method and do it with excel I have the same result ( 0.667 ) 1 0,667 0,667 1 But when I tried in MatLab with simple code: pearson = corr(A',B'); it return the result with different score ( 0,2139 ). 1 0,2139 0,2139 1 Maybe Its happen because the zero score(0) is using to calculate it. In happen

ValueError: shape mismatch: objects cannot be broadcast to a single shape

无人久伴 提交于 2019-12-09 14:03:41
问题 I am using the SciPy's pearsonr(x,y) method and I cannot figure out why the following error is happening: ValueError: shape mismatch: objects cannot be broadcast to a single shape It computes the first two (I am running several thousand of these tests in a loop) and then dies. Does anyone have any ideas about what the problem might be? r_num = n*(np.add.reduce(xm*ym)) this is the line in the pearsonr method that the error occurs on, any help would be much appreciated. 回答1: This particular

k means clustering algorithm

白昼怎懂夜的黑 提交于 2019-12-09 13:51:13
问题 I want to perform a k means clustering analysis on a set of 10 data points that each have an array of 4 numeric values associated with them. I'm using the Pearson correlation coefficient as the distance metric. I did the first two steps of the k means clustering algorithm which were: 1) Select a set of initial centres of k clusters. [I selected two initial centres at random] 2) Assign each object to the cluster with the closest centre. [I used the Pearson correlation coefficient as the

How to loop subset of lists in R?

你离开我真会死。 提交于 2019-12-08 05:48:31
问题 I have a list of 9 lists, see the following code where I want to loop only three lists p , r and t for Pearson, Spearson and Kendall correlations, respectively, instead of all 9 lists. The current pseudocode is the following where the test function is corrplot(M.cor, ...) , see below the complete pseudocode for (i in p.mat.all) { ... } Code with mtcars test data library("psych") library("corrplot") M <- mtcars M.cor <- cor(M) p.mat.all <- psych::corr.test(M.cor, method = c("pearson", "kendall

What is wrong with the pearson algorithm from “Programming Collective Intelligence”?

对着背影说爱祢 提交于 2019-12-05 11:08:58
This function is from the book "Programming Collective Intelligence”, and is supposed to calculate the Pearson correlation coefficient for p1 and p2, which is supposed to be a number between -1 and 1. If two critics rate items very similarly the function should return 1, or close to 1. With real user data I sometimes get weird results. In the following example the dataset critics2 should return 1 - instead it returns 0. Does anyone spot a mistake? (This is not a duplicate of What is wrong with this python function from “Programming Collective Intelligence” ) from __future__ import division

Pearson's Coefficient and Covariance calculation in Matlab

风格不统一 提交于 2019-12-05 05:10:49
I want to calculate Pearson's correlation coefficent in Matlab (without using Matlab's corr function). Simply, I have two vectors A and B (each of them is 1x100) and I am trying to calculate the Pearson's coefficient like this: P = cov(x, y)/std(x, 1)std(y,1) I am using Matlab's cov and std functions. What I don't get is, the cov function returns me a square matrix like this: corrAB = 0.8000 0.2000 0.2000 4.8000 But I expect a single number as the covariance so I can come up with a single P (pearson's coefficient) number. What is the point I'm missing? I think you're just confused with

Extracting and formatting results of cor.test on multiple pairs of columns

匿名 (未验证) 提交于 2019-12-03 09:14:57
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 由 翻译 强力驱动 问题: I am trying to generate a table output of a correlation matrix. Specifically, I am using a for loop in order to identify a correlation between all data in columns 4:40 to column 1. While the results of the table are decent, it does not identify what is being compared to what. In checking attributes of cor.test ,I find that data.name is being given as x[1] and y[1] which is not good enough to trace back which columns is being compared to what. Here is my code: input <- read . delim ( file = "InputData.txt" , header = TRUE ) x <-

Constructing correlated variables

浪子不回头ぞ 提交于 2019-12-01 11:00:21
I have a variable with a given distribution (normale in my below example). set.seed(32) var1 = rnorm(100,mean=0,sd=1) I want to create a variable (var2) that is correlated to var1 with a linear correlation coefficient (roughly or exactly) equals to "Corr". The slope of regression between var1 and var2 should (rougly or exactly) equals 1. Corr = 0.3 How can I achieve this? I wanted to do something like this: decorelation = rnorm(100,mean=0,sd=1-Corr) var2 = var1 + decorelation But of course when running: cor(var1,var2) The result is not close to Corr! I did something similar a while ago. I am

What is wrong with this python function from “Programming Collective Intelligence”?

只谈情不闲聊 提交于 2019-11-30 21:52:00
This is the function in question. It calculates the Pearson correlation coefficient for p1 and p2, which is supposed to be a number between -1 and 1. When I use this with real user data, it sometimes returns a number greater than 1, like in this example: def sim_pearson(prefs,p1,p2): si={} for item in prefs[p1]: if item in prefs[p2]: si[item]=1 if len(si)==0: return 0 n=len(si) sum1=sum([prefs[p1][it] for it in si]) sum2=sum([prefs[p2][it] for it in si]) sum1Sq=sum([pow(prefs[p1][it],2) for it in si]) sum2Sq=sum([pow(prefs[p2][it],2) for it in si]) pSum=sum([prefs[p1][it]*prefs[p2][it] for it

What is wrong with this python function from “Programming Collective Intelligence”?

五迷三道 提交于 2019-11-30 17:19:33
问题 This is the function in question. It calculates the Pearson correlation coefficient for p1 and p2, which is supposed to be a number between -1 and 1. When I use this with real user data, it sometimes returns a number greater than 1, like in this example: def sim_pearson(prefs,p1,p2): si={} for item in prefs[p1]: if item in prefs[p2]: si[item]=1 if len(si)==0: return 0 n=len(si) sum1=sum([prefs[p1][it] for it in si]) sum2=sum([prefs[p2][it] for it in si]) sum1Sq=sum([pow(prefs[p1][it],2) for