correlation

How to compute P-value and standard error from correlation analysis of R's cor()

不羁的心 提交于 2021-02-06 15:27:14
问题 I have data that contain 54 samples for each condition (x and y). I have computed the correlation the following way: > dat <- read.table("http://dpaste.com/1064360/plain/",header=TRUE) > cor(dat$x,dat$y) [1] 0.2870823 Is there a native way to produce SE of correlation in R's cor() functions above and p-value from T-test? As explained in this web (page 14.6) 回答1: I think that what you're looking for is simply the cor.test() function, which will return everything you're looking for except for

How to compute P-value and standard error from correlation analysis of R's cor()

此生再无相见时 提交于 2021-02-06 15:24:08
问题 I have data that contain 54 samples for each condition (x and y). I have computed the correlation the following way: > dat <- read.table("http://dpaste.com/1064360/plain/",header=TRUE) > cor(dat$x,dat$y) [1] 0.2870823 Is there a native way to produce SE of correlation in R's cor() functions above and p-value from T-test? As explained in this web (page 14.6) 回答1: I think that what you're looking for is simply the cor.test() function, which will return everything you're looking for except for

Correlation between two dataframes by row

瘦欲@ 提交于 2021-02-04 10:22:13
问题 I have 2 data frames w/ 5 columns and 100 rows each. id price1 price2 price3 price4 price5 1 11.22 25.33 66.47 53.76 77.42 2 33.56 33.77 44.77 34.55 57.42 ... I would like to get the correlation of the corresponding rows, basically for(i in 1:100){ cor(df1[i, 1:5], df2[i, 1:5]) } but without using a for-loop. I'm assuming there's someway to use plyr to do it but can't seem to get it right. Any suggestions? 回答1: Depending on whether you want a cool or fast solution you can use either diag(cor

Extract p values and r values for all pairwise variables

孤街浪徒 提交于 2021-01-29 17:22:12
问题 I have multiple variables for multiple countries over multiple years. I would like to generate a dataframe containing both an R^2 value and a P value for each pair of variables. I'm somewhat close, have a minimum working example and an idea of what the end product should look like, but am having some difficulties actually implementing it. If anyone could help, that would be most appreciated. Please note, I would like to do this more manually than using packages like Hmisc as that has created

Is it possible to do running correlation with one fixed series in Python?

人走茶凉 提交于 2021-01-29 00:44:28
问题 I'm wondering if there is a fast way to do running correlation in Python with one fixed series? I've tried to use Pandas and for example: df1.rolling(4).corr(df2). However, it requires two DataFrames to have the same length. Is there a way to do similiar to the above Pandas example, but with one DataFrame being fixed? To clarify, I would want to calculate the correlation coefficent between df2 below and the values in df1. Example: First correlation between df2 and df1.loc[0:3] Second

Is it possible to do running correlation with one fixed series in Python?

天大地大妈咪最大 提交于 2021-01-29 00:42:15
问题 I'm wondering if there is a fast way to do running correlation in Python with one fixed series? I've tried to use Pandas and for example: df1.rolling(4).corr(df2). However, it requires two DataFrames to have the same length. Is there a way to do similiar to the above Pandas example, but with one DataFrame being fixed? To clarify, I would want to calculate the correlation coefficent between df2 below and the values in df1. Example: First correlation between df2 and df1.loc[0:3] Second

Plotting binned correlation of two variables using common axis

Deadly 提交于 2021-01-28 11:40:21
问题 I have three lists that I have loaded into a pandas dataframe. import pandas as pd df = pd.DataFrame({'x': location}) df = df.assign(y1 = variable1) df = df.assign(y2 = variable2) I would like to plot the correlation of y1 with y2 with x being the common x-axis. That is, really, I would like to bin y1 and y2 values according to x location, find the correlation of y1 with y2 within each bin and then plot a line of the correlations across the whole x domain. So my final plot will have

Calculate pearson correlation in python

社会主义新天地 提交于 2021-01-28 09:22:34
问题 I have 4 columns "Country, year, GDP, CO2 emissions" I want to measure the pearson correlation between GDP and CO2emissions for each country. The country column has all the countries in the world and the year has the values "1990, 1991, ...., 2018". 回答1: You should use a groupby grouped with corr() as your aggregation function: country = ['India','India','India','India','India','China','China','China','China','China'] Year = [2018,2017,2016,2015,2014,2018,2017,2016,2015,2014] GDP = [100,98,94

Ready implementation of multivariate Spearman rank correlation

十年热恋 提交于 2021-01-28 00:31:19
问题 I'm looking for a way to calculate multivariate version of Spearman rank correlation $\rho$. Are there any ready to use Python implementation I can use? 回答1: There is one in scipy. 回答2: If now or in the future you will want access to some advanced statistical packages, also consider calling R libraries from Python when needed via the RPy2. And then you can compute spearman using a package such as this. 来源: https://stackoverflow.com/questions/2264609/ready-implementation-of-multivariate

Calculate correlation coefficient between words?

孤街浪徒 提交于 2021-01-27 06:32:09
问题 For a text analysis program, I would like to analyze the co-occurrence of certain words in a text. For example, I would like to see that e.g. the words "Barack" and "Obama" appear more often together (i.e. have a positive correlation) than others. This does not seem to be that difficult. However, to be honest, I only know how to calculate the correlation between two numbers, but not between two words in a text. How can I best approach this problem? How can I calculate the correlation between