correlation | 易学教程

How to define 'marginal distribution' in GenOrd package?

阅读更多关于 How to define 'marginal distribution' in GenOrd package?

问题 I'm trying to generate ordinal correlated variables with GenOrd package. However, there are some problems. I'm getting help on this page, which was written by Wesley https://www.r-bloggers.com/simulating-random-multivariate-correlated-data-categorical-variables/ In this page, and most of other explanation about GenOrd package, the marginal distirbution must be set. In this page, the marginal distribution is #The values are cumulative so for the first variable the first marginal will be .1,

R: How to write a for loop that reads every two lines in a matrix?

阅读更多关于 R: How to write a for loop that reads every two lines in a matrix?

问题 I want to calculate correlation statistics using cor.test(). I have a data matrix where the two pairs to be tested are on consecutive lines (I have more than thousand pairs so I need to correct for that also later). I was thinking that I could loop through every two and two lines in the matrix and perform the test (i.e. first test correlation between row1 and row2, then row3 and row4, row5 and row6 etc.), but I don't know how to make this kind of loop. This is how I do the test on a single

Python: Rank order correlation for categorical data

阅读更多关于 Python: Rank order correlation for categorical data

问题 I am somewhat new to programming and statistics, so please help me improve this question if it is formally not correct. I have a lot of parameters and a couple of result vectors I produced in a MonteCarlo simulation. Now I want to test the influence of each parameter for the result. I already got a script working with Kendall's Tau. Now I would like to compare with Spearman and Pearson rho. An example: from scipy.stats import spearmanr, kendalltau, pearsonr result = [106, 86, 100, 101, 99,

correlation for two lists of data

阅读更多关于 correlation for two lists of data

问题 These two lists contain data something like this: a = [1 2 1 3 1 2 1 1 1 2 1 1 2 1 4 1 ] b = [ 3480. 7080. 10440. 13200. 16800. 20400. 23880. 27480. 30840. 38040. 41520. 44880. 48480. 52080. 55680. 59280.] How to find correlation using python by importing rpy2 , I mean cor function. And the o/p has to lie between -1 and +1. 回答1: from rpy2.robjects.vectors import FloatVector from rpy2.robjects.packages import importr stats = importr('stats') a=[1, 2, 1, 3, 1, 2, 1, 1, 1, 2, 1, 1, 2, 1, 4, 1 ]

Split a numeric dataframe into all possible combinations of 2 columns in R

阅读更多关于 Split a numeric dataframe into all possible combinations of 2 columns in R

问题 I am trying to split the columns of a dataframe to find pmcc of all possible combinations of (two)columns from a dataframe containing n columns, e.g. in this case, with 3 columns Length Diameter Height 0.455 0.365 0.095 0.350 0.265 0.090 0.530 0.420 0.135 0.440 0.365 0.125 0.330 0.255 0.22 here I have to find pmcc for all combinations, eg, (length, diameter), (diameter, height), etc. Any help! Thanks 回答1: data.frame(z = rnorm(100, 2), y = rnorm(100, 4), x = rnorm(100, 6)) -> frame combn

Is it possible that numpy.correlate does not follow the given formula?

阅读更多关于 Is it possible that numpy.correlate does not follow the given formula?

问题 The documentation of the numpy.correlate command says that the cross correlation of two arrays is computed as the general definition for signal processing in the way: z[k] = sum_n a[n] * conj(v[n+k]) This does not seem to be the case. It looks like the correlation is flipped. This would mean that either the sign in the last term of the formula is switched z[k] = sum_n a[n] * conj(v[n-k]) or that the two input vectors are in the wrong order. A simple implementation of the given formula would

How to ignore cor.test:“not enough finite observations” and continue, when using tidyverse and ggplot2 (ggpmisc)

阅读更多关于 How to ignore cor.test:“not enough finite observations” and continue, when using tidyverse and ggplot2 (ggpmisc)

问题 I have the following working-toy example: trunctiris <- iris [1:102,] analysis <- trunctiris %>% group_by(Species) %>% nest() %>% mutate(model = map(data, ~lm(Sepal.Length ~ Sepal.Width, data = .)), cor = map(data, ~tidy(cor.test(.x$Sepal.Length, .x$Sepal.Width), 3))) stats <- analysis %>% unnest(cor) ggplot(trunctiris, aes(x = Sepal.Length, y = Sepal.Width)) + geom_point(shape = 21) + geom_text(data = stats, aes(label = sprintf("r = %s", round(estimate, 3)), x = 7, y = 4)) + geom_text(data =

How to find the correlation between continuous and categorical variables in R

阅读更多关于 How to find the correlation between continuous and categorical variables in R

问题 sorry, I edited my question. In R, you can use the cor () function to find the correlation using only Pearson and Spearman correlation between Continuous variables. Which function should I use to get correlation between categorical variable and categorical variable? and Which function should I use to get correlation between categorical variables and Continuous variable Thank you in advance. 来源： https://stackoverflow.com/questions/41053431/how-to-find-the-correlation-between-continuous-and

Pairwise correlation-R code

阅读更多关于 Pairwise correlation-R code

问题 i've got in R a matrix of time series ( zoo object) which shows in each column historical prices for different equity ( within my universe); I have 20 columns ( 20 different stocks ). I transform this first matrix into a matrix of weekly return. My goal is to find a quick R code that help me to calcultate the mean of < the pairwise "rolling 52 weeks weekly return correlation">. It imply that every week, the code will compute one by one the 1y correlation of the stocks(1)'weekly return with

comparing/extracting data from matrices using python (2.6.1)

阅读更多关于 comparing/extracting data from matrices using python (2.6.1)

问题 I have two .csv files containing correlation matrices exported from R. One file contains the P-values and one contains the r-values. The row and column headers match exactly between the two files. I am trying to extract the r-values and corresponding row and column header for pairs only when the P-value < 0.05. Here is a sample of what the data in the r-value input file looks like (I have 1700+ correlated items, rather than only the two shown): Species1 Species2 Species1 1 0.9 Species2 0.9 1