correlation

Delete weak correlations from network in igraph (vertices and edges)

送分小仙女□ 提交于 2019-12-05 18:19:35
I need to plot a network from a correlation matrix. A small subset of my data: Taxon CD1 CD2 Actinomycetaceae;g__Actinomyces 0.072998825 0.031399459 Coriobacteriaceae;g__Atopobium 0.040946468 0.002703265 Corynebacteriaceae;g__Corynebacterium 0.002517201 0.006446247 Micrococcaceae;g__Rothia 0.001174694 0.002703265 Porphyromonadaceae;g__Porphyromonas 0.023326061 0.114368892 Prevotellaceae;g__Prevotella 0.252894781 0.102308172 Flavobacteriaceae;g__Capnocytophaga 0.001174694 0.029320025 Aerococcaceae;g__Abiotrophia 0.002013761 0.003327095 Carnobacteriaceae;g__Granulicatella 0.042960228 0.049490539

partial correlation coefficient in pandas dataframe python

扶醉桌前 提交于 2019-12-05 18:00:57
I have a data in pandas dataframe like: df = X1 X2 X3 Y 0 1 2 10 5.077 1 2 2 9 32.330 2 3 3 5 65.140 3 4 4 4 47.270 4 5 2 9 80.570 and I want to do multiple regression analysis. Here Y is dependent variables and x1, x2 and x3 are independent variables. correlation between each independent variables with dependent variable is: df.corr(): X1 X2 X3 Y X1 1.000000 0.353553 -0.409644 0.896626 X2 0.353553 1.000000 -0.951747 0.204882 X3 -0.409644 -0.951747 1.000000 -0.389641 Y 0.896626 0.204882 -0.389641 1.000000 ​As we can see here y has highest correlation with x1 so i have selected x1 as first

Plot networks with igraph

血红的双手。 提交于 2019-12-05 17:29:51
I want to create a network from a correlation matrix and plot it. I'm trying to use igraph for this. This is a subset of my data. mydata Taxon CD1 CD2 Actinomycetaceae;g__Actinomyces 0.072998825 0.031399459 Coriobacteriaceae;g__Atopobium 0.040946468 0.002703265 Corynebacteriaceae;g__Corynebacterium 0.002517201 0.006446247 Micrococcaceae;g__Rothia 0.001174694 0.002703265 Porphyromonadaceae;g__Porphyromonas 0.023326061 0.114368892 Prevotellaceae;g__Prevotella 0.252894781 0.102308172 Flavobacteriaceae;g__Capnocytophaga 0.001174694 0.029320025 Aerococcaceae;g__Abiotrophia 0.002013761 0.003327095

Intraclass Correlation in Python Module?

风流意气都作罢 提交于 2019-12-05 15:23:55
问题 I'm looking to calculate intraclass correlation (ICC) in Python. I haven't been able to find an existing module that has this feature. Is there an alternate name, or should I do it myself? I'm aware this question was asked a year ago on Cross Validated by another user, but there were no replies. I am looking to compare the continuous scores between two raters. 回答1: You can find an implementation at ICC or Brain_Data.icc 回答2: There are several implementations of the ICC in R. These can be used

Python Scipy spearman correlation for matrix does not match two-array correlation nor pandas.Data.Frame.corr()

人走茶凉 提交于 2019-12-05 10:28:38
I was computing spearman correlations for matrix. I found the matrix input and two-array input gave different results when using scipy.stats.spearmanr . The results are also different from pandas.Data.Frame.corr . from scipy.stats import spearmanr # scipy 1.0.1 import pandas as pd # 0.22.0 import numpy as np #Data X = pd.DataFrame({"A":[-0.4,1,12,78,84,26,0,0], "B":[-0.4,3.3,54,87,25,np.nan,0,1.2], "C":[np.nan,56,78,0,np.nan,143,11,np.nan], "D":[0,-9.3,23,72,np.nan,-2,-0.3,-0.4], "E":[78,np.nan,np.nan,0,-1,-11,1,323]}) matrix_rho_scipy = spearmanr(X,nan_policy='omit',axis=0)[0] matrix_rho

R - Warning message: “In cor(…): the standard deviation is zero”

时光怂恿深爱的人放手 提交于 2019-12-05 09:41:24
问题 I have a single vector of flow data (29 data) and a 3D matrix data(360*180*29) i want to find the correlation between single vector and 3D vector. The correlation matrix will have a size of 360*180. > str(ScottsCk_flow_1981_2010_JJA) num [1:29] 0.151 0.644 0.996 0.658 1.702 ... > str(ssta_winter) num [1:360, 1:180, 1:29] NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... > summary(ssta_winter) Min. 1st Qu. Median Mean 3rd Qu. Max. NA's -2.8 -0.2 0.1 0.2 0.6 6.0 596849.0 This above is the structure

Scipy: distance correlation is higher than 1

断了今生、忘了曾经 提交于 2019-12-05 08:18:44
I'm trying to find distance correlation between columns, look at the code below. Most of time it returns higher than 1 result, which is not possible, because distance correlation is between 0 and 1. You can read about scipy's distance correlation here . import numpy as np from scipy.spatial import distance x = np.random.uniform(-1, 1, 10000) print distance.correlation(x, x**2) 1.00210811815 What is wrong here or how can I measure it? upd1: Link to issue on github I don't see why this is a problem according to the documentation. From the documentation : The correlation distance between u and v,

Similarity between two data sets or arrays

自闭症网瘾萝莉.ら 提交于 2019-12-05 08:01:14
Let's say I have a dataset that look like this: {A:1, B:3, C:6, D:6} I also have a list of other sets to compare my specific set: {A:1, B:3, C:6, D:6}, {A:2, B:3, C:6, D:6}, {A:99, B:3, C:6, D:6}, {A:5, B:1, C:6, D:9}, {A:4, B:2, C:2, D:6} My entries could be visualized as a Table (with four columns, A, B, C, D, and E). How can I find the set with the most similarity? For this example, row 1 is a perfect match and row 2 is a close second, while row 3 is quite far away. I am thinking of calculating a simple delta, for example: Abs(a1 - a2) + Abs(b1 - b2) + etc and perhaps get a correlation

Pearson's Coefficient and Covariance calculation in Matlab

风格不统一 提交于 2019-12-05 05:10:49
I want to calculate Pearson's correlation coefficent in Matlab (without using Matlab's corr function). Simply, I have two vectors A and B (each of them is 1x100) and I am trying to calculate the Pearson's coefficient like this: P = cov(x, y)/std(x, 1)std(y,1) I am using Matlab's cov and std functions. What I don't get is, the cov function returns me a square matrix like this: corrAB = 0.8000 0.2000 0.2000 4.8000 But I expect a single number as the covariance so I can come up with a single P (pearson's coefficient) number. What is the point I'm missing? I think you're just confused with

Defining a function that calculates the covariance-matrix of a correlation-matrix

流过昼夜 提交于 2019-12-05 02:44:44
问题 I have some problems with the transformation of a matrix and the names of the rows and columns. My problem is as follows: As input-matrix I have a (symmetric) correlation matrix like this one: The correlation-vector is given by the values of the lower triangular matrix: Now, I want to compute the variance-covariance-matrix of the these correlations, which are approximately normally distributed with the variance-covariance-matrix : The variances can be approximated by -> N is the sample size