correlation

How does number of points change a FFT in MATLAB

一笑奈何 提交于 2019-12-01 06:35:57
When taking fft(signal, nfft) of a signal, how does nfft change the outcome and why? Can I have a fixed value for nfft, say 2^18 , or do I need to go 2^nextpow2(2*length(signal)-1) ? I am computing the power spectral density(PSD) of two signals by taking the FFT of the autocorrelation, and I want to compare the the results. Since the signals are of different lengths, I am worried if I don't fix nfft, it would make the comparison really hard! There is no inherent reason to use a power-of-two (it just might make the processing more efficient in some circumstances). However, to make the FFTs of

How does number of points change a FFT in MATLAB

拈花ヽ惹草 提交于 2019-12-01 05:32:32
问题 When taking fft(signal, nfft) of a signal, how does nfft change the outcome and why? Can I have a fixed value for nfft, say 2^18 , or do I need to go 2^nextpow2(2*length(signal)-1) ? I am computing the power spectral density(PSD) of two signals by taking the FFT of the autocorrelation, and I want to compare the the results. Since the signals are of different lengths, I am worried if I don't fix nfft, it would make the comparison really hard! 回答1: There is no inherent reason to use a power-of

R: Efficiently locating time series segments with maximal cross-correlation to input segment?

那年仲夏 提交于 2019-12-01 04:08:23
I have a long numerical time series data of approximately 200,000 rows (lets call it Z ). In a loop, I subset x (about 30) consecutive rows from Z at a time and treat them as the query point q . I want to locate within Z the y (~300) most correlated time series segments of length x (most correlated with q ). What is an efficient way to accomplish this? The code below finds the 300 segments you are looking for and runs in 8 seconds on my none too powerful Windows laptop, so it should be fast enough for your purposes. First, it constructs a 30-by-199971 matrix ( Zmat ), whose columns contain all

How to move larger values close to matrix diagonal in a correlation matrix

偶尔善良 提交于 2019-11-30 22:40:41
I have a correlation matrix X of five elements(C1,C2,C3,C4,C5) C1 C2 C3 C4 C5 C1 * 1 0 1 0 C2 1 * 0 0 1 C3 0 0 * 1 1 C4 1 0 1 * 0 C5 0 1 1 0 * I want to use MatLab to move as many as non-zero cells close to diagonal, while keep the diagonal cells are "*". For example, you may notice that the columns and rows is shifting in the following matrix, while the diagonal cells are "*". C1 C4 C2 C5 C3 C1 * 1 1 0 0 C4 1 * 0 0 1 C2 1 0 * 1 0 C5 0 0 1 * 1 C3 0 1 0 1 * Because I want to do clustering, so I want as many as non-zero cells get close to diagonal after shifting. It's an NP-hard problem. Anyone

Correlation between groups in R data.table

三世轮回 提交于 2019-11-30 13:03:54
Is there a way of elegantly calculating the correlations between values if those values are stored by group in a single column of a data.table (other than converting the data.table to a matrix)? library(data.table) set.seed(1) # reproducibility dt <- data.table(id=1:4, group=rep(letters[1:2], c(4,4)), value=rnorm(8)) setkey(dt, group) # id group value # 1: 1 a -0.6264538 # 2: 2 a 0.1836433 # 3: 3 a -0.8356286 # 4: 4 a 1.5952808 # 5: 1 b 0.3295078 # 6: 2 b -0.8204684 # 7: 3 b 0.4874291 # 8: 4 b 0.7383247 Something that works, but requires the group names as input: cor(dt["a"]$value, dt["b"]

Generate correlated data in Python (3.3)

流过昼夜 提交于 2019-11-30 12:19:53
问题 In R there is a function ( cm.rnorm.cor , from package CreditMetrics ), that takes the amount of samples, the amount of variables, and a correlation matrix in order to create correlated data. Is there an equivalent in Python? 回答1: numpy.random.multivariate_normal is the function that you want. Example: import numpy as np import matplotlib.pyplot as plt num_samples = 400 # The desired mean values of the sample. mu = np.array([5.0, 0.0, 10.0]) # The desired covariance matrix. r = np.array([ [ 3

Finding lag at which cross correlation is maximum ccf( )

不问归期 提交于 2019-11-30 10:36:31
问题 I have 2 time series and I am using ccf to find the cross correlation between them. ccf(ts1, ts2) lists the cross-correlations for all time lags. How can I find the lag which results in maximum correlation without manually looking at the data? 回答1: Posting the answer http://r.789695.n4.nabble.com/ccf-function-td2288257.html Find_Max_CCF<- function(a,b) { d <- ccf(a, b, plot = FALSE) cor = d$acf[,,1] lag = d$lag[,,1] res = data.frame(cor,lag) res_max = res[which.max(res$cor),] return(res_max)

Polychoric correlation matrix with significance in R

半城伤御伤魂 提交于 2019-11-30 10:28:29
I have been desperately looking for a way to compute a polychoric correlation matrix, with significance in R. If that is very hard then polychoric correlation between two variables with significance would be sufficient. What I have tried so far: library(polychor) poly <- polychor(var1,var2) poly <- polychor(DatM) #where DatM is a DF converted to matrix library(polycor) hetcor(Dat2) #I am however uncertain hetcor is something I would want if I am looking for polychoric correlation. library(psych) polychoric(Dat$for2a,smooth=TRUE,global=TRUE,polycor=FALSE, ML = FALSE, std.err=TRUE) None of these

numpy corrcoef - compute correlation matrix while ignoring missing data

↘锁芯ラ 提交于 2019-11-30 06:46:00
I am trying to compute a correlation matrix of several values. These values include some 'nan' values. I'm using numpy.corrcoef. For element(i,j) of the output correlation matrix I'd like to have the correlation calculated using all values that exist for both variable i and variable j. This is what I have now: In[20]: df_counties = pd.read_sql("SELECT Median_Age, Rpercent_2008, overall_LS, population_density FROM countyVotingSM2", db_eng) In[21]: np.corrcoef(df_counties, rowvar = False) Out[21]: array([[ 1. , nan, nan, -0.10998411], [ nan, nan, nan, nan], [ nan, nan, nan, nan], [-0.10998411,

Correlation coefficients and p values for all pairs of rows of a matrix

守給你的承諾、 提交于 2019-11-30 06:41:04
I have a matrix data with m rows and n columns. I used to compute the correlation coefficients between all pairs of rows using np.corrcoef : import numpy as np data = np.array([[0, 1, -1], [0, -1, 1]]) np.corrcoef(data) Now I would also like to have a look at the p-values of these coefficients. np.corrcoef doesn't provide these; scipy.stats.pearsonr does. However, scipy.stats.pearsonr does not accept a matrix on input. Is there a quick way how to compute both the coefficient and the p-value for all pairs of rows (arriving e.g. at two m by m matrices, one with correlation coefficients, the