statistics | 易学教程

How to calculate percentage change from different rows over different spans

阅读更多关于 How to calculate percentage change from different rows over different spans

问题 I am trying to calculate the percentage change in price for quarterly data of companies recognized by a gvkey (1001, 1384, etc...). and it's corresponding quarterly stock price, PRCCQ . gvkey PRCCQ 1 1004 23.750 2 1004 13.875 3 1004 11.250 4 1004 10.375 5 1004 13.600 6 1004 14.000 7 1004 17.060 8 1004 8.150 9 1004 7.400 10 1004 11.440 11 1004 6.200 12 1004 5.500 13 1004 4.450 14 1004 4.500 15 1004 8.010 What I am trying to do is add 8 columns showing 1 quarter return, 2 quarter return, etc.

How to calculate percentage change from different rows over different spans

阅读更多关于 How to calculate percentage change from different rows over different spans

R cluster analysis and dendrogram with correlation matrix

阅读更多关于 R cluster analysis and dendrogram with correlation matrix

问题 I have to perform a cluster analysis on a big amount of data. Since I have a lot of missing values I made a correlation matrix. corloads = cor(df1[,2:185], use = "pairwise.complete.obs") Now I have problems how to go on. I read a lot of articles and examples, but nothing really works for me. How can I find out how many clusters are good for me? I already tried this: dissimilarity = 1 - corloads distance = as.dist(dissimilarity) plot(hclust(distance), main="Dissimilarity = 1 - Correlation",

How to get both MSE and R2 from a sklearn GridSearchCV?

阅读更多关于 How to get both MSE and R2 from a sklearn GridSearchCV?

问题 I can use a GridSearchCV on a pipeline and specify scoring to either be 'MSE' or 'R2' . I can then access gridsearchcv._best_score to recover the one I specified. How do I also get the other score for the solution found by GridSearchCV? If I run GridSearchCV again with the other scoring parameter, it might not find the same solution, and so the score it reports might not correspond to the same model as the one for which we have the first value. Maybe I can extract the parameters and supply

Z3 real arithmetic and statistics

阅读更多关于 Z3 real arithmetic and statistics

问题 Given a problem that is encoded using Z3's reals, which of the statistics that Z3 /smt2 /st produces might be helpful in order to judge if the reals engine "has problems/does lots of work"? In my case, I have two mostly equivalent encodings of the problem, both using reals. The "small" difference in the encoding, however, makes a big difference in runtime, namely, that encoding A takes 2:30min and encoding B 13min. The Z3 statistics show that conflicts and quant-instantiations are mostly

one way ANOVA with repeated measurements - Not within-subjects

阅读更多关于 one way ANOVA with repeated measurements - Not within-subjects

问题 I'm trying to conduct a one-way ANOVA with repeated measurements; however, the repeated measurements are independent, they do not represent a measurement of a subject under different conditions, but simply a replication of the same conditions. This means if I obtain two measurements, for example, for one subject and they are different, it's only due to randomness. I looked around and there seems to be a within-subjects ANOVA, but that assumes that the measurements per subject are correlated,

one way ANOVA with repeated measurements - Not within-subjects

阅读更多关于 one way ANOVA with repeated measurements - Not within-subjects

one way ANOVA and TUKEY in R with conditions

阅读更多关于 one way ANOVA and TUKEY in R with conditions

问题 I am trying to find the mean differences between my variable stim_ending_t which contains the following 6 factors: 1, 1.5, 2, 2.5, 3, 3.5 You can access the df Here stim_ending_t visbility soundvolume Opening_text m sd coefVar <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl> 1 1 0 0 Now focus on the Image 1.70 1.14 0.670 2 1 0 0 Now focus on the Sound 1.57 0.794 0.504 3 1 0 1 Now focus on the Image 1.55 1.09 0.701 4 1 0 1 Now focus on the Sound 1.77 0.953 0.540 5 1 1 0 Now focus on the Image 1.38 0

Computing autocorrelation of vectors with numpy

阅读更多关于 Computing autocorrelation of vectors with numpy

问题 I'm struggling to come up with a non-obfuscating, efficient way of using numpy to compute a self correlation function in a set of 3D vectors. I have a set of vectors in a 3d space, saved in an array a = array([[ 0.24463039, 0.58350592, 0.77438803], [ 0.30475903, 0.73007075, 0.61165238], [ 0.17605543, 0.70955876, 0.68229821], [ 0.32425896, 0.57572195, 0.7506 ], [ 0.24341381, 0.50183697, 0.83000565], [ 0.38364726, 0.62338687, 0.68132488]]) their self correlation function is defined as in case

Loop through a .csv file in R, computing relative frequencies?

阅读更多关于 Loop through a .csv file in R, computing relative frequencies?

问题 I'm new to R and I'm trying to create a .R script that will open up a .csv file of mine and compute some frequencies. There are headers in this file and the values associated with them are either 1,0,NA, or -4. What I want to do is go through each vertical row and then compute the frequencies of them. I'm sure this is an easy script, but I'm not sure how the syntax of R works yet. Can anyone get me started on this please? 回答1: The exact script is going to vary based on your input and what