statistics

Extracting and formatting results of cor.test on multiple pairs of columns

柔情痞子 提交于 2019-12-23 01:28:10
问题 I am trying to generate a table output of a correlation matrix. Specifically, I am using a for loop in order to identify a correlation between all data in columns 4:40 to column 1. While the results of the table are decent, it does not identify what is being compared to what. In checking attributes of cor.test ,I find that data.name is being given as x[1] and y[1] which is not good enough to trace back which columns is being compared to what. Here is my code: input <- read.delim(file=

Homography computation using Levenberg Marquardt algorithm [closed]

帅比萌擦擦* 提交于 2019-12-22 18:33:12
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 3 years ago . The findHomography() function in OpenCV finds a perspective transformation between two planes.The computed homography matrix is refined further (using inliers only in case of a robust method) with the Levenberg-Marquardt method to reduce the re-projection error even more. Can anyone provide any links to C/C++

How many times update maximum gets called when trying to find the running maximum in a randomized array of numbers?

為{幸葍}努か 提交于 2019-12-22 18:11:48
问题 Suppose we have an array with integers -N to N in an array size of 2N + 1. We first shuffle the elements in the array, then try to find the maximum integer by iterating through the array from the first element to the last element: (code example is in Java) int called = 0; int max = Integer.MIN_VALUE; for (int i : array) { if (i > max) { called++; max = i; } } What is the expectation(average over many runs) of value called after iterating through the array? Edit: How I found it to be close to

how to use a log scale for y-axis of histogram in R?

痞子三分冷 提交于 2019-12-22 17:07:18
问题 I have a large dataset with the lifespan of threads on an discussion board. I want a histogram that shows the distribution of lifespan, so I did this: dall <- read.csv("lifespan.csv") colnames(dall) <- c("thread.id", "seconds.alive", "start.time") hist(dall$seconds.alive) which generated this hard to read image: My questions are a) is changing y-axis to a log-scale a good way to make it more readable? Apparently some people think is a bad idea to change y-axis to log. b) how do I do that? 回答1

How to write quadratic equation on Label in C# WinForms?

别说谁变了你拦得住时间么 提交于 2019-12-22 14:58:09
问题 We are making statistical software. Everywhere we need to put formula such as ax2+bx+c How to make ax2 means x square 2. I want to display 2 on upper side of x. Same with πc I want to display c at suffix. 回答1: Do you have a fixed list of formulas that users can choose but cannot edit? Then generate an image for each formula, store them in your application, and display them in a PictureBox . If you expect users to be able to type in arbitrary formulas and render them interactively, you will

Can my standard deviation calculation be made more efficient?

喜欢而已 提交于 2019-12-22 10:34:23
问题 I'm curious if my standard deviation method can be made more efficient. By efficient I mean fast, and by fast I mean latency from method call to method return. Here's the code: public double stdDev(ArrayList<Double> input) { double Nrecip = ( 1.0 / ( input.size()) ); double sum = 0.0; double average = 0.0; for (Double input : inputs) { average += input; } average *= Nrecip; for (Double input : inputs) { sum += ( (input - average)*(input - average) ); } sum *= Nrecip; return Math.sqrt(sum); }

Picking fair teams - and the math to prove it

拜拜、爱过 提交于 2019-12-22 10:29:49
问题 Application: similar to picking playground teams. I must divide a collection of n sequentially ranked elements into two teams of n/2. The teams must be as "even" as possible. Think of "even" in terms of playground teams, as described above. The rankings indicate relative "skill" or value levels. Element #1 is worth 1 "point", element #2 is worth 2, etc. No other constraints. So if I had a collection [1,2,3,4], I would need two teams of two elements. The possibilities are [1,2] & [3,4] [1,3] &

Simple approximation of Inverse Incomplete gamma function

醉酒当歌 提交于 2019-12-22 10:12:29
问题 How could one approximate Inverse Incomplete gamma function Г(s,x) by some simple analytical function f(s,Г)? That means write something like x = f(s,Г) = 12*log(123.45*Г) + Г + 123.4^s . (I need at least ideas or references.) 回答1: You can look at the code in Boost: http://www.boost.org/doc/libs/1_35_0/libs/math/doc/sf_and_dist/html/math_toolkit/special/sf_gamma/igamma.html and see what they're using. EDIT: They also have inverses: http://www.boost.org/doc/libs/1_35_0/libs/math/doc/sf_and

Fitting data with a custom distribution using scipy.stats

喜夏-厌秋 提交于 2019-12-22 08:28:31
问题 So I noticed that there is no implementation of the Skewed generalized t distribution in scipy . It would be useful for me to fit this is distribution to some data I have. Unfortunately fit doesn't seem to be working in this case for me. To explain further I have implemented it like so import numpy as np import pandas as pd import scipy.stats as st from scipy.special import beta class sgt(st.rv_continuous): def _pdf(self, x, mu, sigma, lam, p, q): v = q ** (-1 / p) * \ ((3 * lam ** 2 + 1) * (

Python PCA plot using Hotelling's T2 for a confidence interval

我们两清 提交于 2019-12-22 08:14:32
问题 I am trying to apply PCA for Multi variant Analysis and plot the score plot for first two components with Hotelling T2 confidence ellipse in python. I was able to get the scatter plot and I want to add 95% confidence ellipse to the scatter plot. It would be great if anyone know how it can be done in python. Sample picture of expected output: 回答1: This was bugging me, so I adopted an answer from PCA and Hotelling's T^2 for confidence intervall in R in python (and using some source code from