statistics

Fitting a linear combination of distributions

余生颓废 提交于 2021-01-28 14:03:42
问题 I have 5 arrays (columns of a pandas data frame) and I want calculate the best fit for a linear combination of the distributions to an exponential distribution. for example: a*(d1)+b*(d2)+c*(d3)+d*(d4)+e*(d5)=Y where Y has an exponential distribution (which i know) and a,b,c,d,e are the coefficients to fit. I tried using curve_fit or lmfit python libraries but didn't get how to do it effectively. 回答1: What you're describing is a linear model. Use the package scikit-learn: from sklearn.linear

“Import Statistics” Fails To Run

冷暖自知 提交于 2021-01-28 09:26:33
问题 When I use IDLE the code "import statistics" runs, however when I use sublimetext while other packages, such as matplotlib, can be imported I cannot import the statistics module. It gives me this error code: import math import matplotlib import statistics I expect nothing to happen on the screen, but in the command line it spits out. Note that the first two lines of code did work. ImportError: No module named statistics [Finished in 1.2s with exit code 1] [shell_cmd: python -u "/Users/Ivan

Why doesn't Johnson-SU distribution give positive skewness in scipy.stats?

北战南征 提交于 2021-01-28 08:10:15
问题 The code below maps the statistical moments (mean, variance, skewness, excess kurtosis) generated by corresponding parameters ( a , b , loc , scale ) of the Johnson-SU distribution ( johnsonsu ). For the range of loop values specified in my code below, no parameter configuration results in positive skewness, only negative skewness, even though it should be possible to parameterize the Johnson-SU distribution to be positively-skewed. import numpy as np import pandas as pd from scipy.stats

d3.quantile seems to be calculating Q1 incorrectly

左心房为你撑大大i 提交于 2021-01-28 08:06:40
问题 I'm giving a sorted array of 24 numbers to d3.quantile and asking it to calculate the first quartile value. Since the array can be split evenly into four groups of 6 values, my assumption was that the result would be the mean of arr[5] and arr[6], but that's not what I got. var arr = [89.7, 93.2, 94, 94.3, 94.5, 95.4, 95.9, 96.1, 96.4, 96.5, 96.9, 96.9, 97.3, 97.6, 97.6, 97.6, 97.8, 98.3, 98.3, 98.4, 98.5, 98.5, 98.6, 98.6]; var myAssumption = (arr[5] + arr[6]) / 2; // 95.65 var d3Result = d3

Sorting the “Coefficients” table of a StepAIC

ぃ、小莉子 提交于 2021-01-28 07:24:09
问题 Is there any way to sort the output of the "Coefficients" part of a StepAIC (by estimate or t-value)? > summary(sta9); Call: lm(formula = rating ~ `b/d` + `v/w` + `korte klank` + `lange klank` + `-eer, -oor, -eur` + `-aai, -ooi, -oei` + `-eeuw, -ieuw, -uw` + `-ng` + `-nk` + schwa + `eind d` + `eind b` + open + gesloten + `-ee` + `mv klinker 's` + tussenletter + hoofdletter + trema + regelverkleinwoorden + `-ch, -cht` + `ei/ij` + `ou, au, ouw, auw` + `-sch, -schr` + `i/ie` + `be-, ge-, ver-` +

Problems with using plotCalibration() from the predictABEL package in R

烈酒焚心 提交于 2021-01-28 06:33:15
问题 I’ve been having some trouble with the plotCalibration() function, I have managed to get it to work before, but recently whilst working with another dataset (here is a link to the .Rda data file), I have been unable to shake off an error message which keeps cropping up: > plotCalibration(data = data, cOutcome = 2, predRisk = data$sortmort) Error in plotCalibration(data = data, cOutcome = 2, predRisk = data$sortmort) : The specified outcome is not a binary variable.` When I’ve tried to set the

How can I reformat a table in R?

浪子不回头ぞ 提交于 2021-01-28 02:30:50
问题 I loaded a table like this: V1 V2 V3 pat1 1 2 pat1 3 1 pat1 4 2 pat2 3 3 pat3 1 4 pat3 2 3 and I need to format it into something like the following, with V1 indicating the row, V2 indicating the column, and the values in V3: 1 2 3 4 pat1 2 0 1 2 pat2 0 0 3 0 pat3 4 3 0 0 Please, note that pat1 vs. pat2 vs. pat3 have different numbers of observations and that missing values must be filled with 0. 回答1: Using dcast from reshape2 : library(reshape2) dcast(dat,V1~V2,fill=0) V1 1 2 3 4 1 pat1 2 0

Scipy poisson distribution with an upper limit

谁说胖子不能爱 提交于 2021-01-28 01:54:36
问题 I am generating a random number using scipy stats. I used the Poisson distribution. Below is an example: import scipy.stats as sct A =2.5 Pos = sct.poisson.rvs(A,size = 20) When I print Pos, I got the following numbers: array([1, 3, 2, 3, 1, 2, 1, 2, 2, 3, 6, 0, 0, 4, 0, 1, 1, 3, 1, 5]) You can see from the array that some of the number,such as 6, is generated. What I want to do it to limit the biggest number(let's say 5), i.e. any random number generated using sct.poisson.rvs should be equal

Ready implementation of multivariate Spearman rank correlation

十年热恋 提交于 2021-01-28 00:31:19
问题 I'm looking for a way to calculate multivariate version of Spearman rank correlation $\rho$. Are there any ready to use Python implementation I can use? 回答1: There is one in scipy. 回答2: If now or in the future you will want access to some advanced statistical packages, also consider calling R libraries from Python when needed via the RPy2. And then you can compute spearman using a package such as this. 来源: https://stackoverflow.com/questions/2264609/ready-implementation-of-multivariate

How can I reformat a table in R?

社会主义新天地 提交于 2021-01-27 23:11:21
问题 I loaded a table like this: V1 V2 V3 pat1 1 2 pat1 3 1 pat1 4 2 pat2 3 3 pat3 1 4 pat3 2 3 and I need to format it into something like the following, with V1 indicating the row, V2 indicating the column, and the values in V3: 1 2 3 4 pat1 2 0 1 2 pat2 0 0 3 0 pat3 4 3 0 0 Please, note that pat1 vs. pat2 vs. pat3 have different numbers of observations and that missing values must be filled with 0. 回答1: Using dcast from reshape2 : library(reshape2) dcast(dat,V1~V2,fill=0) V1 1 2 3 4 1 pat1 2 0