statistics | 易学教程

Simple Dynamical Model in PyMC3

阅读更多关于 Simple Dynamical Model in PyMC3

问题 I'm trying to put together a model of a dynamical system in PyMC3, to infer two parameters. The model is the basic SIR, commonly used in epidemiology : dS/dt = - r0 * g * S * I dI/dt = g * I ( r * S - 1 ) where r0 and g are parameters to be inferred. So far, I'm unable to get very far at all. The only examples I've seen of putting together a Markov chain like this yields errors about recursion being too deep. Here's my example code. # Time t = np.linspace(0, 8, 200) # Simulated observation

Matlab - transform continuous data to discrete data

阅读更多关于 Matlab - transform continuous data to discrete data

问题 Are there any techniques which are applied for transforming continuous data to discrete data? By continuous data I am referring to output values generated by various functions. For example the value generated for entropy for different sets of data points. If so, are there implementations available in Matlab of Mathworks File Exchange? 回答1: A more precise answer is that you need to bin your data. This can be done with arbitrary splits or splits based on quantiles of the data itself. The base

Matlab - transform continuous data to discrete data

阅读更多关于 Matlab - transform continuous data to discrete data

Fitting a normal distribution in R

阅读更多关于 Fitting a normal distribution in R

问题 I'm using the following code to fit the normal distribution. The link for the dataset for "b" (too large to post directly) is : link for b setwd("xxxxxx") library(fitdistrplus) require(MASS) tazur <-read.csv("b", header= TRUE, sep=",") claims<-tazur$b a<-log(claims) plot(hist(a)) After plotting the histogram, it seems a normal distribution should fit well. f1n <- fitdistr(claims,"normal") summary(f1n) #Length Class Mode #estimate 2 -none- numeric #sd 2 -none- numeric #vcov 4 -none- numeric #n

Getting “NA” when I run a standard deviation

阅读更多关于 Getting “NA” when I run a standard deviation

问题 Quick question. I read my csv file into the variable data . It has a column label var , which has numerical values. When I run the command sd(data$var) I get [1] NA instead of my standard deviation. Could you please help me figure out what I am doing wrong? 回答1: Try sd(data$var, na.rm=TRUE) and then any NAs in the column var will be ignored. Will also pay to check out your data to make sure the NA's should be NA's and there haven't been read in errors, commands like head(data) , tail(data) ,

Getting “NA” when I run a standard deviation

阅读更多关于 Getting “NA” when I run a standard deviation

Using Scipy's stats.kstest module for goodness-of-fit testing

阅读更多关于 Using Scipy's stats.kstest module for goodness-of-fit testing

问题 I've read through existing posts about this module (and the Scipy docs), but it's still not clear to me how to use Scipy's kstest module to do a goodness-of-fit test when you have a data set and a callable function. The PDF I want to test my data against isn't one of the standard scipy.stats distributions, so I can't just call it using something like: kstest(mydata,'norm') where mydata is a Numpy array. Instead, I want to do something like: kstest(mydata,myfunc) where 'myfunc' is the callable

Using Scipy's stats.kstest module for goodness-of-fit testing

阅读更多关于 Using Scipy's stats.kstest module for goodness-of-fit testing

How to detect significant change / trend in a time series data? [closed]

阅读更多关于 How to detect significant change / trend in a time series data? [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 2 years ago . So I have an array of say 25 samples and I would want to be able to note the trends of whether it's decreasing n or increasing from those 25 sample time interval(basically 25 samples array is my buffer that is being filled by every say 1 ms). Note that it is general trend that I am looking for, not the

Different model performance evaluations by statsmodels and scikit-learn

阅读更多关于 Different model performance evaluations by statsmodels and scikit-learn

问题 I am trying to fit a multivariable linear regression on a dataset to find out how well the model explains the data. My predictors have 120 dimensions and I have 177 samples: X.shape=(177,120), y.shape=(177,) Using statsmodels, I get a very good R-squared of 0.76 with a Prob(F-statistic) of 0.06 which trends towards significance and indicates a good model for the data. When I use scikit-learn's linear regression and try to compute 5-fold cross validation r2 score, I get an average r2 score of