statistics

Difference in GLM results between iPython and R

∥☆過路亽.° 提交于 2019-12-21 18:08:10
问题 I'm trying to get to grips with performing regression analyses in R. Below is some random dummy data that I have generated in R, run a logistic glm in R. I have saved the data into a test file, read that into python with ipython (ipython notebook is awesome btw, only just started using it!), and then tried to run the same analyis with python. The results are very similar but they are different. I kind of would have expected them to be the same. Have I done something wrong, is there a

Difference in GLM results between iPython and R

强颜欢笑 提交于 2019-12-21 18:07:46
问题 I'm trying to get to grips with performing regression analyses in R. Below is some random dummy data that I have generated in R, run a logistic glm in R. I have saved the data into a test file, read that into python with ipython (ipython notebook is awesome btw, only just started using it!), and then tried to run the same analyis with python. The results are very similar but they are different. I kind of would have expected them to be the same. Have I done something wrong, is there a

R: How to remove outliers from a smoother in ggplot2?

折月煮酒 提交于 2019-12-21 14:11:12
问题 I have the following data set that I am trying to plot with ggplot2, it is a time series of three experiments A1, B1 and C1 and each experiment had three replicates. I am trying to add a stat which detects and removes outliers before returning a smoother (mean and variance?). I have written my own outlier function (not shown) but I expect there is already a function to do this, I just have not found it. I've looked at stat_sum_df("median_hilow", geom = "smooth") from some examples in the

How do you visualize logfiles in realtime?

爱⌒轻易说出口 提交于 2019-12-21 11:55:38
问题 Sometimes it might be useful, but mostly just looking cool or impressive to visualize log files (anything from http requests and to bandwith usage to cups of coffee drunk per day). I know about Visitorville which I think look a bit silly, and then there's gltail. How do you "visualize" your log files in realtime? 回答1: You may take a look at Apache Chainsaw. This nifty tool allows Log incomes from nearly everyqhere and has live filtering and colering. If you have an already written Log, I'm

Python package that supports weighted covariance computation

家住魔仙堡 提交于 2019-12-21 09:31:03
问题 Is there a python statistical package that supports the computation of weighted covariance (i.e., each observation has a weight) ? Unfortuantely numpy.cov does not support weights. Preferably working under numpy/scipy framework (i.e., able to use numpy arrays to speed up the computation). Thanks a lot! 回答1: statsmodels has weighted covariance calculation in stats . But we can still calculate it also directly: # -*- coding: utf-8 -*- """descriptive statistic with case weights Author: Josef

How to get a weighted average for reviews in Excel?

假如想象 提交于 2019-12-21 06:28:22
问题 So here's my challenge. I have a spreadsheet that looks like this: prod_id | pack | value | durable | feat | ease | grade | # of ratings 1 75 85 99 90 90 88 1 2 90 95 81 86 87 88 9 3 87 86 80 85 82 84 37 4 92 80 68 67 45 70 5 5 93 81 94 93 90 90 4 6 93 70 60 60 70 70 1 Each product has individual grade criteria (packaging - ease of use), an overall average grade, and number of ratings the product received. The entire data set I have places 68% of the products within the 80-89 grade range. I

Manager game: How to calculate market values?

你。 提交于 2019-12-21 06:17:20
问题 Usually players in a soccer manager game have market values. The managers sell their players in accordance with these market values. They think: "Oh, the player is worth 3,000,00 so I'll try to sell him for 3,500,000". All players have three basic qualities: strength value (1-99) maximal strength they can ever attain (1-99) motivation (1-5) current age (16-40) Based on these values, I calculate the market values at the moment. But I would like to calculate the market values dynamically

How to calculate (statistical) power function vs. sample size in python?

不羁岁月 提交于 2019-12-21 06:12:49
问题 How can this be done in python? Calculate sample size for a given power and alpha? Calculate power for a given sample size and alpha? Note: I am totally confused :( with the functions that python gives for (statistical) power function calculation. Can someone help me to make an order here? There are two functions under statsmodels: from statsmodels.stats.power import ttest_power, tt_ind_solve_power() We have: tt_ind_solve_power(effect_size=effect_size, alpha=alpha, power=0.8, ratio=1,

PyTorch - parameters not changing

橙三吉。 提交于 2019-12-21 05:56:44
问题 In an effort to learn how pytorch works, I am trying to do maximum likelihood estimation of some of the parameters in a multivariate normal distribution. However it does not seem to work for any of the covariance related parameters. So my question is: why does this code not work? import torch def make_covariance_matrix(sigma, rho): return torch.tensor([[sigma[0]**2, rho * torch.prod(sigma)], [rho * torch.prod(sigma), sigma[1]**2]]) mu_true = torch.randn(2) rho_true = torch.rand(1) sigma_true

Chi-Squared Probability Function in C++

半世苍凉 提交于 2019-12-21 05:18:12
问题 The following code of mine computes the confidence interval using Chi-square's 'quantile' and probability function from Boost. I am trying to implement this function as to avoid dependency to Boost. Is there any resource where can I find such implementation? #include <boost/math/distributions/chi_squared.hpp> #include <boost/cstdint.hpp> using namespace std; using boost::math::chi_squared; using boost::math::quantile; vector <double> ConfidenceInterval(double x) { vector <double> ConfInts; //