poisson

XGBoost - Poisson distribution with varying exposure / offset

我们两清 提交于 2019-12-18 16:45:31
问题 I am trying to use XGBoost to model claims frequency of data generated from unequal length exposure periods, but have been unable to get the model to treat the exposure correctly. I would normally do this by setting log(exposure) as an offset - are you able to do this in XGBoost? (A similar question was posted here: xgboost, offset exposure?) To illustrate the issue, the R code below generates some data with the fields: x1, x2 - factors (either 0 or 1) exposure - length of policy period on

Algorithm to generate Poisson and binomial random numbers?

*爱你&永不变心* 提交于 2019-12-17 08:53:22
问题 i've been looking around, but i'm not sure how to do it. i've found this page which, in the last paragraph, says: A simple generator for random numbers taken from a Poisson distribution is obtained using this simple recipe: if x 1 , x 2 , ... is a sequence of random numbers with uniform distribution between zero and one, k is the first integer for which the product x 1 · x 2 · ... · x k+1 < e -λ i've found another page describing how to generate binomial numbers, but i think it is using an

Checking interpretation of GLM summary in R

假如想象 提交于 2019-12-13 03:55:36
问题 Just want to check that what I'm doing is all correct! I have bird counts in several sites categorised into two habitats - farmland and wetland. I simply want to see which habitat has higher counts. I'm using a GLM with a Poisson function (as they are count data): > mod <- glm(count ~ habitat, family = "poisson") > summary(mod) Call: glm(formula = count ~ habitat, family = poisson) Deviance Residuals: Min 1Q Median 3Q Max -0.5868 -0.4603 -0.2496 -0.2141 2.8464 Coefficients: Estimate Std.

fftw3 for poisson with dirichlet boundary condition for all side of computational domain

廉价感情. 提交于 2019-12-13 01:14:14
问题 I am trying to solve Poison equation with Dirichlet boundary condition for four sides of computational domain. As known that I should use FFTW_RODFT00 to satisfy the condition. However, the result is not correct.Could you please help me? #include <stdio.h> #include <math.h> #include <cmath> #include <fftw3.h> #include <iostream> #include <vector> using namespace std; int main() { int N1=100; int N2=100; double pi = 3.141592653589793; double L1 = 2.0; double dx = L1/(double)(N1-1); double L2=

Statsmodels Poisson glm different than R

北战南征 提交于 2019-12-12 15:07:02
问题 I am trying to fit some models (Spatial interaction models) according to some code which is provided in R. I have been able to get some of the code to work using statsmodels in a python framework but some of them do not match at all. I believe that the code I have for R and Python should give identical results. Does anyone see any differences? Or is there some fundamental differences which might be throwing things off? The R code is the original code which matches the numbers given in a

how to generate integer inter arrival times using random.expovariate() in python

眉间皱痕 提交于 2019-12-12 12:27:09
问题 In python random module, the expovariate() function generates floating point numbers which can be used to model inter-arrival times of a Poisson process. How do I make use of this to generate integer times between arrival instead of floating point numbers? 回答1: jonrsharpe already kind of mentioned it, you can just let the function generate floating point numbers, and convert the output to integers yourself using int() This >>> import random >>> [random.expovariate(0.2) for i in range(10)] [7

Trying to fit Poisson Distribution in R using fitdistr to Erdos.Reyni random graph constructed in Igraph

寵の児 提交于 2019-12-12 12:16:37
问题 Using igraph in R, I was trying to confirm that the Erdos-Reyni method of constructing random networks would indeed end up with networks with degree distributions that are well fit by a Poisson distribution. So, I ran the R the code: library(igraph) library(MASS) #for fitdistr function samples<-5000 # first n is the number of nodes, the n/samples is the probability of making # an edge between any two nodes g <- erdos.renyi.game(samples, 20/ samples) d <- degree(g) #finds the degree of each

Interpreting the output of glm for Poisson regression [closed]

风流意气都作罢 提交于 2019-12-12 10:14:21
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 6 years ago . Consider the following: foo = 1:10 bar = 2 * foo glm(bar ~ foo, family=poisson) I get results Coefficients: (Intercept) foo 1.1878 0.1929 Degrees of Freedom: 9 Total (i.e. Null); 8 Residual Null Deviance: 33.29 Residual Deviance: 2.399 AIC: 47.06 From the explanation on this page, it seems like the coefficient

Calculate poisson probability percentage

给你一囗甜甜゛ 提交于 2019-12-12 07:30:42
问题 When you use the POISSON function in Excel (or in OpenOffice Calc), it takes two arguments: an integer an 'average' number and returns a float. In Python (I tried RandomArray and NumPy) it returns an array of random poisson numbers. What I really want is the percentage that this event will occur (it is a constant number and the array has every time different numbers - so is it an average?). for example: print poisson(2.6,6) returns [1 3 3 0 1 3] (and every time I run it, it's different). The

calculating confidence interval of coefficient in poisson regression

不羁的心 提交于 2019-12-11 18:13:08
问题 The poisson regression looks as follows in My R-code: poissmod <- glm(aerobics$y ~ factor(aerobics$x1) + factor(aerobics$x2) + aerobics$x3 + aerobics$x4, family = poisson) poissmod Now I have to compute a confidence interval for the factor aerobics$x1 (in a model without aerobics$x1 since this is not significant). This might look very easy, but I am not familiar with R and I can 't find the answer anywhere... Anyone who can help me? Thanks a lot in advance! 回答1: See e.g. the confint function