Getting percentile values from gamlss centile curves

会有一股神秘感。 提交于 2019-12-07 18:51:32

问题


This question is related to: Selecting Percentile curves using gamlss::lms in R

I can get centile curve from following data and code:

age = sample(5:15, 500, replace=T) 
yvar = rnorm(500, age, 20)
mydata = data.frame(age, yvar)
head(mydata)
  age      yvar
1  12  13.12974
2  14 -18.97290
3  10  42.11045
4  12  27.89088
5  11  48.03861
6   5  24.68591

h = lms(yvar, age , data=mydata, n.cyc=30)
centiles(h,xvar=mydata$age, cent=c(90), points=FALSE)

How can I now get yvar on the curve for each of x value (5:15) which would represent 90th percentiles for data after smoothening?

I tried to read help pages and found fitted(h) and fv(h) to get fitted values for entire data. But how to get values for each age level at 90th centile curve level? Thanks for your help.

Edit: Following figure show what I need:

I tried following but it is correct since value are incorrect:

mydata$fitted = fitted(h)
aggregate(fitted~age, mydata, function(x) quantile(x,.9))
   age    fitted
1    5  6.459680
2    6  6.280579
3    7  6.290599
4    8  6.556999
5    9  7.048602
6   10  7.817276
7   11  8.931219
8   12 10.388048
9   13 12.138104
10  14 14.106250
11  15 16.125688

The values are very different from 90th quantile directly from data:

> aggregate(yvar~age, mydata, function(x) quantile(x,.9))
   age     yvar
1    5 39.22938
2    6 35.69294
3    7 25.40390
4    8 26.20388
5    9 29.07670
6   10 32.43151
7   11 24.96861
8   12 37.98292
9   13 28.28686
10  14 43.33678
11  15 44.46269

回答1:


See if this makes sense. The 90th percentile of a normal distribution with mean and sd of 'smn' and 'ssd' is qnorm(.9, smn, ssd): So this seems to deliver (somewhat) sensible results, albeit not the full hack of centiles that I suggested:

 plot(h$xvar, qnorm(.9, fitted(h), h$sigma.fv))

(Note the massive overplotting from only a few distinct xvars but 500 points. Ande you may want to set the ylim so that the full range can be appreciated.)

The caveat here is that you need to check the other parts of the model to see if it is really just an ordinary Normal model. In this case it seems to be:

> h$mu.formula
y ~ pb(x)
<environment: 0x10275cfb8>
> h$sigma.formula
~1
<environment: 0x10275cfb8>
> h$nu.formula
NULL
> h$tau.formula
NULL

So the model is just mean-estimate with a fixed-variance (the ~1) across the range of the xvar, and there are no complications from higher order parameters like a Box-Cox model. (And I'm unable to explain why this is not the same as the plotted centiles. For that you probably need to correspond with the package authors.)



来源:https://stackoverflow.com/questions/27507448/getting-percentile-values-from-gamlss-centile-curves

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!