Gompertz Aging analysis in R

僤鯓⒐⒋嵵緔 提交于 2019-11-30 15:49:07

问题


I have survival data from an experiment in flies which examines rates of aging in various genotypes. The data is available to me in several layouts so the choice of which is up to you, whichever suits the answer best.

One dataframe (wide.df) looks like this, where each genotype (Exp, of which there is ~640) has a row, and the days run in sequence horizontally from day 4 to day 98 with counts of new deaths every two days.

Exp      Day4   Day6    Day8    Day10   Day12   Day14    ...
A        0      0       0       2       3       1        ...

I make the example using this:

wide.df2<-data.frame("A",0,0,0,2,3,1,3,4,5,3,4,7,8,2,10,1,2)
colnames(wide.df2)<-c("Exp","Day4","Day6","Day8","Day10","Day12","Day14","Day16","Day18","Day20","Day22","Day24","Day26","Day28","Day30","Day32","Day34","Day36")

Another version is like this, where each day has a row for each 'Exp' and the number of deaths on that day are recorded.

Exp     Deaths  Day     
A       0       4    
A       0       6
A       0       8
A       2       10
A       3       12
..      ..      ..

To make this example:

df2<-data.frame(c("A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A","A"),c(0,0,0,2,3,1,3,4,5,3,4,7,8,2,10,1,2),c(4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36))
    colnames(df2)<-c("Exp","Deaths","Day")

What I would like to do is perform a Gompertz Analysis (See second paragraph of "the life table" here). The equation is:

μx = α*e β*x

Where μx is probability of death at a given time, α is initial mortality rate, and β is the rate of aging.

I would like to be able to get a dataframe which has α and β estimates for each of my ~640 genotypes for further analysis later.

I need help going from the above dataframes to an output of these values for each of my genotypes in R.

I have looked through the package flexsurv which may house the answer but I have failed in attempts to find and implement it.


回答1:


This should get you started...

Firstly, for the flexsurvreg function to work, you need to specify your input data as a Surv object (from package:survival). This means one row per observation.

The first thing is to re-create the 'raw' data from the summary tables you provide. (I know rbind is not efficient, but you can always switch to data.table for large sets).

### get rows with >1 death
df3 <- df2[df2$Deaths>1, 2:3]
### expand to give one row per death per time
df3 <- sapply(df3, FUN=function(x) rep(df3[, 2], df3[, 1]))
### each death is 1 (occurs once)
df3[, 1] <- 1
### add this to the rows with <=1 death
df3 <- rbind(df3, df2[!df2$Deaths>1, 2:3])
### convert to Surv object
library(survival)
s1 <- with(df3, Surv(Day, Deaths))
### get parameters for Gompertz distribution
library(flexsurv) 
f1 <- flexsurvreg(s1 ~ 1, dist="gompertz")

giving

> f1$res
              est         L95%        U95%
shape 0.165351912 0.1281016481 0.202602176
rate  0.001767956 0.0006902161 0.004528537

Note that this is an intercept-only model as all your genotypes are A. You can loop this over multiple survival objects once you have re-created the per-observation data as above.

From the flexsurv docs:

Gompertz distribution with shape parameter a and rate parameter b has hazard function

H(x: a, b) = b.e^{ax}

So it appears your alpha is b, the rate, and beta is a, the shape.



来源:https://stackoverflow.com/questions/17718566/gompertz-aging-analysis-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!