问题
I want to perform survival analysis (Kaplan-Meier and Cox PH modelling) on data which is both left and right censored. I'm looking at the time to occurrence of a heart arrhythmia (AF) in the presence versus the absence of a particular gene (Gene 0 or 1). However, some subjects are found to already have the arrhythmia at recruitment and so should be left censored. I've read the survival package documentation but can't work out how to account for the left censoring. Some made up example data below. Subjects 1 and 3 had AF at baseline and so should be left censored. Subject 2 did not experience the event by the end of follow-up and so is right censored. Subjects 5 and 6 both experienced the event (at 8 and 3 months respectively).
Gene<-c(0,0,1,1,0)
AF_at_baseline<-c(1,0,1,0, 0)
Followup_time<-c(11,3,8,15,7)
AF_time<-c(NA, NA, NA, 8, 3)
AF_data<-data.frame(Gene, AF_at_baseline, Followup_time, AF_time)
回答1:
I had a similar problem and solved it like this:
As it is stated in the survival
help file you need to specify time
and time2
.
You can think of left censored data as going from -infinity
until the time
you measured, and of right censored of going from the time
you measured (probably last follow-up) until +infinity
. Infinity is best coded with NA
.
What solved my problem was creating two vectors: a start vector time
and a stop vector time2
.
For time
you want all those values that are left censored to be NA
. Right censored observations are filled in with the time of measurement, just as the Events.
For time2
it is the other way around.
I don't really get your data however. Why would you follow-up on subjects if they already had the event? This is what you do for subject 4 and 5 by saying AF-time was 8 and 3 but Followup_time was 15 and 7.
Trying to help, I assume the following:
You have 5 patients with
AF_at_baseline<-c(1,0,1,0,0) #where 1 indicates left censoring
Follow-up times are event times (or last time of follow-up for left and right censored)
So for the left censored data your Followup_time would look like this:
Followup_time <- c(NA, 3, NA, 15, 7)
For the right censored data:
Followup_time2 <- c(11, NA, 8 ,15, 7)
#Since you indicated that only subject 2 didn't experience the event
Now you can call Surv
Surv.Obj <- Surv(Followup_time, Followup_time2, type = 'interval2')
Surv.Obj
[1] 11- 3+ 8- 15 7 # with '-' indicating left censoring and '+' right censoring
Then you can call survfit
and plot the Kaplan-Meier curve:
km <- survfit(Surv.Obj ~ 1, conf.type = "none")
km
Call: survfit(formula = Surv.Obj ~ 1, conf.type = "none")
n events median 0.95LCL 0.95UCL
5 4 7 7 NA
enter code here
summary(km)
Call: survfit(formula = Surv.Obj ~ 1, conf.type = "none")
time n.risk n.event survival std.err lower 95% CI upper 95% CI
7.0 4 3.00e+00 0.25 0.217 0.0458 1
7.5 1 4.44e-16 0.25 0.217 0.0458 1
15.0 1 1.00e+00 0.00 NaN NA NA
plot(km, conf.int = FALSE, mark.time = TRUE)
So far, I didn't find out how to do Cox PH with interval data. See my question here.
回答2:
If you have both left censored and right censored data, you can consider this to be a special case of interval censoring. This is the case when you know the event time only up to an interval. If you have left censoring, this interval is (-Inf, t), with right censoring this is (t, Inf).
As such, you can use my R package icenReg
to model your data. For the Cox-PH model, this can be fit as
fit <- ic_sp(cbind(left, right) ~ covars,
data = myData, model = 'ph',
bs_samples = 500)
where left
and right
are the left and right sides of the interval in which the event occurred for an individual. If an event is uncensored, then just set left
equal to right
for that subject.
来源:https://stackoverflow.com/questions/41968606/left-censoring-for-survival-data-in-r