问题
I am trying to follow a tutorial in R (https://rviews.rstudio.com/2017/09/25/survival-analysis-with-r/).The computer I am using for work does not have a USB port or internet connection - it only has R with a few libraries installed. My work computer has "survival, ranger, ggplot2 and dplyr". However, it does not have "ggfortify". I am trying to figure out how to plot the graphs from the tutorial without 'ggfortify'. Here is the code I am using below:
#load libraries
library(survival)
library(ranger)
library(ggplot2)
library(dplyr)
#load data
data(veteran)
head(veteran)
#Part 1 : works
# Kaplan Meier Survival Curve
km <- with(veteran, Surv(time, status))
km_fit <- survfit(Surv(time, status) ~ 1, data=veteran)
#plot(km_fit, xlab="Days", main = 'Kaplan Meyer Plot') #base graphics is always ready
tibble(time = km_fit$time, surv = km_fit$surv,
min = km_fit$lower, max = km_fit$upper) %>%
ggplot(aes(x = time)) +
geom_line(aes(y = surv)) +
geom_ribbon(aes(ymin = min, ymax = max), alpha = 0.3)
However, I can't get this to work:
#Part 2: does not work
km_trt_fit <- survfit(Surv(time, status) ~ trt, data=veteran)
tibble(time = km_trt_fit$time, surv = km_trt_fit$surv,
min = km_trt_fit$lower, max = km_trt_fit$upper) %>%
ggplot(aes(x = time, group = factor(veteran$trt), colour = factor(veteran$trt), fill = factor(veteran$trt))) +
geom_line(aes(y = surv)) +
geom_ribbon(aes(ymin = min, ymax = max), alpha = 0.3)
Error: Aesthetics must be either length 1 or the same as the data (114): group, colour and fill
Or this to work:
#Part 3: does not work
vet <- mutate(veteran, AG = ifelse((age < 60), "LT60", "OV60"),
AG = factor(AG),
trt = factor(trt,labels=c("standard","test")),
prior = factor(prior,labels=c("N0","Yes")))
aa_fit <-aareg(Surv(time, status) ~ trt + celltype +
karno + diagtime + age + prior ,
data = vet)
tibble(time = aa_fit$time, surv = aa_fit$surv,
min = aa_fit$lower, max = aa_fit$upper) %>%
ggplot(aes(x = time)) +
geom_line(aes(y = surv)) +
geom_ribbon(aes(ymin = min, ymax = max), alpha = 0.3)
Error: geom_line requires the following missing aesthetics: y
Can someone please help me correct these?
Thanks (Previous Post : R: plotting graphs (ggplot vs autoplot))
回答1:
You are going to have to do some detective work!
I have time for part #2 today. So: It turns out, that the information about the strata is contained in the element km_trt_fit$strata
. It looks like this:
km_trt_fit <- survfit(Surv(time, status) ~ trt, data=veteran)
km_trt_fit$strata
#> trt=1 trt=2
#> 61 53
This is telling you that there are 61 elements of trt=1
and 53 elements of trt=2
. I don't know why these don't add up to 137 (the number of rows in veteran
) but I assume that's just how survfit()
works. It is also the reason you are getting the error, because the resulting model data have a different number of rows than the original data frame, which you are trying to include by using veteran$trt
.
My solution: Create a vector strata
with 61 and 53 elements of trt=1
and trt=2
respectively:
strata = km_trt_fit$strata
strata = rep(names(strata), times = strata)
Include that in your input data:
tibble(time = km_trt_fit$time,
surv = km_trt_fit$surv,
min = km_trt_fit$lower,
max = km_trt_fit$upper,
trt = factor(strata)) %>%
ggplot(aes(x = time, colour = trt, fill = trt)) +
geom_line(aes(y = surv)) +
geom_ribbon(aes(ymin = min, ymax = max), alpha = 0.3)
The result is pretty close to what the tutorial has.
I am not overly familiar with ggfortify but its job is probably to do something similar for you automagically. In its absence, you will have to investigate the structures produced by the model functions and extract the data manually like I did above.
来源:https://stackoverflow.com/questions/65042472/r-tibble-vs-ggplot2-plotting-graphs