问题
I'm an R newbie working with an annual time series dataset (named "timeseries"). The set has one column for year and another 600 columns with the yearly values for different locations ("L1," "L2", etc), e.g. similar to the following:
Year L1 L2 L3 L4
1963 0.63 0.23 1.33 1.41
1964 1.15 0.68 0.21 0.4
1965 1.08 1.06 1.14 0.83
1966 1.69 1.85 1.3 0.76
1967 0.77 0.62 0.44 0.96
I'd like to do a linear regression for each site and can use the following for a single site:
timeL1<-lm(L1~Year, data=timeseries)
summary(timeL1)
But I think there must be a way to automatically repeat this for all the locations. Ideally, I'd like to end up with two vectors of results-- one with the coefficients for all the locations and one with the p-values for all the locations. From some searching, I thought the plyr package might work, but I can't figure it out. I'm still learning the basics of R, so any suggestions would be appreciated.
回答1:
You can do this with one line of code:
apply(df[-1], 2, function(x) summary(lm(x ~ df$Year))$coef[1,c(1,4)])
L1 L2 L3 L4
Estimate -160.0660000 -382.2870000 136.4690000 106.9820000
Pr(>|t|) 0.6069965 0.3886881 0.7340981 0.7030296
回答2:
A combination of apply
and lapply
can accomplish this.
d <- read.table(text="Year L1 L2 L3 L4
1963 0.63 0.23 1.33 1.41
1964 1.15 0.68 0.21 0.4
1965 1.08 1.06 1.14 0.83
1966 1.69 1.85 1.3 0.76
1967 0.77 0.62 0.44 0.96", header=TRUE)
year <- d$Year
d <- d[,-1]
models<-apply(d, 2, function(x) lm(x ~ year))
summaries <- lapply(models, summary)
pvals <- lapply(lapply(summaries, coefficients), function(x) x[4])
coefs <- lapply(lapply(summaries, coefficients), function(x) x[1])
来源:https://stackoverflow.com/questions/28972652/r-repeating-linear-regression-in-a-large-dataset