Applying yearwise segmented regression in R

随声附和 提交于 2020-07-10 10:26:33

问题


I have daily rainfall data which I have converted to yearwise cumulative value using following code

library(seas)
library(data.table)
library(ggplot2)

#Loading data
data(mscdata)
dat <- (mksub(mscdata, id=1108447))
dat$julian.date <- as.numeric(format(dat$date, "%j"))
DT <- data.table(dat)
DT[, Cum.Sum := cumsum(rain), by=list(year)]

df <- cbind.data.frame(day=dat$julian.date,cumulative=DT$Cum.Sum)

Then I want to apply segmented regression year-wise to have year-wise breakpoints. I could able to do it for single year like

library("segmented")
x <- subset(dat,year=="1984")$julian.date
y <- subset(DT,year=="1984")$Cum.Sum
fit.lm<-lm(y~x)
segmented(fit.lm, seg.Z = ~ x, npsi=3)

I have used npsi = 3 to have 3 breakpoints. Now how to dinimically apply it year-wise segmented regression and have the estimated breakpoints?


回答1:


Here's a short script to come out with a customised function so that you can run the different yearwise regressions.

## using tidyverse processes instead of mixing and matching with other data manipulation packages 
library(tidyverse); library(segmented); library(seas)

## get mscdata from "seas" packages
data(mscdata)
dat <- (mksub(mscdata, id=1108447))

## generate cumulative sum of rain by year
d2 <- dat %>% group_by(year) %>% mutate(rain_cs = cumsum(rain)) %>% ungroup

## write a custom function

segmentedlm <- function(data, year){
  subset.df <- data %>% filter(year == year)
  fit.lm <- lm(rain_cs ~ julian.date, subset.df)
  segmented(fit.lm, seg.Z = ~ julian.date, npsi=3)
}

# run the customised function for 1975 data
segmentedlm(d2, "1975") %>% plot(., main="1975")

segmentedlm(d2, "1984") %>% plot(., main = "1984")

To output the summary of segmented linear models of multiple years into a text file:

sink("output.txt")
lapply(c("1975", "1984"), function(x) segmentedlm(d2, x))
sink()

You can change the argument for lapply to input all the years.




回答2:


You can store the lm object in a list and apply segmented for each year.

library(tidyverse)

data <- DT %>%
         group_by(year) %>%
         summarise(fit.lm = list(lm(Cum.Sum~julian.date)), 
                   julian.date1 = list(julian.date)) %>%
         mutate(out = map2(fit.lm, julian.date1, function(x, julian.date) 
                       data.frame(segmented::segmented(x, 
                                  seg.Z = ~julian.date, npsi=3)$psi))) %>%
         unnest_wider(out) %>%
         unnest(cols = c(Initial, Est., St.Err)) %>%
         dplyr::select(-fit.lm, -julian.date1)

# A tibble: 90 x 4
#    year Initial  Est. St.Err
#   <int>   <dbl> <dbl>  <dbl>
# 1  1975    84.8  68.3  1.44 
# 2  1975   168.  167.   9.31 
# 3  1975   282.  281.   0.917
# 4  1976    84.8  68.3  1.44 
# 5  1976   168.  167.   9.33 
# 6  1976   282.  281.   0.913
# 7  1977    84.8  68.3  1.44 
# 8  1977   168.  167.   9.32 
# 9  1977   282.  281.   0.913
#10  1978    84.8  68.3  1.44 
# … with 80 more rows


来源:https://stackoverflow.com/questions/62574268/applying-yearwise-segmented-regression-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!