Piecewise regression with R: plotting the segments

后端 未结 3 1949
-上瘾入骨i
-上瘾入骨i 2020-12-23 14:48

I have 54 points. They represent offer and demand for products. I would like to show there is a break point in the offer.

First, I sort the x-axis (offer) and remove

相关标签:
3条回答
  • 2020-12-23 15:17

    Vincent has you on the right track. The only thing "weird" about the lines in your resulting plot is that lines draws a line between each successive point, which means that "jump" you see if it simply connecting the two ends of each line.

    If you don't want that connector, you have to split the lines call into two separate pieces.

    Also, I feel like you can simplify your regression a bit. Here's what I did:

    #After reading your data into dat
    Break <- 22.4
    dat$grp <- dat$offer < Break
    
    #Note the addition of the grp variable makes this a bit easier to read
    m <- lm(demand~offer*grp,data = dat)
    dat$pred <- predict(m)
    
    plot(dat$offer,dat$demand)
    dat <- dat[order(dat$offer),]
    with(subset(dat,offer < Break),lines(offer,pred))
    with(subset(dat,offer >= Break),lines(offer,pred))
    

    which produces this plot:

    enter image description here

    0 讨论(0)
  • 2020-12-23 15:30

    Here is an easier approach using ggplot2.

    require(ggplot2)
    qplot(offer, demand, group = offer > 22.4, geom = c('point', 'smooth'), 
       method = 'lm', se = F, data = dat)
    

    EDIT. I would also recommend taking a look at this package segmented which supports automatic detection and estimation of segmented regression models.

    enter image description here

    UPDATE:

    Here is an example that makes use of the R package segmented to automatically detect the breaks

    library(segmented)
    set.seed(12)
    xx <- 1:100
    zz <- runif(100)
    yy <- 2 + 1.5*pmax(xx - 35, 0) - 1.5*pmax(xx - 70, 0) + 15*pmax(zz - .5, 0) + 
      rnorm(100,0,2)
    dati <- data.frame(x = xx, y = yy, z = zz)
    out.lm <- lm(y ~ x, data = dati)
    o <- segmented(out.lm, seg.Z = ~x, psi = list(x = c(30,60)),
      control = seg.control(display = FALSE)
    )
    dat2 = data.frame(x = xx, y = broken.line(o)$fit)
    
    library(ggplot2)
    ggplot(dati, aes(x = x, y = y)) +
      geom_point() +
      geom_line(data = dat2, color = 'blue')
    

    segmented

    0 讨论(0)
  • 2020-12-23 15:39

    The weird lines are simply due to the order in which the points are plotted. The following should look better:

    i <- order(offer)
    lines(offer[i], predict(model,list(offer))[i])
    

    The warning comes from the fact that the * character is interpreted by lm.

    > lm(demand~(offer<22.4)*offer + (offer>=22.4)*offer)
    Call:
    lm(formula = demand ~ (offer < 22.4) * offer + (offer >= 22.4) * offer)
    Coefficients:
                (Intercept)         offer < 22.4TRUE                    offer  
                    -309.46                   356.08                    29.86  
          offer >= 22.4TRUE   offer < 22.4TRUE:offer  offer:offer >= 22.4TRUE  
                         NA                   -20.79                       NA  
    

    In addition, (offer<22.4)*offer is a discontinuous function: this is where the discontinuity comes from.

    The following should be closer to what you want.

    model <- lm(
      demand ~ ifelse(offer<22.4,offer-22.4,0) 
               + ifelse(offer>=22.4,offer-22.4,0) )
    
    0 讨论(0)
提交回复
热议问题