How to find and plot the local maxima of a polynomial regression curve in R?

你说的曾经没有我的故事 提交于 2019-12-24 16:55:04

问题


I have yield data for a farm (independent variable) and various nutrients which serve as predictors. I have performed univariate (cubic) linear regression using lm(y ~ ploy(x,3)). Then I plotted the predictor variable (P) against the yield, and added a best fit curve (Figure 1). How do I then find the local maxima of this curve, and add a point to my plot which includes the value of this fitted yield (Figure 2)? I have looked into the findPeaks() function from the quantmod package, but have not been able to implement.

Figure 1 - Plot I have created:

Figure 2 - Plot I would like to create:

My code:

head(farm)
P <- farm$P
Yield <- farm$Yield
fit <- lm(Yield ~ poly(P,3), data=farm)
plot(P, Yield, col="lightblue", xlab = "P", ylab = "Yield", main="Farm - polynomial fit")
lines(sort(P), fitted(fit)[order(P)], col='darkred', type='l', cex=10) 

My data:

farm <- structure(list(Yield = c(28.3818, 27.0422555556, 31.5444454545, 
32.2084818182, 25.983, 43.49634, 53.3981333333, 59.19274, 61.23185, 
63.83512, 64.5388, 63.08576, 63.83954, 62.7838333333, 64.10366, 
67.5600666667, 67.6325, 68.0023, 63.02148, 64.0025666667, 64.19196, 
60.4893, 65.2803333333, 62.28096, 59.1914, 59.37304, 58.6482333333, 
58.08474, 58.98306, 61.52755, 61.9972, 62.0188833333, 61.09072, 
58.44715, 61.06646, 59.6103833333, 60.57035, 67.5646, 67.85488, 
71.5719571429, 50.30092, 22.80535, 62.26542, 62.0754333333, 58.46464, 
62.7326833333, 65.53482, 63.3064666667, 63.68004, 63.5212166667, 
64.65068, 66.19655, 66.0726, 66.6404666667, 65.02208, 62.1035666667, 
61.78824, 61.0844166667, 60.1723, 61.6899333333, 57.8784166667, 
56.8886666667, 59.13944, 57.26695, 69.5792666667, 61.9865166667, 
55.7342, 53.7012857143, 56.9748166667, 56.8706333333, 61.3384166667, 
52.87725, 29.2331888889, 15.5046, 42.943, 44.590325, 50.09525, 
52.68065, 53.0983714286, 19.43875, 38.06708, 59.9217666667, 62.0287166667, 
66.59496, 64.3986333333, 64.4089333333, 64.6951, 63.8205, 63.6122, 
62.51384, 63.2565666667, 62.47745, 61.42234, 62.6233166667, 62.0730333333, 
60.81996, 60.6490833333, 58.4331333333, 59.94638, 61.4119333333
), P = c(90L, 93L, 97L, 97L, 100L, 106L, 107L, 114L, 118L, 120L, 
121L, 121L, 120L, 120L, 121L, 115L, 116L, 113L, 101L, 90L, 85L, 
84L, 85L, 85L, 83L, 82L, 82L, 82L, 82L, 84L, 84L, 85L, 84L, 87L, 
87L, 88L, 88L, 92L, 93L, 95L, 80L, 67L, 85L, 90L, 88L, 90L, 91L, 
91L, 91L, 91L, 91L, 92L, 88L, 78L, 73L, 71L, 71L, 71L, 71L, 71L, 
69L, 69L, 69L, 71L, 75L, 71L, 71L, 78L, 82L, 84L, 84L, 77L, 64L, 
51L, 99L, 104L, 107L, 109L, 107L, 99L, 102L, 115L, 120L, 121L, 
121L, 120L, 121L, 112L, 111L, 101L, 101L, 89L, 89L, 85L, 84L, 
83L, 83L, 82L, 83L, 83L), Mg = c(666L, 667L, 668L, 668L, 668L, 
670L, 670L, 671L, 672L, 672L, 673L, 673L, 673L, 673L, 673L, 645L, 
645L, 636L, 594L, 553L, 535L, 534L, 534L, 534L, 534L, 534L, 534L, 
534L, 534L, 534L, 534L, 543L, 540L, 570L, 568L, 576L, 576L, 577L, 
577L, 577L, 574L, 572L, 575L, 576L, 576L, 576L, 577L, 577L, 577L, 
577L, 577L, 577L, 574L, 567L, 565L, 564L, 564L, 564L, 564L, 564L, 
564L, 564L, 564L, 564L, 565L, 564L, 553L, 519L, 509L, 509L, 509L, 
508L, 505L, 502L, 668L, 669L, 670L, 670L, 670L, 668L, 669L, 672L, 
672L, 673L, 673L, 673L, 673L, 636L, 636L, 594L, 594L, 553L, 553L, 
534L, 534L, 534L, 534L, 534L, 534L, 534L), S = c(29L, 30L, 31L, 
31L, 31L, 33L, 33L, 35L, 36L, 36L, 37L, 37L, 36L, 36L, 37L, 37L, 
37L, 37L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 
38L, 38L, 38L, 38L, 36L, 36L, 36L, 36L, 37L, 37L, 37L, 34L, 31L, 
35L, 36L, 36L, 36L, 37L, 36L, 37L, 37L, 37L, 37L, 37L, 38L, 39L, 
38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 38L, 39L, 38L, 37L, 37L, 
37L, 38L, 38L, 36L, 33L, 30L, 31L, 32L, 33L, 34L, 33L, 31L, 32L, 
35L, 36L, 37L, 37L, 36L, 37L, 37L, 37L, 38L, 38L, 38L, 38L, 38L, 
38L, 38L, 38L, 38L, 38L, 38L), K = c(537L, 542L, 549L, 549L, 
554L, 563L, 565L, 576L, 584L, 586L, 589L, 588L, 587L, 587L, 588L, 
565L, 566L, 557L, 516L, 477L, 460L, 459L, 460L, 459L, 456L, 455L, 
455L, 454L, 455L, 457L, 458L, 463L, 461L, 478L, 476L, 482L, 482L, 
489L, 491L, 493L, 470L, 447L, 477L, 485L, 482L, 485L, 487L, 486L, 
487L, 487L, 487L, 488L, 495L, 512L, 517L, 514L, 513L, 513L, 512L, 
512L, 510L, 509L, 509L, 513L, 519L, 513L, 495L, 454L, 443L, 446L, 
446L, 435L, 414L, 392L, 552L, 561L, 565L, 569L, 565L, 551L, 557L, 
579L, 586L, 589L, 589L, 587L, 589L, 555L, 554L, 515L, 516L, 476L, 
475L, 459L, 458L, 457L, 456L, 455L, 456L, 457L), Ca = c(4795L, 
4796L, 4797L, 4797L, 4797L, 4799L, 4799L, 4800L, 4801L, 4801L, 
4802L, 4802L, 4802L, 4802L, 4802L, 4779L, 4779L, 4771L, 4736L, 
4701L, 4686L, 4685L, 4685L, 4685L, 4685L, 4685L, 4685L, 4685L, 
4685L, 4685L, 4685L, 4789L, 4754L, 5135L, 5100L, 5204L, 5204L, 
5205L, 5205L, 5205L, 5202L, 5200L, 5203L, 5204L, 5204L, 5204L, 
5205L, 5205L, 5205L, 5205L, 5205L, 5205L, 5125L, 4886L, 4807L, 
4806L, 4806L, 4806L, 4806L, 4806L, 4806L, 4806L, 4806L, 4806L, 
4807L, 4806L, 4687L, 4329L, 4211L, 4211L, 4211L, 4210L, 4207L, 
4204L, 4797L, 4798L, 4799L, 4799L, 4799L, 4797L, 4798L, 4801L, 
4801L, 4802L, 4802L, 4802L, 4802L, 4771L, 4771L, 4736L, 4736L, 
4701L, 4701L, 4685L, 4685L, 4685L, 4685L, 4685L, 4685L, 4685L
)), .Names = c("Yield", "P", "Mg", "S", "K", "Ca"), row.names = c(NA, 
100L), class = "data.frame")

回答1:


The coefficients of fit give you an explicit polynomial equation for the regression line. You can therefore find the maximum analytically by taking the first and second derivatives of this polynomial:

fit <- lm(Yield ~ poly(P,3, raw=TRUE), data=farm)  # Note use of raw=TRUE, otherwise poly returns orthogonal polynomials

# Plot data points
with(farm, plot(P, Yield, col="lightblue", ylim=c(0, max(Yield)), 
                xlab = "P", ylab = "Yield", main="Farm - polynomial fit"))

# Add model fit
P = seq(min(farm$P), max(farm$P), length=1000)
pred = data.frame(P,
                  Yield=predict(fit, newdata=data.frame(P)))

with(pred, lines(P, Yield, col='darkred', type='l', cex=10))

# Vector of model coefficients
cf = coef(fit)

# First derivative of fit. This is just for Illustration; we won't plot this
#  equation directly, but we'll find its roots to get the locations of 
#  local maxima and minima.
D1 = cf[2] + 2*cf[3]*pred$P + 3*cf[4]*pred$P^2

# Roots of first derivative (these are locations where first derivative = 0).
#  Use quadratic formula to find roots.
D1roots = (-2*cf[3] + c(-1,1)*sqrt((2*cf[3])^2 - 4*3*cf[4]*cf[2]))/(2*3*cf[4])

# Second derivative at the two roots. 
D2atD1roots =  2*cf[3] + 6*cf[4]*D1roots

# Local maxima are where the second derivative is negative
max_x = D1roots[which(D2atD1roots < 0)]
max_y = cf %*% max_x^(0:3)

# Plot local maxima
points(max_x, max_y, pch=16, col="red")

# Add values of Yield at local maxima
text(max_x, max_y, label=round(max_y,2), adj=c(0.5,-1), cex=0.8)




回答2:


as you don't give a minimal reproducible example it is a bit hard to give a code. But an advice would be to look for the first closest to zero points of the derivative of your fit curve. You can simply calculate the derivative

curve <- fitted(fit)[order(P)]
derivative <- diff(curve)/(diff(order(P)))

and then look for the zeros of your function. Here an approch could be to find the min values of abs(derivative)



来源:https://stackoverflow.com/questions/48505900/how-to-find-and-plot-the-local-maxima-of-a-polynomial-regression-curve-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!