Using ggplot2 in R creating multiple smoothed/fitted lines

空扰寡人 提交于 2021-02-20 02:57:28

问题


I am having trouble producing a figure in R using ggplots. No stats are needed - I just need the visual representation of my data. I have 7 participants, and I want to plot a line for each participant through a scatterplot. The slope and shape of the line is different for each participant, however on average is somewhat exponential.

I have used the below code in R, however I am only getting liner models. When changing the method to loess, the lines are too wriggly. Can someone please help me make this more presentable? Essentially I'm after a line of best fit for each participant, yet still need to be able to use the function fullrange = FALSE.

Furthermore, should I be using stat_smooth or geom_smooth? Is there a difference.

ggplot(data, aes(x=x, y=y, group = athlete)) +
  geom_point() + 
  stat_smooth(method = "lm", se=FALSE, fullrange = FALSE)

ggplot for data

Thanks in advance for any help!


回答1:


I don't have your data, so I'll just do this with the mpg dataset.

As you've noted, you can use geom_smooth() and specify a method such as "loess". Know that you can pass on arguments to the methods as you would if you were using the function behind it.

With loess, the smoothing parameter is span. You can play around with this until you're happy with the results.

data(mpg)
g <- ggplot(mpg, aes(x = displ, y = hwy, color = class)) +  geom_point()

g + geom_smooth(se = F, method = 'loess', span = .8) + ggtitle("span 0.8")
g + geom_smooth(se = F, method = 'loess', span = 1) + ggtitle("span 1")




回答2:


There is, to my knowledge, no built-in method for achieving this, but you can do it with some manual plotting. First, since you expect an exponential relationship, it might make sense to run a linear regression using log(y) as the response (I'll be using u and v, in order not to confuse them with the x and y aesthetics in the graph):

tb1 = tibble(
  u = rep(runif(100, 0, 5), 3),
  a = c(rep(-.5, 100), rep(-1, 100), rep(-2, 100)),
  v = exp(a*u + rnorm(3*100, 0, .1))
) %>% mutate(a = as.factor(a))
lm1 = lm(log(v) ~ a:u, tb1)
summary(lm1)

gives you:

Call:
lm(formula = log(v) ~ a:u, data = tb1)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.263057 -0.069510 -0.001262  0.062407  0.301033 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.013696   0.012234   -1.12    0.264    
a-2:u       -1.996670   0.004979 -401.04   <2e-16 ***
a-1:u       -1.001412   0.004979 -201.14   <2e-16 ***
a-0.5:u     -0.495636   0.004979  -99.55   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1002 on 296 degrees of freedom
Multiple R-squared:  0.9984,    Adjusted R-squared:  0.9983 
F-statistic: 6.025e+04 on 3 and 296 DF,  p-value: < 2.2e-16

Under "Coefficients" you can find the intercept and the "slopes" for the curves (actually the exponential factors). You can see that they closely match the factors we used for generating the data.

To plot the fitting curves, you can use the "predicted" values, produced from your linear model using predict:

ggplot(tb1, aes(u, v, colour=a)) +
  geom_point() +
  geom_line(data=tb1 %>% mutate(v = exp(predict(lm1))))

If you want to have the standard error ribbons, it's a little more work, but still possible:

p1 = predict(lm1, se.fit=T)
tb2 = tibble(
  u = tb1$u,
  a = tb1$a,
  v = exp(p1$fit),
  vmin = exp(p1$fit - 1.96*p1$se.fit),
  vmax = exp(p1$fit + 1.96*p1$se.fit)
)
ggplot(tb2, aes(u, v, colour=a)) +
  geom_ribbon(aes(fill=a, ymin=vmin, ymax=vmax), colour=NA, alpha=.25) +
  geom_line(size=.5) +
  geom_point(data=tb1)

produces:



来源:https://stackoverflow.com/questions/58873060/using-ggplot2-in-r-creating-multiple-smoothed-fitted-lines

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!