ggplot2: issues with dual y-axes and Loess smoothing

大憨熊 提交于 2019-12-24 11:05:10

问题


I'm a novice with R and ggplot. I recognize the power of R and elegance of ggplot and am trying to learn. Normally, I can find a solution online but have had no luck this time.

I am trying to generate a chart in ggplot comparing Economic Freedom scores with Life Expectancy and Infant mortality using World Bank data (the csv data is included at the bottom of the post). I have had some success using this code (using the example at https://rpubs.com/MarkusLoew/226759):

p <- ggplot(mydata, aes(x = Score))
  p <- p + geom_point(aes(y = Longevity, colour = "Life Expectancy")) 
  p <- p + geom_point(aes(y = Infant/1, colour = "Infant mortality (per 
capita)")) 
   p <- p + scale_y_continuous(sec.axis = sec_axis(~.*1, name = "Infant 
mortality (per capita)")) 
  p <- p + scale_colour_manual(values = c("blue", "red"))
  p <- p + labs(y = "Life Expectancy (years)",
                x = "Score",
                colour = " ")
p

This has produced the following: my messed up chart

I can't manage to properly scale the primary y-axis. Scaling the graphs as in the example (link above) doesn't work: I just expand out or squash the Longevity data. I tried loading the Longevity data on the secondary y but it still didn't work.

The other issue is that I would like to add LOESS smooth trendlines to each set of data. I have tried following various examples but nothing works.

If anyone has a solution it will be much appreciated!

Thanks

Data:

Country Name,Score,GDP,Infant,Longevity,,,,,,,,,
Afghanistan,48.9,585.850064,53.2,63.673,,,,,,,,,
Albania,64.4,4537.86249,8.1,78.345,,,,,,,,,
Algeria,46.5,4.12E+03,21,76.078,,,,,,,,,
Angola,48.5,4.17E+03,55.8,61.547,,,,,,,,,
Argentina,50.4,1.44E+04,9.7,76.577,,,,,,,,,
Armenia,70.3,3936.79832,11.9,74.618,,,,,,,,,
Australia,81,5.38E+04,3.1,82.5,,,,,,,,,
Austria,72.3,4.73E+04,3,80.8902439,,,,,,,,,
Azerbaijan,63.6,4131.61831,21.9,72.026,,,,,,,,,
Bahrain,68.5,23655.0356,6.4,76.9,,,,,,,,,
Bangladesh,55,1.52E+03,28.3,72.489,,,,,,,,,
Barbados,54.5,16788.6839,11.9,75.906,,,,,,,,,
Belarus,58.6,5726.02967,2.9,73.82682927,,,,,,,,,
Belgium,67.8,4.33E+04,3.1,80.99268293,,,,,,,,,
Belize,58.6,4905.50628,12.8,70.384,,,,,,,,,
Benin,59.2,829.797231,65.1,60.907,,,,,,,,,
Bhutan,58.4,3110.23011,26.5,70.197,,,,,,,,,
Bolivia,47.7,3393.95582,29,69.125,,,,,,,,,
Bosnia and Herzegovina,60.2,5180.6363,5.1,76.911,,,,,,,,,
Botswana,70.1,7595.59585,32.3,66.797,,,,,,,,,
Brazil,52.9,9.82E+03,14.6,75.509,,,,,,,,,
Brunei Darussalam,69.8,28290.5852,9,77.203,,,,,,,,,
Bulgaria,67.9,8031.59844,6.7,74.61463415,,,,,,,,,
Burkina Faso,59.6,670.705913,52.6,60.361,,,,,,,,,
Burundi,53.2,320.08687,44.1,57.481,,,,,,,,,
Cabo Verde,56.9,3209.69112,15.9,72.798,,,,,,,,,
Cambodia,59.5,1384.42319,26.3,68.981,,,,,,,,,
Cameroon,51.8,1446.70289,56.6,58.073,,,,,,,,,
Canada,78.5,4.50E+04,4.6,82.3005122,,,,,,,,,
Central African Republic,51.8,418.411287,89.2,52.171,,,,,,,,,
Chad,49,669.886426,75,52.903,,,,,,,,,
Chile,76.5,1.53E+04,6.6,79.522,,,,,,,,,
China,57.4,8.83E+03,8.6,76.252,,,,,,,,,
Colombia,69.7,6.30E+03,13.1,74.381,,,,,,,,,
Comoros,55.8,797.286368,53.6,63.701,,,,,,,,,
Costa Rica,65,11630.6684,8,79.831,,,,,,,,,
Cote d'Ivoire,63,1662.44247,66,53.582,,,,,,,,,
Croatia,59.4,13294.5149,4,78.02195122,,,,,,,,,
Cyprus,67.9,25233.571,2.2,80.508,,,,,,,,,
Czech Republic,73.3,2.04E+04,2.6,78.33170732,,,,,,,,,
Denmark,75.1,5.63E+04,3.7,80.70487805,,,,,,,,,
Djibouti,46.7,1927.58971,53,62.465,,,,,,,,,
Dominica,63.7,7609.61435,30.4,,,,,,,,,,
Dominican Republic,62.9,7052.25884,25.6,73.861,,,,,,,,,
Ecuador,49.3,6.20E+03,12.7,76.327,,,,,,,,,
"Egypt, Arab Rep.",52.6,2.41E+03,19.4,71.484,,,,,,,,,
El Salvador,64.1,3889.30877,12.9,73.512,,,,,,,,,
Equatorial Guinea,45,9850.01358,67.4,57.681,,,,,,,,,
Estonia,79.1,19704.655,2.3,77.73658537,,,,,,,,,
Ethiopia,52.7,767.563478,42.5,65.475,,,,,,,,,
Fiji,63.4,5589.38883,21.1,70.269,,,,,,,,,
Finland,74,4.57E+04,1.9,81.7804878,,,,,,,,,
France,63.3,3.85E+04,3.5,82.27317073,,,,,,,,,
Gabon,58.6,7220.68724,36.1,66.105,,,,,,,,,
Georgia,76,4078.25488,10.2,73.261,,,,,,,,,
Germany,73.8,4.45E+04,3.2,80.64146341,,,,,,,,,
Ghana,56.2,1641.48662,37.2,62.742,,,,,,,,,
Greece,55,1.86E+04,4.2,81.03658537,,,,,,,,,
Guatemala,63,4470.98957,23.9,73.409,,,,,,,,,
Guinea,47.6,825.34493,58.1,60.015,,,,,,,,,
Guinea-Bissau,56.1,723.658622,57.4,57.403,,,,,,,,,
Guyana,58.5,4725.31906,26.7,66.65,,,,,,,,,
Haiti,49.6,765.683925,55,63.33,,,,,,,,,
Honduras,58.8,2480.12593,16.2,73.575,,,,,,,,,
"Hong Kong SAR, China",88.6,4.62E+04,,84.22682927,,,,,,,,,
Hungary,65.8,1.42E+04,4.1,75.56829268,,,,,,,,,
Iceland,74.4,70056.8734,1.7,82.46829268,,,,,,,,,

回答1:


This should give you a good start. You can play around with scale_ratio & dif if you want to

library(tidyverse)

mydata <- read_csv(text, col_types = paste0(c("c", rep("d", 4), rep("_", 9)), collapse = ""))
mydata
#> # A tibble: 67 x 5
#>    `Country Name` Score    GDP Infant Longevity
#>    <chr>          <dbl>  <dbl>  <dbl>     <dbl>
#>  1 Afghanistan     48.9   586.   53.2      63.7
#>  2 Albania         64.4  4538.    8.1      78.3
#>  3 Algeria         46.5  4120    21        76.1
#>  4 Angola          48.5  4170    55.8      61.5
#>  5 Argentina       50.4 14400     9.7      76.6
#>  6 Armenia         70.3  3937.   11.9      74.6
#>  7 Australia       81   53800     3.1      82.5
#>  8 Austria         72.3 47300     3        80.9
#>  9 Azerbaijan      63.6  4132.   21.9      72.0
#> 10 Bahrain         68.5 23655.    6.4      76.9
#> # ... with 57 more rows

Calculate ratios needed to scale the two y-axes

scale_ratio <- (max(mydata$Infant, na.rm = TRUE) - min(mydata$Infant, na.rm = TRUE)) /
  (max(mydata$Longevity, na.rm = TRUE) - min(mydata$Longevity, na.rm = TRUE))

dif <-  min(mydata$Longevity, na.rm = TRUE) - min(mydata$Infant, na.rm = TRUE)

myColor <- c("#d95f02", "#1b9e77")

p <- ggplot(mydata, aes(x = Score, y = Longevity)) + 
  geom_point(aes(colour = "Life Expectancy"), 
             shape = "triangle",
             alpha = 0.7, size = 2) +
  geom_point(aes(y = Infant/scale_ratio + dif, 
                 colour = "Infant mortality (per capita)"), 
             alpha = 0.7, size = 2) +
  scale_y_continuous(sec.axis = sec_axis(~ (. - dif) * scale_ratio, 
                                         name = "Infant mortality (per capita)")) +
  scale_colour_manual(values = myColor) +
  theme_bw(base_size = 14) +
  labs(y = "Life Expectancy (years)", 
       x = "Score",
       colour = " ") +
  guides(colour = guide_legend(title = "",
                              override.aes = list(shape = c("circle", "triangle")))) +
  theme(legend.position = 'bottom') +
  NULL
p

Add fitted lines and their corresponding equations/R2

### https://docs.r4photobiology.info/ggpmisc/articles/user-guide.html
library(ggpmisc)

formula <- y ~ poly(x, 2, raw = TRUE)

p + 
  stat_smooth(aes(y = Longevity), 
              method = "lm", formula = formula, se = FALSE, size = 1, color = myColor[2]) +
  stat_smooth(aes(y = Infant/scale_ratio + dif), 
              method = "lm", formula = formula, se = FALSE, size = 1, color = myColor[1]) +
  stat_poly_eq(aes(y = Longevity,
                   label =  paste(..eq.label.., ..adj.rr.label.., 
                                  sep = "~~italic(\"with\")~~")),
               geom = "text", alpha = 0.7,
               formula = formula, parse = TRUE, 
               color = myColor[2],
               label.x.npc = 0.5,
               label.y.npc = 0.95) +
  stat_poly_eq(aes(y = Infant/scale_ratio + dif,
                   label =  paste(..eq.label.., ..adj.rr.label.., 
                                  sep = "~~italic(\"with\")~~")),
               geom = "text", alpha = 0.7,
               color = myColor[1],
               formula = formula, parse = TRUE,                
               label.x.npc = 0.75,
               label.y.npc = 0.15) +
  NULL

Created on 2018-10-07 by the reprex package (v0.2.1.9000)



来源:https://stackoverflow.com/questions/52681895/ggplot2-issues-with-dual-y-axes-and-loess-smoothing

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!