how to avoid overlapping labels with identical data points in scatterplot / ggplot?

最后都变了- 提交于 2019-12-21 04:58:11

问题


Is there any function etc which avoids overlapping data labels for identical data points in a scatter plot? I have checked the various questions/responses to textxy, direct.label, and geom_text(), but I haven't been successful. Maybe it's simply not possible.

Here's a sample of the relevant data:

structure(list(cowc = structure(c(5L, 7L, 24L, 24L, 23L, 36L, 
34L, 38L, 23L, 6L, 8L, 38L, 38L, 23L, 5L, 7L, 24L, 24L, 23L, 
36L, 34L, 38L, 23L, 6L, 8L, 38L, 38L, 23L), .Label = c("AFG", 
"ANG", "AZE", "BNG", "BOS", "BUI", "CAM", "CDI", "CHA", "COL", 
"CRO", "DOM", "DRC", "ETH", "GNB", "GRG", "GUA", "IND", "INS", 
"IRQ", "KEN", "LAO", "LBR", "LEB", "MAL", "MLD", "MZM", "NEP", 
"NIC", "PHI", "PNG", "RUS", "RWA", "SAF", "SAL", "SIE", "SOM", 
"SUD", "TAJ", "UKG", "YAR", "ZIM"), class = "factor"), conflict = c("Bosnia 92-95", 
"Cambodia 70-91", "Lebanon 58-58", "Lebanon 75-89", "Liberia 89-93", 
"SieLeo 91-96", "Stafrica 83-91", "Sudan 63-72", "Liberia 94-96", 
"Burundi 1993-2005", "Cote d'Ivoire 2002-2007", "Darfur, Sudan 2003-2010", 
"Sudan 83-05", "Liberia 1999-2003", "Bosnia 92-95", "Cambodia 70-91", 
"Lebanon 58-58", "Lebanon 75-89", "Liberia 89-93", "SieLeo 91-96", 
"Stafrica 83-91", "Sudan 63-72", "Liberia 94-96", "Burundi 1993-2005", 
"Cote d'Ivoire 2002-2007", "Darfur, Sudan 2003-2010", "Sudan 83-05", 
"Liberia 1999-2003"), totalps = c(3L, 2L, 2L, 2L, 1L, 3L, 4L, 
3L, 1L, 3L, 3L, 4L, 3L, 3L, 3L, 2L, 2L, 2L, 1L, 3L, 4L, 3L, 1L, 
3L, 3L, 4L, 3L, 3L), vetotype = structure(c(1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("strictvetos", "lenientvetos"
), class = "factor"), intensity = c(3L, 4L, 2L, 5L, 2L, 2L, 2L, 
2L, 2L, 3L, 2L, 2L, 2L, 2L, 3L, 4L, 2L, 6L, 2L, 2L, 4L, 2L, 2L, 
3L, 3L, 2L, 2L, 2L)), .Names = c("cowc", "conflict", "totalps", 
"vetotype", "intensity"), class = "data.frame", row.names = c(NA, 
-28L))

Here's my code:

vetotype.plot <- ggplot(vetotype.x, aes(x=totalps, y=intensity, color=conflict))+
      geom_point() + 
      labs(x="number of power-sharing arenas", y="intensity") +
      ggtitle("Number of Power-Sharing areas and Veto intensity") +
      geom_text(aes(label=conflict),hjust=0, vjust=0, size=4)+
      scale_x_continuous(limits=c(1, 5))+
      theme(legend.position="none")+
      facet_wrap(~vetotype, nrow=2)

plot(vetotype.plot)

And below is my graph. I manually highlighted those data points which are overlapping.

What I am looking for is an 'automatic' way to get the labels of the overlapping data points displayed in way so that they don't overlap. Is there any function for this purpose? Many thanks!


回答1:


This is not a completely general solution, but it does seem to work in your case.

library(ggplot2)
# identify duplicated points
dupes <- aggregate(conflict~totalps+intensity+vetotype,vetotype.x,length)
colnames(dupes)[4] = "dupe"
df <- merge(vetotype.x,dupes)   # add dupe column
df$vjust <- 0                   # default vertical offset is 0
# calculate vertical offsets based on number of dupes
for (i in 2:max(df$dupe)) df[df$dupe==i,]$vjust<-seq(-trunc(i/2),-trunc(i/2)+i-1)
# render the plot
vetotype.plot <- ggplot(df, aes(x=totalps, y=intensity, color=conflict))+
  geom_point() + 
  labs(x="number of power-sharing arenas", y="intensity") +
  ggtitle("Number of Power-Sharing areas and Veto intensity") +
  geom_text(aes(label=conflict,vjust=vjust), hjust=0,size=4)+
  scale_x_continuous(limits=c(1, 5))+
  scale_y_continuous(limits=c(1, 6))+
  theme(legend.position="none")+
  facet_wrap(~vetotype, nrow=2)

plot(vetotype.plot)




回答2:


ggrepel can now do this easily:

https://twitter.com/slowkow/status/686341190749392896




回答3:


Here's what your plot looks like with ggrepel:

library(ggrepel)

ggplot(vetotype.x, aes(x=totalps, y=intensity, color=conflict))+
  geom_point() + 
  labs(x="number of power-sharing arenas", y="intensity") +
  ggtitle("Number of Power-Sharing areas and Veto intensity") +
  geom_text_repel(
    aes(label=conflict), size=4, box.padding = unit(0.5, "lines")
  )+
  scale_x_continuous(limits=c(1, 5))+
  theme(legend.position="none")+
  facet_wrap(~vetotype, nrow=2)



来源:https://stackoverflow.com/questions/21585128/how-to-avoid-overlapping-labels-with-identical-data-points-in-scatterplot-ggpl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!