Heatmap-like plot for three categorical variables

不想你离开。 提交于 2019-12-25 06:55:32

问题


I'm dealing with a data frame of categorical variables in case form, made up of three variables (i.e. color, shape and size) and its corresponding frequency. An example of the data frame is like this:

 Color    Shape     Size   Freq
1 Yellow  Square    Big    10
2 Yellow  Square    Medium  6
3 Yellow  Square    Small   3
4 Yellow  Triangle  Big     4
5 Yellow  Triangle  Medium  6
6 Yellow  Triangle  Small   8
7 Red     Square    Big     2
8 Red     Square    Medium  6
9 Red     Square    Small   5
10Red     Triangle  Big    12
.......

The "color" variable is measured against the "shape" and "size" variables, having a frequency for each case.

From this data frame I'm struggling to create a heatmap-like plot where only the relation between "Color" and "Shape" is displayed, and using as weight the variable "Size" with the highest frequency. Bit tricky, isn't it!

For example, for the "Yellow" - "Square" cases I should only display "Big", since "big" is the size with the highest freq. For every size there should be an accompanying color (i.e "red" for big, "green" for medium, and "orange" for small). Frank


回答1:


How about this?

library(dplyr)
library(ggplot2)

df_max <- df %>%
  group_by(Color, Shape) %>%
  slice(which.max(Freq))

head(df_max)
# Source: local data frame [4 x 4]
# Groups: Color, Shape [4]
# 
#    Color    Shape   Size  Freq
#    (chr)    (chr)  (chr) (int)
# 1    Red   Square Medium     6
# 2    Red Triangle    Big    12
# 3 Yellow   Square    Big    10
# 4 Yellow Triangle  Small     8

ggplot(df_max, aes(x = Color, y = Shape, fill = Size)) +
  geom_tile()



来源:https://stackoverflow.com/questions/32851208/heatmap-like-plot-for-three-categorical-variables

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!