Time-series histogram

倾然丶 夕夏残阳落幕 提交于 2019-12-03 03:37:29

Here is one possible solution using R and ggplot2.

Your data, ready to paste into R console:

dat = structure(list(date = structure(c(15541, 15541, 15541, 15541, 
    15541, 15541, 15541, 15541, 15541, 15541, 15541, 15541, 15541, 
    15541, 15541, 15541, 15541, 15542, 15542, 15542, 15542, 15542, 
    15542, 15542, 15542, 15542, 15542, 15542, 15542, 15542, 15542, 
    15542, 15543, 15543, 15543, 15543, 15543, 15543, 15543, 15543, 
    15543, 15543, 15543, 15543, 15543, 15543, 15543, 15543, 15543, 
    15543, 15543, 15544, 15544, 15544, 15544, 15544, 15544, 15544, 
    15544, 15544, 15544, 15544, 15544, 15544, 15544, 15544, 15544, 
    15544, 15544, 15544, 15544, 15544, 15545, 15545, 15545, 15545, 
    15545, 15545, 15545, 15545, 15545, 15545, 15545, 15545, 15545, 
    15545, 15545, 15545, 15545, 15546, 15546, 15546, 15546, 15546, 
    15546, 15546, 15546, 15546, 15546, 15546, 15546, 15546, 15546, 
    15546, 15546, 15546, 15547, 15547, 15547, 15547, 15547, 15547, 
    15547, 15547, 15547, 15547, 15547, 15547, 15547, 15547, 15547, 
    15547, 15547, 15547, 15547), class = "Date"), bucket = c(800L, 
    900L, 1000L, 1100L, 1200L, 1300L, 1400L, 1500L, 1600L, 1700L, 
    1800L, 1900L, 2000L, 2100L, 2200L, 2300L, 2400L, 800L, 900L, 
    1000L, 1100L, 1200L, 1300L, 1400L, 1500L, 1600L, 1700L, 1800L, 
    1900L, 2000L, 2100L, 2200L, 900L, 1000L, 1100L, 1200L, 1300L, 
    1400L, 1500L, 1600L, 1700L, 1800L, 1900L, 2000L, 2100L, 2200L, 
    2300L, 2400L, 2500L, 2600L, 2800L, 800L, 900L, 1000L, 1100L, 
    1200L, 1300L, 1400L, 1500L, 1600L, 1700L, 1800L, 1900L, 2000L, 
    2100L, 2200L, 2300L, 2400L, 2500L, 2600L, 2700L, 2800L, 800L, 
    900L, 1000L, 1100L, 1200L, 1300L, 1400L, 1500L, 1600L, 1700L, 
    1800L, 1900L, 2000L, 2100L, 2200L, 2300L, 2400L, 800L, 900L, 
    1000L, 1100L, 1200L, 1300L, 1400L, 1500L, 1600L, 1700L, 1800L, 
    1900L, 2000L, 2100L, 2200L, 2300L, 2400L, 1300L, 1400L, 1500L, 
    1600L, 1700L, 1800L, 1900L, 2000L, 2100L, 2200L, 2300L, 2400L, 
    2500L, 2600L, 2700L, 2800L, 2900L, 3000L, 3200L), cnt = c(119L, 
    123L, 173L, 226L, 284L, 257L, 268L, 244L, 191L, 204L, 187L, 177L, 
    164L, 125L, 140L, 109L, 103L, 123L, 165L, 237L, 278L, 338L, 306L, 
    316L, 269L, 271L, 241L, 188L, 174L, 158L, 153L, 132L, 154L, 241L, 
    246L, 300L, 305L, 301L, 292L, 253L, 251L, 214L, 189L, 179L, 159L, 
    161L, 144L, 139L, 132L, 136L, 105L, 120L, 156L, 209L, 267L, 299L, 
    316L, 318L, 307L, 295L, 273L, 283L, 229L, 192L, 193L, 170L, 164L, 
    154L, 138L, 101L, 115L, 103L, 105L, 156L, 220L, 255L, 308L, 338L, 
    318L, 255L, 278L, 260L, 235L, 230L, 185L, 145L, 147L, 157L, 109L, 
    104L, 191L, 201L, 238L, 223L, 229L, 286L, 256L, 240L, 233L, 202L, 
    180L, 184L, 161L, 125L, 110L, 101L, 132L, 117L, 124L, 154L, 167L, 
    137L, 169L, 175L, 168L, 188L, 137L, 173L, 164L, 167L, 115L, 116L, 
    118L, 125L, 104L)), .Names = c("date", "bucket", "cnt"), 
    class = "data.frame", row.names = c(NA, -125L))

Plotting code:

library(ggplot2)

plot_1 = ggplot(dat, aes(x=date, y=bucket, fill=cnt)) +
         geom_tile() +
         scale_fill_continuous(low="#F7FBFF", high="#2171B5") +
         theme_bw()

ggsave("plot_1.png", plot_1, width=6, height=4)

The plot might look better if you include rows for zero bucket values in your data. Then you could change low="#F7FBFF" to low="white".

Here's a version in D3, modeled after @bdemarest's answer using ggplot2:

This version uses tiled rect elements. If you have a large dataset, you might get better performance from a pixel-based heatmap.

If you want to compute the buckets using D3, you can use d3.nest to group the data by day and by value. There's also d3.layout.histogram, but since you presumably want uniformly-spaced bins and the same bins for every day, d3.nest should be sufficient.

One subtle consideration: I placed the tick marks on the scale in-between tiles so as to indicate visually how the values are binned. For example, the bottom-left bucket corresponds to all values between 800 and 900 on July 20 (where July 20 is the midnight-to-midnight interval); at least, that’s what I assumed from looking at your data. This is slightly clearer than labeling the middle of the rect because it indicates that the values are floored rather than rounded.

Put your numbers in a matrix and use 'image(mat)'? That looks to be all it is. A grid. A raster. Or am I missing something?

There's also ways to do this with ggplot, raster, and probably others.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!