R histogram showing time spent in each bin

谁都会走 提交于 2019-12-06 11:39:03

问题


I'm trying to create a plot similar to the ones here:

Basically I want a histogram, where each bin shows how long was spent in that range of cadence (e.g 1 hour in 0-20rpm, 3 hours in 21-40rpm, etc)

library("rjson") # 3rd party library, so: install.packages("rjson")

# Load data from Strava API.
# Ride used for example is http://app.strava.com/rides/13542320
url <- "http://app.strava.com/api/v1/streams/13542320?streams[]=cadence,time"
d <- fromJSON(paste(readLines(url)))

Each value in d$cadence (rpm) is paired with the same index in d$time (the number of seconds from the start).

The values are not necessarily uniform (as can be seen if you compare plot(x=d$time, y=d$cadence, type='l') with plot(d$cadence, type='l') )

If I do the simplest possible thing:

hist(d$cadence)

..this produces something very close, but the Y value is "frequency" instead of time, and ignores the time between each data point (so the 0rpm segment in particular will be underrepresented)


回答1:


You need to create a new column to account for the time between samples.

I prefer data.frames to lists for this kind of thing, so:

d <- as.data.frame(fromJSON(paste(readLines(url))))
d$sample.time <- 0
d$sample.time[2:nrow(d)] <- d$time[2:nrow(d)]-d$time[1:(nrow(d)-1)]

now that you've got your sample times, you can simply "repeat" the cadence measures for anything with a sample time more than 1, and plot a histogram of that

hist(rep(x=d$cadence, times=d$sample.time),
     main="Histogram of Cadence", xlab="Cadence (RPM)",
     ylab="Time (presumably seconds)")

There's bound to be a more elegant solution that wouldn't fall apart for non-integer sample times, but this works with your sample data.

EDIT: re: the more elegant, generalized solution, you can deal with non-integer sample times with something like new.d <- aggregate(sample.time~cadence, data=d, FUN=sum), but then the problem becomes plotting a histogram for something that looks like a frequency table, but with non-integer frequencies. After some poking around, I'm coming to the conclusion you'd have to roll-your-own histogram for this case by further aggregating the data into bins, and then displaying them with a barchart.



来源:https://stackoverflow.com/questions/11529146/r-histogram-showing-time-spent-in-each-bin

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!