How do I split a data frame based on range of column values in R?

时间秒杀一切 提交于 2019-12-01 04:44:37

You can combine split with cut to do this in a single line of code, avoiding the need to subset with a bunch of different expressions for different data ranges:

split(dat, cut(dat$Age, c(0, 5, 10, 15), include.lowest=TRUE))
# $`[0,5]`
#   Users Age
# 1     1   2
# 4     4   3
# 
# $`(5,10]`
#   Users Age
# 2     2   7
# 3     3  10
# 5     5   8
# 
# $`(10,15]`
# [1] Users Age  
# <0 rows> (or 0-length row.names)

cut splits up data based on the specified break points, and split splits up a data frame based on the provided categories. If you stored the result of this computation into a list called l, you could access the smaller data frames with l[[1]], l[[2]], and l[[3]] or the more verbose:

l$`[0,5]`
l$`(5,10]`
l$`(10, 15]`

First, here's your dataset for my purposes: foo=data.frame(Users=1:6,Age=c(2,7,10,3,8,20))

Here's your first dataset with ages 0–5: subset(foo,Age<=5&Age>=0)

  Users Age
1     1   2
4     4   3

Here's your second with ages 6–10: subset(foo,Age<=10&Age>=6)

  Users Age
2     2   7
3     3  10
5     5   8

Your third (using subset(foo,Age<=15&Age>=11)) is empty – your last Age observation is over 15.

Note also that fractional ages between 5 and 6 or 10 and 11 (e.g., 5.1, 10.5) would be excluded, as this code matches your question very literally. If you'd want someone with an age less than 6 to go in the first group, just amend that code to subset(foo,Age<6&Age>=0). If you'd prefer a hypothetical person with Age=5.1 in the second group, that group's code would be subset(foo,Age<=10&Age>5).

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!