I have a set of data in which I need to code values of certain variables (numeric) into 3 classes.
My data set is similar to this but has 60 more variables:
I think Greg's answers cover "standard operating procedure", but I find many uses for the findInterval function as well. It naturally returns a number that identifies the interval in the second argument.
data$int <- findInterval(data$wt, c(179, 200, 300, Inf))
data
Just for completeness and info, the classInt package (on CRAN) is another handy way to classify numbers into classes.
Just to show an alternate (similar to recode in SPSS) method from package car:
> data$SWT <- with(data, recode(wt, "lo:200=1; 300:hi=3; else=2"))
> data
anim wt SWT
1 1 181.0 1
2 2 179.0 1
3 3 180.5 1
4 4 201.0 2
5 5 201.5 2
6 6 245.0 2
7 7 246.4 2
8 8 189.3 1
9 9 301.0 3
10 10 354.0 3
11 11 369.0 3
12 12 205.0 2
13 13 199.0 1
14 14 394.0 3
15 15 231.3 2
The cut
method as outlined by @Greg is probably what you want here. One thing to note is that cut
returns a factor by default, which you can suppress by supplying labels = FALSE
to return the integer values:
cut(data$wt, c(178, 200, 300, Inf), labels = FALSE)
Alternatively, if your cutting does not lend itself to natural breaks, you can use ifelse()
. You can "nest" the ifelse statements similar to Excel. I use "with" to cut down on the typing needed:
data$group2 <- with(data, ifelse(wt >= 179 & wt < 200, 1,
ifelse(wt >= 200 & wt < 300, 2, 3))
)
You can try cut
anim <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)
wt <-c(181,179,180.5,201,201.5,245,246.4,
189.3,301,354,369,205,199,394,231.3)
data <- data.frame(anim,wt)
EDIT: fixed group - right = FALSE, got rid of split example.
group = cut(data$wt, c(178, 200, 300, Inf), right=FALSE)
data$swt = as.numeric(group)
data
anim wt swt
1 1 181.0 1
2 2 179.0 1
3 3 180.5 1
4 4 201.0 2
5 5 201.5 2
6 6 245.0 2
7 7 246.4 2
8 8 189.3 1
9 9 301.0 3
10 10 354.0 3
11 11 369.0 3
12 12 205.0 2
13 13 199.0 1
14 14 394.0 3
15 15 231.3 2
>