I have a dataset about 105000 rows and 30 columns. I have a categorical variable that I would like to assign it to a number. In Excel, I would probably do something with
You could use mapvalues()
from the plyr package.
Initial data:
dat <- data.frame(HouseType = c("Semi", "Single", "Row", "Single", "Apartment", "Apartment", "Row"))
> dat
HouseType
1 Semi
2 Single
3 Row
4 Single
5 Apartment
6 Apartment
7 Row
Lookup / crosswalk table:
lookup <- data.frame(type_text = c("Semi", "Single", "Row", "Apartment"), type_num = c(1, 2, 3, 4))
> lookup
type_text type_num
1 Semi 1
2 Single 2
3 Row 3
4 Apartment 4
Create the new variable:
dat$house_type_num <- plyr::mapvalues(dat$HouseType, from = lookup$type_text, to = lookup$type_num)
Or for simple replacements you can skip creating a long lookup table and do this directly in one step:
dat$house_type_num <- plyr::mapvalues(dat$HouseType,
from = c("Semi", "Single", "Row", "Apartment"),
to = c(1, 2, 3, 4))
Result:
> dat
HouseType house_type_num
1 Semi 1
2 Single 2
3 Row 3
4 Single 2
5 Apartment 4
6 Apartment 4
7 Row 3