Right now, I am having 3 separate columns as year, month, and day in a data file in R. How do I merge these three columns into just one column and make R understand that it
Try:
df$date <- as.Date(with(df, paste(year, mon, day,sep="-")), "%Y-%m-%d")
df$date
#[1] "1947-01-01" "1947-04-01" "1947-07-01" "1947-10-01" "1948-01-01"
#[6] "1948-04-01"
Or you could use the lubridate package, which makes working with dates and times in R much easier in general.
e.g.
df$date <- with(df, ymd(sprintf('%04d%02d%02d', year, mon, day)))
df$date
# [1] "1947-01-01 UTC" "1947-04-01 UTC" "1947-07-01 UTC" "1947-10-01 UTC"
# [5] "1948-01-01 UTC" "1948-04-01 UTC"
The ymd
function takes a string representing Year, Month and Day, which could be "19470101", "1947-01-01", "1947/01/01", etc. Or there is also mdy
and dmy
if the elements are ordered differently. You can also optionally specify a time zone.
There is also a simpler solution using lubridate
and magrittr
:
df$date <- paste(df$year, df$mon, df$day, sep="-") %>% ymd() %>% as.Date()
This worked for me, even though I had days and months written in single (i.e. 1) and double (i.e. 01) digits. Parsing was correct as well.
Since your year, month and day types are numerical the best function to use is the make_date function from the lubridate package. The tidyverse style solution is therefore
library(tidyverse)
library(lubridate)
data %>%
mutate(date = make_date(year, month, day))