How to convert date to format “yyyy-mm-dd” in R when input value can be of different formats

半腔热情 提交于 2021-02-04 21:22:15

问题


I can receive files where date can be in one of the two formats: "mm/dd/yyyy" or "yyyy-mm-dd". They are factors to start with. Irrespective of the date format i receive it in, i want to be able to convert them to 'Date' data type of format "yyyy-mm-dd". I have tried using

df_1$Date <- as.Date(as.character(df_1$Date), format = "%Y-%m-%d")

This works with the format "yyyy-mm-dd" but gives NA when input values are of format "mm/dd/yyyy". Similarly, other methods i have tried works for only one of these 2 formats. I need it to work for both.

Below i have posted code for creating the datasets and replicating the issue.

df_1 <- structure(list(Text.Identifier = c(4L, 5L, 7L, 1838L), Date = structure(c(2L, 
                                                                      2L, 1L, 3L), .Label = c("5/18/2016", "7/12/2015", "8/29/2016"
                                                                      ), class = "factor")), .Names = c("Text.Identifier", "Date"), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                        -4L))

df_2 <- structure(list(Text.Identifier = 1:4, Date = structure(c(5L, 
                                                     5L, 5L, 1L), .Label = c("2015-07-12", "2016-05-01", "2016-05-05", 
                                                                             "2016-05-09", "2016-05-12", "2016-05-18", "2016-08-01", "2016-08-19", 
                                                                             "2016-08-29", "2016-09-20"), class = "factor")), .Names = c("Text.Identifier", 
                                                                                                                                         "Date"), row.names = c(NA, 4L), class = "data.frame")

df_1$Date <- as.Date(df_1$Date, format = "%Y-%m-%d")
df_2$Date <- as.Date(df_2$Date, format = "%Y-%m-%d")
View(df_1)
View(df_2)

回答1:


I suggest using the anydate() function from the anytime library. It is better suited to this case than lubridate's parse_date_time(), since it recognizes the dates without requiring any user input concerning the order of the entries, like ymd or dmy. Furthermore, there is no problem if the data is stored as factors.

Here's an example:

my_dates <- c("2015-07-12", "2016-05-01", "2016-05-05", "2016-05-09", 
 "2016-05-12", "2016-05-18", "2016-08-01", "2016-08-19", "2016-08-29",
 "2016-09-20", "5/18/2016", "7/12/2015", "8/29/2016")
my_dates <- as.factor(my_dates)
library(anytime)
anydate(my_dates)
# [1] "2015-07-12" "2016-05-01" "2016-05-05" "2016-05-09" "2016-05-12" "2016-05-18"
# [7] "2016-08-01" "2016-08-19" "2016-08-29" "2016-09-20" "2016-05-18" "2015-07-12"
#[13] "2016-08-29

One can verify that the class of the output is indeed Date

class(anydate(my_dates))
#[1] "Date"

whereas another posted solution yields a POSIX object with unnecessary time zone information.




回答2:


If you're sure the only two date formats are "mm/dd/yyyy" or "yyyy-mm-dd", this should work. This solution uses grep to find a forward slash in the dates, and converts those dates from "mm/dd/yyyy" to "yyyy-mm-dd".

Here is an example:

dates <- data.frame(date = c("2015-11-01", "12/12/2016", "1992-05-28", "03/05/2011"), stringsAsFactors = FALSE)

dates$date[grep("/", dates$date)] <- as.character(as.Date(dates$date[grep("/", dates$date)], "%m/%d/%Y"))



回答3:


Adding my comment as an answer so we can mark this question answered.

The lubridate package is smart enough to choose between different date separators and also different date orders.

library(lubridate)
data = c("01-12-2000", "02-11-2001", "2016-06-20", "2016-12-05")
parse_date_time(data, c('dmy', 'ymd'))

## returns
## [1] "2000-12-01 UTC" "2001-11-02 UTC" "2016-06-20 UTC" "2016-12-05 UTC"


来源:https://stackoverflow.com/questions/40942750/how-to-convert-date-to-format-yyyy-mm-dd-in-r-when-input-value-can-be-of-diffe

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!