问题
I can receive files where date can be in one of the two formats: "mm/dd/yyyy" or "yyyy-mm-dd". They are factors to start with. Irrespective of the date format i receive it in, i want to be able to convert them to 'Date' data type of format "yyyy-mm-dd". I have tried using
df_1$Date <- as.Date(as.character(df_1$Date), format = "%Y-%m-%d")
This works with the format "yyyy-mm-dd" but gives NA when input values are of format "mm/dd/yyyy". Similarly, other methods i have tried works for only one of these 2 formats. I need it to work for both.
Below i have posted code for creating the datasets and replicating the issue.
df_1 <- structure(list(Text.Identifier = c(4L, 5L, 7L, 1838L), Date = structure(c(2L,
2L, 1L, 3L), .Label = c("5/18/2016", "7/12/2015", "8/29/2016"
), class = "factor")), .Names = c("Text.Identifier", "Date"), class = "data.frame", row.names = c(NA,
-4L))
df_2 <- structure(list(Text.Identifier = 1:4, Date = structure(c(5L,
5L, 5L, 1L), .Label = c("2015-07-12", "2016-05-01", "2016-05-05",
"2016-05-09", "2016-05-12", "2016-05-18", "2016-08-01", "2016-08-19",
"2016-08-29", "2016-09-20"), class = "factor")), .Names = c("Text.Identifier",
"Date"), row.names = c(NA, 4L), class = "data.frame")
df_1$Date <- as.Date(df_1$Date, format = "%Y-%m-%d")
df_2$Date <- as.Date(df_2$Date, format = "%Y-%m-%d")
View(df_1)
View(df_2)
回答1:
I suggest using the anydate()
function from the anytime
library. It is better suited to this case than lubridate
's parse_date_time()
, since it recognizes the dates without requiring any user input concerning the order of the entries, like ymd
or dmy
. Furthermore, there is no problem if the data is stored as factors.
Here's an example:
my_dates <- c("2015-07-12", "2016-05-01", "2016-05-05", "2016-05-09",
"2016-05-12", "2016-05-18", "2016-08-01", "2016-08-19", "2016-08-29",
"2016-09-20", "5/18/2016", "7/12/2015", "8/29/2016")
my_dates <- as.factor(my_dates)
library(anytime)
anydate(my_dates)
# [1] "2015-07-12" "2016-05-01" "2016-05-05" "2016-05-09" "2016-05-12" "2016-05-18"
# [7] "2016-08-01" "2016-08-19" "2016-08-29" "2016-09-20" "2016-05-18" "2015-07-12"
#[13] "2016-08-29
One can verify that the class of the output is indeed Date
class(anydate(my_dates))
#[1] "Date"
whereas another posted solution yields a POSIX object with unnecessary time zone information.
回答2:
If you're sure the only two date formats are "mm/dd/yyyy" or "yyyy-mm-dd", this should work. This solution uses grep to find a forward slash in the dates, and converts those dates from "mm/dd/yyyy" to "yyyy-mm-dd".
Here is an example:
dates <- data.frame(date = c("2015-11-01", "12/12/2016", "1992-05-28", "03/05/2011"), stringsAsFactors = FALSE)
dates$date[grep("/", dates$date)] <- as.character(as.Date(dates$date[grep("/", dates$date)], "%m/%d/%Y"))
回答3:
Adding my comment as an answer so we can mark this question answered.
The lubridate
package is smart enough to choose between different date separators and also different date orders.
library(lubridate)
data = c("01-12-2000", "02-11-2001", "2016-06-20", "2016-12-05")
parse_date_time(data, c('dmy', 'ymd'))
## returns
## [1] "2000-12-01 UTC" "2001-11-02 UTC" "2016-06-20 UTC" "2016-12-05 UTC"
来源:https://stackoverflow.com/questions/40942750/how-to-convert-date-to-format-yyyy-mm-dd-in-r-when-input-value-can-be-of-diffe