I have a large data set with many columns containing dates in two different formats:
\"1996-01-04\" \"1996-01-05\" \"1996-01-08\" \"1996-01-09\" \"1996-01-10
If there are any duplicated date fields in your dataset, then one way you could do is by setting up de-duplicated reference table then do the mapping on the smaller dataset. This will be faster than converting the date fields on all records.
Data
df <- data.frame(
X1 = c("1996-01-04", "1996-01-05", "1996-01-08", "1996-01-09", "1996-01-10", rep("1996-01-11", 100)),
X2 = c("02/01/1996", "03/01/1996", "04/01/1996", "05/01/1996", "08/01/1996", rep("09/01/1996", 100)),
stringsAsFactors = FALSE)
Create unique Date rows for mapping
date_mapping <- function(date_col){
ref_df <- data.frame(date1 = unique(date_col), stringsAsFactors = FALSE)
if(all(grepl("/", ref_df$date1))) {
ref_df$date2 <- as.Date(ref_df$date1, format = "%d/%m/%Y")
} else {
ref_df$date2 <- as.Date(ref_df$date1)
}
date_col_mapped <- ref_df[match(date_col, ref_df$date1), "date2"]
return(date_col_mapped)
}
date_mapping(df$X1)
date_mapping(df$X2)