I hope you didn\'t think I was asking for relationship advice.
Infrequently, I have to offer survey respondents the ability to specify when an event occurred. What resu
My sympathy that your date didn't turn out as pretty as expected. ;-)
I have constructed a (still partial) solution along the lines suggested by @Rguy.
(Please note that this code still has a bug: It does't always return the correct time. For some reason, it doesn't always do a greedy match on the digits before the colon, thus sometimes returning 1:00 when the time is 11:00.)
First, construct a helper function that wraps around gsub and grep. This function takes a character vector as one of its arguments and collapses this into a single string separated by |. The effect of this is to allow you to easily pass multiple patterns to be matched by a regex:
find.pattern <- function(x, pattern_list){
pattern <- paste(pattern_list, collapse="|")
ret <- gsub(paste("^.*(", pattern, ").*", sep=""), "\\1", x, ignore.case=TRUE)
ret[ret==x] <- NA
ret2 <- grepl(paste("^(", pattern, ")$", sep=""), x, ignore.case=TRUE)
ret[ret2] <- x[ret2]
ret
}
Next, use some built-in variable names to construct a vector of months and abbreviations:
all.month <- c(month.name, month.abb)
Finally, construct a data frame with different extracts:
ret <- data.frame(
data = dat,
date1 = find.pattern(dat, "\\d+/\\d+/\\d+"),
date2 = find.pattern(dat,
paste(all.month, "\\s*\\d+[(th)|,]*\\s{0,3}[(2010)|(2011)]*", collapse="|", sep="")),
year = find.pattern(dat, c(2010, 2011)),
month = find.pattern(dat, month.abb), #Use base R variable called month.abb for month names
hour = find.pattern(dat, c("\\d+[\\.:h]\\d+", "12 noon")),
ampm = find.pattern(dat, c("am", "pm"))
)
The results:
head(ret, 50)
data date1 date2 year month hour ampm
20 April 4th around 10am April 4th Apr am
21 April 4th around 10am April 4th Apr am
22 Mar 18, 2011 9:33am Mar 18, 2011 2011 Mar 9:33 am
23 Mar 18, 2011 9:27am Mar 18, 2011 2011 Mar 9:27 am
24 df
25 fg
26 12:16 12:16
27 9:50 9:50
28 Feb 8, 2011 / 12:20pm Feb 8, 2011 2011 Feb 2:20 pm
29 8:34 am 2/4/11 2/4/11 8:34 am
30 Jan 31, 2011 2:50pm Jan 31, 2011 2011 Jan 2:50 pm
31 Jan 31, 2011 2:45pm Jan 31, 2011 2011 Jan 2:45 pm
32 Jan 31, 2011 2:38pm Jan 31, 2011 2011 Jan 2:38 pm
33 Jan 31, 2011 2:26pm Jan 31, 2011 2011 Jan 2:26 pm
34 11h09 11h09
35 11:00 am 1:00 am
36 1h02 pm 1h02 pm
37 10h03 10h03
38 2h10 2h10
39 Jan 13, 2011 9:50am Van Jan 13, 2011 2011 Jan 9:50 am
40 Jan 12, 2011 Jan 12, 2011 2011 Jan