问题
I have a situation where I have a data.frame where a vector has the date above a sequence of times, and I'd like to convert into some kind of POSIX date-time field.
For example:
"7/16/2014", "5:06:59 PM", "11:51:26 AM", "7/13/2014", "3:53:16 PM", "3:24:19 PM", "11:47:49 AM", "7/12/2014", "11:57:41 AM", "7/11/2014", "10:01:48 AM", "7/10/2014", "4:54:08 PM", "2:23:04 PM", "11:34:09 AM"
Conceptually, it seems what to do is to replicate this MIXED vector into a DATEONLY vector and a TIMEONLY vector using regular expressions, so they maintain the same position, and then use something like fill function from tidyr to fill in the blank spots in the DATEONLY vector, then recombine the DATEONLY AND TIMEONLY columns... but I'm a bit stumped as to where to start.
I'd like to have it present as
"7/16/2014 5:06:59 PM", "7/16/2014 11:51:26 AM", "7/13/2014 3:53:16 PM" etc...
回答1:
I do not think this is a concise way to achieve your task. But, the following works. I could not come up with a good idea of splitting the vector (i.e., x). So I decided to work with a data frame. First, I created a group variable. In order to do that, as you mentioned in your question, I searched indices of date (month/day/year). Using the indices and na.locf()
, I fill in the group column. Then, I split the data by group and handled pasting date and time with stri_join()
. Finally, I unlist the list. If you want date objects, you need to work on that.
library(zoo)
library(magrittr)
library(stringi)
x <- c("7/16/2014", "5:06:59 PM", "11:51:26 AM",
"7/13/2014", "3:53:16 PM", "3:24:19 PM", "11:47:49 AM",
"7/12/2014", "11:57:41 AM", "7/11/2014", "10:01:48 AM",
"7/10/2014", "4:54:08 PM", "2:23:04 PM", "11:34:09 AM")
# Create a data frame
mydf <- data.frame(date = x, group = NA)
# Get indices for date (month/day/year)
ind <- grep(pattern = "\\d+/\\d+/\\d+", x = mydf$date)
# Add group number to the ind positions of mydf$group and
# fill NA with the group numbers
mydf$group[ind] <- 1:length(ind)
mydf$group <- na.locf(mydf$group)
# Split the data frame by group and create dates (in character)
split(mydf, mydf$group) %>%
lapply(function(x){
stri_join(x$date[1], x$date[2:length(x$date)], sep = " ")}) %>%
unlist
11 12 21 22
"7/16/2014 5:06:59 PM" "7/16/2014 11:51:26 AM" "7/13/2014 3:53:16 PM" "7/13/2014 3:24:19 PM"
23 3 4 51
"7/13/2014 11:47:49 AM" "7/12/2014 11:57:41 AM" "7/11/2014 10:01:48 AM" "7/10/2014 4:54:08 PM"
52 53
"7/10/2014 2:23:04 PM" "7/10/2014 11:34:09 AM"
来源:https://stackoverflow.com/questions/34169485/r-separating-out-a-mixed-data-column-date-above-multiple-times