Can I apply a function over a vector using base tryCatch?

后端未结

关注

 3  725

既然无缘 2021-01-16 16:24

I\'m trying to parse dates (using lubridate functions) from a vector which has mixed date formats.

departureDate <- c(\"Aug 17, 2020 12:00:00 AM\", \"Nov


      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   抹茶落季
                                             
                
                
                (楼主)
            
              
              
                2021-01-16 17:09
              

            
            
                        
One method would be to iterate through a list of candidate formats and apply it only to dates not previously parsed correctly.

fmts <- c("%b %d, %Y %H:%M:%S %p", "%d/%m/%Y")
dates <- rep(Sys.time()[NA], length(departureDate))
for (fmt in fmts) {
  isna <- is.na(dates)
  if (!any(isna)) break
  dates[isna] <- as.POSIXct(departureDate[isna], format = fmt)
}
dates
#  [1] "2020-08-17 12:00:00 PDT" "2019-11-19 12:00:00 PST" "2020-12-21 12:00:00 PST"
#  [4] "2020-12-24 12:00:00 PST" "2020-12-24 12:00:00 PST" "2020-04-19 12:00:00 PDT"
#  [7] "2019-06-28 00:00:00 PDT" "2019-08-16 00:00:00 PDT" "2019-02-04 00:00:00 PST"
# [10] "2019-04-10 00:00:00 PDT" "2019-07-28 00:00:00 PDT" "2019-07-26 00:00:00 PDT"
# [13] "2020-06-22 12:00:00 PDT" "2020-04-05 12:00:00 PDT" "2021-05-01 12:00:00 PDT"
as.Date(dates)
#  [1] "2020-08-17" "2019-11-19" "2020-12-21" "2020-12-24" "2020-12-24" "2020-04-19" "2019-06-28"
#  [8] "2019-08-16" "2019-02-04" "2019-04-10" "2019-07-28" "2019-07-26" "2020-06-22" "2020-04-05"
# [15] "2021-05-01"


I encourage you to put the most-likely formats first in the fmts vector.

The way this is set up, as soon as all elements are correctly found, no further formats are attempted (i.e., break).



Edit: if there is a difference in LOCALE where AM/PM are not locally recognized, then one method would be to first remove them from the strings:

departureDate <- gsub("\\s[AP]M$", "", departureDate)
departureDate
#  [1] "Aug 17, 2020 12:00:00" "Nov 19, 2019 12:00:00" "Dec 21, 2020 12:00:00"
#  [4] "Dec 24, 2020 12:00:00" "Dec 24, 2020 12:00:00" "Apr 19, 2020 12:00:00"
#  [7] "28/06/2019"            "16/08/2019"            "04/02/2019"           
# [10] "10/04/2019"            "28/07/2019"            "26/07/2019"           
# [13] "Jun 22, 2020 12:00:00" "Apr 5, 2020 12:00:00"  "May 1, 2021 12:00:00" 


and then use a simpler format:

fmts <- c("%b %d, %Y %H:%M:%S", "%d/%m/%Y")

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复