Why are lubridate functions so slow when compared with as.POSIXct?

前端未结

关注

 2  1527

As the title goes. Why is the lubridate function so much slower?

library(lubridate)
library(microbenchmark)

Dates <- sample(c(dates = format(seq(ISOdate(


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  一整个雨季        
                
              
                            
                2020-12-14 17:33
              
            
            
                                                                       
For the same reason cars are slow in comparison to riding on top of rockets. The added ease of use and safety make cars much slower than a rocket but you're less likely to get blown up and it's easier to start, steer, and brake a car. However, in the right situation (e.g., I need to get to the moon) the rocket is the right tool for the job. Now if someone invented a car with a rocket strapped to the roof we'd have something.

Start with looking at what dmy is doing and you'll see the difference for the speed (by the way from your bechmarks I wouldn't say that lubridate is that much slower as these are in milliseconds):

dmy  #type this into the command line and you get:

>dmy
function (..., quiet = FALSE, tz = "UTC") 
{
    dates <- unlist(list(...))
    parse_date(num_to_date(dates), make_format("dmy"), quiet = quiet, 
        tz = tz)
}
<environment: namespace:lubridate>


Right away I see parse_date and num_to_date and make_format.  Makes one wonder what all these guys are.  Let's see:

parse_date

> parse_date
function (x, formats, quiet = FALSE, seps = find_separator(x), 
    tz = "UTC") 
{
    fmt <- guess_format(head(x, 100), formats, seps, quiet)
    parsed <- as.POSIXct(strptime(x, fmt, tz = tz))
    if (length(x) > 2 & !quiet) 
        message("Using date format ", fmt, ".")
    failed <- sum(is.na(parsed)) - sum(is.na(x))
    if (failed > 0) {
        message(failed, " failed to parse.")
    }
    parsed
}
<environment: namespace:lubridate>


num_to_date

> getAnywhere(num_to_date)
A single object matching ‘num_to_date’ was found
It was found in the following places
  namespace:lubridate
with value

function (x) 
{
    if (is.numeric(x)) {
        x <- as.character(x)
        x <- paste(ifelse(nchar(x)%%2 == 1, "0", ""), x, sep = "")
    }
    x
}
<environment: namespace:lubridate>


make_format

> getAnywhere(make_format)
A single object matching ‘make_format’ was found
It was found in the following places
  namespace:lubridate
with value

function (order) 
{
    order <- strsplit(order, "")[[1]]
    formats <- list(d = "%d", m = c("%m", "%b"), y = c("%y", 
        "%Y"))[order]
    grid <- expand.grid(formats, KEEP.OUT.ATTRS = FALSE, stringsAsFactors = FALSE)
    lapply(1:nrow(grid), function(i) unname(unlist(grid[i, ])))
}
<environment: namespace:lubridate>


Wow we got strsplit-ting, expand-ing.grid-s, paste-ing, ifelse-ing, unname-ing etc. plus a Whole Lotta Error Checking Going On (play on the Zep song).  So what we have here is some nice syntactic sugar.  Mmmmm tasty but it comes with a price, speed.

Compare that to as.POSIXct:

getAnywhere(as.POSIXct)  #tells us to use methods to see the business
methods('as.POSIXct')    #tells us all the business
as.POSIXct.date          #what I believe your code is using (I don't use dates though)


There's a lot more Internal coding and less error checking going on with as.POSIXct  So you have to ask do I want ease and safety or speed and power?  Depends on the job.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  感情败类        
                
              
                            
                2020-12-14 17:58
              
            
            
                                                                       
@Tyler's answer is correct. Here's some more info including a tip on making lubridate faster - from the help file:


  " Lubridate has an inbuilt very fast POSIX parser, ported from the
  fasttime package by Simon Urbanek. This functionality is as yet
  optional and could be activated with options(lubridate.fasttime =
  TRUE). Lubridate will automatically detect POSIX strings and use fast
  parser instead of the default strptime utility. "

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复