R data.table fread - read column as Date

帅比萌擦擦* 提交于 2020-05-25 04:29:21

问题


I would like to read a file with fread from data.table that has a column with "YYYY-MM-DD" format dates. By default, fread reads the column as chr. However, I would like to have the column as Date, the same way I would obtain when applying as.Date.

I have tried to use

dt[,starttime.date := as.Date(starttime.date)]

but it takes very long to run (I have approx. 43 million rows).


回答1:


Using the fasttime package, as suggested in the fread documentation, is approximately 100x faster than as.Date or as.IDate:

library(data.table)
library(fasttime)

dt[,starttime.date := fastPOSIXct(starttime.date)]

Benchmark results:

library(microbenchmark)
library(fasttime)
DT <- data.table(start_date = paste(sample(1900:2018, 100000, replace = T), 
                                    sample(1:12, 100000, replace = T),
                                    sample(1:28, 100000, replace = T),
                                    sep = "-"))
microbenchmark(
  as.Date(DT$start_date),
  as.IDate(DT$start_date),
  fastPOSIXct(DT$start_date)
)

> Unit: milliseconds
>                        expr    mean 
>      as.Date(DT$start_date)  383.89
>     as.IDate(DT$start_date)  405.89
>  fastPOSIXct(DT$start_date)    4.59 


来源:https://stackoverflow.com/questions/29140416/r-data-table-fread-read-column-as-date

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!