Converting string/chr to date using sparklyr

别说谁变了你拦得住时间么 提交于 2019-12-11 07:06:48

问题


I've brought a table into Hue which has a column of dates and i'm trying to play with it using sparklyr in Rstudio.

I'd like to convert a character column into a date column like so:

Weather_data = mutate(Weather_data, date2 = as.Date(date, "%m/%d/%Y"))

and this runs fine but when i check:

head(Weather_data) 

How to I properly convert the chr to dates?

Thanks!!!!


回答1:


The problem is that sparklyr doesn't correctly support Spark DateType. It is possible to parse dates, and correct format, but not represent these as proper DateType columns. If that's enough then please follow the instructions below.

In Spark 2.2 or later use to_date with Java SimpleDataFormat compatible string:

df <- copy_to(sc, data.frame(date=c("01/01/2010")))
parsed <- df %>% mutate(date_parsed = to_date(date, "MM/dd/yyyy"))
parsed
# Source:   lazy query [?? x 2]
# Database: spark_connection
        date date_parsed
       <chr>       <chr>
1 01/15/2010  2010-01-15

Interestingly internal Spark object still uses DateType columns:

parsed %>% spark_dataframe
<jobj[120]>
  class org.apache.spark.sql.Dataset
  [date: string, date_parsed: date]

For earlier versions unix_timestamp and cast (but watch for possible timezone problems):

df %>%
  mutate(date_parsed = sql(
    "CAST(CAST(unix_timestamp(date, 'MM/dd/yyyy') AS timestamp) AS date)"))
# Source:   lazy query [?? x 2]
# Database: spark_connection
        date date_parsed
       <chr>       <chr>
1 01/15/2010  2010-01-15

Edit:

It looks like this problem has been resolved on current master (sparklyr_0.7.0-9105):

# Source:   lazy query [?? x 2]
# Database: spark_connection
        date date_parsed
       <chr>      <date>
1 01/01/2010  2009-12-31


来源:https://stackoverflow.com/questions/46453646/converting-string-chr-to-date-using-sparklyr

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!