How to filter on partial match using sparklyr

前端 未结 1 944
悲哀的现实
悲哀的现实 2020-12-10 05:40

I\'m new to sparklyr (but familiar with spark and pyspark), and I\'ve got a really basic question. I\'m trying to filter a column based on a partial match. In dplyr, i\'d wr

相关标签:
1条回答
  • 2020-12-10 06:02

    The same as in standard Spark, you can use either rlike (Java regular expressions):

    df <- copy_to(sc, iris) 
    
    df %>% filter(rlike(Species, "osa"))
    
    # or anchored
    df %>% filter(rlike(Species, "^.*osa.*$")
    

    or like (simple SQL regular expressions):

    df %>% filter(like(Species, "%osa%"))
    

    Both methods can be also used with suffix notation as

    df %>% filter(Species %rlike%  "^.*osa.*$")
    

    and

    df %>% filter(Species %like% "%osa%")
    

    respectively.

    For details see vignette("sql-translation").

    0 讨论(0)
提交回复
热议问题