How to use a predicate while reading from JDBC connection?

前端 未结 1 856
轮回少年
轮回少年 2020-12-30 17:24

By default, spark_read_jdbc() reads an entire database table into Spark. I\'ve used the following syntax to create these connections.

library(sp         


        
相关标签:
1条回答
  • 2020-12-30 17:35

    You can replace dbtable with query:

    db_tbl <- sc %>%
      spark_read_jdbc(sc      = .,
                  name    = "table_name",  
                  options = list(url      = "jdbc:mysql://localhost:3306/schema_name",
                                 user     = "root",
                                 password = "password",
                                 dbtable  = "(SELECT * FROM table_name WHERE field > 1) as my_query"))
    

    but with simple condition like this Spark should push it automatically when you filter:

    db_tbl %>% filter(field > 1)
    

    Just make sure to set:

    memory = FALSE
    

    in spark_read_jdbc.

    0 讨论(0)
提交回复
热议问题