SparklyR removing a Table from Spark Context

匿名 (未验证) 提交于 2019-12-03 08:59:04

问题:

Would like to remove a single data table from the Spark Context ('sc'). I know a single cached table can be un-cached, but this isn't the same as removing an object from the sc -- as far as I can gather.

library(sparklyr) library(dplyr) library(titanic) library(Lahman)  spark_install(version = "2.0.0") sc <- spark_connect(master = "local")  batting_tbl <- copy_to(sc, Lahman::Batting, "batting") titanic_tbl <- copy_to(sc, titanic_train, "titanic", overwrite = TRUE) src_tbls(sc)  # [1] "batting" "titanic"  tbl_cache(sc, "batting") # Speeds up computations -- loaded into memory src_tbls(sc)  # [1] "batting" "titanic"  tbl_uncache(sc, "batting") src_tbls(sc)  # [1] "batting" "titanic" 

To disconnect the complete sc, I would use spark_disconnect(sc), but in this example it would destroy both "titanic" and "batting" tables stored inside of sc.

Rather, I would like to delete e.g., "batting" with something like spark_disconnect(sc, tableToRemove = "batting"), but this doesn't seem possible.

回答1:

dplyr::db_drop_table(sc, "batting") 

I tried this function and it seems work.



回答2:

The slightly lower-level alternative is

tbl_name <- "batting" DBI::dbGetQuery(sc, paste("DROP TABLE", tbl_name)) 


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!