Can sparklyr be used with spark deployed on yarn-managed hadoop cluster?
Is the sparklyr R package able to connect to YARN-managed hadoop clusters? This doesn't seem to be documented in the cluster deployment documentation. Using the SparkR package that ships with Spark it is possible by doing: # set R environment variables Sys.setenv(YARN_CONF_DIR=...) Sys.setenv(SPARK_CONF_DIR=...) Sys.setenv(LD_LIBRARY_PATH=...) Sys.setenv(SPARKR_SUBMIT_ARGS=...) spark_lib_dir <- ... # install specific library(SparkR, lib.loc = c(sparkr_lib_dir, .libPaths())) sc <- sparkR.init(master = "yarn-client") However when I swaped the last lines above with library(sparklyr) sc <- spark