I want to use r packages on cran such as forecast etc with sparkr and meet following two problems.
Should I pre-install all those packages on w
Add libraries works with spark 2.0+. For example, I am adding the package forecast in all node of cluster. The code works with Spark 2.0+ and databricks environment.
schema <- structType(structField("out", "string"))
out <- gapply(
df,
c("p", "q"),
function(key, x)
if (!all(c("forecast") %in% (.packages()))){
if (!require("forecast")) {
install.packages("forecast", repos ="http://cran.us.r-project.org", INSTALL_opts = c('--no-lock'))
}
}
#use forecast
#dataframe out
data.frame(out = x$column, stringAsFactor = FALSE)
},
schema)