问题
We have a SparklyR project which is set up like this
# load functions
source('./a.R')
source('./b.R')
source('./c.R')
....
# main script computations
sc -> spark_connect(...)
read_csv(sc, s3://path)
....
Running it on EMR
spark-submit --deploy-mode client s3://path/to/my/script.R
Running this script using spark-submit above fails since it seems to only take a single R script but we are sourcing functions from multiple files. Is there a way we can package this as an egg/jar file with all of the files and pass it as an argument to spark-submit?
来源:https://stackoverflow.com/questions/62400076/egg-jar-equivalent-for-sparklyr-projects