Unable to launch SparkR in RStudio

匿名 (未验证) 提交于 2019-12-03 03:04:01

问题:

After long and difficult installation process of SparkR i getting into new problems of launching SparkR.

My Settings

R 3.2.0     RStudio 0.98.1103     Rtools 3.3     Spark 1.4.0 Java Version 8 SparkR 1.4.0 Windows 7 SP 1  64 Bit

Now i try to use following code in R:

library(devtools) library(SparkR) Sys.setenv(SPARK_MEM="1g") Sys.setenv(SPARK_HOME="C:/spark-1.4.0") sc <- sparkR.init(master="local")

I recieve following:

JVM is not ready after 10 seconds

I was also trying to add some system variables like spark path or java path.

Do you have any advices for me to fix that problems.

The next step for me after testing local host would be to start tests on my running hadoop cluster.

回答1:

I think it was a bug that has now been resolved. Try the following,

Sys.setenv(SPARK_HOME="C:\\spark-1.4.0")  .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))  library("SparkR", lib.loc="C:\\spark-1.4.0\\lib") # The use of \\ is for windows environment.  library(SparkR)  sc=sparkR.init(master="local")

Launching java with spark-submit command C:\spark-1.4.0/bin/spark-submit.cmd sparkr-shell

C:\Users\Ashish\AppData\Local\Temp\RtmpWqFsOB\backend_portbdc329477c6

Hope this helps.



回答2:

I had the same issue and my spark-submit.cmd file was also not executing from the command line. Following steps worked for me

Go to your environment variables and in the system variables select variable name PATH. Along with other values add c:/Windows/System32/ separated by a semicolon. This made my spark-submit.cmd run from command line and eventually from the Rstudio.

I have realized that we get the above issue only if all the required path values are not specified. Ensure all your path values(R, Rtools) are specified in the environment variables. For instance my Rtools path was c:\Rtools\bin;c:\Rtools\gcc-4.6.3\bin

I hope this helps.



回答3:

That didn't work for me. If anyone has the same problem, try to give execute permissions to c:/sparkpath/bin/spark-submit.cmd.



回答4:

I had exact same issue. I can start SparkR in command line, but not in RStudio in Windows. And here is the solution works for me.

  1. clean up all the paths you set when you tried to fix this issue. This including the paths you set in the windows environment from window control panel and uses Sys.unsetenv() to unset the SPARK_HOME.

  2. find out your RStudio default working directory by using getwd() in RStudio. And then create a .Rprofile file in this directory. Put the following line in this file: .libPaths("C:/Apache/Spark-1.5.1/R/lib")

  3. In window control panel->System->Advanced system settings->Environment Variables, add this ";C:\Apache\Spark-1.5.1\bin" at the end of your exsiting PATH variable.

  4. Start RStudio, if you type .libPaths(), you can see the SparkR library path is already in the library path

  5. use library(SparkR) to load SparkR library

  6. sc=sparkR.init(master="local")

I tried this on both Spark 1.4.1 and 1.5.1, they both work fine. I hope this can help whoever still having issue after all the suggestion above.



回答5:

I had a similar issue. In my case the problem was with the hyphen ('-').
by changing the code :

sc <- sparkR.init(master = "local[*]",sparkPackages = c("com.databricks:spark-csv_2.11-1.4.0"))

to:

sc <- sparkR.init(master = "local[*]",sparkPackages = c("com.databricks:spark-csv_2.11:1.4.0"))

worked for me. Do you notice the change?

P.S.: Do copy the jar in your SPARK_HOME\lib folder

Edit 1: Also, check that you have configured your "HADOOP_HOME"


Hope this helps.



回答6:

The following solution will work for Mac OS.

After installing Hadoop followed by Spark.

spark_path <- strsplit(system("brew info apache-spark",intern=T)[4],' ')[[1]][1] # Get your spark path .libPaths(c(file.path(spark_path,"libexec", "R", "lib"), .libPaths())) library(SparkR



回答7:

I also had this error, from a different cause. Under the hood, Spark calls

system2(sparkSubmitBin, combinedArgs, wait = F)

There are many ways this can go wrong. In my case the underlying error (invisible until calling system2 directly as an experiment) was ""UNC path are not supported." I had to change my working directory in R studio to a directory which was not part of a network share, and then it started working.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!