可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I'm trying to run pyspark on my macbook air. When i try starting it up I get the error:
Exception: Java gateway process exited before sending the driver its port number
when sc = SparkContext() is being called upon startup. I have tried running the following commands:
./bin/pyspark ./bin/spark-shell export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"
with no avail. I have also looked here:
Spark + Python - Java gateway process exited before sending the driver its port number?
but the question has never been answered. Please help! Thanks.
回答1:
this should help you
One solution is adding pyspark-shell to the shell environment variable PYSPARK_SUBMIT_ARGS:
export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"
There is a change in python/pyspark/java_gateway.py , which requires PYSPARK_SUBMIT_ARGS includes pyspark-shell if a PYSPARK_SUBMIT_ARGS variable is set by a user.
回答2:
One possible reason is JAVA_HOME is not set because java is not installed.
I encountered the same issue. It says
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/spark/launcher/Main : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:643) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at java.net.URLClassLoader.access$000(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:212) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:323) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:296) at java.lang.ClassLoader.loadClass(ClassLoader.java:268) at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:406) Traceback (most recent call last): File "", line 1, in File "/opt/spark/python/pyspark/conf.py", line 104, in __init__ SparkContext._ensure_initialized() File "/opt/spark/python/pyspark/context.py", line 243, in _ensure_initialized SparkContext._gateway = gateway or launch_gateway() File "/opt/spark/python/pyspark/java_gateway.py", line 94, in launch_gateway raise Exception("Java gateway process exited before sending the driver its port number") Exception: Java gateway process exited before sending the driver its port number
at sc = pyspark.SparkConf()
. I solved it by running
sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer
which is from https://www.digitalocean.com/community/tutorials/how-to-install-java-with-apt-get-on-ubuntu-16-04
回答3:
Had the same issue with my iphython notebook (IPython 3.2.1) on Linux (ubuntu).
What was missing in my case was setting the master URL in the $PYSPARK_SUBMIT_ARGS environment like this (assuming you use bash):
export PYSPARK_SUBMIT_ARGS="--master spark://:"
e.g.
export PYSPARK_SUBMIT_ARGS="--master spark://192.168.2.40:7077"
You can put this into your .bashrc file. You get the correct URL in the log for the spark master (the location for this log is reported when you start the master with /sbin/start_master.sh).
回答4:
I got the same Java gateway process exited......port number
exception even though I set PYSPARK_SUBMIT_ARGS
properly. I'm running Spark 1.6 and trying to get pyspark to work with IPython4/Jupyter (OS: ubuntu as VM guest).
While I got this exception, I noticed an hs_err_*.log was generated and it started with:
There is insufficient memory for the Java Runtime Environment to continue. Native memory allocation (malloc) failed to allocate 715849728 bytes for committing reserved memory.
So I increased the memory allocated for my ubuntu via VirtualBox Setting and restarted the guest ubuntu. Then this Java gateway
exception goes away and everything worked out fine.
回答5:
I got the same Exception: Java gateway process exited before sending the driver its port number
in Cloudera VM when trying to start IPython with CSV support with a syntax error:
PYSPARK_DRIVER_PYTHON=ipython pyspark --packages com.databricks:spark-csv_2.10.1.4.0
will throw the error, while:
PYSPARK_DRIVER_PYTHON=ipython pyspark --packages com.databricks:spark-csv_2.10:1.4.0
will not.
The difference is in that last colon in the last (working) example, seperating the Scala version number from the package version number.
回答6:
In my case this error came for the script which was running fine before. So I figured out that this might be due to my JAVA update. Before I was using java 1.8 but I had accidentally updated to java 1.9. When I switched back to java 1.8 the error disappeared and everything is running fine. For those, who get this error for the same reason but do not know how to switch back to older java version on ubuntu: run
sudo update-alternatives --config java
and make the selection for java version
回答7:
I got this error because I was running low on disk space.
回答8:
Had same issue, after installing java using below lines solved the issue !
sudo add-apt-repository ppa:webupd8team/java sudo apt-get update sudo apt-get install oracle-java8-installer
回答9:
I had the same exception: installing java jdk worked for me.