问题
I am using spark 2.4.4. Having trouble running spark-submit on my ubuntu machine with the following config and system environment:
cpchung:small$ minikube start --kubernetes-version=1.17.0
😄 minikube v1.6.2 on Ubuntu 19.10
✨ Selecting 'kvm2' driver from user configuration (alternates: [virtualbox none])
💡 Tip: Use 'minikube start -p <name>' to create a new cluster, or 'minikube delete' to delete this one.
🏃 Using the running kvm2 "minikube" VM ...
⌛ Waiting for the host to be provisioned ...
🐳 Preparing Kubernetes v1.17.0 on Docker '18.09.9' ...
💾 Downloading kubelet v1.17.0
💾 Downloading kubeadm v1.17.0
🚜 Pulling images ...
🚀 Launching Kubernetes ...
🏄 Done! kubectl is now configured to use "minikube"
cpchung:small$ minikube docker-env
export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://192.168.39.246:2376"
export DOCKER_CERT_PATH="/home/cpchung/.minikube/certs"
# Run this command to configure your shell:
# eval $(minikube docker-env)
cpchung:small$ eval $(minikube docker-env)
cpchung:small$ ./run.sh
inside my run.sh
is this:
spark-submit --class SimpleApp \
--master "k8s://https://192.168.39.246:8443" \
--deploy-mode cluster \
--conf spark.kubernetes.container.image=somename:latest \
--conf spark.kubernetes.image.pullPolicy=IfNotPresent \
--conf spark.executor.instances=1 \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
local:///small.jar
Then I got this:
simpleapp-1580791262398-driver 0/1 Error 0 21s
cpchung:small$ kubectl logs simpleapp-1580791262398-driver
20/02/04 04:41:10 INFO Utils: Successfully started service 'SparkUI' on port 4040.
20/02/04 04:41:10 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://simpleapp-1580791262398-driver-svc.default.svc:4040
20/02/04 04:41:10 INFO SparkContext: Added JAR file:///small.jar at spark://simpleapp-1580791262398-driver-svc.default.svc:7078/jars/small.jar with timestamp 1580791270603
20/02/04 04:41:14 INFO ExecutorPodsAllocator: Going to request 1 executors from Kubernetes.
20/02/04 04:41:14 WARN WatchConnectionManager: Exec Failure: HTTP 403, Status: 403 -
java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden'
回答1:
Spark up to v2.4.4
uses fabric8 Kubernetes client v4.1.2
which works only with Kubernetes APIs 1.9.0 - 1.12.0
, please refer compatibility matrix and pom.xml.
Spark 2.4.5 will upgrade Kubernetes client to v4.6.1
and will support Kubernetes APIs up to 1.15.2
.
So you have the following options:
- Downgrade Kubernetes cluster to
1.12
- Wait until Spark
2.4.5/3.0.0
is released and downgrade Kubernetes cluster to1.15.2
- Upgrade Spark
fabric8 Kubernetes client
dependency and do custom build of Spark and its Docker image
Hope it helps.
Update 1
In order to run Spark on Kubernetes cluster with API version 1.17.0+
you need to patch Spark code based on tag v2.4.5
with the following changes:
diff --git a/resource-managers/kubernetes/core/pom.xml b/resource-managers/kubernetes/core/pom.xml
index c9850ca512..41468e1363 100644
--- a/resource-managers/kubernetes/core/pom.xml
+++ b/resource-managers/kubernetes/core/pom.xml
@@ -29,7 +29,7 @@
<name>Spark Project Kubernetes</name>
<properties>
<sbt.project.name>kubernetes</sbt.project.name>
- <kubernetes.client.version>4.6.1</kubernetes.client.version>
+ <kubernetes.client.version>4.7.1</kubernetes.client.version>
</properties>
<dependencies>
diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala
index 575bc54ffe..c180ea3a7b 100644
--- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala
+++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala
@@ -67,7 +67,8 @@ private[spark] class BasicDriverFeatureStep(
.withAmount(driverCpuCores)
.build()
val driverMemoryQuantity = new QuantityBuilder(false)
- .withAmount(s"${driverMemoryWithOverheadMiB}Mi")
+ .withAmount(driverMemoryWithOverheadMiB.toString)
+ .withFormat("Mi")
.build()
val maybeCpuLimitQuantity = driverLimitCores.map { limitCores =>
("cpu", new QuantityBuilder(false).withAmount(limitCores).build())
diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
index d89995ba5e..9c589ace92 100644
--- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
+++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
@@ -86,7 +86,8 @@ private[spark] class BasicExecutorFeatureStep(
// executorId
val hostname = name.substring(Math.max(0, name.length - 63))
val executorMemoryQuantity = new QuantityBuilder(false)
- .withAmount(s"${executorMemoryTotal}Mi")
+ .withAmount(executorMemoryTotal.toString)
+ .withFormat("Mi")
.build()
val executorCpuQuantity = new QuantityBuilder(false)
.withAmount(executorCoresRequest)
QuantityBuilder
has changed the way of parsing the input with this PR which is available since fabric8 Kubernetes client v4.7.0
.
Update 2
Spark images are based on openjdk:8-jdk-slim
which runs Java 8u252
which in order has a bug related to OkHttp. To fix it we require fabric8 Kubernetes client v4.9.2
, please refer its release notes for more details.
Also the patch above can be simplified:
diff --git a/resource-managers/kubernetes/core/pom.xml b/resource-managers/kubernetes/core/pom.xml
index c9850ca512..595aaba8dd 100644
--- a/resource-managers/kubernetes/core/pom.xml
+++ b/resource-managers/kubernetes/core/pom.xml
@@ -29,7 +29,7 @@
<name>Spark Project Kubernetes</name>
<properties>
<sbt.project.name>kubernetes</sbt.project.name>
- <kubernetes.client.version>4.6.1</kubernetes.client.version>
+ <kubernetes.client.version>4.9.2</kubernetes.client.version>
</properties>
<dependencies>
diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala
index 575bc54ffe..a1559e07a4 100644
--- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala
+++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicDriverFeatureStep.scala
@@ -66,9 +66,7 @@ private[spark] class BasicDriverFeatureStep(
val driverCpuQuantity = new QuantityBuilder(false)
.withAmount(driverCpuCores)
.build()
- val driverMemoryQuantity = new QuantityBuilder(false)
- .withAmount(s"${driverMemoryWithOverheadMiB}Mi")
- .build()
+ val driverMemoryQuantity = new Quantity(s"${driverMemoryWithOverheadMiB}Mi")
val maybeCpuLimitQuantity = driverLimitCores.map { limitCores =>
("cpu", new QuantityBuilder(false).withAmount(limitCores).build())
}
diff --git a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
index d89995ba5e..b439ebf837 100644
--- a/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
+++ b/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala
@@ -85,9 +85,7 @@ private[spark] class BasicExecutorFeatureStep(
// name as the hostname. This preserves uniqueness since the end of name contains
// executorId
val hostname = name.substring(Math.max(0, name.length - 63))
- val executorMemoryQuantity = new QuantityBuilder(false)
- .withAmount(s"${executorMemoryTotal}Mi")
- .build()
+ val executorMemoryQuantity = new Quantity(s"${executorMemoryTotal}Mi")
val executorCpuQuantity = new QuantityBuilder(false)
.withAmount(executorCoresRequest)
.build()
来源:https://stackoverflow.com/questions/60050782/spark-on-kubernetes-expected-http-101-response-but-was-403-forbidden