PySpark SparkSession Builder with Kubernetes Master

前端 未结 1 1115
我在风中等你
我在风中等你 2021-02-09 04:19

I recently saw a pull request that was merged to the Apache/Spark repository that apparently adds initial Python bindings for PySpark on K8s. I posted a comment to the PR asking

相关标签:
1条回答
  • 2021-02-09 05:15

    pyspark client mode works on Spark's latest version 2.4.0

    This is how I did it (in Jupyter lab):

    import os
    os.environ['PYSPARK_PYTHON']="/usr/bin/python3.6"
    os.environ['PYSPARK_DRIVER_PYTHON']="/usr/bin/python3.6"
    
    from pyspark import SparkContext, SparkConf
    from pyspark.sql import SparkSession
    
    sparkConf = SparkConf()
    sparkConf.setMaster("k8s://https://localhost:6443")
    sparkConf.setAppName("KUBERNETES-IS-AWESOME")
    sparkConf.set("spark.kubernetes.container.image", "robot108/spark-py:latest")
    sparkConf.set("spark.kubernetes.namespace", "playground")
    
    spark = SparkSession.builder.config(conf=sparkConf).getOrCreate()
    sc = spark.sparkContext
    

    Note: I am running kubernetes locally on Mac with Docker Desktop.

    0 讨论(0)
提交回复
热议问题