Why does Spark fail with “Failed to get broadcast_0_piece0 of broadcast_0” in local mode?

前端 未结 5 1270
孤城傲影
孤城傲影 2021-02-06 10:59

I\'m running this snippet to sort an RDD of points, ordering the RDD and taking the K-nearest points from a given point:

def getKNN(sparkContext:SparkContext, k:         


        
5条回答
  •  南旧
    南旧 (楼主)
    2021-02-06 11:40

    I was getting this error as well. I haven't really seen any concrete coding examples, so I will share my solution. This cleared the error for me, but I have a sense that there may be more than 1 solution to this problem. But this would be worth a go as it keeps everything within the code.

    It looks as though the SparkContext was shutting down, thus throwing the error. I think the issue is that the SparkContext is created in a class and then extended to other classes. The extension causes it to shut down, which is a bit annoying. Below is the implementation I used to get this error to clear.

    Spark Initialisation Class:

    import org.apache.spark.{SparkConf, SparkContext}
    
    class Spark extends Serializable {
      def getContext: SparkContext = {
        @transient lazy val conf: SparkConf = 
              new SparkConf()
              .setMaster("local")
              .setAppName("test")
    
        @transient lazy val sc: SparkContext = new SparkContext(conf)
        sc.setLogLevel("OFF")
    
       sc
      }
     }
    

    Main Class:

    object Test extends Spark{
    
      def main(args: Array[String]): Unit = {
      val sc = getContext
      val irisRDD: RDD[String] = sc.textFile("...")
    ...
    }
    

    Then just extend your other class with the the Spark Class and it should all work out.

    I was getting the error running LogisticRegression Models, so I would assume this should fix it for you as well with other Machine Learning libraries as well.

提交回复
热议问题