Apache Spark: network errors between executors

前端 未结 2 1444
醉梦人生
醉梦人生 2020-12-08 20:25

I\'m running Apache Spark 1.3.1 on Scala 2.11.2, and when running on an HPC cluster with large enough data, I get numerous errors like the ones at the bottom of my post (rep

相关标签:
2条回答
  • 2020-12-08 20:36

    This appears to be a bug related to the Netty networking system (block transfer service), added in Spark 1.2. Adding .set("spark.shuffle.blockTransferService", "nio") to my SparkConf fixed the bug, so now everything works perfectly.

    I found a post on the spark-user mailing list from someone that was running into similar errors, and they suggested trying nio instead of Netty.

    SPARK-5085 is similar, in that changing from Netty to nio fixed their issue; however, they were also able to fix the issue by changing some networking settings. (I didn't try this yet myself, since I'm not sure I have the right access privileges to do so on the cluster.)

    0 讨论(0)
  • 2020-12-08 20:59

    It's also possible that your Maven config is different from your Spark server installation.

    For example your picked a pom.xml from an blog post Tutorial

    <dependencies>
        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_1.3</artifactId>
            <version>1.3</version>
        </dependency>
    
    </dependencies>
    

    But you could have downloaded the latest 2.3 version on Apache Spark website.

    0 讨论(0)
提交回复
热议问题