Spark query running very slow

前端 未结 2 2224
野性不改
野性不改 2021-02-20 07:09

i have a cluster on AWS with 2 slaves and 1 master. All instances are of type m1.large. I\'m running spark version 1.4. I\'m benchmarking the performance of spark over 4m data c

2条回答
  •  醉话见心
    2021-02-20 08:02

    1. Set default.parallelism to 2
    2. Start spark with --num-executor-cores 8
    3. Modify this part

    df.registerTempTable('test') d=sqlContext.sql("""...

    to

    df.registerTempTable('test') sqlContext.cacheTable("test") d=sqlContext.sql("""...

提交回复
热议问题