I\'m trying to understand physical plans on spark but I\'m not understanding some parts because they seem different from traditional rdbms. For example, in this plan below,
Tungsten is the new memory engine in Spark since 1.4, which manages data outside JVM to save some GC overhead. You can imagine doing that involves copy data from and to JVM. That's it. In Spark 1.5 you can turn Tungsten off through spark.sql.tungsten.enabled then you will see the "old" plan, in Spark 1.6 I think you can't turn it off any more.