Kryo vs Encoder vs Java Serialization in Spark?

廉价感情. 提交于 2021-02-08 10:40:35

问题


Which serialization is used for which case,
From spark documentation it says :
It provides two serialization libraries:
1. Java(default) and
2. Kryo
Now where did Encoders come from and why is it not given in the doc.
And also from databricks it says Encoders performs faster for Datasets,what about RDD, and how do all these maps together. In which case which serializer should we use?


回答1:


  • Encoders are used in Dataset only.
  • Kryo is used internally in spark.
  • Kryo and Java serialization is available for you to use for your data shuffling.

As to which should you use - Kryo is your best option if you don't use Dataset. Otherwise you don't have any options, actually.



来源:https://stackoverflow.com/questions/59298413/kryo-vs-encoder-vs-java-serialization-in-spark

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!