Spark: Dataset Serialization
问题 If I have a dataset each record of which is a case class, and I persist that dataset as shown below so that serialization is used: myDS.persist(StorageLevel.MERORY_ONLY_SER) Does Spark use java/kyro serialization to serialize the dataset? or just like dataframe, Spark has its own way of storing the data in the dataset? 回答1: Spark Dataset does not use standard serializers. Instead it uses Encoders , which "understand" internal structure of the data and can efficiently transform objects