What is RDD in spark

后端 未结 9 1516
傲寒
傲寒 2020-12-12 19:20

Definition says:

RDD is immutable distributed collection of objects

I don\'t quite understand what does it mean. Is it like da

9条回答
  •  既然无缘
    2020-12-12 19:55

    RDD is an Resilient Distributed Data Set. It is an core part of spark. It is an Low Level API of spark. DataFrame and DataSets are built on top of RDD. RDD are nothing but row level data i.e. sits on n number of executors. RDD's are immutable .means you cannot change the RDD. But you can create new RDD using Transformation and Actions

提交回复
热议问题