Definition says:
RDD is immutable distributed collection of objects
I don\'t quite understand what does it mean. Is it like da
RDD is a way of representing data in spark.The source of data can be JSON,CSV textfile or some other source. RDD is fault tolerant which means that it stores data on multiple locations(i.e the data is stored in distributed form ) so if a node fails the data can be recovered. In RDD data is available at all times. However RDD are slow and hard to code hence outdated. It has been replaced by concept of DataFrame and Dataset.