Is it possible to create nested RDDs in Apache Spark?

前端 未结 2 903
甜味超标
甜味超标 2020-12-06 13:28

I am trying to implement K-nearest neighbor algorithm in Spark. I was wondering if it is possible to work with nested RDD\'s. This will make my life a lot easier. Consider t

2条回答
  •  情话喂你
    2020-12-06 14:30

    I ran into nullpointer exception while trying something of this sort.As we can't perform operations on RDDs within a RDD.

    Spark doesn't support nesting of RDDs the reason being - to perform an operation or create a new RDD spark runtime requires access to sparkcontext object which is available only in the driver machine.

    Hence if you want to operate on nested RDDs, you may collect the parent RDD on driver node then iterate it's items using array or something.

    Note:- RDD class is serializable. Please see below.

提交回复
热议问题