cross combine two RDDs using pyspark
问题 How can I cross combine (is this the correct way to describe?) the two RDDS? input: rdd1 = [a, b] rdd2 = [c, d] output: rdd3 = [(a, c), (a, d), (b, c), (b, d)] I tried rdd3 = rdd1.flatMap(lambda x: rdd2.map(lambda y: (x, y)) , it complains that It appears that you are attempting to broadcast an RDD or reference an RDD from an action or transformation. . I guess that means you can not nest action as in the list comprehension, and one statement can only do one action . 回答1: So as you have