How to add a new column to a Spark RDD?
I have a RDD with MANY columns (e.g., hundreds ), how do I add one more column at the end of this RDD? For example, if my RDD is like below: 123, 523, 534, ..., 893 536, 98, 1623, ..., 98472 537, 89, 83640, ..., 9265 7297, 98364, 9, ..., 735 ...... 29, 94, 956, ..., 758 how can I add a column to it, whose value is the sum of the second and the third columns? Thank you very much. You do not have to use Tuple * objects at all for adding a new column to an RDD . It can be done by mapping each row, taking its original contents plus the elements you want to append, for example: val rdd = ... val