python : reduce by key with if condition statement?

走远了吗. 提交于 2019-12-11 19:13:41

问题


(K1, (v1, v2))
(K2, (v3, v4))
(K1, (v1, v5))
(K2, (v3, v6))

How can I sum up the values of the key provided the first value is the some or eque such that I get (k1, (v1,v2+v5), (k2,(v3,v4+v6) ?


回答1:


IIUC, you need to change the key before the reduce, and then map your values back in the desired format.

You should be able to do the following:

new_rdd = rdd.map(lambda row: ((row[0], row[1][0]), row[1][1]))\
    .reduceByKey(sum).
    .map(lambda row: (row[0][0], (row[0][1], row[1])))


来源:https://stackoverflow.com/questions/54226549/python-reduce-by-key-with-if-condition-statement

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!