ValueError: could not convert string to float in Pyspark

ε祈祈猫儿з 提交于 2019-12-08 14:20:35

问题


my spark RDD looks something like this

totalDistance=flightsParsed.map(lambda x:x.distance)
totalDistance.take(5)


[1979.0, 640.0, 1947.0, 1590.0, 874.0]

But when i run reduce on it I get error as mentioned below

totalDistance=flightsParsed.map(lambda x:x.distance).reduce(lambda y,z:y+z)

ValueError: could not convert string to float:

Please help.


回答1:


Did you try:

totalDistance=flightsParsed.map(lambda x: int(x.distance or 0))

or

totalDistance=flightsParsed.map(lambda x: float(x.distance or 0))

You may have missing or inconsistent data inside flightsParsed



来源:https://stackoverflow.com/questions/47559522/valueerror-could-not-convert-string-to-float-in-pyspark

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!