PySpark When item in list

倾然丶 夕夏残阳落幕 提交于 2019-12-09 03:46:40

问题


Following is the action I'm trying to achieve:

types = ["200","300"]
def Count(ID):
    cnd = F.when((**F.col("type") in types**), 1).otherwise(F.lit(0))
    return F.sum(cnd).alias("CountTypes")

The syntax in bold is not correct, any suggestions how to get the right syntax here for PySpark?


回答1:


I'm not sure about what you are trying to achieve but here is the correct syntax :

types = ["200","300"]
from pyspark.sql import functions as F

cnd = F.when(F.col("type").isin(types),F.lit(1)).otherwise(F.lit(0))
sum_on_cnd = F.sum(cnd).alias("count_types")
# Column<b'sum(CASE WHEN (type IN (200, 300)) THEN 1 ELSE 0 END) AS `count_types`'>


来源:https://stackoverflow.com/questions/41328352/pyspark-when-item-in-list

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!