Updating a dataframe column in spark

前端 未结 5 1618
庸人自扰
庸人自扰 2020-11-28 02:55

Looking at the new spark dataframe api, it is unclear whether it is possible to modify dataframe columns.

How would I go about changing a value in row x

5条回答
  •  盖世英雄少女心
    2020-11-28 03:40

    Commonly when updating a column, we want to map an old value to a new value. Here's a way to do that in pyspark without UDF's:

    # update df[update_col], mapping old_value --> new_value
    from pyspark.sql import functions as F
    df = df.withColumn(update_col,
        F.when(df[update_col]==old_value,new_value).
        otherwise(df[update_col])).
    

提交回复
热议问题