If dataframes in Spark are immutable, why are we able to modify it with operations such as withColumn()?

前端 未结 2 1475
感动是毒
感动是毒 2020-12-20 23:15

This is probably a stupid question originating from my ignorance. I have been working on PySpark for a few weeks now and do not have much programming experience to start wit

2条回答
  •  天涯浪人
    2020-12-20 23:51

    You aren't; the documentation explicitly says

    Returns a new Dataset by adding a column or replacing the existing column that has the same name.

    If you keep a variable referring to the dataframe you called withColumn on, it won't have the new column.

提交回复
热议问题