Looking at the new spark dataframe api, it is unclear whether it is possible to modify dataframe columns.
How would I go about changing a value in row x
importing col, when from pyspark.sql.functions and updating fifth column to integer(0,1,2) based on the string(string a, string b, string c) into a new DataFrame.
from pyspark.sql.functions import col, when
data_frame_temp = data_frame.withColumn("col_5",when(col("col_5") == "string a", 0).when(col("col_5") == "string b", 1).otherwise(2))