This is probably a stupid question originating from my ignorance. I have been working on PySpark for a few weeks now and do not have much programming experience to start wit
You aren't; the documentation explicitly says
Returns a new Dataset by adding a column or replacing the existing column that has the same name.
If you keep a variable referring to the dataframe you called withColumn on, it won't have the new column.
withColumn