I am converting some code from Pandas to pyspark. In pandas, lets imagine I have the following mock dataframe, df:
And in pandas, I define a certain variable