I have dataframe in pyspark. Some of its numerical columns contain \'nan\' so when I am reading the data and checking for the schema of dataframe, those columns will have \'
You could use cast(as int) after replacing NaN with 0,
cast
NaN
0
data_df = df.withColumn("Plays", df.call_time.cast('float'))