Is there a way to calculate the size in bytes of an Apache spark Data Frame using pyspark?
why don't you just cache the df and then look in the spark UI under storage and convert the units to bytes
df.cache()
来源:https://stackoverflow.com/questions/38180140/how-can-you-calculate-the-size-of-an-apache-spark-data-frame-using-pyspark