问题
Is there a way to calculate the size in bytes of an Apache spark Data Frame using pyspark?
回答1:
why don't you just cache the df and then look in the spark UI under storage and convert the units to bytes
df.cache()
来源:https://stackoverflow.com/questions/38180140/how-can-you-calculate-the-size-of-an-apache-spark-data-frame-using-pyspark