How to get the size of a data frame before doing the broadcast join in pyspark
问题 I am new to spark ,I want to do a broadcast join and before that i am trying to get the size of my data frame that i want to broadcast.. Is there anyway to find the size of a data frame . I am using Python as my programming language for spark Any help much appreciated 回答1: If you are looking for size in bytes as well as size in row count follow this- Alternative-1 // ### Alternative -1 /** * file content * spark-test-data.json * -------------------- * {"id":1,"name":"abc1"} * {"id":2,"name":