The Parquet files contain a per-block row count field. Spark seems to read it at some point (SpecificParquetRecordReaderBase.java#L151).
I tried this in spark-
spark-
We can also use
java.text.NumberFormat.getIntegerInstance.format(sparkdf.count)