Can someone explain to me why that reading the same subset of columns from parquet using different method could result in different Input and shuffle size ( when shuffled )?