I am pretty new to Pyspark and I have been processing a large number of files. I have around 2.5 TB of data and I am extracting some metadata from each of these files and storin