In Spark there is a function input_file_name that I an use to create a new column with the path / filename for each row.
df.withColumn("path", f.inp