I have a CSV file that is structured this way:
Header
Blank Row
\"Col1\",\"Col2\"
\"1,200\",\"1,456\"
\"2,000\",\"3,450\"
I have two proble
Why don't you just try the DataFrameReader API from pyspark.sql? It is pretty easy. For this problem, I guess this single line would be good enough.
df = spark.read.csv("myFile.csv") # By default, quote char is " and separator is ','
With this API, you can also play around with few other parameters like header lines, ignoring leading and trailing whitespaces. Here is the link: DataFrameReader API