My csv data looks like this:
heading1,heading2,heading3,heading4,heading5,value1_1,value2_1,value3_1,value4_1,value5_1,value1_2,value2_2,value3_2,val
function CSVParse(csvFile)
{
this.rows = [];
var fieldRegEx = new RegExp('(?:\s*"((?:""|[^"])*)"\s*|\s*((?:""|[^",\r\n])*(?:""|[^"\s,\r\n]))?\s*)(,|[\r\n]+|$)', "g");
var row = [];
var currMatch = null;
while (currMatch = fieldRegEx.exec(this.csvFile))
{
row.push([currMatch[1], currMatch[2]].join('')); // concatenate with potential nulls
if (currMatch[3] != ',')
{
this.rows.push(row);
row = [];
}
if (currMatch[3].length == 0)
break;
}
}
I like to have the regex do as much as possible. This regex treats all items as either quoted or unquoted, followed by either a column delimiter, or a row delimiter. Or the end of text.
Which is why that last condition -- without it it would be an infinite loop since the pattern can match a zero length field (totally valid in csv). But since $ is a zero length assertion, it won't progress to a non match and end the loop.
And FYI, I had to make the second alternative exclude quotes surrounding the value; seems like it was executing before the first alternative on my javascript engine and considering the quotes as part of the unquoted value. I won't ask -- just got it to work.