Is anyone familiar with how to parse through a csv file and put it inside a string list. Right now I am taking the entire csv file and putting into the string list. I am try
One might prefer to do it this way:
QStringList MainWindow::parseCSV(const QString &string)
{
enum State {Normal, Quote} state = Normal;
QStringList fields;
QString value;
for (int i = 0; i < string.size(); i++)
{
const QChar current = string.at(i);
// Normal state
if (state == Normal)
{
// Comma
if (current == ',')
{
// Save field
fields.append(value.trimmed());
value.clear();
}
// Double-quote
else if (current == '"')
{
state = Quote;
value += current;
}
// Other character
else
value += current;
}
// In-quote state
else if (state == Quote)
{
// Another double-quote
if (current == '"')
{
if (i < string.size())
{
// A double double-quote?
if (i+1 < string.size() && string.at(i+1) == '"')
{
value += '"';
// Skip a second quote character in a row
i++;
}
else
{
state = Normal;
value += '"';
}
}
}
// Other character
else
value += current;
}
}
if (!value.isEmpty())
fields.append(value.trimmed());
// Quotes are left in until here; so when fields are trimmed, only whitespace outside of
// quotes is removed. The quotes are removed here.
for (int i=0; i=1 && fields[i].left(1)=='"')
{
fields[i]=fields[i].mid(1);
if (fields[i].length()>=1 && fields[i].right(1)=='"')
fields[i]=fields[i].left(fields[i].length()-1);
}
return fields;
}
Edit: I've finally got around to getting this to trim spaces before and after the fields. No whitespace nor commas are trimmed inside quotes. Otherwise, all whitespace is trimmed from the start and end of a field. After puzzling about this for a while, I hit on the idea that the quotes could be left around the field; and so all fields could be trimmed. That way, only whitespace before and after quotes or text is removed. A final step was then added, to strip out quotes for fields that start and end with quotes.
Here is a more or less challenging test case:
QStringList sl=
{
"\"one\"",
" \" two \"\"\" , \" and a half ",
"three ",
"\t four"
};
for (int i=0; i < sl.size(); ++i)
qDebug() << parseCSV(sl[i]);
This corresponds to the file
"one"
" two """ , " and a half
three
four
where
Its output is (where qDebug() is representing quotes in the string with \" and putting things in quotes and parens):
("one")
(" two \"", " and a half")
("three")
("four")
You can observe that the quote and the extra spaces were preserved inside the quote for item "two". In the malformed case for "and a half", the space before the quote, and those after the last word, were removed; but the others were not. Missing terminal spaces in this routine could be an indication of a missing terminal quote. Quotes in a field that don't start or end it are just treated as part of a string. A quote isn't removed from the end of a field if one doesn't start it. To detect an error here, just check for a field that starts with a quote, but doesn't end with one; and/or one that contains quotes but doesn't start and end with one, in the final loop.
More than was needed for yer test case, I know; but a solid general answer to the ?, nonetheless - perhaps for others who have found it.
Adapted from: https://github.com/hnaohiro/qt-csv/blob/master/csv.cpp