CSV parser in JAVA, double quotes in string (SuperCSV, OpenCSV)

十年热恋 提交于 2019-12-01 10:44:52

It's not clear from your question whether you're asking....

1. My data contains quotes - why are they being stripped out?

In this case, I'd point you to the CSV specification as your CSV file is not properly escaped, so those quotes aren't actually part of your data.

It should be

1,""Bob"",London,12

not

1,"Bob",London,12

2. How do I apply quotes when writing (even if the data doesn't contain commas, quotes, etc)?

By default Super CSV only escapes if necessary (the field contains a comma, double quote or newline).

If you really want to enable quotes, then you can configure Super CSV with a quote mode.

For example, you could always quote the name column in your example with the following preferences:

private static final CsvPreference ALWAYS_QUOTE_NAME_COL = 
    new CsvPreference.Builder(CsvPreference.STANDARD_PREFERENCE)
    .useQuoteMode(new ColumnQuoteMode(2)).build();

Alternatively, if you want to quote everything then you can use AlwaysQuoteMode, or if you want a completely custom solution, then you can write your own QuoteMode.

You create your own Preference.

CsvPreference excelPreference = new CsvPreference.Builder('\'', ',', "\n").build();
CsvListReader parser = new CsvListReader(Files.newBufferedReader(pathToFile , StandardCharsets.UTF_8), excelPreference);

After that, it will output as expected. In this example, you will strip the single quote if you have that in your csv file and keep the double quote untouched.

In the CsvPreference.EXCEL_PREFERENCE you've given, the quote character is the " as described in the javadoc. The quote character is a character you use to wrap special characters that want you want to appear literally.

As such, for these preferences, the appropriate way to produce your CSV content would be

id, name, city, age
1,"""Bob""",London,12

Otherwise, the CSV parser simply thinks

"Bob"

means, literally,

Bob

since there is no other special character between the quotes. But a quote is a special character so if it appears between quotes, it will be considered, literally, as a quote.

Alternatively, provide a different CsvPreference object which has a different quote character.

Make this decision only after you are certain about what your CSV producer is sending you.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!