How to avoid triggering an ArrayIndexOutOfBoundsException while parsing empty positions in a line of CSV?

杀马特。学长 韩版系。学妹 提交于 2020-02-02 14:08:05

问题


String[] values = line.split(",");

Long locId = Long.parseLong(replaceQuotes(values[0]));
String country = replaceQuotes(values[1]);
String region = replaceQuotes(values[2]);
String city = replaceQuotes(values[3]);
String postalCode = replaceQuotes(values[4]);
String latitude = replaceQuotes(values[5]);
String longitude = replaceQuotes(values[6]);
String metroCode = replaceQuotes(values[7]);
String areaCode = replaceQuotes(values[8]);

//...

public String replaceQuotes(String txt){
    txt = txt.replaceAll("\"", "");
    return txt;
}

I'm using the code above to parse a CSV with data in this format:

828,"US","IL","Melrose Park","60160",41.9050,-87.8641,602,708

However, when I encounter a line of data such as the following I get java.lang.ArrayIndexOutOfBoundsException: 7

1,"O1","","","",0.0000,0.0000,,

Does this mean that any time I even try to access the value at values[7], an Exception will be thrown?

If so, how do I parse lines that don't contain data in that position of the text line?


回答1:


First of all, String.split() is not a great CSV parser: it doesn't know about quotes and will mess up as soon as one of your quoted values contains a comma.

That being said, by default String.split() leaves out empty trailing elements. You can influence that by using the two-argument variant:

String[] values = line.split(",", -1);
  • -1 (or any negative value) means that the array will be as large as necessary.
  • Using a positive value gives a maximum amount of splits to be done (meaning that everything beyond that will be a single value, even if it contains a comma).
  • 0 (the default if you use the one-argument value) means that the array will be as large as necessary, but empty trailing values will be left out of the array (exactly as it happens to you).



回答2:


As a general rule you should never, ever hack up your own (faulty) parser if a working one already exists. CSV is not easy to parse correctly, and String.split will not do the job since CSV allows , to be used between "'s without working as separaters.

Consider using OpenCSV. This will solve both the problem you have now and the problem you will face when a user uses a , as part of the data.



来源:https://stackoverflow.com/questions/6581932/how-to-avoid-triggering-an-arrayindexoutofboundsexception-while-parsing-empty-po

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!