I have a csv file in the below format.
H,\"TestItems_20100107.csv\",07/01/2010,20:00:00,\"TT1198\",\"MOBb\",\"AMD\",NEW,,
I require the spl
I came across this same problem today and found a simpe solution for csv files: adding an extra field containing just one space at the time the split is executed:
(line + ", ").split(",");
This way no matter how many consecutive empty fields may exist at the end of the csv file, split() will return always n+1 fields
Example session (using bsh)
bsh % line = "H,\"TestItems_20100107.csv\",07/01/2010,20:00:00,\"TT1198\",\"MOBb\",\"AMD\",NEW,,
bsh % System.out.println(line);
H,"TestItems_20100107.csv",07/01/2010,20:00:00,"TT1198","MOBb","AMD",NEW,,
bsh % String[] items = line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)");
bsh % System.out.println(items.length);
8
bsh % items = (line + ", ").split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)");
bsh % System.out.println(items.length - 1 );
10
bsh %
There's nothing wrong with the regular expression. The problem is that split discards empty matches at the end:
This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
A workaround is to supply an argument greater than the number of columns you expect in your CSV file:
String[] tokens = line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)", 99);