Apache commons csv skip lines

耗尽温柔 提交于 2020-06-12 02:38:45

问题


How to skip lines in input file with apache commons csv. In my file first few lines are garbage useful meta-information like date, etc. Can't find any options for this.

private void parse() throws Exception {
    Iterable<CSVRecord> records = CSVFormat.EXCEL
            .withQuote('"').withDelimiter(';').parse(new FileReader("example.csv"));
    for (CSVRecord csvRecord : records) {
        //do something            
    }
}

回答1:


Use FileReader.readLine() before starting the for-loop.

Your example:

private void parse() throws Exception {
  FileReader reader = new FileReader("example.csv");
  reader.readLine(); // Read the first/current line.

  Iterable <CSVRecord> records = CSVFormat.EXCEL.withQuote('"').withDelimiter(';').parse(reader);
  for (CSVRecord csvRecord: records) {
    // do something
  }
}



回答2:


There is no built-in facility to skip an unknown number of lines.

If you want to skip only the first line (the header line), you can call withSkipHeaderRecord() while building the parser.

A more general solution would be to call next() on the iterator:

Iterable<CSVRecord> parser = CSVFormat.DEFAULT.parse(new FileReader("example.csv"));
Iterator<CSVRecord> iterator = parser.iterator();

for (int i = 0; i < amountToSkip; i++) {
    if (iterator.hasNext()) {
        iterator.next();
    }
}

while (iterator.hasNext()) {
    CSVRecord record = iterator.next();
    System.out.println(record);
}



回答3:


So CSVParser.iterator() should most definitely not throw an exception on iterator.hasNext() as it makes it near impossible to recover during an error condition.

But where there is a will there is a way, and I present a Terrible Idea that sorta works™

    public void runOnFile(Path file) {
        try {
            BufferedReader in = fixHeaders(file);
            CSVParser parsed = CSVFormat.DEFAULT.withFirstRecordAsHeader().parse(in);
            Map<String, Integer> headerMap = parsed.getHeaderMap();

            String line;
            while ((line = in.readLine()) != null) {
                try {
                    CSVRecord record = CSVFormat.DEFAULT.withHeader(headerMap.keySet().toArray(new String[headerMap.keySet().size()]))
                            .parse(new StringReader(line)).getRecords().get(0);
                    // do something with your record
                } catch (Exception e) {
                    System.out.println("ignoring line:" + line);
                }
            }
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }


来源:https://stackoverflow.com/questions/33972243/apache-commons-csv-skip-lines

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!