How can I process a large file via CSVParser?

眉间皱痕 提交于 2019-11-30 18:39:26

No matter what you do, all of the data from your file is going to come over to your local machine because your system needs to parse through it to determine validity. Whether the file arrives via a file read through the parser (so you can parse each line), or whether you just copy the entire file over for parsing purposes, it will all come over to local. You will need to get the data local, then trim the excess.

Calling csvFileParser.getRecords() is already a lost battle because the documentation explains that that method loads every row of your file into memory. To parse the record while conserving active memory, you should instead iterate over each record; the documentation implies the following code loads one record to memory at a time:

CSVParser csvFileParser = CSVParser.parse(new File("filePath"), csvFileFormat);

for (CSVRecord csvRecord : csvFileParser) {
     ... // qualify the csvRecord; output qualified row to new file and flush as needed.
}

Since you explained that "filePath" is not local, the above solution is prone to failure due to connectivity issues. To eliminate connectivity issues, I recommend you copy the entire remote file over to local, ensure the file copied accurately by comparing checksums, parse the local copy to create your target file, then delete the local copy after completion.

This is a late response, but you CAN use a BufferedReader with the CSVParser:

try (BufferedReader reader = new BufferedReader(new FileReader(fileName), 1048576 * 10)) {
    Iterable<CSVRecord> records = CSVFormat.RFC4180.parse(reader);
    for (CSVRecord line: records) {
        // Process each line here
    }
catch (...) { // handle exceptions from your bufferedreader here
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!