Number of lines in a file in Java

前端 未结 19 2541
抹茶落季
抹茶落季 2020-11-22 05:31

I use huge data files, sometimes I only need to know the number of lines in these files, usually I open them up and read them line by line until I reach the end of the file<

19条回答
  •  挽巷
    挽巷 (楼主)
    2020-11-22 05:46

    I know this is an old question, but the accepted solution didn't quite match what I needed it to do. So, I refined it to accept various line terminators (rather than just line feed) and to use a specified character encoding (rather than ISO-8859-n). All in one method (refactor as appropriate):

    public static long getLinesCount(String fileName, String encodingName) throws IOException {
        long linesCount = 0;
        File file = new File(fileName);
        FileInputStream fileIn = new FileInputStream(file);
        try {
            Charset encoding = Charset.forName(encodingName);
            Reader fileReader = new InputStreamReader(fileIn, encoding);
            int bufferSize = 4096;
            Reader reader = new BufferedReader(fileReader, bufferSize);
            char[] buffer = new char[bufferSize];
            int prevChar = -1;
            int readCount = reader.read(buffer);
            while (readCount != -1) {
                for (int i = 0; i < readCount; i++) {
                    int nextChar = buffer[i];
                    switch (nextChar) {
                        case '\r': {
                            // The current line is terminated by a carriage return or by a carriage return immediately followed by a line feed.
                            linesCount++;
                            break;
                        }
                        case '\n': {
                            if (prevChar == '\r') {
                                // The current line is terminated by a carriage return immediately followed by a line feed.
                                // The line has already been counted.
                            } else {
                                // The current line is terminated by a line feed.
                                linesCount++;
                            }
                            break;
                        }
                    }
                    prevChar = nextChar;
                }
                readCount = reader.read(buffer);
            }
            if (prevCh != -1) {
                switch (prevCh) {
                    case '\r':
                    case '\n': {
                        // The last line is terminated by a line terminator.
                        // The last line has already been counted.
                        break;
                    }
                    default: {
                        // The last line is terminated by end-of-file.
                        linesCount++;
                    }
                }
            }
        } finally {
            fileIn.close();
        }
        return linesCount;
    }
    

    This solution is comparable in speed to the accepted solution, about 4% slower in my tests (though timing tests in Java are notoriously unreliable).

提交回复
热议问题