Number of lines in a file in Java

前端 未结 19 2551
抹茶落季
抹茶落季 2020-11-22 05:31

I use huge data files, sometimes I only need to know the number of lines in these files, usually I open them up and read them line by line until I reach the end of the file<

19条回答
  •  孤城傲影
    2020-11-22 06:10

    I concluded that wc -l:s method of counting newlines is fine but returns non-intuitive results on files where the last line doesn't end with a newline.

    And @er.vikas solution based on LineNumberReader but adding one to the line count returned non-intuitive results on files where the last line does end with newline.

    I therefore made an algo which handles as follows:

    @Test
    public void empty() throws IOException {
        assertEquals(0, count(""));
    }
    
    @Test
    public void singleNewline() throws IOException {
        assertEquals(1, count("\n"));
    }
    
    @Test
    public void dataWithoutNewline() throws IOException {
        assertEquals(1, count("one"));
    }
    
    @Test
    public void oneCompleteLine() throws IOException {
        assertEquals(1, count("one\n"));
    }
    
    @Test
    public void twoCompleteLines() throws IOException {
        assertEquals(2, count("one\ntwo\n"));
    }
    
    @Test
    public void twoLinesWithoutNewlineAtEnd() throws IOException {
        assertEquals(2, count("one\ntwo"));
    }
    
    @Test
    public void aFewLines() throws IOException {
        assertEquals(5, count("one\ntwo\nthree\nfour\nfive\n"));
    }
    

    And it looks like this:

    static long countLines(InputStream is) throws IOException {
        try(LineNumberReader lnr = new LineNumberReader(new InputStreamReader(is))) {
            char[] buf = new char[8192];
            int n, previousN = -1;
            //Read will return at least one byte, no need to buffer more
            while((n = lnr.read(buf)) != -1) {
                previousN = n;
            }
            int ln = lnr.getLineNumber();
            if (previousN == -1) {
                //No data read at all, i.e file was empty
                return 0;
            } else {
                char lastChar = buf[previousN - 1];
                if (lastChar == '\n' || lastChar == '\r') {
                    //Ending with newline, deduct one
                    return ln;
                }
            }
            //normal case, return line number + 1
            return ln + 1;
        }
    }
    

    If you want intuitive results, you may use this. If you just want wc -l compatibility, simple use @er.vikas solution, but don't add one to the result and retry the skip:

    try(LineNumberReader lnr = new LineNumberReader(new FileReader(new File("File1")))) {
        while(lnr.skip(Long.MAX_VALUE) > 0){};
        return lnr.getLineNumber();
    }
    

提交回复
热议问题