Does the Scanner class load the entire file into memory at once?

六眼飞鱼酱① 提交于 2019-11-28 02:03:08
Edwin Dalorzo

If you read the source code you can answer the question yourself.

It appear that the implementation of the Scanner constructor in question shows:

public Scanner(File source) throws FileNotFoundException {
        this((ReadableByteChannel)(new FileInputStream(source).getChannel()));
}

Latter this is wrapped into a Reader:

private static Readable makeReadable(ReadableByteChannel source, CharsetDecoder dec) {
    return Channels.newReader(source, dec, -1);
}

And it is read using a buffer size

private static final int BUFFER_SIZE = 1024; // change to 1024;

As you can see in the final constructor in the construction chain:

private Scanner(Readable source, Pattern pattern) {
        assert source != null : "source should not be null";
        assert pattern != null : "pattern should not be null";
        this.source = source;
        delimPattern = pattern;
        buf = CharBuffer.allocate(BUFFER_SIZE);
        buf.limit(0);
        matcher = delimPattern.matcher(buf);
        matcher.useTransparentBounds(true);
        matcher.useAnchoringBounds(false);
        useLocale(Locale.getDefault(Locale.Category.FORMAT));
    }

So, it appears scanner does not read the entire file at once.

From reading the code, it appears to load 1 KB at a time by default. The size of the buffer can increase for long lines of text. (To the size of the longest line of text you have)

You're better off going with something like BufferedReader with a FileReader for large files. A basic example can be found here.

In ACM Contest the fast read is very important. In Java we found found that use something like that is very faster...

    FileInputStream inputStream = new FileInputStream("input.txt");
    InputStreamReader streamReader = new InputStreamReader(inputStream, "UTF-8");
    BufferedReader in = new BufferedReader(streamReader);
    Map<String, Integer> map = new HashMap<String, Integer>();
    int trees = 0;
    for (String s; (s = in.readLine()) != null; trees++) {
        Integer n = map.get(s);
        if (n != null) {
            map.put(s, n + 1);
        } else {
            map.put(s, 1);
        }
    }

The file contains, in that case, tree names...

Red Alder
Ash
Aspen
Basswood
Ash
Beech
Yellow Birch
Ash
Cherry
Cottonwood

You can use the StringTokenizer for catch any part of line that your want.

We have some errors if we use Scanner for large files. Read 100 lines from a file with 10000 lines!

A scanner can read text from any object which implements the Readable interface. If an invocation of the underlying readable's Readable.read(java.nio.CharBuffer) method throws an IOException then the scanner assumes that the end of the input has been reached. The most recent IOException thrown by the underlying readable can be retrieved via the ioException() method.

tells in the API

Good luck!

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!