Read large amount of data from file in Java

后端 未结 7 1333
一生所求
一生所求 2020-12-03 01:36

I\'ve got text file that contains 1 000 002 numbers in following formation:

123 456
1 2 3 4 5 6 .... 999999 100000

Now I need

7条回答
  •  再見小時候
    2020-12-03 02:15

    Use a StreamTokenizer on a BufferedReader will give you quite good performance already. You shouldn't need to write your own readInt() function.

    Here is the code I used to do some local performance testing:

    /**
     * Created by zhenhua.xu on 11/27/16.
     */
    public class MyReader {
    
    private static final String FILE_NAME = "./1m_numbers.txt";
    private static final int n = 1000000;
    
    public static void main(String[] args) {
        try {
            readByScanner();
            readByStreamTokenizer();
            readByStreamTokenizerOnBufferedReader();
            readByBufferedInputStream();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
    
    public static void readByScanner() throws Exception {
        long startTime = System.currentTimeMillis();
    
        Scanner stdin = new Scanner(new File(FILE_NAME));
        int array[] = new int[n];
        for (int i = 0; i < n; i++) {
            array[i] = stdin.nextInt();
        }
    
        long endTime = System.currentTimeMillis();
        System.out.println(String.format("Total time by Scanner: %d ms", endTime - startTime));
    }
    
    public static void readByStreamTokenizer() throws Exception {
        long startTime = System.currentTimeMillis();
    
        StreamTokenizer st = new StreamTokenizer(new FileReader(FILE_NAME));
        int array[] = new int[n];
    
        for (int i = 0; st.nextToken() != StreamTokenizer.TT_EOF; i++) {
            array[i] = (int) st.nval;
        }
    
        long endTime = System.currentTimeMillis();
        System.out.println(String.format("Total time by StreamTokenizer: %d ms", endTime - startTime));
    }
    
    public static void readByStreamTokenizerOnBufferedReader() throws Exception {
        long startTime = System.currentTimeMillis();
    
        StreamTokenizer st = new StreamTokenizer(new BufferedReader(new FileReader(FILE_NAME)));
        int array[] = new int[n];
    
        for (int i = 0; st.nextToken() != StreamTokenizer.TT_EOF; i++) {
            array[i] = (int) st.nval;
        }
    
        long endTime = System.currentTimeMillis();
        System.out.println(String.format("Total time by StreamTokenizer with BufferedReader: %d ms", endTime - startTime));
    }
    
    public static void readByBufferedInputStream() throws Exception {
        long startTime = System.currentTimeMillis();
    
        BufferedInputStream bis = new BufferedInputStream(new FileInputStream(FILE_NAME));
        int array[] = new int[n];
        for (int i = 0; i < n; i++) {
            array[i] = readInt(bis);
        }
    
        long endTime = System.currentTimeMillis();
        System.out.println(String.format("Total time with BufferedInputStream: %d ms", endTime - startTime));
    }
    
    private static int readInt(InputStream in) throws IOException {
        int ret = 0;
        boolean dig = false;
    
        for (int c = 0; (c = in.read()) != -1; ) {
            if (c >= '0' && c <= '9') {
                dig = true;
                ret = ret * 10 + c - '0';
            } else if (dig) break;
        }
    
        return ret;
    }
    

    Results I got:

    • Total time by Scanner: 789 ms
    • Total time by StreamTokenizer: 226 ms
    • Total time by StreamTokenizer with BufferedReader: 80 ms
    • Total time by BufferedInputStream: 95 ms

提交回复
热议问题