Fastest way to read huge number of int from binary file

前端 未结 3 539
广开言路
广开言路 2020-12-21 01:09

I use Java 1.5 on an embedded Linux device and want to read a binary file with 2MB of int values. (now 4bytes Big Endian, but I can decide, the format)

Using D

相关标签:
3条回答
  • 2020-12-21 01:14

    You can use IntBuffer from nio package -> http://docs.oracle.com/javase/6/docs/api/java/nio/IntBuffer.html

    int[] intArray = new int[ 5000000 ];
    
    IntBuffer intBuffer = IntBuffer.wrap( intArray );
    
    ...
    

    Fill in the buffer, by making calls to inChannel.read(intBuffer).

    Once the buffer is full, your intArray will contain 500000 integers.

    EDIT

    After realizing that Channels only support ByteBuffer.

    // asume I already know that there are now 500 000 int to read:
    int numInts = 500000;
    // here I want the result into
    int[] result = new int[numInts];
    
    // 4 bytes per int, direct buffer
    ByteBuffer buf = ByteBuffer.allocateDirect( numInts * 4 );
    
    // BIG_ENDIAN byte order
    buf.order( ByteOrder.BIG_ENDIAN );
    
    // Fill in the buffer
    while ( buf.hasRemaining( ) )
    {
       // Per EJP's suggestion check EOF condition
       if( inChannel.read( buf ) == -1 )
       {
           // Hit EOF
           throw new EOFException( );
       }
    }
    
    buf.flip( );
    
    // Create IntBuffer view
    IntBuffer intBuffer = buf.asIntBuffer( );
    
    // result will now contain all ints read from file
    intBuffer.get( result );
    
    0 讨论(0)
  • 2020-12-21 01:20

    I don't know if this will be any faster than what Alexander provided, but you could try mapping the file.

        try (FileInputStream stream = new FileInputStream(filename)) {
            FileChannel inChannel = stream.getChannel();
    
            ByteBuffer buffer = inChannel.map(FileChannel.MapMode.READ_ONLY, 0, inChannel.size());
            int[] result = new int[500000];
    
            buffer.order( ByteOrder.BIG_ENDIAN );
            IntBuffer intBuffer = buffer.asIntBuffer( );
            intBuffer.get(result);
        }
    
    0 讨论(0)
  • 2020-12-21 01:31

    I ran a fairly careful experiment using serialize/deserialize, DataInputStream vs ObjectInputStream, both based on ByteArrayInputStream to avoid IO effects. For a million ints, readObject was about 20msec, readInt was about 116. The serialization overhead on a million-int array was 27 bytes. This was on a 2013-ish MacBook Pro.

    Having said that, object serialization is sort of evil, and you have to have written the data out with a Java program.

    0 讨论(0)
提交回复
热议问题