Is there any way to create a primitive array without initialization?

前端 未结 4 1981
轮回少年
轮回少年 2020-12-10 13:07

As we know Java always initialises arrays upon creation. I.e. new int[1000000] always returns an array with all elements = 0. I understand that it\'s a must for

相关标签:
4条回答
  • 2020-12-10 13:45

    I can imagine that O(n) cost of new int[n] could be a burden in some data structures or algorithms.

    A way to have amortized O(1) cost of memory allocation in Java for a primitive array of size n is to do recycling of allocated arrays with an object pool or some other strategy. Recycled array can be considered "uninitialized" for the next allocation.

    0 讨论(0)
  • 2020-12-10 13:53

    I'm going to move this to an answer because it probably should be.

    An "Array" in java is not what you think it is. It's not just a pointer to a chunk of contiguous memory on the stack or heap.

    An Array in Java is an Object just like everything else (except primitives) and is on the heap. When you call new int[100000] you're creating a new object just like every other object, and it gets initialized, etc.

    The JLS provides all the specific info about this:

    http://docs.oracle.com/javase/specs/jls/se5.0/html/arrays.html

    So, no. You can't avoid "initializing" an array. That's just not how Java works. There's simply no such thing as uninitialized heap memory; many people call that a "feature" as it prevents you from accessing uninitialized memory.

    0 讨论(0)
  • 2020-12-10 13:57

    I've done some investigation. There is no legal way to create uninitialized array in Java. Even JNI NewXxxArray creates initialized arrays. So it is impossible to know exactly the cost of array zeroing. Nevertheless I've done some measurements:

    1) 1000 byte arrays creation with different array size

            long t0 = System.currentTimeMillis();
            for(int i = 0; i < 1000; i++) {
    //          byte[] a1 = new byte[1];
                byte[] a1 = new byte[1000000];
            }
            System.out.println(System.currentTimeMillis() - t0);
    

    on my PC it gives < 1ms for byte[1] and ~500 ms for byte[1000000]. Sounds impressive to me.

    2) We don't have a fast (native) method in JDK for filling arrays, Arrays.fill is too slow, so let's see at least how much 1000 copying of 1,000,000 size array takes with native System.arraycopy

        byte[] a1 = new byte[1000000];
        byte[] a2 = new byte[1000000];
        for(int i = 0; i < 1000; i++) {
            System.arraycopy(a1, 0, a2, 0, 1000000);
        }
    

    It is 700 ms.

    It gives me reasons to believe that a) creating long arrays is expensive b) it seems to be expensive because of useless initialization.

    3) Let's take sun.misc.Unsafe http://www.javasourcecode.org/html/open-source/jdk/jdk-6u23/sun/misc/Unsafe.html. It is protected from external usage but not too much

        Field f = Unsafe.class.getDeclaredField("theUnsafe");
        f.setAccessible(true);
        Unsafe unsafe = (Unsafe)f.get(null);
    

    Here is the cost of memory allocation test

        for(int i = 0; i < 1000; i++) {
            long m = u.allocateMemory(1000000);
        }
    

    It takes < 1 ms, if you remember, for new byte[1000000] it took 500ms.

    4) Unsafe has no direct methods to work with arrays. It needs to know class fields, but reflection shows no fields in an array. There is not much info about arrays internals, I guess it is JVM / platform specific. Nevertheless, it is, like any other Java Object, header + fields. On my PC/JVM it looks like

    header - 8 bytes
    int length - 4 bytes
    long bufferAddress - 8 bytes
    

    Now, using Unsafe, I will create byte[10], allocate a 10 byte memory buffer and use it as my array's elements:

        byte[] a = new byte[10];
        System.out.println(Arrays.toString(a));
        long mem = unsafe.allocateMemory(10);
        unsafe.putLong(a, 12, mem);
        System.out.println(Arrays.toString(a));
    

    it prints

    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    [8, 15, -114, 24, 0, 0, 0, 0, 0, 0]
    

    You can see thay array's data are not initialized.

    Now I'll change our array length (though it still points to 10 bytes memory)

        unsafe.putInt(a, 8, 1000000);
        System.out.println(a.length);
    

    it shows 1000000. It was just to prove that the idea works.

    Now performance test. I will create an empty byte array a1, allocate a buffer of 1000000 bytes, assign this buffer to a1 an set a1.length = 10000000

        long t0 = System.currentTimeMillis();
        for(int i = 0; i < 1000; i++) {
            byte[] a1 = new byte[0];
            long mem1 = unsafe.allocateMemory(1000000);
            unsafe.putLong(a1, 12, mem);
            unsafe.putInt(a1, 8, 1000000);
        }
        System.out.println(System.currentTimeMillis() - t0);
    

    it takes 10ms.

    5) There are malloc and alloc in C++, malloc just allocates memory block , calloc also initializes it with zeroes.

    cpp

    ...
    JNIEXPORT void JNICALL Java_Test_malloc(JNIEnv *env, jobject obj, jint n) {
         malloc(n);
    } 
    

    java

    private native static void malloc(int n);
    
    for (int i = 0; i < 500; i++) {
        malloc(1000000);
    }
    

    results malloc - 78 ms; calloc - 468 ms

    Conclusions

    1. It seems that Java array creation is slow because of useless element zeroing.
    2. We cannot change it, but Oracle can. No need to change anything in JLS, just add native methods to java.lang.reflect.Array like

      public static native xxx[] newUninitialziedXxxArray(int size);

    for all primitive numeric types (byte - double) and char type. It could be used all over the JDK, like in java.util.Arrays

        public static int[] copyOf(int[] original, int newLength) {
            int[] copy = Array.newUninitializedIntArray(newLength);
            System.arraycopy(original, 0, copy, 0, Math.min(original.length, newLength));
            ...
    

    or java.lang.String

       public String concat(String str) {
            ...   
            char[] buf = Array.newUninitializedCharArray(count + otherLen);
            getChars(0, count, buf, 0);
            ...
    
    0 讨论(0)
  • 2020-12-10 13:58

    Java 9 actually starts to expose this via jdk.internal.misc.Unsafe.allocateUninitializedArray method. It would actually require JDK.Unsupported module declaration.

    0 讨论(0)
提交回复
热议问题