String Deduplication feature of Java 8

后端 未结 4 1863
难免孤独
难免孤独 2020-11-28 02:06

Since String in Java (like other languages) consumes a lot of memory because each character consumes two bytes, Java 8 has introduced a new feature called

4条回答
  •  情歌与酒
    2020-11-28 02:50

    Imagine you have a phone book, which contains people, which have a String firstName and a String lastName. And it happens that in your phone book, 100,000 people have the same firstName = "John".

    Because you get the data from a database or a file those strings are not interned so your JVM memory contains the char array {'J', 'o', 'h', 'n'} 100 thousand times, one per John string. Each of these arrays takes, say, 20 bytes of memory so those 100k Johns take up 2 MB of memory.

    With deduplication, the JVM will realise that "John" is duplicated many times and make all those John strings point to the same underlying char array, decreasing the memory usage from 2MB to 20 bytes.

    You can find a more detailed explanation in the JEP. In particular:

    Many large-scale Java applications are currently bottlenecked on memory. Measurements have shown that roughly 25% of the Java heap live data set in these types of applications is consumed by String objects. Further, roughly half of those String objects are duplicates, where duplicates means string1.equals(string2) is true. Having duplicate String objects on the heap is, essentially, just a waste of memory.

    [...]

    The actual expected benefit ends up at around 10% heap reduction. Note that this number is a calculated average based on a wide range of applications. The heap reduction for a specific application could vary significantly both up and down.

提交回复
热议问题