Why the odd performance curve differential between ByteBuffer.allocate() and ByteBuffer.allocateDirect()

前端未结

关注

 4  1231

醉酒成梦 2020-12-07 10:16

I\'m working on some SocketChannel-to-SocketChannel code which will do best with a direct byte buffer--long lived and large (tens to hundreds of me

4条回答

-上瘾入骨i (楼主)

2020-12-07 10:47
Thread Local Allocation Buffers (TLAB)

I wonder if the thread local allocation buffer (TLAB) during the test is around 256K. Use of TLABs optimizes allocations from the heap so that the non-direct allocations of <=256K are fast.
- http://blogs.oracle.com/jonthecollector/entry/a_little_thread_privacy_please
What is commonly done is to give each thread a buffer that is used exclusively by that thread to do allocations. You have to use some synchronization to allocate the buffer from the heap, but after that the thread can allocate from the buffer without synchronization. In the hotspot JVM we refer to these as thread local allocation buffers (TLAB's). They work well.

Large allocations bypassing the TLAB

If my hypothesis about a 256K TLAB is correct, then information later in the the article suggests that perhaps the >256K allocations for the larger non-direct buffers bypass the TLAB. These allocations go straight to heap, requiring thread synchronization, thus incurring the performance hits.
- http://blogs.oracle.com/jonthecollector/entry/a_little_thread_privacy_please
An allocation that can not be made from a TLAB does not always mean that the thread has to get a new TLAB. Depending on the size of the allocation and the unused space remaining in the TLAB, the VM could decide to just do the allocation from the heap. That allocation from the heap would require synchronization but so would getting a new TLAB. If the allocation was considered large (some significant fraction of the current TLAB size), the allocation would always be done out of the heap. This cut down on wastage and gracefully handled the much-larger-than-average allocation.

Tweaking the TLAB parameters

This hypothesis could be tested using information from a later article indicating how to tweak the TLAB and get diagnostic info:
- http://blogs.oracle.com/jonthecollector/entry/the_real_thing
To experiment with a specific TLAB size, two -XX flags need to be set, one to define the initial size, and one to disable the resizing:
```
-XX:TLABSize= -XX:-ResizeTLAB
```
The minimum size of a tlab is set with -XX:MinTLABSize which defaults to 2K bytes. The maximum size is the maximum size of an integer Java array, which is used to fill the unallocated portion of a TLAB when a GC scavenge occurs.

Diagnostic Printing Options
```
-XX:+PrintTLAB
```
Prints at each scavenge one line for each thread (starts with "TLAB: gc thread: " without the "'s) and one summary line.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...

Why the odd performance curve differential between ByteBuffer.allocate() and ByteBuffer.allocateDirect()

Thread Local Allocation Buffers (TLAB)

Large allocations bypassing the TLAB

Tweaking the TLAB parameters