问题
I read stuff about TheUnsafe, but I get confused by the fact that, unlike C/C++ we have to work out the offset of stuff, and there's also the 32bits VM vs the 64bits VM, which may or may not have different pointers sizes depending on a particular VM setting being turned on or off (also, I'm assuming all offsets to data are actually based on pointer arithmetic this would influence them to).
Unfortunately, it seems all the stuff ever written about how to use TheUnsafe stems from one article only (the one who happened to be the first) and all the others copy pasted from it to a certain degree. Not many of them exist, and some are not clear because the author apparently did not speak English.
My question is:
How can I find the offset of a field + the pointer to the instance that owns that field (or field of a field, or field, of a field, of a field...) using TheUnsafe
How can I use it to perform a memcpy to another pointer + offset memory address
Considering the data may have several GB in size, and considering the heap offers no direct control over data alignment and it may most certainly be fragmented because:
1) I don't think there's nothing stoping the VM from allocating field1 at offset + 10 and field2 at offset sizeof(field1) + 32, is there?
2) I would also assume the GC would move big chunks of data around, leading to a field with 1GB in size being fragmented sometimes.
So is the memcpy operation as I described even possible?
If data is fragmented because of GC, of course the heap has a pointer to where the next chunk of data is, but using the simple process described above doesn't seem to cover that.
so must the data be off-heap for this to (maybe) work? If so, how to allocate off-heap data using TheUnsafe, making such data work as a field of an instance and of course freeing the allocated memory once done with it?
I encourage anyone who didn't quite understand the question to ask for any specifics they need to know.
I also urge people to refrain from answering if their whole idea is "put all objects you need to copy in an array and useSystem.arraycopy
. I know it's common practice in this wonderful forum to, instead of answering what's been asked, offering a complete alternate solution that, in principle, has nothing to do with the original question apart from the fact that it gets the same job done.
Best regards.
回答1:
First a big warning: “Unsafe must die” http://blog.takipi.com/still-unsafe-the-major-bug-in-java-6-that-turned-into-a-java-9-feature/
Some prerequisites
static class DataHolder {
int i1;
int i2;
int i3;
DataHolder d1;
DataHolder d2;
public DataHolder(int i1, int i2, int i3, DataHolder dh) {
this.i1 = i1;
this.i2 = i2;
this.i3 = i3;
this.d1 = dh;
this.d2 = this;
}
}
Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
theUnsafe.setAccessible(true);
Unsafe unsafe = (Unsafe) theUnsafe.get(null);
DataHolder dh1 = new DataHolder(11, 13, 17, null);
DataHolder dh2 = new DataHolder(23, 29, 31, dh1);
The basics
To get the offset of a field (i1), you can use the following code:
Field fi1 = DataHolder.class.getDeclaredField("i1");
long oi1 = unsafe.objectFieldOffset(fi1);
and the access the field value of instance dh1 you can write
System.out.println(unsafe.getInt(dh1, oi1)); // will print 11
You can use similar code to access an object reference (d1):
Field fd1 = DataHolder.class.getDeclaredField("d1");
long od1 = unsafe.objectFieldOffset(fd1);
and you can use it to get the reference to dh1 from dh2:
System.out.println(dh1 == unsafe.getObject(dh2, od1)); // will print true
Field ordering and alignment
To get the offsets of all declared fields of a object:
for (Field f: DataHolder.class.getDeclaredFields()) {
if (!Modifier.isStatic(f.getModifiers())) {
System.out.println(f.getName()+" "+unsafe.objectFieldOffset(f));
}
}
On my test it seems that the JVM reorders fields as it sees fit (i.e. adding a field can yield completely different offsets on the next run)
An Objects address in native memory
It's important to understand that the following code is going to crash your JVM sooner or later, because the Garbage Collector will move your objects at random times, without you having any control on when and why it happens.
Also it's important to understand that the following code depends on the JVM type (32 bits versus 64 bits) and on some start parameters for the JVM (namely, usage of compressed oops on 64 bit JVMs).
On a 32 bit VM a reference to an object has the same size as an int. So what do you get if you call int addr = unsafe.getInt(dh2, od1));
instead of unsafe.getObject(dh2, od1))
? Could it be the native address of the object?
Let's try:
System.out.println(unsafe.getInt(null, unsafe.getInt(dh2, od1)+oi1));
will print out 11
as expected.
On a 64 bit VM without compressed oops (-XX:-UseCompressedOops), you will need to write
System.out.println(unsafe.getInt(null, unsafe.getLong(dh2, od1)+oi1));
On a 64 bit VM with compressed oops (-XX:+UseCompressedOops), things are a bit more complicated. This variant has 32 bit object references that are turned into 64 bit addresses by multiplying them with 8L:
System.out.println(unsafe.getInt(null, 8L*(0xffffffffL&(dh2, od1)+oi1));
What is the problem with these accesses
The problem is the Garbage Collector together with this code. The Garbage Collector can move around objects as it pleases. Since the JVM knows about it's object references (the local variables dh1 and dh2, the fields d1 and d2 of these objects) it can adjust these references accordingly, your code will never notice.
By extracting object references into int/long variables you turn these object references into primitive values that happen to have the same bit-pattern as an object reference, but the Garbage Collector does not know that these were object references (they could have been generated by a random generator as well) and therefore does not adjust these values while moving objects around. So as soon as a Garbage Collection cycle is triggered your extracted addresses are no longer valid, and trying to access memory at these addresses might crash your JVM immediately (the good case) or you might trash your memory without noticing on the spot (the bad case).
来源:https://stackoverflow.com/questions/38114099/understanding-how-to-memcpy-with-theunsafe