Why is Joda Time serialized form so large, and what to do about it?

梦想与她 提交于 2019-12-12 11:53:36

问题


On my machine, the following code snippet:

DateTime now = DateTime.now();
System.out.println(now);
System.out.println("Date size:\t\t"+serialiseToArray(now).length);
System.out.println("DateString size:\t"+serialiseToArray(now.toString()).length);
System.out.println("java.util.Date size:\t"+serialiseToArray(new Date()).length);
Duration twoHours = Duration.standardHours(2);
System.out.println(twoHours);
System.out.println("Duration size:\t\t"+serialiseToArray(twoHours).length);
System.out.println("DurationString size:\t"+serialiseToArray(twoHours.toString()).length);

Gives the following output:

2013-09-09T15:07:44.642+01:00
Date size:      273
DateString size:    36
java.util.Date size:    46
PT7200S
Duration size:      107
DurationString size:    14

As you can see, the org.joda.time.DateTime object is more than 5 times larger than its String form, which seems to describe it perfectly, and the java.util.Date equivalent. The Duration object representing 2 hours is also much larger than I would expect, as looking at the source it seems like its only member variable is a single long value.

Why are these serialized objects so large? And is there any pre-existing solution for getting a smaller representation?

The serialiseToArray method, for reference:

private static byte[] serialiseToArray(Serializable s)
{
    try
    {
        ByteArrayOutputStream byteArrayBuffer = new ByteArrayOutputStream();
        new ObjectOutputStream(byteArrayBuffer).writeObject(s);
        return byteArrayBuffer.toByteArray();
    }
    catch (IOException ex)
    {
        throw new RuntimeException(ex);
    }
}

回答1:


Serializing has some overhead. In this instance the overhead that you notice the most is that the class structure is described in the actual output. And since Duration has a base class (BaseDuration) and two interfaces (ReadableDuration and Serializable), that overhead becomes slightly larger than the one of Date (which has no base class and just a single interface).

Those classes are referenced using their fully-qualified class names in the serialized file and as such create quite some bytes.

Good news: that overhead is only paid once per output stream. If you serialize another Duration object, the difference in size should be rather small.

I've used the jdeserialize project to look in the result of serializing a java.util.Date vs. a Duration (note that this tool does not need access to the .class files, so all information it dumps is actually contained in the serialized data):

The result for java.util.Date:

read: java.util.Date _h0x7e0001 = r_0x7e0000;
//// BEGIN stream content output
java.util.Date _h0x7e0001 = r_0x7e0000;
//// END stream content output

//// BEGIN class declarations (excluding array classes)
class java.util.Date implements java.io.Serializable {
}

//// END class declarations

//// BEGIN instance dump
[instance 0x7e0001: 0x7e0000/java.util.Date
  object annotations:
    java.util.Date
        [blockdata 0x00: 8 bytes]

  field data:
    0x7e0000/java.util.Date:
]
//// END instance dump

The result for Duration:

read: org.joda.time.Duration _h0x7e0002 = r_0x7e0000;
//// BEGIN stream content output
org.joda.time.Duration _h0x7e0002 = r_0x7e0000;
//// END stream content output

//// BEGIN class declarations (excluding array classes)
class org.joda.time.Duration extends org.joda.time.base.BaseDuration implements java.io.Serializable {
}

class org.joda.time.base.BaseDuration implements java.io.Serializable {
    long iMillis;
}

//// END class declarations

//// BEGIN instance dump
[instance 0x7e0002: 0x7e0000/org.joda.time.Duration
  field data:
    0x7e0001/org.joda.time.base.BaseDuration:
        iMillis: 0
    0x7e0000/org.joda.time.Duration:
]
//// END instance dump

Note that the "class declaration" block is quite a bit longer for Duration. This also explains why serializing a single Duration takes 107 bytes, but serializing two (distinct) Duration objects takes only 121 bytes.




回答2:


From the source:

Internally, the class holds two pieces of data. Firstly, it holds the datetime as milliseconds from the Java epoch of 1970-01-01T00:00:00Z. Secondly, it holds a Chronology which determines how the millisecond instant value is converted into the date time fields. The default Chronology is org.joda.time.chrono.ISOChronology which is the agreed international standard and compatible with the modern Gregorian calendar.

The ISOChronology derives from AssembledChronology, most of which (but not all) is declared as transient fields.



来源:https://stackoverflow.com/questions/18700347/why-is-joda-time-serialized-form-so-large-and-what-to-do-about-it

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!