Simple, hassle-free, zero-boilerplate serialization in Scala/Java similar to Python's Pickle?

后端 未结 5 1232
无人共我
无人共我 2021-01-30 13:10

Is there a simple, hassle-free approach to serialization in Scala/Java that\'s similar to Python\'s pickle? Pickle is a dead-simple solution that\'s reasonably efficient in spa

5条回答
  •  耶瑟儿~
    2021-01-30 14:15

    I actually think you'd be best off with kryo (I'm not aware of alternatives that offer less schema defining other than non-binary protocols). You mention that pickle is not susceptible to the slowdowns and bloat that kryo gets without registering classes, but kryo is still faster and less bloated than pickle even without registering classes. See the following micro-benchmark (obviously take it with a grain of salt, but this is what I could do easily):

    Python pickle

    import pickle
    import time
    class Person:
        def __init__(self, name, age):
            self.name = name
            self.age = age
    people = [Person("Alex", 20), Person("Barbara", 25), Person("Charles", 30), Person("David", 35), Person("Emily", 40)]
    for i in xrange(10000):
        output = pickle.dumps(people, -1)
        if i == 0: print len(output)
    start_time = time.time()
    for i in xrange(10000):
        output = pickle.dumps(people, -1)
    print time.time() - start_time    
    

    Outputs 174 bytes and 1.18-1.23 seconds for me (Python 2.7.1 on 64-bit Linux)

    Scala kryo

    import com.esotericsoftware.kryo._
    import java.io._
    class Person(val name: String, val age: Int)
    object MyApp extends App {
      val people = Array(new Person("Alex", 20), new Person("Barbara", 25), new Person("Charles", 30), new Person("David", 35), new Person("Emily", 40))
      val kryo = new Kryo
      kryo.setRegistrationOptional(true)
      val buffer = new ObjectBuffer(kryo)
      for (i <- 0 until 10000) {
        val output = new ByteArrayOutputStream
        buffer.writeObject(output, people)
        if (i == 0) println(output.size)
      }
      val startTime = System.nanoTime
      for (i <- 0 until 10000) {
        val output = new ByteArrayOutputStream
        buffer.writeObject(output, people)
      }
      println((System.nanoTime - startTime) / 1e9)
    }
    

    Outputs 68 bytes for me and 30-40ms (Kryo 1.04, Scala 2.9.1, Java 1.6.0.26 hotspot JVM on 64-bit Linux). For comparison, it outputs 51 bytes and 18-25ms if I register the classes.

    Comparison

    Kryo uses about 40% of the space and 3% of the time as Python pickle when not registering classes, and about 30% of the space and 2% of the time when registering classes. And you can always write a custom serializer when you want more control.

提交回复
热议问题