
What are the pros and cons of java serialization vs kryo serialization?

问题 In spark, java serialization is the default, if kryo is that efficient then why it is not set as default. Is there some cons using kryo or in what scenarios we should use kryo or java serialization? 回答1: Here is comment from documentation: Kryo is significantly faster and more compact than Java serialization (often as much as 10x), but does not support all Serializable types and requires you to register the classes you’ll use in the program in advance for best performance. So it is not used


如果应用使用的google protobuf 或 apache thrift序列器工具, 你是需要注册自已的序列化工具的。以protobuf和thrift为例,示例如下: 譬如 google protobuf 样例: 注册ProtobufSerializer序列化器: final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.getConfig().registerTypeWithKryoSerializer(PbSdkStat.DataRecords.class, ProtobufSerializer.class); 添加maven依赖 <dependency>   <groupId>com.twitter</groupId>   <artifactId>chill-protobuf</artifactId>   <version>0.7.6</version>   <!-- exclusions for dependency conversion -->   <exclusions>   <exclusion>    <groupId>com.esotericsoftware.kryo</groupId>    <artifactId>kryo


最近一直在老家远程办公,微信突然响了下,有同事说遇到了一个奇怪的问题,让我帮忙看下。 现象就是标题所说的缓存获取不到的问题,我一听感觉这个问题挺有意思的,决定一探究竟。 下面给出部分代码还原下案发现场: @CreateCache(name = "demo", expire = 600) private Cache<String, ThirdPartyEventResponse> cache; @Test public void test() { ThirdPartyEventResponse eventResponse = new ThirdPartyEventResponse(); eventResponse.setTicketCategories(Arrays.asList(ticketCategoryResponse)); // 省略 ..... // 添加 cache.put(DisChannelType.PIAONIU.getValue(), eventResponse); // 获取 ThirdPartyEventResponse resp = cache.get(DisChannelType.PIAONIU.getValue()); } Put 之后马上 Get,居然获取不到值。 这就有点匪夷所思了,我们来好好排查下。 首先过期时间为 600 秒

Does Kryo help in SparkSQL?

问题 Kryo helps improve the performance of Spark applications by the efficient serialization approach. I'm wondering, if Kryo will help in the case of SparkSQL, and how should I use it. In SparkSQL applications, we'll do a lot of column based operations like$"c1", $"c2") , and the schema of DataFrame Row is not quite static. Not sure how to register one or several serializer classes for the use case. For example: case class Info(name: String, address: String) ... val df = spark

No such property: ToInputStream for class: Script4

问题 I have a situation where I want to import my graph data to database.I am having janusgraph(latest version) running with cassandra(version 3) and elasticsearch(version 6.6.0) using Docker.I have been suggested to use gryo format.So I have tried this command"my_graph.kryo"), graph); but ended up with an error No such property: ToInputStream for class: Script4 The documentation I am following is here.Please take a look and

