How to register InternalRow with Kryo in Spark

心已入冬 提交于 2019-12-11 00:59:38

问题


I want to run Spark with Kryo serialisation. Therefore I set spark.serializer=org.apache.spark.serializer.KryoSerializer and spark.kryo.registrationRequired=true

When I then run my code I get the error:

Class is not registered: org.apache.spark.sql.catalyst.InternalRow[]

According to this post I used

sc.getConf.registerKryoClasses(Array( classOf[ org.apache.spark.sql.catalyst.InternalRow[_] ] ))

But then the error is:

org.apache.spark.sql.catalyst.InternalRow does not take type parameters


回答1:


you should use an external class as

class MyRegistrator extends KryoRegistrator {
override def registerClasses(kryo: Kryo) {
kryo.register(classOf[Array[org.apache.spark.sql.catalyst.InternalRow]])
}
}

source : http://spark.apache.org/docs/0.6.0/tuning.html

Or if you want to register in your spark class

val cls: Class[Array[InternalRow]] = classOf[Array[org.apache.spark.sql.catalyst.InternalRow]]

spark.sparkContext.getConf.registerKryoClasses(Array(cls))

I use the first one and works perfectly, I haven't tested the second one.



来源:https://stackoverflow.com/questions/49109967/how-to-register-internalrow-with-kryo-in-spark

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!