Invalid int encoding on deserializing kafka avro topics in spark structured streaming

三世轮回 提交于 2019-12-11 08:33:56

问题


I'm trying to process streaming avro data from kafka using spark structured streaming (version-2.3.1), so i tried with this example to de-serialize. It works only if the topics value part contains StringType, but in my case the schema contains long and integers like below:

public static final String USER_SCHEMA = "{"
        + "\"type\":\"record\","
        + "\"name\":\"variables\","
        + "\"fields\":["
        + "  { \"name\":\"time\", \"type\":\"long\" },"
        + "  { \"name\":\"thnigId\", \"type\":\"string\" },"
        + "  { \"name\":\"controller\", \"type\":\"int\" },"
        + "  { \"name\":\"module\", \"type\":\"int\" }"
        + "]}";

So it gives an exception at

sparkSession.udf().register("deserialize", (byte[] data) -> {
GenericRecord record = recordInjection.invert(data).get(); //throws error at invert method.
return RowFactory.create(record.get("time"), record.get("thingId").toString(), record.get("controller"), record.get("module"));
    }, DataTypes.createStructType(type.fields()));

saying

Failed to invert: [B@22a45e7
Caused by java.io.IOException: Invalid int encoding.

because I'm having time, controller and module in schema long and int types.

I guess this is some sort of encoding and decoding formats errors of byte array byte[] data.


回答1:


Did you take a look at this https://issues.apache.org/jira/browse/AVRO-1650. It talks specifically about the issue you might be running into. The default UTF-8 encoding can result in loss during the encoding/decoding process.

I would also suggest if you are dealing with binary encoded data to use Base64 encoding to save/transmit the data as that utilizes ISO-8859-1 which is the right encoding the use per the link above.



来源:https://stackoverflow.com/questions/55914401/invalid-int-encoding-on-deserializing-kafka-avro-topics-in-spark-structured-stre

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!