问题
What class/method in Kafka Streams can we use to serialize/deserialize Java object to byte array OR vice versa? The following link proposes the usage of ByteArrayOutputStream & ObjectOutputStream but they are not thread safe.
Send Custom Java Objects to Kafka Topic
There is another option to use the ObjectMapper, ObjectReader (for threadsafe), but that's converting from POJO -> JSON -> bytearray. Seems this option is an extensive one. Wanted to check if there is a direct way to translate object into bytearray and vice versa which is threadsafe. Please suggest
import org.apache.kafka.common.serialization.Serializer;
public class HouseSerializer<T> implements Serializer<T>{
private Class<T> tClass;
public HouseSerializer(){
}
@SuppressWarnings("unchecked")
@Override
public void configure(Map configs, boolean isKey) {
tClass = (Class<T>) configs.get("POJOClass");
}
@Override
public void close() {
}
@Override
public byte[] serialize(String topic, T data) {
//Object serialization to be performed here
return null;
}
}
Note: Kafka version - 0.10.1
回答1:
Wanted to check if there is a direct way to translate object into bytearray
I would suggest you look at using Avro serialization with the Confluent Schema Registry, if possible, but not required. JSON is a good fall back, but takes more space "on the wire", and so MsgPack would be the alternative there.
See Avro code example here
Above example is using the avro-maven-plugin to generate a LogLine class from the src/main/resources/avro
schema file.
Otherwise, it's up to you for how to serialize your object into a byte array, for example, a String is commonly packed as
[(length of string) (UTF8 encoded bytes)]
While booleans are a single 0 or 1 bit
which is threadsafe
I understand the concern, but you aren't commonly sharing deserialized data between threads. You send/read/process a message for each independent one.
来源:https://stackoverflow.com/questions/50377332/kafka-streams-pojo-serialization-deserialization