I am receiving from a remote server Kafka Avro messages in Python (using the consumer of Confluent Kafka Python library), that represent clickstream data with json dictionar
If you use Confluent Schema Registry and want to deserialize avro messages, just add message_bytes.seek(5) to the decode function, since Confluent adds 5 extra bytes before the typical avro-formatted data.
def decode(msg_value):
message_bytes = io.BytesIO(msg_value)
message_bytes.seek(5)
decoder = BinaryDecoder(message_bytes)
event_dict = reader.read(decoder)
return event_dict