Flink Kafka - Custom Class Data is always null

一个人想着一个人 提交于 2019-12-25 02:21:43

问题


Custom Class

Person

class Person
{
  private Integer id;
  private String name; 
 //getters and setters
}

Kafka Flink Connector

TypeInformation<Person> info = TypeInformation.of(Person.class);
TypeInformationSerializationSchema schema = new TypeInformationSerializationSchema(info, new ExecutionConfig());
DataStream<Person> input = env.addSource( new FlinkKafkaConsumer08<>("persons", schema , getKafkaProperties()));

Now if I send the below json

{ "id" : 1, "name": Synd }

through Kafka Console Producer, the flink code throws null pointer exception But if I use SimpleStringSchema instead of CustomSchema as defined before, the stream is getting printed.

What is wrong in the above setup


回答1:


The TypeInformationSerializationSchema is a de-/serialization schema which uses Flink's serialization stack and, thus, also its serializer. Therefore, when using this SerializationSchema Flink expects that the data has been serialized with Flink's serializer for the Person type.

Given the excerpt of the Person class, Flink will most likely use its PojoTypeSerializer. Feeding JSON input data won't be understood by this serializer.

If you want to use JSON as the input format, then you have to define your own DeserializationSchema which can parse JSON into Person.




回答2:


Answer for who have the same question

Custom Serializer

class PersonSchema implements DeserializationSchema<Person>{

    private ObjectMapper mapper = new ObjectMapper(); //com.fasterxml.jackson.databind.ObjectMapper;

    @Override
    public Person deserialize(byte[] bytes) throws IOException {
        return mapper.readValue( bytes, Person.class );
    }

    @Override
    public boolean isEndOfStream(Person person) {
        return false;
    }

    @Override
    public TypeInformation<Person> getProducedType() {
        return TypeInformation.of(new TypeHint<Person>(){});
    }
}

Using the schema

DataStream<Person> input = env.addSource( new FlinkKafkaConsumer08<>("persons", new PersonSchema() , getKafkaProperties()));


来源:https://stackoverflow.com/questions/51139625/flink-kafka-custom-class-data-is-always-null

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!