Cannot see message while sinking kafka stream and cannot see print message in flink 1.2

南笙酒味 提交于 2019-12-11 17:45:56

问题


My goal is to use kafka to read in a string in json format, do a filter to the string and then sink the message out (still in json string format).

For testing purpose, my input string message looks like:

{"a":1,"b":2}

And my code of implementation is:

def main(args: Array[String]): Unit = {

// parse input arguments
val params = ParameterTool.fromArgs(args)

if (params.getNumberOfParameters < 4) {
  println("Missing parameters!\n"
    + "Usage: Kafka --input-topic <topic> --output-topic <topic> "
    + "--bootstrap.servers <kafka brokers> "
    + "--zookeeper.connect <zk quorum> --group.id <some id> [--prefix <prefix>]")
  return
}

val env = StreamExecutionEnvironment.getExecutionEnvironment
env.getConfig.disableSysoutLogging
env.getConfig.setRestartStrategy(RestartStrategies.fixedDelayRestart(4, 10000))
// create a checkpoint every 5 seconds
env.enableCheckpointing(5000)
// make parameters available in the web interface
env.getConfig.setGlobalJobParameters(params)

// create a Kafka streaming source consumer for Kafka 0.10.x
val kafkaConsumer = new FlinkKafkaConsumer010(
  params.getRequired("input-topic"),
  new JSONKeyValueDeserializationSchema(false),
  params.getProperties)

val messageStream = env.addSource(kafkaConsumer)

val filteredStream: DataStream[ObjectNode] = messageStream.filter(node => node.get("a").asText.equals("1")
                      && node.get("b").asText.equals("2"))

messageStream.print()
// Refer to: https://stackoverflow.com/documentation/apache-flink/9004/how-to-define-a-custom-deserialization-schema#t=201708080802319255857
filteredStream.addSink(new FlinkKafkaProducer010[ObjectNode](
  params.getRequired("output-topic"),
  new SerializationSchema[ObjectNode] {
    override def serialize(element: ObjectNode): Array[Byte] = element.toString.getBytes()
  }, params.getProperties
))

env.execute("Kafka 0.10 Example")
}

As can be seen, I want to print message stream to the console and sink the filtered message to kafka. However, I can see neither of them.

The interesting thing is, if I modify the schema of KafkaConsumer from JSONKeyValueDeserializationSchema to SimpleStringSchema, I can see messageStream print to the console. Code as shown below:

 val kafkaConsumer = new FlinkKafkaConsumer010(
  params.getRequired("input-topic"),
  new SimpleStringSchema,
  params.getProperties)

val messageStream = env.addSource(kafkaConsumer)
messageStream.print()

This makes me think if I use JSONKeyValueDeserializationSchema, my input message is actually not accepted by Kafka. But this seems so weird and quite different from the online document(https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/kafka.html)

Hope someone can help me out!


回答1:


The JSONKeyValueDeserializationSchema() expects message key with each kafka msg and I am assuming that no key is supplied when the JSON messages are produced and sent over the kafka topic.

Thus to solve the issue, try using JSONDeserializationSchema() which expects only the message and creates an object node based on the message received.



来源:https://stackoverflow.com/questions/45564373/cannot-see-message-while-sinking-kafka-stream-and-cannot-see-print-message-in-fl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!