How to fix “java.io.NotSerializableException: org.apache.kafka.clients.consumer.ConsumerRecord” in Spark Streaming Kafka Consumer?

后端 未结 3 877
挽巷
挽巷 2020-12-10 03:57
  • Spark 2.0.0
  • Apache Kafka 0.10.1.0
  • scala 2.11.8

When I use spark streaming and kafka integration with kafka broker version 0.10.1

3条回答
  •  悲&欢浪女
    2020-12-10 04:12

    KafkaUtils.createDirectStream creates as a org.apache.spark.streaming.dstream.DStream. It is not a RDD. Spark Streaming will create RDDs temporarily as is runs. To retrieve an RDD use stream.foreach() to get the RDD and then RDD.foreach to get each object in the RDD. Those will be Kafka ConsumerRecords of which you use use the value() method to read the message from the Kafka topic:

    stream.foreachRDD { rdd => 
        rdd.foreach { record => 
        val value = record.value()
        println(map.get(value)) 
        }
    }
    

提交回复
热议问题