How to fix “java.io.NotSerializableException: org.apache.kafka.clients.consumer.ConsumerRecord” in Spark Streaming Kafka Consumer?

后端 未结 3 856
挽巷
挽巷 2020-12-10 03:57
  • Spark 2.0.0
  • Apache Kafka 0.10.1.0
  • scala 2.11.8

When I use spark streaming and kafka integration with kafka broker version 0.10.1

3条回答
  •  春和景丽
    2020-12-10 04:25

    The Consumer record object is received from Dstream. When you try to print it, it gives error because that object is not serailizable. Instead you should get values from ConsumerRecord object and print it.

    instead of stream.print(), do:

    stream.map(record=>(record.value().toString)).print
    

    This should solve your problem.

    GOTCHA

    For anyone else seeing this exception, any call to checkpoint will call a persist with storageLevel = MEMORY_ONLY_SER, so don't call checkpoint until you call map

提交回复
热议问题