Pyspark 2.4.0, read avro from kafka with read stream - Python
问题 I am trying to read avro messages from Kafka, using PySpark 2.4.0. The spark-avro external module can provide this solution for reading avro files: df = spark.read.format("avro").load("examples/src/main/resources/users.avro") df.select("name", "favorite_color").write.format("avro").save("namesAndFavColors.avro") However, I need to read streamed avro messages. The library documentation suggests using the from_avro() function, which is only available for Scala and Java. Are there any other