I read Spark Structured Streaming doesn\'t support schema inference for reading Kafka messages as JSON. Is there a way to retrieve schema the same as Spark Streaming does:
It is not possible. Spark Streaming supports limited schema inference in development with spark.sql.streaming.schemaInference
set to true
:
By default, Structured Streaming from file based sources requires you to specify the schema, rather than rely on Spark to infer it automatically. This restriction ensures a consistent schema will be used for the streaming query, even in the case of failures. For ad-hoc use cases, you can reenable schema inference by setting spark.sql.streaming.schemaInference to true.
but it cannot be used to extract JSON from Kafka messages and DataFrameReader.json
doesn't support streaming Datasets
as arguments.
You have to provide schema manually How to read records in JSON format from Kafka using Structured Streaming?