Writing large DataFrame from PySpark to Kafka runs into timeout
问题 I'm trying to write a data frame which has about 230 million records to a Kafka. More specifically to a Kafka-enable Azure Event Hub, but I'm not sure if that's actually the source of my issue. EH_SASL = 'kafkashaded.org.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://myeventhub.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=****";' dfKafka \ .write \ .format("kafka") \ .option("kafka.sasl