What is the correct way to access the log4j logger of Spark using pyspark on an executor?
It\'s easy to do so in the driver but I cannot seem to understand how to ac
I have yet another approach to solve logging issue in PySpark. Idea is as follows:
This is good approach if you are already using cloud services as many of them also have log collection/management services.
I have a simple wordcount example on Github to demonstrate this approach https://github.com/chhantyal/wordcount
This Spark app sends logs to Loggly using standard logging module from driver (master node) as well as executors (worker nodes).