问题
I used to develop in Scala Spark using IntelliJ. I was able to inspect variable contents under debug mode by setting break point. Like this
I recently start a new project using pyspark with pycharm. I found code does not stop at break point in Spark operations, like below.
And another question is the prompt hint does not give right hint for instance from "map" function. Seems IDE does not know the variable from "map" function is still RDD
, my guess is it related to python function does not define return type.
I feel these naive question for PySpark developers. Any help would be great, thank you!
回答1:
"...code does not stop at break point in Spark operations, like below..." - Could you please clarify what is your PyCharm version and OS?
"And another question is the prompt hint does not give right hint for instance from "map" function. Seems IDE does not know the variable from "map" function is still rdd..." - I believe it is related to this feature request https://youtrack.jetbrains.com/issue/PY-29811
来源:https://stackoverflow.com/questions/52452981/pyspark-how-to-inspect-variables-within-rdd-operations