问题
When debugging a spark program, I can pause the stack and look at the frame to see all the meta data of a DataFrame. Partition metadata like input split, logical plan metadata, underlying RDD metadata, etc. But I cannot see the contents of the DataFrame. The DataFrame is another JVM somewhere on another node, or even on the same node (on a local training cluster). So my question, does anyone use a way for troubleshooting, where they are looking at the contents of the DataFrame partitions the way the driver program can be debugged?
来源:https://stackoverflow.com/questions/45243388/viewing-internal-spark-dataframe-contents