Capture Console output of Spark Action Node in Oozie as variable across the Oozie Workflow

ⅰ亾dé卋堺 提交于 2019-12-11 00:26:32

问题


Is there a way to capture the console output of a spark job in Oozie? I want to use the specific printed value in the next action node after the spark job.

I was thinking that I could have maybe used the ${wf:actionData("action-id")["Variable"]} but it seems that oozie does not have the capability to capture output from a spark action node unlike in the Shell action you could just use echo "var=12345" and then invoke the wf:actionData in oozie to be used as an Oozie Variable across the workflow.

I want to achieve that because I want to print the possible number of records processed and store that as an oozie variable and use that to the next action nodes in the workflow without doing any functionalities that requires you to store that data outside of the workflow like saving them in a table or storing them as a system variable via the implementing them inside the Spark Scala Program.

Any help would be thoroughly appreciated since I'm still a novice spark programmer. Thank you very much.


回答1:


As Spark action does not support capture-output, you'll have to write the data into a file to HDFS. This post explains how to do that from Spark.



来源:https://stackoverflow.com/questions/44171000/capture-console-output-of-spark-action-node-in-oozie-as-variable-across-the-oozi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!