Why do my application level logs disappear when executed in oozie?

后端 未结 1 1554
谎友^
谎友^ 2020-12-12 04:57

I\'m using oozie in CDH5 environment. I\'m also using the oozie web-console. I\'m not able to see any of the logs from my application. I can see hadoop logs, spark logs,

相关标签:
1条回答
  • 2020-12-12 05:52

    Oozie runs each Action in a different "launcher" job -- actually a YARN job with a single mapper (see exceptions below).

    Whenever you see an "external ID" in the form job_000000000_0000 then you can reach the YARN logs for application_000000_0000 (yeah, "job" is the legacy naming convention from Hadoop 1, still used by JobHistory service, but YARN has another naming convention).

    Your application output is actually dumped into the YARN logs for that Oozie "launcher"

    • your StdErr is dumped as-is and can be retrieved in the "stderr" section
    • your StdOut is dumped with a prefix on each line (that prefix is used by Oozie to manage its <capture_output/> trick for Shell and Pig actions) at the end of the atrocely verbose "stdout" section
    • and nothing gets into the "syslog" section AFAIK

    Bottom line:

    1. run oozie job -info ****** to get the list of Actions and the corresponding "external IDs" for your Oozie workflow execution
    2. for each job_*****_** legacy ID, run yarn logs -applicationId application_*****_** | more to skim the global YARN logs, then zoom on your specific app logs
    3. now you can try to automate that thing... have fun           B-)


    Exceptions to the "launcher" Oozie job principle -- the E-mail Action / Filesystem Action are just API calls executed directly from the Oozie server process; and the MapReduce Action spawns a regular YARN job with multiple Mappers and Reducers.

    0 讨论(0)
提交回复
热议问题