Oozie | 易学教程

Oozie not sending SLA email alerts

阅读更多关于 Oozie not sending SLA email alerts

问题 I used this link from oozie documentation to setup SLAs for my oozie workflow. I then scheduled a job which ran longer than the defined SLAs. However, I am not getting any email alerts for SLA miss from oozie. Any idea on how I should debug it? Thank you! 来源： https://stackoverflow.com/questions/57281650/oozie-not-sending-sla-email-alerts

Why do my application level logs disappear when executed in oozie?

阅读更多关于 Why do my application level logs disappear when executed in oozie?

问题 I'm using oozie in CDH5 environment. I'm also using the oozie web-console. I'm not able to see any of the logs from my application. I can see hadoop logs, spark logs, etc; but I see no application specific logs. In my application I've included src/main/resources/log4j.properties # Root logger option log4j.rootLogger=INFO, stdout # Direct log messages to stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.Target=System.out log4j.appender.stdout.layout=org.apache

Copying files from a hdfs directory to another with oozie distcp-action

阅读更多关于 Copying files from a hdfs directory to another with oozie distcp-action

问题 My actions start_fair_usage ends with status okey, but test_copy returns Main class [org.apache.oozie.action.hadoop.DistcpMain], main() threw exception, null In /user/comverse/data/${1}_B I have a lot of different files, some of which I want to copy to ${NAME_NODE}/user/evkuzmin/output . For that I try to pass paths from copy_files.sh which holds an array of paths to the files I need. <action name="start_fair_usage"> <shell xmlns="uri:oozie:shell-action:0.1"> <job-tracker>${JOB_TRACKER}</job

Ambari自动化卸载shell脚本

阅读更多关于 Ambari自动化卸载shell脚本

#!/bin/ bash # Program: # uninstall ambari automatic # History: # 2014 / 01 / 13 - Ivan - 2862099249 @qq.com - First release PATH =/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:~/ bin export PATH #取得集群的所有主机名，这里需要注意： /etc/ hosts配置的IP和主机名只能用一个空格分割 hostList =$( cat /etc/hosts | tail -n + 3 | cut -d ' ' -f 2 ) yumReposDir =/etc/ yum .repos.d/ alterNativesDir =/etc/alternatives/ pingCount = 5 logPre = TDP read -p " Please input your master hostname: " master master =${master:- " master " } ssh $master " ambari-server stop " #重置ambari数据库 ssh $master " ambari-server reset " for host in

大数据常用命令

阅读更多关于大数据常用命令

linux： 1.清理内存 1.清理前查看内存使用情况 free -m 2.释放前最好sync一下，防止丢数据.因为LINUX的内核机制，一般情况下不需要特意去释放已经使用的cache。这些cache起来的内容可以增加文件以及的读写速度。 #sync 2.开始清理 #echo 1 > /proc/sys/vm/drop_caches 3.清理后内存使用情况 #free -m 4.完成! hadoop: 1jobhistory 服务启动 history-server mr-jobhistory-daemon.sh start historyserver 停止 history-server mr-jobhistory-daemon.sh stop historyserver 通过浏览器访问 Hadoop Jobhistory 的 WEBUI http://node-1:19888 hive: nohup /export/servers/hive/bin/hive --service metastore & nohup /export/servers/hive/bin/hive --service hiveserver2 & /export/servers/hive/bin/beeline ! connect jdbc:hive2://node-1:10000 oozie :

Error on running multiple Workflow in OOZIE-4.1.0

阅读更多关于 Error on running multiple Workflow in OOZIE-4.1.0

问题 I installed oozie 4.1.0 on a Linux machine by following the steps at http://gauravkohli.com/2014/08/26/apache-oozie-installation-on-hadoop-2-4-1/ hadoop version - 2.6.0 maven - 3.0.4 pig - 0.12.0 Cluster Setup - MASTER NODE runnig - Namenode, Resourcemanager ,proxyserver. SLAVE NODE running -Datanode,Nodemanager. When I run single workflow job means it succeeds. But when I try to run more than one Workflow job i.e. both the jobs are in accepted state Inspecting the error log, I drill down the

Job queue for Hive action in oozie

阅读更多关于 Job queue for Hive action in oozie

问题 I have a oozie workflow. I am submitting all the hive actions with <name>mapred.job.queue.name</name> <value>${queueName}</value> But for few hive actions, the job launched is not in specified queue; it is invoked in default queue. Please suggest me the cause behind this behavior and solution. 回答1: A. Oozie specifics Oozie propagates the "regular" Hadoop properties to a "regular" MapReduce Action. But for other types of Action (Shell, Hive, Java, etc.) where Oozie runs a single Mapper task in

大数据教程（13.3）azkaban简介&安装

阅读更多关于大数据教程（13.3）azkaban简介&安装

上一节介绍了Flume多个agent连接配合使用。本节博主将为小伙伴们介绍azkaban的相关概念、简单的安装使用。由于azkaban使用新版本太耗时间，需要编译安装，所以博主此次就使用编译好的老版本2.5；等全部教程做完后，博主会将各软件的最新版本的使用安装在后面补充章节中推出。一、工作流调度器azkaban 概述 1.1、为什么需要工作流调度系统一个完整的数据分析系统通常都是由大量任务单元组成：shell脚本程序，java程序，mapreduce程序、hive脚本等，各任务单元之间存在时间先后及前后依赖关系，为了很好地组织起这样的复杂执行计划，需要一个工作流调度系统来调度执行；例如，我们可能有这样一个需求，某个业务系统每天产生20G原始数据，我们每天都要对其进行处理，处理步骤如下所示： 1、通过Hadoop先将原始数据同步到HDFS上； 2、借助MapReduce计算框架对原始数据进行转换，生成的数据以分区表的形式存储到多张Hive表中； 3、需要对Hive中多个表的数据进行JOIN处理，得到一个明细数据Hive大表； 4、将明细数据进行复杂的统计分析，得到结果报表信息； 5、需要将统计分析得到的结果数据同步到业务系统中，供业务调用使用。 1.2、工作流调度实现方式简单的任务调度：直接使用linux的crontab来定义；复杂的任务调度

OOZIE: properties defined in file referenced in global job-xml not visible in workflow.xml

阅读更多关于 OOZIE: properties defined in file referenced in global job-xml not visible in workflow.xml

问题 I'm new to hadoop and now I'm testing simple workflow with just single sqoop action. It works if I use plain values - not global properties. My objective was however, to define some global properties in file referenced in job-xml tag in global section. After long fight and reading many articles I still cannot make it work. I suspect some simple thing is wrong, since I found articles suggesting that this feature works fine. Hopefully, you can give me a hint. In short: I have properties,

Oozie: Launch Map-Reduce from Oozie <java> action?

阅读更多关于 Oozie: Launch Map-Reduce from Oozie action?

问题 I am trying to execute a Map-Reduce task in an Oozie workflow using a <java> action. O\'Reilley\'s Apache Oozie (Islam and Srinivasan 2015) notes that: While it’s not recommended, Java action can be used to run Hadoop MapReduce jobs because MapReduce jobs are nothing but Java programs after all. The main class invoked can be a Hadoop MapReduce driver and can call Hadoop APIs to run a MapReduce job. In that mode, Hadoop spawns more mappers and reducers as required and runs them on the cluster.