Scheduling spark jobs on a timely basis
Which is the recommended tool for scheduling Spark Jobs on a daily/weekly basis. 1) Oozie 2) Luigi 3) Azkaban 4) Chronos 5) Airflow Thanks in advance. Joe Harris Updating my previous answer from here: Suggestion for scheduling tool(s) for building hadoop based data pipelines Airflow: Try this first. Decent UI, Python-ish job definition, semi-accessible for non-programmers, dependency declaration syntax is weird. Airflow has built in support for the fact that jobs scheduled jobs often need to be rerun and/or backfilled. Make sure you build your pipelines to support this. Azkaban: Nice UI,