How to integrate Airflow with Github for running scripts

核能气质少年 提交于 2020-02-03 08:19:25

问题


If we maintain our code/scripts in github repository account, is there any way to copy these scripts from Github repository and execute on some other cluster ( which can be Hadoop or Spark).

Does airflow provides any operator to connect to Github for fetching such files ?

Maintaining scripts in Github will provide more flexibility as every change in the code will be reflected and used directly from there.

Any idea on this scenario will really help.


回答1:


You can use GitPython as part of a PythonOperator task to run the pull as per a specified schedule.

import git 

g = git.cmd.Git( git_dir )
g.pull()

Don't forget to make sure that you have added the relevant keys so that the airflow workers have permission to pull the data.



来源:https://stackoverflow.com/questions/53406177/how-to-integrate-airflow-with-github-for-running-scripts

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!