How to obtain and process mysql records using Airflow?

北城以北 提交于 2019-12-13 11:55:01

问题


I need to

1. run a select query on MYSQL DB and fetch the records.              
2. Records are processed by python script.

I am unsure about the way I should proceed. Is xcom the way to go here? Also, MYSQLOperator only executes the query, doesn't fetch the records. Is there any inbuilt transfer operator I can use? How can I use a MYSQL hook here?

you may want to use a PythonOperator that uses the hook to get the data, apply transformation and ship the (now scored) rows back some other place.

Can someone explain how to proceed regarding the same.

Refer - http://markmail.org/message/x6nfeo6zhjfeakfe

def do_work():
    mysqlserver = MySqlHook(connection_id)
    sql = "SELECT * from table where col > 100 "
    row_count = mysqlserver.get_records(sql, schema='testdb')
    print row_count[0][0]

callMYSQLHook = PythonOperator(
    task_id='fetch_from_testdb',
    python_callable=mysqlHook,
    dag=dag
)

Is this the correct way to proceed? Also how do we use xcoms to store the records for the following MySqlOperator?'

t = MySqlOperator(
conn_id='mysql_default',
task_id='basic_mysql',
sql="SELECT count(*) from table1 where id > 10",
dag=dag)

回答1:


Sure, just create a hook or operator and call the get_records() method: https://airflow.readthedocs.io/en/stable/_modules/airflow/hooks/dbapi_hook.html



来源:https://stackoverflow.com/questions/46359497/how-to-obtain-and-process-mysql-records-using-airflow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!