问题
Currently I have some script which first deletes the table and upload the table from MySQL to Bigquery. And many time it had failed. Plus it run only once a day. I am looking for some scalable and realtime solution. Your Help will be much appreciated :)
回答1:
Read these series of posts from Wepay, where they detail how they sync their MySQL databases to BigQuery, using Airflow:
- https://wecode.wepay.com/posts/wepays-data-warehouse-bigquery-airflow
- https://wecode.wepay.com/posts/airflow-wepay
- (3rd one is about BigQuery)
As a summary (quoting):
- Setup authentication, connections, DAG.
- Define which columns to pull from MySQL and load into BigQuery.
- Choose how to load the data: incrementally, or fully.
- De-duplicating.
来源:https://stackoverflow.com/questions/40629086/how-to-sync-mysql-into-bigquery-in-realtime