how to load data from AWS RDS to Google BigQuery in streaming mode?

删除回忆录丶 提交于 2021-02-19 08:09:07

问题


how to load data from AWS RDS to Google BigQuery in streaming mode? Description: I have data in RDS (SQL Server), and wanted to load this data into Google BigQuery in real-time.


回答1:


There is no direct way to insert changes from Amazon RDS to Google Cloud BigQuery. It could be done with a pipeline like this

Amazon RDS ----Lambda/DMS----> Kinesis Data Streams -----Lambda----> BigQuery

  1. Read changes from Amazon RDS to Kinesis Data Streams using Lambda or use Cloud DMS. You can also push it to Kinesis Firehose for aggregating/batching records.
  2. Use Lambda to read from Kinesis streams/Firehose to insert into BigQuery using tabledata.insertAll (BQ streaming API). Code will be something similar to this.



回答2:


You can use the Cloud Storage Transfer Service that manages and schedules load jobs into BigQuery. This is the recommended migration method for this use case. Firstly you need to load data from AWS RDS to CSV files, then move it to S3. Amazon S3 transfers are a two step process:

  1. Transfer Service is used to bring data from S3 into GCS.
  2. BQ load job is used to load the data into BigQuery.

Another interesting solution that I found is about using AWS Data Pipeline to export data from MySQL and feed it to BigQuery.

Moreover, you can use one of the ETL tools (see here) which have integration with Amazon RDS and BigQuery to perform transfer of the data to BigQuery. One of the best is Fivetran.

I hope it helps you.



来源:https://stackoverflow.com/questions/60287594/how-to-load-data-from-aws-rds-to-google-bigquery-in-streaming-mode

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!