问题
I am new to talend and have very limited experience in it , My task required to perform daily incremental update from sql rds to redshift on daliy basis, however my job runs with very slow transfer rate details are listed below
my sql rds query is
SELECT
*
FROM
test.ankit2
WHERE
id > (SELECT COALESCE(max(id), 0) as id FROM test.stagetable)
ankit2 is the table in myrds and stagetable is table in redshift and used tmap component to link the component from rds input to redshift output component Please have a look at the image
Please have a look and provide your suggestion Any help will be appreciated regards AnkitTalend Etl job
回答1:
For the best Redshift performance, use tRedshiftOutputBulkExec
component instead of tRedshiftOutput
. It will use selected S3 bucket for storing data as csv, and utilize superfast COPY
command (with this approach I managed to hae 20000 rows/s write speed).
It will be better to do direct unload from RDS to S3 if possible.
来源:https://stackoverflow.com/questions/35404230/talend-job-running-with-slow-transfer-rate