Azure Data flow taking mins to trigger next pipeline

痞子三分冷 提交于 2019-12-07 11:39:17

This document Monitor data flow performance mentioned that:

Note that you can assume 1 minute of cluster job execution set-up time in your overall performance calculations and if you are using the default Azure Integration Runtime, you may need to add 5 minutes of cluster spin-up time as well.

That's maybe a reason. You can first follow this tutorial Mapping data flows performance and tuning guide.

This document Execute data flow activity in Azure Data Factory also can help us improve the performance.

Choose the compute environment for this execution of your data flow. The default is the Azure Auto-Resolve Default Integration Runtime. This choice will execute the data flow on the Spark environment in the same region as your data factory. The compute type will be a job cluster, which means the compute environment will take several minutes to start-up.

You have control over the Spark execution environment for your Data Flow activities. In the Azure integration runtime are settings to set the compute type (general purpose, memory optimized, and compute optimized), number of worker cores, and time-to-live to match the execution engine with your Data Flow compute requirements. Also, setting TTL will allow you to maintain a warm cluster that is immediately available for job executions.

Note:

The Integration Runtime selection in the Data Flow activity only applies to triggered executions of your pipeline. Debugging your pipeline with Data Flows with Debug will execute against the 8-core default Spark cluster.

Hope this helps.

You will hit the Databricks cluster spin-up time during job (triggered) execution.

As long as you are in Debug mode, you'll always hit a warmed cluster while the debug session is still green.

We've added TTL to the Azure IR in the Data Flow configuration section so that you can keep a cluster alive for your next data flow activity and you won't incur the start-up penalty on your next execution.

Note that option is greyed out at this time, but will enable it soon.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!