Data Lake Analytics U-SQL EXTRACT speed (Local vs Azure)
Been looking into using the Azure Data Lake Analytics functionality to try and manipulate some Gzip’d xml data I have stored within Azures Blob Storage but I’m running into an interesting issue. Essentially when using U-SQL locally to process 500 of these xml files the processing time is extremely quick , roughly 40 seconds using 1 AU locally (which appears to be the limit). However when we run this same functionality from within Azure using 5 AU’s the processing takes 17+ minutes. We are eventually wanting to scale this up to ~ 20,000 files and more but have reduced the set to try and measure