Uploading files to Azure blob storage taking more time for larger files

一世执手 提交于 2020-01-17 04:03:21

问题


Hi All...

I am trying to uploading the lager file (size more than 100 MB) files to Azure blob storage.Below is the code.

My problem is even though I have used BeginPutBlock with TPL (Task Parallelism) it is taking more time (20 Min for 100 MB uploading). But i have to upload the files more than 2 GB size. Can anyone please help me on this.

namespace BlobSamples {
    public class UploadAsync
    {
        static void Main(string[] args)
        {
            //string filePath = @"D:\Frameworks\DNCMag-Issue26-DoubleSpread.pdf";
            string filePath = @"E:\E Books\NageswaraRao Meterial\ebooks\applied_asp.net_4_in_context.pdf";
            string accountName = "{account name}";
            string accountKey = "{account key}";
            string containerName = "sampleContainer";
            string blobName = Path.GetFileName(filePath);
            //byte[] fileContent = File.ReadAllBytes(filePath);
            Stream fileContent = System.IO.File.OpenRead(filePath);

            StorageCredentials creds = new StorageCredentials(accountName, accountKey);
            CloudStorageAccount storageAccount = new CloudStorageAccount(creds, useHttps: true);
            CloudBlobClient blobclient = storageAccount.CreateCloudBlobClient();
            CloudBlobContainer container = blobclient.GetContainerReference(containerName);
            CloudBlockBlob blob = container.GetBlockBlobReference(blobName);

            // Define your retry strategy: retry 5 times, starting 1 second apart
            // and adding 2 seconds to the interval each retry.
            var retryStrategy = new Incremental(5, TimeSpan.FromSeconds(1),
              TimeSpan.FromSeconds(2));

            // Define your retry policy using the retry strategy and the Azure storage
            // transient fault detection strategy.
            var retryPolicy =
              new RetryPolicy<StorageTransientErrorDetectionStrategy>(retryStrategy);

            // Receive notifications about retries.
            retryPolicy.Retrying += (sender, arg) =>
                {
                    // Log details of the retry.
                    var msg = String.Format("Retry - Count:{0}, Delay:{1}, Exception:{2}",
                        arg.CurrentRetryCount, arg.Delay, arg.LastException);
                };

            Console.WriteLine("Upload Started" + DateTime.Now);
            ChunkedUploadStreamAsync(blob, fileContent, (1024*1024), retryPolicy);
            Console.WriteLine("Upload Ended" + DateTime.Now);
            Console.ReadLine();
        }

        private static Task PutBlockAsync(CloudBlockBlob blob, string id, Stream stream, RetryPolicy policy)
        {
            Func<Task> uploadTaskFunc = () => Task.Factory
                .FromAsync(
                    (asyncCallback, state) => blob.BeginPutBlock(id, stream, null, null, null, null, asyncCallback, state)
                    , blob.EndPutBlock
                    , null
                );
            Console.WriteLine("Uploaded " + id + DateTime.Now);
            return policy.ExecuteAsync(uploadTaskFunc);
        }

        public static Task ChunkedUploadStreamAsync(CloudBlockBlob blob, Stream source, int chunkSize, RetryPolicy policy)
        {
            var blockids = new List<string>();
            var blockid = 0;

            int count;

            // first create a list of TPL Tasks for uploading blocks asynchronously
            var tasks = new List<Task>();

            var bytes = new byte[chunkSize];
            while ((count = source.Read(bytes, 0, bytes.Length)) != 0)
            {
                var id = Convert.ToBase64String(BitConverter.GetBytes(++blockid));
                blockids.Add(id);
                tasks.Add(PutBlockAsync(blob, id, new MemoryStream(bytes, true), policy));
                bytes = new byte[chunkSize]; //need a new buffer to avoid overriding previous one
            }

            return Task.Factory.ContinueWhenAll(
                tasks.ToArray(),
                array =>
                {
                    // propagate exceptions and make all faulted Tasks as observed
                    Task.WaitAll(array);
                    policy.ExecuteAction(() => blob.PutBlockListAsync(blockids));
                    Console.WriteLine("Uploaded Completed " + DateTime.Now);
                });
        }
    } }

回答1:


If you can accept command line tool, you can try AzCopy, which is able to transfer Azure Storage data in high performance and its transferring can be resumed.

If you want to control the transferring jobs programmatically, please use Azure Storage Data Movement Library, which is the core of AzCopy.




回答2:


As I known, Block blobs are made up of blocks. A block could up to 4MB in size. According to your code, you set the block size to 1MB and programatically uploaded each block in parallel. For a simple way, you could leverage the property ParallelOperationThreadCount to upload blob blocks in parallel as follows:

//set the number of blocks that may be simultaneously uploaded
var requestOption = new BlobRequestOptions()
{
    ParallelOperationThreadCount = 5,
    //Gets or sets the maximum size of a blob in bytes that may be uploaded as a single blob
    SingleBlobUploadThresholdInBytes = 10 * 1024 * 1024 //maximum for 64MB,32MB by default
};

//upload a file to blob
blob.UploadFromFile("{filepath}", options: requestOption);

Upon the option, when your blob(file) is larger than the value in SingleBlobUploadThresholdInBytes, then the storage client breaks the file into blocks(4MB in size) automatically and upload the blocks simultaneously.

Based on your requirement, I created an ASP.NET Web API application which exposes a API to upload file to Azure Blob Storage.

Project URL: AspDotNet-WebApi-AzureBlobFileUploadSample

Note:

In order to upload large file, you need to increase the maxRequestLength and maxAllowedContentLength in your web.config as follows:

<system.web>
   <httpRuntime maxRequestLength="2097152"/>  <!--KB in size, 4MB by default, increase it to 2GB-->
</system.web>
<system.webServer> 
      <security> 
          <requestFiltering> 
             <requestLimits maxAllowedContentLength="2147483648" />  <!--Byte in size,increase it to 2GB-->
          </requestFiltering> 
      </security> 
</system.webServer>

Screenshot




回答3:


I'd suggest you use Azcopy when uploading large files, it saves a lot time for coding by yourself and is more efficient. For upload a single file, run the command below:

AzCopy /Source:C:\folder /Dest:https://youraccount.blob.core.windows.net/container /DestKey:key /Pattern:"test.txt"


来源:https://stackoverflow.com/questions/41054784/uploading-files-to-azure-blob-storage-taking-more-time-for-larger-files

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!