Large File download from SQL via WebApi after custom MultipartFormDataStreamProvider upload

旧时模样 提交于 2019-12-06 06:18:14

I finally figured it out. I over-complicated the write process which brought about most of the struggle. Here is my solution to my initial issue:

To keep .net from buffering the file in memory (so that you can handle large file uploads), you first need to override the WebHostBufferPolicySelector so that it doesnt buffer the input stream for your controller and then replace the BufferPolicy Selector.

 public class NoBufferPolicySelector : WebHostBufferPolicySelector
{
    public override bool UseBufferedInputStream(object hostContext)
    {
        var context = hostContext as HttpContextBase;

        if (context != null)
        {
            if (context.Request.RequestContext.RouteData.Values["controller"] != null)
            {
                if (string.Equals(context.Request.RequestContext.RouteData.Values["controller"].ToString(), "upload", StringComparison.InvariantCultureIgnoreCase))
                    return false;
            }
        }

        return true;
    }

    public override bool UseBufferedOutputStream(HttpResponseMessage response)
    {
        return base.UseBufferedOutputStream(response);
    }
}

then for replacing the BufferPolicy Selector

GlobalConfiguration.Configuration.Services.Replace(typeof(IHostBufferPolicySelector), new NoBufferPolicySelector());

Then to avoid the default behavior of having the file stream written to disk, you need to provide a stream provider that will write to the database instead. To do this you inherit MultipartStreamProvider and override the GetStream method to return the stream that will write to your database.

    public override Stream GetStream(HttpContent parent, HttpContentHeaders headers)
    {
        // For form data, Content-Disposition header is a requirement
        ContentDispositionHeaderValue contentDisposition = headers.ContentDisposition;
        if (contentDisposition != null && !String.IsNullOrEmpty(contentDisposition.FileName))
        {
            // We won't post process files as form data
            _isFormData.Add(false);

            //create unique identifier for this file upload
            var identifier = Guid.NewGuid();
            var fileName = contentDisposition.FileName;

            var boundaryObj = parent.Headers.ContentType.Parameters.SingleOrDefault(a => a.Name == "boundary");

            var boundary = (boundaryObj != null) ? boundaryObj.Value : "";

            if (fileName.Contains("\\"))
            {
                fileName = fileName.Substring(fileName.LastIndexOf("\\") + 1).Replace("\"", "");
            }

            //write parent container for the file chunks that are being stored
            WriteLargeFileContainer(fileName, identifier, headers.ContentType.MediaType, boundary);

            //create an instance of the custom stream that will write the chunks to the database
            var stream = new CustomSqlStream();
            stream.Filename = fileName;
            stream.FullFilename = contentDisposition.FileName.Replace("\"", "");
            stream.Identifier = identifier.ToString();
            stream.ContentType = headers.ContentType.MediaType;
            stream.Boundary = (!string.IsNullOrEmpty(boundary)) ? boundary : "";

            return stream;
        }
        else
        {
            // We will post process this as form data
            _isFormData.Add(true);

            // If no filename parameter was found in the Content-Disposition header then return a memory stream.
            return new MemoryStream();
        }
    }

The custom stream you create needs to inherit Stream and override the Write method. This is where I overthought the problem and thought I needed to parse out the boundary headers that were passed via the buffer parameter. But this is actually done for you by leveraging the offset and count parameters.

public override void Write(byte[] buffer, int offset, int count)
    {
        //no boundary is inluded in buffer
        byte[] fileData = new byte[count];
        Buffer.BlockCopy(buffer, offset, fileData, 0, count);
        WriteData(fileData);
    }

From there, it's just plugging in the api methods for upload and download. For upload:

 public Task<HttpResponseMessage> PostFormData()
    {
        var provider = new CustomMultipartLargeFileStreamProvider();

        // Read the form data and return an async task.
        var task = Request.Content.ReadAsMultipartAsync(provider).ContinueWith<HttpResponseMessage>(t =>
        {
            if (t.IsFaulted || t.IsCanceled)
            {
                Request.CreateErrorResponse(HttpStatusCode.InternalServerError, t.Exception);
            }

            return Request.CreateResponse(HttpStatusCode.OK);
        });

        return task;
    }

For download, and in order to keep the memory footprint low, I leveraged the PushStreamContent to push the chunks back to the user:

[HttpGet]
    [Route("file/{id}")]
    public async Task<HttpResponseMessage> GetFile(string id)
    {
        string mimeType = string.Empty;
        string filename = string.Empty;
        if (!string.IsNullOrEmpty(id))
        {
            //get the headers for the file being sent back to the user
            using (var myConn = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["PortalBetaConnectionString"].ConnectionString))
            {
                using (var myCmd = new SqlCommand("ReadLargeFileInfo", myConn))
                {
                    myCmd.CommandType = System.Data.CommandType.StoredProcedure;

                    var pIdentifier = new SqlParameter("@Identifier", id);

                    myCmd.Parameters.Add(pIdentifier);

                    myConn.Open();

                    var dataReader = myCmd.ExecuteReader();

                    if (dataReader.HasRows)
                    {
                        while (dataReader.Read())
                        {
                            mimeType = dataReader.GetString(0);
                            filename = dataReader.GetString(1);
                        }
                    }
                }
            }


            var result = new HttpResponseMessage()
            {
                Content = new PushStreamContent(async (outputStream, httpContent, transportContext) =>
                {
                    //pull the data back from the db and stream the data back to the user
                    await WriteDataChunksFromDBToStream(outputStream, httpContent, transportContext, id);
                }),
                StatusCode = HttpStatusCode.OK
            };

            result.Content.Headers.ContentType = new MediaTypeHeaderValue(mimeType);// "application/octet-stream");
            result.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment") { FileName = filename };

            return result;
        }

        return new HttpResponseMessage(HttpStatusCode.BadRequest);
    }

    private async Task WriteDataChunksFromDBToStream(Stream responseStream, HttpContent httpContent, TransportContext transportContext, string fileIdentifier)
    {
        // PushStreamContent requires the responseStream to be closed
        // for signaling it that you have finished writing the response.
        using (responseStream)
        {
            using (var myConn = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["PortalBetaConnectionString"].ConnectionString))
            {
                await myConn.OpenAsync();

                //stored proc to pull the data back from the db
                using (var myCmd = new SqlCommand("ReadAttachmentChunks", myConn))
                {
                    myCmd.CommandType = System.Data.CommandType.StoredProcedure;

                    var fileName = new SqlParameter("@Identifier", fileIdentifier);

                    myCmd.Parameters.Add(fileName);

                    // The reader needs to be executed with the SequentialAccess behavior to enable network streaming
                    // Otherwise ReadAsync will buffer the entire BLOB into memory which can cause scalability issues or even OutOfMemoryExceptions
                    using (var reader = await myCmd.ExecuteReaderAsync(CommandBehavior.SequentialAccess))
                    {
                        while (await reader.ReadAsync())
                        {
                            //confirm the column that has the binary data of the file returned is not null
                            if (!(await reader.IsDBNullAsync(0)))
                            {
                                //read the binary data of the file into a stream
                                using (var data = reader.GetStream(0))
                                {
                                    // Asynchronously copy the stream from the server to the response stream
                                    await data.CopyToAsync(responseStream);
                                    await data.FlushAsync();
                                }
                            }
                        }
                    }
                }
            }
        }// close response stream
    }

Ugh. This is nasty. With the upload, you have to make sure to

  1. separate the headers from the content portion - you must follow the requirements RFC documents for HTTP.
  2. Allow for chunked transfers
  3. Of course, the content portion (unless you are transmitting text) will be binary encoded into strings.
  4. Allow for transfers that are compressed, i.e. GZIP or DEFLATE.
  5. Maybe - just maybe - take the encoding into account (ASCII, Unicode, UTF8, etc).

You can't really ensure that you're persisting the right info to the database without looking at all of these. For the latter items, all of your metadata as to what to do will be somewhere in the header, so it's not just a throwaway.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!