Node JS - Stream data from Busboy to AWS S3

安稳与你 提交于 2019-12-08 08:37:00

问题


I am trying to upload a file to s3 via ec2. My first approach was - upload file to ec2 completely and then upload that file to s3. This approach is not good because transfer time from ec2 to s3 is waste of time.

Currently I am trying to use busboy upload stream to s3 upload stream so that uploading to ec2 and then ec2 to s3 will be done simultaneously as s3 "upload" method support stream as upload Body.

Here is my code -

router.post('/s3StreamUpload', function(req, res, next) {
   var busboy = new Busboy({headers: req.headers});
   busboy.on('file', function (fieldname, file, filename, encoding, mimetype) {
      console.log('Before Upload: ' + new Date());
      console.log('File [' + fieldname + ']: filename: ' + filename + ', encoding: ' + encoding + ', mimetype: ' + mimetype);

      var s3 = new AWS.S3({
         params: {Bucket: 'sswa', Key: filename, Body: file},
         options: {partSize: 5 * 1024 * 1024, queueSize: 10}   // 5 MB
      });
      s3.upload().on('httpUploadProgress', function (evt) {
         console.log(evt);
      }).send(function (err, data) {
         console.log('After Upload: ' + new Date());
         console.log(err, data);
      });
   });
   busboy.on('field', function(fieldname, val, fieldnameTruncated, valTruncated, encoding, mimetype) {
      console.log('Field [' + fieldname + ']: value: ' + inspect(val));
   });
   busboy.on('finish', function() {
      console.log('Done parsing form!');
      res.writeHead(303, { Connection: 'close', Location: '/' });
      res.end();
   });
   req.pipe(busboy);
});

I have doubt is it really uploading to s3 simultaneously as stream ? Is there any drawbacks of this approach ?


回答1:


To test whether multi-part streaming upload to S3 is working or not, I took time log at three points of execution -

  1. Before start upload from client (uploadStartTime)
  2. After uploaded to EC2 (busboyFinishTime)
  3. After transferred to S3 (s3UploadFinishTime)

Then I run from EC2. After uploading various length of video files (36.1 MB, 33.3 MB, 52.5 MB) I observed that parts are transferred to S3 immediately for each 5MB (as I defined) uploaded to EC2. When uploading parts to S3 you will see a log of the following line. It will show file part upload progress with the part number.

console.log(evt);

For all three uploads busboyFinishTime and s3UploadFinishTime are same or there is hardly a 1-second difference.

Example: When 52.5 MB uploaded

{
  "uploadStartTime": "2016-04-28T14:19:51.365Z",
  "busboyFinishTime": "2016-04-28T14:22:26.292Z",
  "s3UploadFinishTime": "2016-04-28T14:22:26.558Z"
}

Full code:

router.post('/s3StreamUpload', function(req, res, next) {
   var busboy = new Busboy({headers: req.headers});
   var uploadStartTime = new Date(),
      busboyFinishTime = null,
      s3UploadFinishTime = null;

   busboy.on('file', function (fieldname, file, filename, encoding, mimetype) {
      console.log('File [' + fieldname + ']: filename: ' + filename + ', encoding: ' + encoding + ', mimetype: ' + mimetype);

      var s3 = new AWS.S3({
         params: {Bucket: 'sswa', Key: filename, Body: file},
         options: {partSize: 5 * 1024 * 1024, queueSize: 10}   // 5 MB
      });
      s3.upload().on('httpUploadProgress', function (evt) {
         console.log(evt);
      }).send(function (err, data) {
         s3UploadFinishTime = new Date();
         if(busboyFinishTime && s3UploadFinishTime) {
            res.json({
               uploadStartTime: uploadStartTime,
               busboyFinishTime: busboyFinishTime,
               s3UploadFinishTime: s3UploadFinishTime
            });
         }
         console.log(err, data);
      });
   });
   busboy.on('field', function(fieldname, val, fieldnameTruncated, valTruncated, encoding, mimetype) {
      console.log('Field [' + fieldname + ']: value: ' + inspect(val));
   });
   busboy.on('finish', function() {
      console.log('Done parsing form!');
      busboyFinishTime = new Date();
      if(busboyFinishTime && s3UploadFinishTime) {
         res.json({
            uploadStartTime: uploadStartTime,
            busboyFinishTime: busboyFinishTime,
            s3UploadFinishTime: s3UploadFinishTime
         });
      }
   });
   req.pipe(busboy);
});

According to my observations, I feel confident that this is one of the best solutions to upload a file to S3 via EC2 using a REST API deployed on EC2.




回答2:


Are you trying to upload to S3 directly from browser? If so you can use presigned-put for direct browser to S3 uploads.

This is how you generate a presigned PUT url using minio-js

s3Client.presignedPutObject('my-bucketname', 'my-objectname', 1000, function(e, presignedUrl) {
  if (e) return console.log(e)
  console.log(presignedUrl)
})

Now you pass this presigned URL to the browser client which can use XMLHttpRequest to directly PUT a file to S3.



来源:https://stackoverflow.com/questions/36911100/node-js-stream-data-from-busboy-to-aws-s3

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!