How do I bulk upload to s3?

折月煮酒 提交于 2019-11-30 05:03:09

问题


I recently refactored some of my code to stuff rows into a db using 'load data' and it works great -- however for each record I have I must upload 2 files to s3 -- this totally destroys the magnificent speed upgrade that I was obtaining. Whereas I was able to process 600+ of these documents/second they are now trickling in at 1/second because of s3.

What are your workarounds for this? Looking at the API I see that it is mostly RESTful so I'm not sure what to do -- maybe I should just stick all this into the database. The text files are usually no more than 1.5k. (the other file we stuff in there is an xml representation of the text)

I already cache these files in HTTP requests to my web server as they are used quite a lot.

btw: our current implementation uses java; I have not yet tried threads but that might be an option

Recommendations?


回答1:


You can use the [putObjects][1] function of JetS3t to upload multiple files at once.

Alternatively you could use a background thread to upload to S3 from a queue, and add files to the queue from your code that loads the data into the database.

[1]: http://jets3t.s3.amazonaws.com/api/org/jets3t/service/multithread/S3ServiceMulti.html#putObjects(org.jets3t.service.model.S3Bucket, org.jets3t.service.model.S3Object[])




回答2:


I just found a nice solution to upload an entire directory with php:

$client->uploadDirectory(
  SOURCE_FOLDER,
  YOUR_BUCKET_NAME,
  DESTINATION,
  array(
    'concurrency' => 5,
    'debug'          => TRUE,
    'force'            => FALSE,
    'params'        => array(
      'ServerSideEncryption' => 'AES256',
    ),
  )
);


来源:https://stackoverflow.com/questions/667478/how-do-i-bulk-upload-to-s3

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!