amazon-s3

Copy files from S3 to EMR local using Lambda

对着背影说爱祢 提交于 2020-01-05 04:57:06
问题 I need to move the files from S3 to EMR's local dir /home/hadoop programmatically using Lambda. S3DistCp copies over to HDFS. I then login into EMR and run a CopyToLocal hdfs command on commandline to get the files to /home/hadoop. Is there a programmatic way using boto3 in Lambda to copy from S3 to Emr's local dir? 回答1: I wrote a test Lambda function to submit a job step to EMR that copies files from S3 to EMR's local dir. This worked. emrclient = boto3.client('emr', region_name='us-west-2')

using local endpoint with boto2

对着背影说爱祢 提交于 2020-01-05 04:12:12
问题 I am trying to mock AWS s3 api calls using boto2. I create local s3 endpoint using localstack and can use this using boto3 easily as below, import boto3 s3_client = boto3.client('s3', endpoint_url='http://localhost:4572') bucket_name = 'my-bucket' s3_client.create_bucket(Bucket=bucket_name) But I did not find way to do this using boto2. Is there any way preferably using ~/.boto or ~/.aws/config? Tried providing endpoint with boto2 but it failed. import boto boto.s3.S3RegionInfo(name='test-s3

In Boto3, how to create a Paginator for list_objects with additional keyword arguments?

时光怂恿深爱的人放手 提交于 2020-01-05 04:11:29
问题 I'm using a Paginator to iterate over the contents of an S3 bucket (following http://boto3.readthedocs.io/en/latest/guide/paginators.html#creating-paginators): client = boto3.client('s3') paginator = client.get_paginator('list_objects') page_iterator = paginator.paginate(Bucket=<my_bucket>) for page in page_iterator: for object in page['Contents']: key = object['Key'] In this example, the method name 'list_objects' is passed as a string. However, I would actually like to use a 'partial'

Sign PUT request to s3

喜夏-厌秋 提交于 2020-01-05 03:54:06
问题 I am trying to do a call with curl to upload a file in s3 (eu-central-1) without using aswcli (this is a requirement) or boto3 for uploading. I am using python and some methods from botocore to sign the request as follow: import datetime from botocore.credentials import Credentials from botocore.handlers import calculate_md5 from botocore.awsrequest import AWSRequest from botocore.auth import S3SigV4Auth if __name__ == "__main__": access_key = '<ACCESS>' secret_key = '<SECRET>' bucket =

Easy Thumbnail with Django raising access denied error

雨燕双飞 提交于 2020-01-05 03:45:11
问题 I'm using S3Boto3Storage to save docs in my aws s3 and tried to use easy-thumbnails to generate thumbnail images, please find the code below Model class class ThumbnailTestModel(models.Model): sample1 = models.FileField( storage=S3Boto3Storage(), help_text="Field to store the sample document of Professional", null=True, blank=True, upload_to=s3_professional_sample_storage_path) sample1_file_name = models.CharField(blank=True,null=True,max_length=1000, default=True) View class class

Spark :How to generate file path to read from s3 with scala

廉价感情. 提交于 2020-01-05 03:35:45
问题 How do I generate and load multiple s3 file path in scala so that I can use : sqlContext.read.json ("s3://..../*/*/*") I know I can use wildcards to read multiple files but is there any way so that I can generate the path ? For example my fIle structure looks like this: BucketName/year/month/day/files s3://testBucket/2016/10/16/part00000 These files are all jsons. The issue is I need to load just spacific duration of files, for eg. Say 16 days then I need to loado files for start day ( oct 16

Using amazon data pipeline to backup dynamoDB data to S3

生来就可爱ヽ(ⅴ<●) 提交于 2020-01-05 03:30:58
问题 I need to backup my dynamoDB table data to S3 using amazon Data pipeline. My question is- Can i use a single data pipeline to backup multiple dynamoDB tables to S3, or do I have to make a separate pipeline for each of them?? Also, since my tables have a year_month prefix( ex- 2014_3_tableName) , I was thinking of using datapipeline SDK to change the table name in pipeline definition once the month changes.Will this work? Is there an alternate/better way?? Thanks!! 回答1: If you are setting up

JGit S3 support for Standard-US buckets only?

我们两清 提交于 2020-01-05 02:46:09
问题 Is it possbile to use other locations besides US-standard for S3 buckets with JGit (ie. through the config file etc.) or am I doing something wrong here? If I try to use a S3 bucket with JGit that is located in EU, jgit throws an error -> jgit push origin master Counting objects: 3 Finding sources: 100% (3/3) Getting sizes: 100% (2/2) Compressing objects: 100% (1/1) Writing objects: 100% (3/3) java.lang.NullPointerException at org.eclipse.jgit.transport.AmazonS3.error(AmazonS3.java:518) at

Choosing cloud storage service with web API can FTP third party server [closed]

点点圈 提交于 2020-01-04 14:01:22
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 8 years ago . I need to find a storage service that can programatically (REST API) send (FTP) a file to a third party service. I was thinking of using Amazon S3, but I found a previous similar question here: Sending file from S3 to third party FTP server using CloudFront and apparently it cant be done. What I want to avoid is

Preventing a user from even knowing about other users (folders) on AWS S3

允我心安 提交于 2020-01-04 13:47:52
问题 I have a question about writing IAM policies on AWS S3 that was partially answered here, in this nice post by Jim Scharf: https://aws.amazon.com/blogs/security/writing-iam-policies-grant-access-to-user-specific-folders-in-an-amazon-s3-bucket/ Taking Jim's post as a starting point, what I am trying to achieve is preventing a user from even knowing about the existence of other users that have access to the same bucket while using S3's console. Jim's solution, as well as others I've found,