amazon-s3

Cloudfront and Lambda@Edge: Remove response header

我怕爱的太早我们不能终老 提交于 2020-06-15 05:59:10
问题 I am trying to remove some headers from a Cloudfront response using Lambda@Edge on the ViewerResponse event. The origin is an S3 bucket. I have been successful to change the header like this: exports.handler = (event, context, callback) => { const response = event.Records[0].cf.response; response.headers.server = [{'key': 'server', 'value': 'bunny'}]; callback(null, response); }; However it does not seem to work to remove headers all together, e.g. like this. exports.handler = (event, context

How do I map multiple domains to the same bucket on Amazon S3?

走远了吗. 提交于 2020-06-13 20:01:09
问题 Is it possible to do that? I need to be able to access mydomain.com by typing in my-domain.com in the address bar of the browser? Now I added a DNS entry: my-domain.com CNAME mydomain.com But this doesn't seem to work. I get an 404 not found error. 回答1: You can only map a single domain to your S3 bucket. However you could use Cloudfront to do this. See my answer to another similar question for more information. 回答2: We had the same issue and basically I set our CI to publish to two S3 buckets

I would like to export DynamoDB Table to S3 bucket in CSV format using Python (Boto3)

…衆ロ難τιáo~ 提交于 2020-06-13 08:20:13
问题 This question has been asked earlier in the following link: How to write dynamodb scan data's in CSV and upload to s3 bucket using python? I have amended the code as advised in the comments. The code looks like as follows: import csv import boto3 import json dynamodb = boto3.resource('dynamodb') db = dynamodb.Table('employee_details') def lambda_handler(event, context): AWS_BUCKET_NAME = 'session5cloudfront' s3 = boto3.resource('s3') bucket = s3.Bucket(AWS_BUCKET_NAME) path = '/tmp/' +

Ungzipping chunks of bytes from from S3 using iter_chunks()

冷暖自知 提交于 2020-06-13 05:01:35
问题 I am encountering issues ungzipping chunks of bytes that I am reading from S3 using the iter_chunks() method from boto3 . The strategy of ungzipping the file chunk-by-chunk originates from this issue. The code is as follows: dec = zlib.decompressobj(32 + zlib.MAX_WBITS) for chunk in app.s3_client.get_object(Bucket=bucket, Key=key)["Body"].iter_chunks(2 ** 19): data = dec.decompress(chunk) print(len(chunk), len(data)) # 524288 65505 # 524288 0 # 524288 0 # ... This code initially prints out

Ungzipping chunks of bytes from from S3 using iter_chunks()

拈花ヽ惹草 提交于 2020-06-13 05:00:36
问题 I am encountering issues ungzipping chunks of bytes that I am reading from S3 using the iter_chunks() method from boto3 . The strategy of ungzipping the file chunk-by-chunk originates from this issue. The code is as follows: dec = zlib.decompressobj(32 + zlib.MAX_WBITS) for chunk in app.s3_client.get_object(Bucket=bucket, Key=key)["Body"].iter_chunks(2 ** 19): data = dec.decompress(chunk) print(len(chunk), len(data)) # 524288 65505 # 524288 0 # 524288 0 # ... This code initially prints out

Ungzipping chunks of bytes from from S3 using iter_chunks()

夙愿已清 提交于 2020-06-13 05:00:30
问题 I am encountering issues ungzipping chunks of bytes that I am reading from S3 using the iter_chunks() method from boto3 . The strategy of ungzipping the file chunk-by-chunk originates from this issue. The code is as follows: dec = zlib.decompressobj(32 + zlib.MAX_WBITS) for chunk in app.s3_client.get_object(Bucket=bucket, Key=key)["Body"].iter_chunks(2 ** 19): data = dec.decompress(chunk) print(len(chunk), len(data)) # 524288 65505 # 524288 0 # 524288 0 # ... This code initially prints out

Duplicate partition columns on write s3

*爱你&永不变心* 提交于 2020-06-12 15:41:55
问题 I'm processing data and writing it to s3 using the following code: spark = SparkSession.builder.config('spark.sql.sources.partitionOverwriteMode', 'dynamic').getOrCreate() df = spark.read.parquet('s3://<some bucket>/<some path>').filter(F.col('processing_hr') == <val>) transformed_df = do_lots_of_transforms(df) # here's the important bit on how I'm writing it out transformed_df.write.mode('overwrite').partitionBy('processing_hr').parquet('s3://bucket_name/location') Basically, I'm trying to

Duplicate partition columns on write s3

旧城冷巷雨未停 提交于 2020-06-12 15:39:12
问题 I'm processing data and writing it to s3 using the following code: spark = SparkSession.builder.config('spark.sql.sources.partitionOverwriteMode', 'dynamic').getOrCreate() df = spark.read.parquet('s3://<some bucket>/<some path>').filter(F.col('processing_hr') == <val>) transformed_df = do_lots_of_transforms(df) # here's the important bit on how I'm writing it out transformed_df.write.mode('overwrite').partitionBy('processing_hr').parquet('s3://bucket_name/location') Basically, I'm trying to

saving csv file to s3 using boto3

依然范特西╮ 提交于 2020-06-12 08:04:52
问题 I am trying to write and save a CSV file to a specific folder in s3 (exist). this is my code: from io import BytesIO import pandas as pd import boto3 s3 = boto3.resource('s3') d = {'col1': [1, 2], 'col2': [3, 4]} df = pd.DataFrame(data=d) csv_buffer = BytesIO() bucket = 'bucketName/folder/' filename = "test3.csv" df.to_csv(csv_buffer) content = csv_buffer.getvalue() def to_s3(bucket,filename,content): s3.Object(bucket,filename).put(Body=content) to_s3(bucket,filename,content) this is the

saving csv file to s3 using boto3

我的梦境 提交于 2020-06-12 08:04:10
问题 I am trying to write and save a CSV file to a specific folder in s3 (exist). this is my code: from io import BytesIO import pandas as pd import boto3 s3 = boto3.resource('s3') d = {'col1': [1, 2], 'col2': [3, 4]} df = pd.DataFrame(data=d) csv_buffer = BytesIO() bucket = 'bucketName/folder/' filename = "test3.csv" df.to_csv(csv_buffer) content = csv_buffer.getvalue() def to_s3(bucket,filename,content): s3.Object(bucket,filename).put(Body=content) to_s3(bucket,filename,content) this is the