utf-8 filename in s3 bucket

拥有回忆 提交于 2021-02-09 11:13:14

问题


Is it possible to add a key to s3 with an utf-8 encoded name like "åøæ.jpg"?

I'm getting the following error when uploading with boto:

<Error><Code>InvalidURI</Code><Message>Couldn't parse the specified URI.</Message>

回答1:


@2083: This is a bit of an old question, but if you haven't found the solution, and for everyone else that comes here like me looking for an answer:

From the official documentation (http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html):

Although you can use any UTF-8 characters in an object key name, the following key naming best practices help ensure maximum compatibility with other applications. Each application may parse special characters differently. The following guidelines help you maximize compliance with DNS, web safe characters, XML parsers, and other APIs.

Safe Characters

The following character sets are generally safe for use in key names:

Alphanumeric characters [0-9a-zA-Z]

Special characters !, -, _, ., *, ', (, and )

The following are examples of valid object key names:

4my-organization

my.great_photos-2014/jan/myvacation.jpg

videos/2014/birthday/video1.wmv

However, if what you really want, like me, is a filename that allows UTF-8 characters (note that this can be different from the key name). You have a way to do it!

From http://www.bennadel.com/blog/2591-embedding-foreign-characters-in-your-content-disposition-filename-header.htm and http://www.bennadel.com/blog/2696-overriding-content-type-and-content-disposition-headers-in-amazon-s3-pre-signed-urls.htm (Kudos to Ben Nadal) you can do that by making sure that when downloading the file, S3 will override the Content-Disposition header.

As I have done it in java, I include here the code, I'm sure you'll be able to easily translate it to Python :) :

      AmazonS3 s3 = S3Controller.getS3Client();

        //as per http://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html

        String key = fileName.substring(fileName.indexOf("-")).replaceAll("[^a-zA-Z0-9._]", "");
        PutObjectRequest putObjectRequest = new PutObjectRequest(
                S3Controller.bucketNameForBucket(S3Controller.Bucket.EXPORT_BUCKET), 
                key,
                file);
        // we can always regenerate these files, so we can used reduced redundancy storage
        putObjectRequest.setStorageClass(StorageClass.Standard);
        String urlEncodedUTF8Filename = key;
        try {
            //http://www.bennadel.com/blog/2696-overriding-content-type-and-content-disposition-headers-in-amazon-s3-pre-signed-urls.htm
            //http://www.bennadel.com/blog/2591-embedding-foreign-characters-in-your-content-disposition-filename-header.htm
            //Issue#179
            urlEncodedUTF8Filename = URLEncoder.encode(fileName.substring(fileName.indexOf("-")), "UTF-8");
        } catch (UnsupportedEncodingException e) {
            LOG.warn("Could not URLEncode a filename. Original Filename: " + fileName, e );
        }

        ObjectMetadata metadata = new ObjectMetadata();
        metadata.setContentDisposition("attachment; filename=\"" + key + "\"; filename*=UTF-8''"+ urlEncodedUTF8Filename);
        putObjectRequest.setMetadata(metadata);

        s3.putObject(putObjectRequest);

It should help :)




回答2:


From AWS FAQ: A key is a sequence of Unicode characters whose UTF-8 encoding is at most 1024 bytes long.

From my experience, use ASCII.



来源:https://stackoverflow.com/questions/21074800/utf-8-filename-in-s3-bucket

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!