GSUtil rsync gives a 400 non-retryable exception on S3 bucket

邮差的信 提交于 2019-12-12 13:15:24

问题


I'm using gsutil rsync, copying from s3 to gs, and I'm getting the following error after gsutil has gone partway through a bucket:

Caught non-retryable exception while listing s3://[bucket]/: BadRequestException: 400 None CommandException: Caught non-retryable exception - aborting rsync

This is undesirable behavior, because I can manually copy from s3 to gs other files. I can't bypass by using the "-C" switch, since this isn't an error in copying.

Edit: Appears that if a "#" is in a filename in s3, gsutil replaces it with "?versionId=". For example:

S3 filename: Updaet#2_Montgomery Building Permits.xlsx

GS lists in debug output as: Updaet?versionId=2_Montgomery Building Permits.xlsx


回答1:


can you please provide more details about this failure by running:

gsutil -D rsync your-source your-destination

and then excerpting the HTTP request/response that shows the error? When you do please redact the authorization: header.

If you'd prefer not to post the details of your request on the public forum you can email them to me at gs-team@google.com

Thanks.




回答2:


This same thing happened to me yesterday, and the '#' is indeed the problem.

The issue appears to be in boto, not necessarily gsutil, though I don't know exactly where the fix is. BotoTranslation._StorageUriForObject() calls boto.storage_uri() which uses VERSION_RE ('(?P<versionless_uri_str>.+)#(?P<version_id>.+)$') to find a version in the uri_str/path. If the object name contains a '#', everything after it will therefore get treated as an S3 version ID. I don't see that there is currently any way to escape or encode the '#' so that it doesn't get treated as a version separator.



来源:https://stackoverflow.com/questions/28582964/gsutil-rsync-gives-a-400-non-retryable-exception-on-s3-bucket

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!