How to get the first 100 lines of a file on S3?

|▌冷眼眸甩不掉的悲伤 提交于 2020-01-03 16:44:42

问题


I have a huge (~6 GB) file on Amazon S3 and want to get the first 100 lines of it without having to download the whole thing. Is this possible?

Here's what I'm doing now:

aws cp s3://foo/bar - | head -n 100

But this takes a while to execute. I'm confused -- shouldn't head close the pipe once it's read enough lines, causing aws cp to crash with a BrokenPipeError before it has time to download the entire file?


回答1:


Using the Range HTTP header in a GET request, you can retrieve a specific range of bytes in an object stored in Amazon S3. (see http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGET.html)

if you use aws cli you can use aws s3api get-object --range bytes=0-xxx, see http://docs.aws.amazon.com/cli/latest/reference/s3api/get-object.html

It is not exactly as a number of lines but should allow you to retrieve your file in part so avoid downloading the full object



来源:https://stackoverflow.com/questions/39258347/how-to-get-the-first-100-lines-of-a-file-on-s3

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!