Listing directories at a given level in Amazon S3

前端 未结 2 1573
Happy的楠姐
Happy的楠姐 2021-01-03 05:25

I am storing two million files in an amazon S3 bucket. There is a given root (l1) below, a list of directories under l1 and then each directory contains files. So my bucket

2条回答
  •  情歌与酒
    2021-01-03 06:08

    This thread is quite old but I did run into this issue recently and wanted to assert my 2cents...

    It is a hassle and a half (it seems) to cleanly list out folders given a path in an S3 bucket. Most of the current gem wrappers around the S3 API (AWS-SDK official, S3) don't correctly parse the return object (specifically the CommonPrefixes) so it is difficult to get back a list of folders (delimiter nightmares).

    Here is a quick fix for those using the S3 gem... Sorry it isn't one size fits all but it's the best I wanted to do.

    https://github.com/qoobaa/s3/issues/61

    Code snippet:

    module S3
      class Bucket
        # this method recurses if the response coming back
        # from S3 includes a truncation flag (IsTruncated == 'true')
        # then parses the combined response(s) XML body
        # for CommonPrefixes/Prefix AKA directories
        def directory_list(options = {}, responses = [])
          options = {:delimiter => "/"}.merge(options)
          response = bucket_request(:get, :params => options)
    
          if is_truncated?(response.body)
            directory_list(options.merge({:marker => next_marker(response.body)}), responses << response.body)
          else
            parse_xml_array(responses + [response.body], options)
          end
        end
    
        private
    
        def parse_xml_array(xml_array, options = {}, clean_path = true)
          names = []
          xml_array.each do |xml|
            rexml_document(xml).elements.each("ListBucketResult/CommonPrefixes/Prefix") do |e|
              if clean_path
                names << e.text.gsub((options[:prefix] || ''), '').gsub((options[:delimiter] || ''), '')
              else
                names << e.text
              end
            end
          end
          names
        end
    
        def next_marker(xml)
          marker = nil
          rexml_document(xml).elements.each("ListBucketResult/NextMarker") {|e| marker ||= e.text }
          if marker.nil?
            raise StandardError
          else
            marker
          end
        end
    
        def is_truncated?(xml)
          is_truncated = nil
          rexml_document(xml).elements.each("ListBucketResult/IsTruncated") {|e| is_truncated ||= e.text }
          is_truncated == 'true'
        end
      end
    end
    

提交回复
热议问题