Regex with named capture groups getting all matches in Ruby

前端 未结 10 2071
滥情空心
滥情空心 2021-02-02 08:12

I have a string:

s=\"123--abc,123--abc,123--abc\"

I tried using Ruby 1.9\'s new feature \"named groups\" to fetch all named group info:

10条回答
  •  渐次进展
    2021-02-02 09:05

    A year ago I wanted regular expressions that were more easy to read and named the captures, so I made the following addition to String (should maybe not be there, but it was convenient at the time):

    scan2.rb:

    class String  
      #Works as scan but stores the result in a hash indexed by variable/constant names (regexp PLACEHOLDERS) within parantheses.
      #Example: Given the (constant) strings BTF, RCVR and SNDR and the regexp /#BTF# (#RCVR#) (#SNDR#)/
      #the matches will be returned in a hash like: match[:RCVR] =  and match[:SNDR] = 
      #Note: The #STRING_VARIABLE_OR_CONST# syntax has to be used. All occurences of #STRING# will work as #{STRING}
      #but is needed for the method to see the names to be used as indices.
      def scan2(regexp2_str, mark='#')
        regexp              = regexp2_str.to_re(mark)                       #Evaluates the strings. Note: Must be reachable from here!
        hash_indices_array  = regexp2_str.scan(/\(#{mark}(.*?)#{mark}\)/).flatten #Look for string variable names within (#VAR#) or # replaced by 
        match_array         = self.scan(regexp)
    
        #Save matches in hash indexed by string variable names:
        match_hash = Hash.new
        match_array.flatten.each_with_index do |m, i|
          match_hash[hash_indices_array[i].to_sym] = m
        end
        return match_hash  
      end
    
      def to_re(mark='#')
        re = /#{mark}(.*?)#{mark}/
        return Regexp.new(self.gsub(re){eval $1}, Regexp::MULTILINE)    #Evaluates the strings, creates RE. Note: Variables must be reachable from here!
      end
    
    end
    

    Example usage (irb1.9):

    > load 'scan2.rb'
    > AREA = '\d+'
    > PHONE = '\d+'
    > NAME = '\w+'
    > "1234-567890 Glenn".scan2('(#AREA#)-(#PHONE#) (#NAME#)')
    => {:AREA=>"1234", :PHONE=>"567890", :NAME=>"Glenn"}
    

    Notes:

    Of course it would have been more elegant to put the patterns (e.g. AREA, PHONE...) in a hash and add this hash with patterns to the arguments of scan2.

提交回复
热议问题