Regex with recursive expression to match nested braces?

后端 未结 2 1249
执笔经年
执笔经年 2021-01-18 05:25

I\'m trying to match text like sp { ...{...}... }, where the curly braces are allowed to nest. This is what I have so far:

my $regex = qr/
(             


        
相关标签:
2条回答
  • 2021-01-18 05:36

    There are numerous problems. The recursive bit should be:

    (
       (?: \{ (?-1) \}
       |   [^{}]+
       )*
    )
    

    All together:

    my $regex = qr/
       sp\s+
       \{
          (
             (?: \{ (?-1) \}
             |   [^{}]++
             )*
          )
       \}
    /x;
    
    print "$1\n" if 'sp { { word } }' =~ /($regex)/;
    
    0 讨论(0)
  • 2021-01-18 05:47

    This is case for the underused Text::Balanced, a very handy core module for this kind of thing. It does rely on the pos of the start of the delimited sequence being found/set first, so I typically invoke it like this:

    #!/usr/bin/env perl
    
    use strict;
    use warnings;
    
    use Text::Balanced 'extract_bracketed';
    
    sub get_bracketed {
      my $str = shift;
    
      # seek to beginning of bracket
      return undef unless $str =~ /(sp\s+)(?={)/gc;
    
      # store the prefix
      my $prefix = $1;
    
      # get everything from the start brace to the matching end brace
      my ($bracketed) = extract_bracketed( $str, '{}');
    
      # no closing brace found
      return undef unless $bracketed;
    
      # return the whole match
      return $prefix . $bracketed;
    }
    
    my $str = 'sp { { word } }';
    
    print get_bracketed $str;
    

    The regex with the gc modifier tells the string to remember where the end point of the match is, and extract_bracketed uses that information to know where to start.

    0 讨论(0)
提交回复
热议问题