Parsing a possibly nested braced item using a grammar

白昼怎懂夜的黑 提交于 2019-11-30 21:57:08

Without knowing how you want the resultant data to look I would change it to look something like this:

my $str = 「author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},」;

grammar ExtractBraced {
    token TOP {
        'author='
        $<author> = <.braced-item>
        .*
    }
    token braced-item {
       '{' ~ '}'

           [
           || <- [{}] >+
           || <.before '{'> <.braced-item>
           ]*
    }
}

ExtractBraced.parse( $str ).say;
「author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},」
 author => 「{Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.}」

If you want a bit more structure It might look a bit more like this:

my $str = 「author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},」;

grammar ExtractBraced {
    token TOP {
        'author='
        $<author> = <.braced-item>
        .*
    }
    token braced-part {
        || <- [{}] >+
        || <.before '{'> <braced-item>
    }
    token braced-item {
        '{' ~ '}'
            <braced-part>*
    }
}

class Print {
    method TOP ($/){
        make $<author>.made
    }
    method braced-part ($/){
        make $<braced-item>.?made // ~$/
    }
    method braced-item ($/){
        make [~] @<braced-part>».made
    }
}


my $r = ExtractBraced.parse( $str, :actions(Print) );
say $r;
put();
say $r.made;
「author={Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.},」
 author => 「{Belayneh, M. and Geiger, S. and Matth{\"{a}}i, S.K.}」
  braced-part => 「Belayneh, M. and Geiger, S. and Matth」
  braced-part => 「{\"{a}}」
   braced-item => 「{\"{a}}」
    braced-part => 「\"」
    braced-part => 「{a}」
     braced-item => 「{a}」
      braced-part => 「a」
  braced-part => 「i, S.K.」

Belayneh, M. and Geiger, S. and Matth\"ai, S.K.

Note that the + on <-[{}]>+ is an optimization, as well as <before '{'>, both can be omitted and it will still work.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!