Raku regex: How to use capturing group inside lookaheads

随声附和 提交于 2021-02-20 06:32:14

问题


How can I use capturing groups inside lookahead assertion?

This code:

say "ab" ~~ m/(a) <?before (b) > /;

returns:

「a」
 0 => 「a」

But I was expecting to also capture 'b'.

Is there a way to do so?

I don't want to leave 'b' outside of the lookahead because I don't want 'b' to be part of the match.

Is there a way to capture 'b' but still leave it outside of the match?

NOTE:

I tried to use Raku's capture markers, as in:

say "ab" ~~ m/<((a))> (b) /;

「a」
 0 => 「a」
 1 => 「b」

But this does not seem to work as I expect because even if 'b' is left ouside the match, the regex has processed 'b'. And I don't want to be processed too.

For example:

say 'abab' ~~ m:g/(a)<?before b>|b/;

(「a」
    0 => 「a」
 「b」 
 「a」
    0 => 「a」
 「b」)

# Four matches (what I want)
 

say 'abab' ~~ m:g/<((a))>b|b/;

(「a」
    0 => 「a」 
 「a」
    0 => 「a」)

# Two matches

回答1:


Is there a way to do so?

Not really, but sort of. Three things conspire against us in trying to make this happen.

  1. Raku regex captures form trees of matches. Thus (a(b)) results in one positional capture that contains another positional capture. Why do I mention this? Because the same thing is going on with things like before, which take a regex as an argument: the regex passed to before gets its own Match object.
  2. The ? implies "do not capture". We may think of dropping it to get <before (b)>, and indeed there is a before key in the Match object now, which sounds promising except...
  3. before doesn't actually return what it matched on the inside, but instead a zero-width Match object, otherwise if we did forget the ? we'd end up with it not being a lookahead.

If only we could rescue the Match object from inside of the lookahead. Well, we can! We can declare a variable and then bind the $/ inside of the before argument regex into it:

say "ab" ~~ m/(a) :my $lookahead; <?before b {$lookahead = $/}> /;
say $lookahead;

Which gives:

「a」
 0 => 「a」
「b」

Which works, although it's unfortunately not attached like a normal capture. There's not a way to do that, although we can attach it via make:

say "ab" ~~ m/(a) :my $lookahead; <?before (b) {$lookahead = $0}> { make $lookahead } /;
say $/.made;

With the same output, except now it will be reliably attached to each match object coming back from m:g, and so will be robust, even if not beautiful.



来源:https://stackoverflow.com/questions/64898346/raku-regex-how-to-use-capturing-group-inside-lookaheads

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!