Regex to match string between curly braces (that allows to escape them via 'doubling')

前提是你 提交于 2021-02-08 08:44:35

问题


I was using the regex from Extract values within single curly braces:

(?<!{){[^{}]+}(?!})

However, it does not cover the user case #3 (see below).

I would like to know if it's possible to define a regular expression that satisfied the use cases below

Use case 1

Given:

Hola {name}

It should match {name} and capture name

But I would like to be able to escape curly braces when needed by doubling them, like C# does for interpolated strings. So, in a string like

Use case 2

Hola {name}, this will be {{unmatched}}

The {{unmatched}} part should be ignored because it uses them doubled. Notice the {{ and }}.

Use case 3

In the last, most complex case, a text like this:

Buenos {{{dias}}}

The text {dias} should be a match (and capture dias) because the first outer-most doubled curly braces should be interpreted just like another character (they are escaped) so it should match: {{{dias}}}

My ultimate goal is to replace the matches later with another string, like a variable.

EDIT

This 4th use case pretty much summarized the whole requirements:

Given: Hola {name}, buenos {{{dias}}}

Results in:

  • Match 1:
    • Matched text: {name}
    • Captured text: name
  • Match 2:
    • Matched text: {dias}
    • Captured text: dias

回答1:


You can use

(?<!{)(?:{{)*{([^{}]*)}(?:}})*(?!})

See the .NET regex demo.

In C#, you can use

var results = Regex.Matches(text, @"(?<!{)(?:{{)*{([^{}]*)}(?:}})*(?!})").Cast<Match>().Select(x => x.Groups[1].Value).ToList();

Alternatively, to get full matches, wrap the left- and right-hand contexts in lookarounds:

(?<=(?<!{)(?:{{)*{)[^{}]*(?=}(?:}})*(?!}))

See this regex demo. In C#:

var results = Regex.Matches(text, @"(?<=(?<!{)(?:{{)*{)[^{}]*(?=}(?:}})*(?!}))")
    .Cast<Match>()
    .Select(x => x.Value)
    .ToList();

Regex details

  • (?<=(?<!{)(?:{{)*{) - immediately to the left, there must be zero or more {{ substrings not immediately preceded with a { char and then {
  • [^{}]* - zero or more chars other than { and }
  • (?=}(?:}})*(?!})) - immediately to the right, there must be }, zero or more }} substrings not immediately followed with a } char.



回答2:


To optionally match double curly's, you could use an if clause and take the value from capture group 2.

(?<!{)({{)?{([^{}]+)}(?(1)}})(?!})

Explanation

  • (?<!{) Assert not { directly to the left
  • ({{)? Optionally capture {{ in group 1
  • {([^{}]+)} Match from { till } without matching { and } in between
  • (?(1)}}) If clause, if group 1 exists, match }}
  • (?!}) Assert not } directly to the right

.Net regex demo | C# demo

string pattern = @"(?<!{)({{)?{([^{}]+)}(?(1)}})(?!})";
string input = @"Hola {name}
    Hola {name}, this will be {{unmatched}}
    Buenos {{{dias}}}";

foreach (Match m in Regex.Matches(input, pattern))
{
    Console.WriteLine(m.Groups[2].Value);
}

Output

name
name
dias

If the double curly's should be balanced, you might use this approach:

(?<!{){(?>(?<={){{(?<c>)|([^{}]+)|}}(?=})(?<-c>))*(?(c)(?!))}(?!})

.NET regex demo



来源:https://stackoverflow.com/questions/65716941/regex-to-match-string-between-curly-braces-that-allows-to-escape-them-via-doub

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!