问题
I am implementing a plugin code for my CMS system. Something like a shortcode but will be applicable in many scenarios. I want a case where an admin writes his code like this:
Example 1:
{COMMAND_NAME}Strings of texts that conatains htmltags,symbols,just anything{/COMMAND_NAME}
Example 2
{COMMAND_NAME}
Example 3
{COMMAND_NAME{attriute1=value attribute2=value}}
Example 4
{COMMAND_NAME{attriute1=value attribute2=value}}Strings of anything including texts, htmltags and anything at all {/COMMAND_NAME}
Regex can match the the above string. Get the COMMAND_NAME
, get the text in between and get the closing {/COMMAND_NAME}
from a single regex pattern.
In the regex , I want to capture the COMMAND_NAME
, the attributes if provided, the text in between if the {COMMAND_NAME}
has a closing {/COMMAND_NAME}
and the closing {/COMMAND_NAME}
if provided.
See what I've done so far and go some incomplete result.
$regex = #\{(RAW|ACCESS|DWNLINK|MODL)[\{]{0,1}([\w\W\s]*?)\}{0}\}([\w\s]+)([\{/RAW|ACCESS|DWNLINK|MODL]*)\}#i
$strings = '<div class="blog-list-item blog"><header class="entry-title">
<h1>Welcome to our website</h1>
</header><article id="entry-72" class="entry post-72 page et-bg-layout-dark et-white-bg"><div class="jumbotron row">
<div class="col-md-8">
<ul>
<li>You have a pending job on your neck?…</li>
<li>Do your company need a website makeover ?…</li>
<li>Or a competitive web application ? ?…</li>
<li>Do you need a customized plugin, or a tweak ?…</li>
<li>Maybe you want a personal website ?…</li>
<li>Or a graphic for your new project ?…</li>
</ul>
<div class="bg-primary well">
<h4 class="text-center text-white shadow">Track your project as we work it to perfection...</h4>
</div>
</div>
<div class="pull-right col-md-4">
<h4 class="bg-primary text-white well">Other services we offer</h4>
{ACCESS{type=500}}
<ul>
<li>SEO work for an existing website or new</li>
<li>Bulk SMS</li>
<li>E-currency exchange</li>
<li>Facebook AD</li>
<li>Google AD</li>
</ul>
{/ACCESS}</div>
{RAW{say=email,access=500}} {RAW} <a class="btn button large tall green" href="client-area">Place new Job now as we deliver at the quickest <em>reasonable time</em></a>{/RAW}</div></article></div>';
And doing a php var_dump, gives the following result:
array(5) {
[0]=>
array(1) {
[0]=>
string(224) "{ACCESS{type=500}}
<ul>
<li>SEO work for an existing website or new</li>
<li>Bulk SMS</li>
<li>E-currency exchange</li>
<li>Facebook AD</li>
<li>Google AD</li>
</ul>
{/ACCESS}</div>
{RAW{say=email,access=500}} {RAW}"
}
[1]=>
array(1) {
[0]=>
string(6) "ACCESS"
}
[2]=>
array(1) {
[0]=>
string(209) "type=500}}
<ul>
<li>SEO work for an existing website or new</li>
<li>Bulk SMS</li>
<li>E-currency exchange</li>
<li>Facebook AD</li>
<li>Google AD</li>
</ul>
{/ACCESS}</div>
{RAW{say=email,access=500}"
}
[3]=>
array(1) {
[0]=>
string(1) " "
}
[4]=>
array(1) {
[0]=>
string(4) "{RAW"
}
}
Which is actually not what i needed to retrieve.
Once again, I want to capture the COMMAND_NAME
, the attributes only if provided, the text in between if the {COMMAND_NAME}
has a closing {/COMMAND_NAME}
and the closing {/COMMAND_NAME}
if provided. That means the command can be inline {COMMAND_NAME}
, or not {COMMAND_NAME}
some strings {/COMMAND_NAME}
, has an attribute {COMMAND_NAME{attr1=value attr2=value2}}
or not.
回答1:
This regex will work as you specified:
$regex = '~
#opening tag
\{(RAW|ACCESS|DWNLINK|MODL|\w+)
#optional attributes
(?>
\{ ([^}]*) }
)?
}
#optional text and closing tag
(?:
( #text:= any char except "{", or a "{" not followed by /commandname
[^{]*+
(?>\{(?!/?\1[{}])[^{]*)*+
)
#closing tag
( \{/\1} )
)?
~ix';
regex101 demo
Compared to what you had:
First of all, I used the /x
modifier (at the end), which ignores whitespace and #comments
.
In the opening tag, I used your options, but you may as well use \w+
to match any command name:
\{(RAW|ACCESS|DWNLINK|MODL|\w+)
For the optional attributes, you had [\{]{0,1}([\w\W\s]*?)\}{0}
, which was avalid attempt to make every part optional. Instead, I'm using a (?> group )?
(See non-capturing groups and atomic groups) to make the whole subpattern optional (with the ?
quantifier).
(?>
\{ ([^}]*) }
)?
The same logic is applied to the text and closing tag, to make it optional.
You were using [\w\s]+
to match the text, which matches word characters and whitespace, but fails to match punctuation and other characters. I could have used .*?
and it would work just as fine. However, I used the following construct, which matches the same, but performs better:
( #text:= any char except "{", or a "{" not followed by /commandname
[^{]*+
(?>\{(?!/?\1[{}])[^{]*)*?
)
And finally, I'm matching the closing tag using \1
, which is a backreference to the text matched in group 1 (the opening tag name):
\{/\1}
Assumptions:
- An attribute does not have a closing brace in quotes such as
"te}xt"
that could make it break.
来源:https://stackoverflow.com/questions/33841196/how-to-match-text-inside-starting-and-closing-curly-brace-the-tags-and-the-spec