Regular expression with multiple results

僤鯓⒐⒋嵵緔 提交于 2019-12-12 04:47:56

问题


What's wrong with my regex ?

"/Blabla\(2\)&nbsp;:.*<tr><td class=\"generic\">(.*)<\/td>.+<\/tr>/Uis"

....

<tr>
<td class="aaa">Blabla(1)&nbsp;:</td>
<td>
<table class="bbb"><tbody>
<tr class="ccc"><th>title1</th><th>title2</th><th>title3</th></tr>
<tr><td class="generic">word1</td><td class="generic">word2 </td><td class="generic">word3</td></tr>
<tr><td class="generic">word4</td><td class="generic">word5 </td><td class="generic">word6</td></tr>
</tbody></table>
</td>
</tr>

<tr>
<td class="aaa">Blabla(2)&nbsp;:</td>
<td>
<table class="bbb"><tbody>
<tr class="ccc"><th>title1</th><th>title2</th><th>title3</th></tr>
<tr><td class="generic">word1b</td><td class="generic">word2b </td><td class="generic">word3b</td></tr>
<tr><td class="generic">word4b</td><td class="generic">word5b </td><td class="generic">word6b</td></tr>
</tbody></table>
</td>
</tr

What I want to do is to get the content of the FIRST TD of each TR from the block beginning with Blabla(2).

So the expected answer is word1b AND word4b But only the first is returned...

Thank you for your help. Please don't answer me to use a DOM navigator, it's not possible in my case.


回答1:


That's an interesting regex, in which I learned about the ungreedy flag, nice!

And for your problem, you might make use of \G to match immediately after the previous match and the flag g, assuming PCRE engine:

/(?:Blabla\(2\)&nbsp;:|(?<!^)\G).*<tr><td class=\"generic\">(.*)<\/td>.+<\/tr>/Uisg

regex101 demo

Or a little shorter with different delimiters:

'~(?:Blabla\(2\)&nbsp;:|(?<!^)\G).*<tr><td class="generic">(.*)</td>.+</tr>~Uisg'



回答2:


Thanks to @Jerry, I learn today new tricks:

(Blabla\(2\)&nbsp;:.*?|\G)<tr><td class=\"generic\">\K([^<]+).+?<\/tr>\r\n


来源:https://stackoverflow.com/questions/19142687/regular-expression-with-multiple-results

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!