问题
I'm fooling around with Yahoo! pipes and I'm hitting a wall with some regular expression. Now I'm familiar with regular expressions from Perl but the rules just seem to be different in Yahoo! pipes.
What I'm doing is fetching a page and trying to turn it into a feed, my regex for stripping out the link from the HTML works fine but the title which I want to be what was in <i> tags just outputs the original text.
Sample text that matches in Perl and on this online regexp tester:
<a rel="nofollow" target="_blank" HREF="http://changed.to/protect/the-guilty.html"><i>"Fee Fi Fo Fun" (English Man)</i></a> (See also this other site <a rel="nofollow" target="_blank" href="http://stackoverflow.com">Nada</a>) Some other text here
回答1:
RegEx for the title:
(?i).*?<i>([^<]*).* [ ] g [x] s [ ] m [ ] i
RegEx for the link:
(?i).*?href="([^"]*).* [ ] g [x] s [ ] m [ ] i
Somehow the case-insensitive checkbox seems broken. Luckily you can substitute with (?i)
, which works nicely.
Here is a nice web2.0-ish tool to test regular expressions with: RegExr. But for some reason it's still beta. ;-)
回答2:
One important thing to watch out for with YP is do not trust the debug screen, it has a small quirk of hiding some tags from view that can cause no end of confusion when attempting regexing. To expose any hidden html replace '<' with something like '#'
来源:https://stackoverflow.com/questions/360492/regular-expression-on-yahoo-pipes