Here\'s the deal. Is there a way to have strings tokenized in a line based on multiple regexes?
One example:
I have to get all href tags, their corresponding
From perlop:
A useful idiom for lex -like scanners is
/\G.../gc. You can combine several regexps like this to process a string part-by-part, doing different actions depending on which regexp matched. Each regexp tries to match where the previous one leaves off.LOOP: { print(" digits"), redo LOOP if /\G\d+\b[,.;]?\s*/gc; print(" lowercase"), redo LOOP if /\G[a-z]+\b[,.;]?\s*/gc; print(" UPPERCASE"), redo LOOP if /\G[A-Z]+\b[,.;]?\s*/gc; print(" Capitalized"), redo LOOP if /\G[A-Z][a-z]+\b[,.;]?\s*/gc; print(" MiXeD"), redo LOOP if /\G[A-Za-z]+\b[,.;]?\s*/gc; print(" alphanumeric"), redo LOOP if /\G[A-Za-z0-9]+\b[,.;]?\s*/gc; print(" line-noise"), redo LOOP if /\G[^A-Za-z0-9]+/gc; print ". That's all!\n"; }