Emulation of lex like functionality in Perl or Python

前端未结

关注

 8  2110

梦毁少年i 2021-01-13 23:46

Here\'s the deal. Is there a way to have strings tokenized in a line based on multiple regexes?

One example:

I have to get all href tags, their corresponding

8条回答

滥情空心 (楼主)

2021-01-14 00:10

From perlop:

A useful idiom for lex -like scanners is /\G.../gc . You can combine several regexps like this to process a string part-by-part, doing different actions depending on which regexp matched. Each regexp tries to match where the previous one leaves off.
 LOOP:
    {
      print(" digits"),       redo LOOP if /\G\d+\b[,.;]?\s*/gc;
      print(" lowercase"),    redo LOOP if /\G[a-z]+\b[,.;]?\s*/gc;
      print(" UPPERCASE"),    redo LOOP if /\G[A-Z]+\b[,.;]?\s*/gc;
      print(" Capitalized"),  redo LOOP if /\G[A-Z][a-z]+\b[,.;]?\s*/gc;
      print(" MiXeD"),        redo LOOP if /\G[A-Za-z]+\b[,.;]?\s*/gc;
      print(" alphanumeric"), redo LOOP if /\G[A-Za-z0-9]+\b[,.;]?\s*/gc;
      print(" line-noise"),   redo LOOP if /\G[^A-Za-z0-9]+/gc;
      print ". That's all!\n";
    }

0 讨论(0)

查看其它8个回答