Emulation of lex like functionality in Perl or Python

前端 未结 8 2110
梦毁少年i
梦毁少年i 2021-01-13 23:46

Here\'s the deal. Is there a way to have strings tokenized in a line based on multiple regexes?

One example:

I have to get all href tags, their corresponding

8条回答
  •  滥情空心
    2021-01-14 00:10

    From perlop:

    A useful idiom for lex -like scanners is /\G.../gc . You can combine several regexps like this to process a string part-by-part, doing different actions depending on which regexp matched. Each regexp tries to match where the previous one leaves off.

     LOOP:
        {
          print(" digits"),       redo LOOP if /\G\d+\b[,.;]?\s*/gc;
          print(" lowercase"),    redo LOOP if /\G[a-z]+\b[,.;]?\s*/gc;
          print(" UPPERCASE"),    redo LOOP if /\G[A-Z]+\b[,.;]?\s*/gc;
          print(" Capitalized"),  redo LOOP if /\G[A-Z][a-z]+\b[,.;]?\s*/gc;
          print(" MiXeD"),        redo LOOP if /\G[A-Za-z]+\b[,.;]?\s*/gc;
          print(" alphanumeric"), redo LOOP if /\G[A-Za-z0-9]+\b[,.;]?\s*/gc;
          print(" line-noise"),   redo LOOP if /\G[^A-Za-z0-9]+/gc;
          print ". That's all!\n";
        }
    

提交回复
热议问题