Match regex from right to left?

后端未结

关注

 6  449

Is there any way of matching a regex from right to left? What Im looking for is a regex that gets

MODULE WAS INSERTED              EVENT
LOST SIGNAL ON E1/T


                      
              相关标签:


      
      
        
          6条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  再見小時候        
                
              
                            
                2020-12-11 15:43
              
            
            
                                                                       
Can you do field-oriented processing, rather than a regex?  In awk/sh, this would look like:


   < $datafile awk '{ print $(NF-3), $(NF-2) }' | column


which seems rather cleaner than specifying a regex.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  一个人的身影        
                
              
                            
                2020-12-11 15:50
              
            
            
                                                                       
Does the input file fit nicely into fixed width tabular text like this? Because if it does, then the simplest solution is to just take the right substring of each line, from column 56 to column 94.

In Unix, you can use the cut command:

cut -c56-94 yourfile


See also


Wikipedia/Cut (Unix)




In Java, you can write something like this:

String[] lines = {
    "CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40",
    "CLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58",
    "CLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59",
    "CLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32",
};
for (String line : lines) {
    System.out.println(line.substring(56, 94));
}


This prints:

MODULE WAS INSERTED              EVENT
LOST SIGNAL ON E1/T1 LINK        OFF  
CRC ERROR                        EVENT
CLK IS DIFF FROM MASTER CLK SRC  OFF  




A regex solution

This is most likely not necessary, but something like this works (as seen on ideone.com):

line.replaceAll(".*  \\b(.+  .+)   \\S+ \\S+", "$1")


As you can see, it's not very readable, and you have to know your regex to really understand what's going on.

Essentially you match this to each line:

.*  \b(.+  .+)   \S+ \S+


And you replace it with whatever group 1 matched. This relies on the usage of two consecutive spaces exclusively for separating the columns in this table.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  谎友^        
                
              
                            
                2020-12-11 15:50
              
            
            
                                                                       
How about

.{56}(.*(EVENT|OFF))

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  无人及你        
                
              
                            
                2020-12-11 15:51
              
            
            
                                                                       
If tokens are guaranteed to be separated by more than one space and words within the string before EVENT|OFF are guaranteed to be separated by just one space - only then you can look for single-space-separated words followed by spaces followed by EVENT or OFF

var s = "CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40"
        + "\nCLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58"
        + "\nCLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59"
        + "\nCLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32"

var r = /\([0-9]+\).+?((?:[^ ]+ )* +(?:EVENT|OFF))/g;
var m;
while((m = r.exec(s)) != null)
  console.log(m[1]);


Output:

MODULE WAS INSERTED              EVENT
LOST SIGNAL ON E1/T1 LINK        OFF
CRC ERROR                        EVENT
CLK IS DIFF FROM MASTER CLK SRC  OFF




Regex: /\([0-9]+\).+?((?:[^ ]+ )* +(?:EVENT|OFF))/g

\([0-9]+\)       #digits in parentheses followed by  
.+?              #some characters - minimum required (non-greedy)  
(                #start capturing 
(?:[^ ]+ )*      #non-space characters separated by a space  
` +`             #more spaces (separating string and event/off - 
                 #backticks added for emphasis), followed by
(?:EVENT|OFF)    #EVENT or OFF
)                #stop capturing

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  离开以前        
                
              
                            
                2020-12-11 15:53
              
            
            
                                                                       
With regex, you could simply replace this:

^.{56}|.{19}$


with the empty string.

But really, you only need to cut out the string from "position 56" to "string-length - 19" with a substring function. That's easier and much faster than regex.

Here's an example in JavaScript, other languages work more or less the same:

var lines = [
  'CLI MUX trap received: (022) CL-B  MCL-2ETH             MODULE WAS INSERTED              EVENT   07-05-2010 12:08:40',
  'CLI MUX trap received: (090) IO-2  ML-1E1        EX1    LOST SIGNAL ON E1/T1 LINK        OFF     04-06-2010 09:58:58',
  'CLI MUX trap received: (094) IO-2  ML-1E1        EX1    CRC ERROR                        EVENT   04-06-2010 09:58:59',
  'CLI MUX trap received: (009)                            CLK IS DIFF FROM MASTER CLK SRC  OFF     07-05-2010 12:07:32'
];
for (var i=0; i<lines.length; i++) {
  alert( lines[i].substring(56, lines[i].length-19) );
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  被撕碎了的回忆        
                
              
                            
                2020-12-11 16:02
              
            
            
                                                                       
In .NET you could use the RightToLeft option :

Regex RE = new Regex(Pattern, RegexOptions.RightToLeft);
Match theMatch = RE.Match(Source);

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复