Extracting multiple strings using Pythons's regular expression

后端未结

关注

 3  494

北荒 2021-01-27 21:42

I have a log file having the following output and I have shortened it as it goes to thousands of lines:

Time = 1

smoothSolver:  Solving for Ux, Initial residual


      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   不要未来只要你来
                                             
                
                
                (楼主)
            
              
              
                2021-01-27 22:29
              

            
            
                        
You can do this with a regex, assuming that your log format is the same for all of your entries. The explanation of what is going on is below:

import re

s = """Time = 1

smoothSolver:  Solving for Ux, Initial residual = 0.230812, Final residual = 0.0134171, No Iterations 2
smoothSolver:  Solving for Uy, Initial residual = 0.283614, Final residual = 0.0158797, No Iterations 3
smoothSolver:  Solving for Uz, Initial residual = 0.190444, Final residual = 0.016567, No Iterations 2
GAMG:  Solving for p, Initial residual = 0.0850116, Final residual = 0.00375608, No Iterations 3
time step continuity errors : sum local = 0.00999678, global = 0.00142109, cumulative = 0.00142109
smoothSolver:  Solving for omega, Initial residual = 0.00267604, Final residual = 0.000166675, No Iterations 3
bounding omega, min: -26.6597 max: 18468.7 average: 219.43
smoothSolver:  Solving for k, Initial residual = 1, Final residual = 0.0862096, No Iterations 2
ExecutionTime = 4.84 s  ClockTime = 5 s

Time = 2

smoothSolver:  Solving for Ux, Initial residual = 0.230812, Final residual = 0.0134171, No Iterations 2
smoothSolver:  Solving for Uy, Initial residual = 0.283614, Final residual = 0.0158797, No Iterations 3
smoothSolver:  Solving for Uz, Initial residual = 0.190444, Final residual = 0.016567, No Iterations 2
GAMG:  Solving for p, Initial residual = 0.0850116, Final residual = 0.00375608, No Iterations 3
time step continuity errors : sum local = 0.00999678, global = 0.00142109, cumulative = 0.00123456
smoothSolver:  Solving for omega, Initial residual = 0.00267604, Final residual = 0.000166675, No Iterations 3
bounding omega, min: -26.6597 max: 18468.7 average: 219.43
smoothSolver:  Solving for k, Initial residual = 1, Final residual = 0.0862096, No Iterations 2
ExecutionTime = 4.84 s  ClockTime = 5 s
"""

regex = re.compile("^Time = (\d+?).*?cumulative = (\d{0,10}\.\d{0,10})",re.DOTALL|re.MULTILINE)

for x in re.findall(regex,s):
    print "{} => {}".format(x[0], x[1])




This outputs two results (because I've added two log entries, instead of just the one you provided):

1 => 0.00142109
2 => 0.00123456




What is happening?

The RegEx being utilized is this:

^Time = (\d+?).*?cumulative = (\d{0,10}\.\d{0,10})


This Regex is looking for your Time = string at the beginning of the line, and matching the digit that follows. Then it does a non-greedy match to the string cumulative = and captures the digits that follow that. The non-greedy is important, otherwise you'd only get one result in your entire log because it'd match the first instance of Time = and the last instance of cumulative =.

It then prints each result. Each captured result contains the time value and the cumulative value. This portion of the code can be modified to print to a file if required.

This regex works across multiple lines because it utilizes two flags: DOTALL and MULTILINE
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复