Extracting multiple strings using Pythons's regular expression

后端 未结 3 494
北荒
北荒 2021-01-27 21:42

I have a log file having the following output and I have shortened it as it goes to thousands of lines:

Time = 1

smoothSolver:  Solving for Ux, Initial residual         


        
3条回答
  •  不要未来只要你来
    2021-01-27 22:29

    You can do this with a regex, assuming that your log format is the same for all of your entries. The explanation of what is going on is below:

    import re
    
    s = """Time = 1
    
    smoothSolver:  Solving for Ux, Initial residual = 0.230812, Final residual = 0.0134171, No Iterations 2
    smoothSolver:  Solving for Uy, Initial residual = 0.283614, Final residual = 0.0158797, No Iterations 3
    smoothSolver:  Solving for Uz, Initial residual = 0.190444, Final residual = 0.016567, No Iterations 2
    GAMG:  Solving for p, Initial residual = 0.0850116, Final residual = 0.00375608, No Iterations 3
    time step continuity errors : sum local = 0.00999678, global = 0.00142109, cumulative = 0.00142109
    smoothSolver:  Solving for omega, Initial residual = 0.00267604, Final residual = 0.000166675, No Iterations 3
    bounding omega, min: -26.6597 max: 18468.7 average: 219.43
    smoothSolver:  Solving for k, Initial residual = 1, Final residual = 0.0862096, No Iterations 2
    ExecutionTime = 4.84 s  ClockTime = 5 s
    
    Time = 2
    
    smoothSolver:  Solving for Ux, Initial residual = 0.230812, Final residual = 0.0134171, No Iterations 2
    smoothSolver:  Solving for Uy, Initial residual = 0.283614, Final residual = 0.0158797, No Iterations 3
    smoothSolver:  Solving for Uz, Initial residual = 0.190444, Final residual = 0.016567, No Iterations 2
    GAMG:  Solving for p, Initial residual = 0.0850116, Final residual = 0.00375608, No Iterations 3
    time step continuity errors : sum local = 0.00999678, global = 0.00142109, cumulative = 0.00123456
    smoothSolver:  Solving for omega, Initial residual = 0.00267604, Final residual = 0.000166675, No Iterations 3
    bounding omega, min: -26.6597 max: 18468.7 average: 219.43
    smoothSolver:  Solving for k, Initial residual = 1, Final residual = 0.0862096, No Iterations 2
    ExecutionTime = 4.84 s  ClockTime = 5 s
    """
    
    regex = re.compile("^Time = (\d+?).*?cumulative = (\d{0,10}\.\d{0,10})",re.DOTALL|re.MULTILINE)
    
    for x in re.findall(regex,s):
        print "{} => {}".format(x[0], x[1])
    

    This outputs two results (because I've added two log entries, instead of just the one you provided):

    1 => 0.00142109
    2 => 0.00123456
    

    What is happening?

    The RegEx being utilized is this:

    ^Time = (\d+?).*?cumulative = (\d{0,10}\.\d{0,10})
    

    This Regex is looking for your Time = string at the beginning of the line, and matching the digit that follows. Then it does a non-greedy match to the string cumulative = and captures the digits that follow that. The non-greedy is important, otherwise you'd only get one result in your entire log because it'd match the first instance of Time = and the last instance of cumulative =.

    It then prints each result. Each captured result contains the time value and the cumulative value. This portion of the code can be modified to print to a file if required.

    This regex works across multiple lines because it utilizes two flags: DOTALL and MULTILINE

提交回复
热议问题