regexp match within a log file, return dynamic content above and below match

后端 未结 4 1729
Happy的楠姐
Happy的楠姐 2020-12-12 03:06

I have some catchall log files in a format as follows:

timestamp event summary
foo details
account name: userA
bar more details
timestamp event summary
baz d         


        
相关标签:
4条回答
  • 2020-12-12 03:37

    Below there is a pure Batch solution that does not use grep. It locates timestamp lines because the "summary" word that must not exist in other lines, but this word may be changed for another one if needed.

    EDIT: I changed the word that identify timestamp lines to "Auth."; I also changed FINDSTR seek to ignore case. This is the new version:

    @echo off
    setlocal EnableDelayedExpansion
    
    :parselog <username> <logfile>
    echo Searching %~2 for records containing %~1...
    
    set n=0
    set previousMatch=Auth.
    for /F "tokens=1* delims=:" %%a in ('findstr /I /N "Auth\. %~1" %2') do (
       set currentMatch=%%b
       if "!previousMatch:Auth.=!" neq "!previousMatch!" (
          if "!currentMatch:Auth.=!" equ "!currentMatch!" (
             set /A n+=1
             set /A skip[!n!]=!previousLine!-1
          )
       ) else (
          set /A end[!n!]=%%a-1
       )
       set previousLine=%%a
       set previousMatch=%%b
    )
    if %n% equ 0 (
       echo No records found
       goto :EOF
    )
    
    if not defined end[%n%] set end[%n%]=-1
    set i=1
    :nextRecord
       echo/
       echo ---------------start of record %i%-------------
       if !skip[%i%]! equ 0 (
          set skip=
       ) else (
          set skip=skip=!skip[%i%]!
       )
       set end=!end[%i%]!
       for /F "%skip% tokens=1* delims=:" %%a in ('findstr /N "^" %2') do (
          echo(%%b
          if %%a equ %end% goto endOfRecord
       )
       :endOfRecord
       echo ---------------end of record %i%-------------
       set /A i+=1
    if %i% leq %n% goto nextRecord
    

    Example command:

    C:>test user6q catch-all.log
    

    Result:

    Searching catch-all.log for records containing user6q...
    
    ---------------start of record 1-------------
    2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security    11730159    Mon Mar 25 08:02:29 2013    680 Security    NT AUTHORITY\SYSTEM N/A Audit Failure   dc3 9   Logon attempt by:   MICROSOFT_AUTHENTICATION_PACKAGE_V1_0
    
    Logon account:  USER6Q
    
    Source Workstation: dc3
    
    Error Code: 0xC0000234
    ---------------end of record 1-------------
    
    ---------------start of record 2-------------
    2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security    11730160    Mon Mar 25 08:02:29 2013    539 Security    NT AUTHORITY\SYSTEM N/A Audit Failure   dc3 2   Logon Failure:
    
        Reason:     Account locked out
    
        User Name:  USER6Q@MYDOMAIN.TLD
    
        Domain: MYDOMAIN
    
        Logon Type: 3
    
        Logon Process:  Advapi  
    
        Authentication Package: Negotiate
    
        Workstation Name:   dc3
    
        Caller User Name:   dc3$
    
        Caller Domain:  MYDOMAIN
    
        Caller Logon ID:    (0x0,0x3E7)
    
        Caller Process ID: 400
    
        Transited Services: -
    
        Source Network Address: 169.254.7.89
    
        Source Port:    55314
    ---------------end of record 2-------------
    

    This method use just one execution of findstr command to locate all matching records, and then one additional findstr command to show each record. Note that first for /F ... command works over findstr "Auth. user.." results, and the second for /F command have a "skip=N" option and a GOTO that break the loop as soon as the record was displayed. This mean that FOR commands does not slow down the program; the speed of this program depends on the speed of FINDSTR command.

    However, it is possible that the second for /F "%skip% ... in ('findstr /N "^" %2') command take too long because the size of FINDSTR output result before it is processed by the FOR. If this happen, we could modify the second FOR by another faster method (an asynchronous pipe that will be break, for example). Please, report the result.

    Antonio

    0 讨论(0)
  • 2020-12-12 03:43

    Here's my effort:

    @ECHO OFF
    SETLOCAL
    ::
    :: Target username
    ::
    SET target=%1
    CALL :zaplines
    SET count=0
    FOR /f "delims=" %%I IN (rojoslog.txt) DO (
      ECHO.%%I| findstr /r "^20[0-9][0-9]-[0-9][0-9]-[0-9][0-9].[0-9][0-9]:[0-9][0-9]:[0-9][0-9]" >NUL
      IF NOT ERRORLEVEL 1 (
        IF DEFINED founduser CALL :report
        CALL :zaplines
      )
      (SET stored=)
      FOR /l %%L IN (1000,1,1200) DO IF NOT DEFINED stored IF NOT DEFINED line%%L (
        SET line%%L=%%I
        SET stored=Y
       )
      ECHO.%%I|FINDSTR /b /e /i /c:"account name: %target%" >NUL
      IF NOT ERRORLEVEL 1 (SET founduser=Y)
    )
    IF DEFINED founduser CALL :report
    GOTO :eof
    
    ::
    :: remove all envvars starting 'line'
    :: Set 'not found user' at same time
    ::
     :zaplines
    (SET founduser=)
    FOR /f "delims==" %%L IN ('set line 2^>nul') DO (SET %%L=)
    GOTO :eof
    
    :report
    IF NOT DEFINED line1000 GOTO :EOF 
    SET /a count+=1
    ECHO.
    ECHO.---------- START of record %count% ----------
    FOR /l %%L IN (1000,1,1200) DO IF DEFINED line%%L CALL ECHO.%%line%%L%%
    ECHO.----------- END of record %count% -----------
    GOTO :eof
    
    0 讨论(0)
  • 2020-12-12 03:51

    This is all you need with GNU awk (for IGNORECASE):

    $ cat tst.awk
    function prtRecord() {
        if (record ~ regexp) {
            printf "-------- start of record %d --------%s", ++numRecords, ORS
            printf "%s", record
            printf "--------- end of record %d ---------%s%s", numRecords, ORS, ORS
        }
        record = ""
    }
    BEGIN{ IGNORECASE=1 }
    /^[[:digit:]]+-[[:digit:]]+-[[:digit:]]+/ { prtRecord() }
    { record = record $0 ORS }
    END { prtRecord() }
    

    or with any awk:

    $ cat tst.awk
    function prtRecord() {
        if (tolower(record) ~ tolower(regexp)) {
            printf "-------- start of record %d --------%s", ++numRecords, ORS
            printf "%s", record
            printf "--------- end of record %d ---------%s%s", numRecords, ORS, ORS
        }
        record = ""
    }
    /^[[:digit:]]+-[[:digit:]]+-[[:digit:]]+/ { prtRecord() }
    { record = record $0 ORS }
    END { prtRecord() }
    

    Either way you'd run it on UNIX as:

    $ awk -v regexp=user6q -f tst.awk file
    

    I don't know the Windows syntax but I expect it's very similar if not identical.

    Note the use of tolower() in the script to make both sides of the comparison lower case so the match is case-insensitive. If you can instead pass in a search regexp that's the correct case, then you don't need to call tolower() on either side of the comparison. nbd, it might just speed the script up slightly.

    $ awk -v regexp=user6q -f tst.awk file
    -------- start of record 1 --------
    2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security
        11730159    Mon Mar 25 08:02:29 2013    680 Security    NT AUTHORITY\SYSTEM N/A Audit Failure
    dc3 9   Logon attempt by:   MICROSOFT_AUTHENTICATION_PACKAGE_V1_0
    
    Logon account:  USER6Q
    
    Source Workstation: dc3
    
    Error Code: 0xC0000234
    --------- end of record 1 ---------
    
    -------- start of record 2 --------
    2013-03-25 08:02:32 Auth.Critical   169.254.8.110   Mar 25 08:02:32 dc3 MSWinEventLog   2   Security
        11730160    Mon Mar 25 08:02:29 2013    539 Security    NT AUTHORITY\SYSTEM N/A Audit Failure
    dc3 2   Logon Failure:
    
        Reason:     Account locked out
    
        User Name:  USER6Q@MYDOMAIN.TLD
    
        Domain: MYDOMAIN
    
        Logon Type: 3
    
        Logon Process:  Advapi
    
        Authentication Package: Negotiate
    
        Workstation Name:   dc3
    
        Caller User Name:   dc3$
    
        Caller Domain:  MYDOMAIN
    
        Caller Logon ID:    (0x0,0x3E7)
    
        Caller Process ID: 400
    
        Transited Services: -
    
        Source Network Address: 169.254.7.89
    
        Source Port:    55314
    --------- end of record 2 ---------
    
    0 讨论(0)
  • 2020-12-12 03:52

    I think awk is all you need:

    awk "/---start of record---/,/---end of record---/ {print}" logfile
    

    That's all you need if the first line indicator is:

    ---start of record---
    

    and the last is:

    ---end of record---
    

    Notice that there is no middle-pattern matching, that "," is just a separator for both regexps.

    0 讨论(0)
提交回复
热议问题