Bash script that analyzes report files

ぃ、小莉子 提交于 2019-12-02 02:04:30

As suggested by Dave Jarvis, awk will:

  • handle this better than bash
  • is fairly easy to learn
  • likely available wherever bash is available

I've never had to look farther than The AWK Manual.

It would make things easier if you used a consistent field separator for both the list of column names and the data. Perhaps you could do some pre-processing in a bash script using sed before feeding to awk. Anyway, take a look at multi-dimensional arrays and reading multiple lines in the manual.

Below is a working awk implementation that uses it's pseudo multidimensional arrays. I've included sample output to show you how it looks. I took the liberty to add a 'Count' column to denote how many times a certain "Issue" was hit for a given Error Code

#!/bin/bash

awk '
 /Error Code for Issue/ {
   errCode[currCode=$5]=$5
 }
 /^ +[0-9-]+$/ {
   split($0, tmpArr, "-")
   error[errCode[currCode],tmpArr[1]]++
 }
 END {
   for (code in errCode) {
     printf("Error Code: %s\n", code)
     for (item in error) {
       split(item, subscr, SUBSEP)
       if (subscr[1] == code) {
         printf("\tIssue: %s\tCount: %s\n", subscr[2], error[item])
       }
     }
   }
 }
' *_report*.txt

Output

$ ./report.awk
Error Code: B
        Issue:    1212  Count: 3
Error Code: X
        Issue:    2211  Count: 1
        Issue:    1143  Count: 2
Error Code: Y
        Issue:    2961  Count: 1
        Issue:    6666  Count: 1
        Issue:    5555  Count: 2
        Issue:    5911  Count: 1
        Issue:    4949  Count: 1
Error Code: Z
        Issue:    2222  Count: 1
        Issue:    1111  Count: 1
        Issue:    2323  Count: 2
        Issue:    3333  Count: 1
        Issue:    1212  Count: 1

Bash has one-dimensional arrays that are indexed by integers. Bash 4 adds associative arrays. That's it for data structures. AWK has one dimensional associative arrays and fakes its way through two dimensional arrays. If you need some kind of data structure more advanced than that, you'll need to use Python, for example, or some other language.

That said, here's a rough outline of how you might parse the data you've shown.

#!/bin/bash    

# methods
analyzeStructuralErrors()
{ 
    local f=$1
    local Xpat="Error Code for Issue X"
    local notXpat="Error Code for Issue [^X]"
    while read -r line
    do
        if [[ $line =~ $Xpat ]]
        then
            flag=true
        elif [[ $line =~ $notXpat ]]
        then
            flag=false
        elif $flag && [[ $line =~ , ]]
        then
            # columns could be overwritten if there are more than one X section
            IFS=, read -ra columns <<< "$line"
        elif $flag && [[ $line =~ - ]]
        then
            issues+=(line)
        else
            echo "unrecognized data line"
            echo "$line"
        fi
    done

    for issue in ${issues[@]}
    do
        IFS=- read -ra array <<< "$line"
        # do something with ${array[0]}, ${array[1]}, etc.
        # or iterate
        for field in ${array[@]}
        do
            # do something with $field
        done
    done
}

# main
find . -name "*_report*.txt" | while read -r f
do
    echo "Processing $f"
    analyzeStructuralErrors "$f"
done
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!