问题
Okay i will reupdate this
I have 2 files - File1.txt , File2.txt
File1 is base template
File2 is having status result
file1.txt
N1,N2,N3,N4,N5,N6
XX,ZZ,XC,EE,RR,BB
XC,CF,FG,RG,GH,GH
file2.txt
DF,GH,MH,FR,FG,GH,NA
XX,ZZ,XC,EE,RR,BB,OK
Below command compares column 1 in both files if it matches then it retrieves the value from 7th cell in file2 and appends in file1.txt as last column with new header.
if not found NA is updated .
Command used :
awk -F '
FNR==NR { a[$1]=$7; next }
FNR==1 { print $0; len=length($0); next }
{
printf $0
cont=(($1 in a) ? ","a[$1] : ",NA")
for ( i=length($0)+1; i<=len-length(cont); i++)
printf " "
print cont
}
' file2.txt file1.txt > tmp &&
Day1 - After running above command
N1,N2,N3,N4,N5,N6,D1
XX,ZZ,XC,EE,RR,BB,OK
XC,CF,FG,RG,GH,GH,NA
Day 2 - After running above command
N1,N2,N3,N4,N5,N6,D1,D2
XX,ZZ,XC,EE,RR,BB,OK,OK
XC,CF,FG,RG,GH,GH,NA,NA
At Day3 i inserted a new row in File1 at bottom
N1,N2,N3,N4,N5,N6,D1,D2
XX,ZZ,XC,EE,RR,BB,OK,OK
XC,CF,FG,RG,GH,GH,NA,NA
DM,LC,VF,GR,GH,ES
now when i run above command on Day3 , i need output like this
N1,N2,N3,N4,N5,N6,D1,D2,D3
XX,ZZ,XC,EE,RR,BB,OK,OK,OK
XC,CF,FG,RG,GH,GH,NA,NA,NA
DM,LC,VF,GR,GH,ES,,,NA
回答1:
This awk
script seems to do the job:
awk -F, '
BEGIN { OFS = FS }
FNR==NR { a[$1] = $7; next }
FNR==1 { n1 = n = NF + 1; $n = "D" (n-6); print; next }
{ $n1 = ($1 in a) ? a[$1] : "NA"; print }
' file2.txt file1.txt
OFS is the output field separator; FS is the (input) field separator. Both are set to ,
, FS by the -F
option and OFS by the assignment. This makes it easy to get the correct number of fields in the output. awk
's string concatenation with no operator, exemplified by "D" (n-6)
is slightly weird; you get used to it, up to a point, but it still looks a little odd.
Example
The example run uses a program ow
that has the synopsis:
ow file cmd …args…
It preserves the contents of the file by having the cmd …args…
write to a temporary file, and if the command succeeds (exit status 0) and the output is not empty, it then preserves a copy of the original, ignores a number of signals, and then copies the temporary output over the original and cleans up. It is rather useful — code at the bottom. This is how I did my test. Clearly, I could use tmp=$(mktemp tmp.XXXXXX); awk … file1.txt > $tmp; mv $tmp file1.txt
instead, or something along those lines. However, since I have ow
, I use it.
$ cat file1.txt
N1,N2,N3,N4,N5,N6
XX,ZZ,XC,EE,RR,BB
XC,CF,FG,RG,GH,GH
$ ow file1.txt awk -F, '
> BEGIN { OFS = FS }
> FNR==NR { a[$1] = $7; next }
> FNR==1 { n1 = n = NF + 1; $n = "D" (n-6); print; next }
> { $n1 = ($1 in a) ? a[$1] : "NA"; print }
> ' file2.txt file1.txt
$ cat file1.txt
N1,N2,N3,N4,N5,N6,D1
XX,ZZ,XC,EE,RR,BB,OK
XC,CF,FG,RG,GH,GH,NA
$ ow file1.txt awk -F, '
> BEGIN { OFS = FS }
> FNR==NR { a[$1] = $7; next }
> FNR==1 { n1 = n = NF + 1; $n = "D" (n-6); print; next }
> { $n1 = ($1 in a) ? a[$1] : "NA"; print }
> ' file2.txt file1.txt
$ cat file1.txt
N1,N2,N3,N4,N5,N6,D1,D2
XX,ZZ,XC,EE,RR,BB,OK,OK
XC,CF,FG,RG,GH,GH,NA,NA
$ echo DM,LC,VF,GR,GH,ES >> file1.txt
$ ow file1.txt awk -F, '
> BEGIN { OFS = FS }
> FNR==NR { a[$1] = $7; next }
> FNR==1 { n1 = n = NF + 1; $n = "D" (n-6); print; next }
> { $n1 = ($1 in a) ? a[$1] : "NA"; print }
> ' file2.txt file1.txt
$ cat file1.txt
N1,N2,N3,N4,N5,N6,D1,D2,D3
XX,ZZ,XC,EE,RR,BB,OK,OK,OK
XC,CF,FG,RG,GH,GH,NA,NA,NA
DM,LC,VF,GR,GH,ES,,,NA
$
Note that as you assign to $i
and i
is larger than NF was, NF increases, and any missing fields are created as empty fields.
The first working version of this script had more complex logic, with a loop creating the empty fields, but since awk
will do that automatically, the script simplified considerably. You'll often find that with a bit of time and care, initial solutions can be simplified and cleaned up.
However, it is probably also relevant to point out that this code is very trusting. It doesn't ensure that there are exactly 7 fields in file2.txt
. It doesn't check that each line in file1.txt
has either the same number of fields as the first line in the file or exactly 6 fields. If you supply screwy data in, you get screwy data out — the age-old GIGO principle: Garbage In, Garbage Out.
ow
: "@(#)$Id: ow.sh,v 1.6 2005/06/30 18:14:08 jleffler Exp $"
#
# Overwrite file
# From: The UNIX Programming Environment by Kernighan and Pike
# Amended: remove PATH setting; handle file names with blanks.
case $# in
0|1) echo "Usage: $0 file command [arguments]" 1>&2
exit 1;;
esac
file="$1"
shift
new=${TMPDIR:-/tmp}/ovrwr.$$.1
old=${TMPDIR:-/tmp}/ovrwr.$$.2
trap "rm -f '$new' '$old' ; exit 1" 0 1 2 15
if "$@" >"$new"
then
cp "$file" "$old"
trap "" 1 2 15
cp "$new" "$file"
rm -f "$new" "$old"
trap 0
exit 0
else
echo "$0: $1 failed - $file unchanged" 1>&2
rm -f "$new" "$old"
trap 0
exit 1
fi
Adding date instead of Dn to heading
Is it possible that
awk
can print a date in the header instead of D1?
If you want the current date added, you have two main options. One, using GNU gawk (often also installed as awk), then the time functions make it easy. Failing that, awk -v date=$(date +'%Y-%m-%d') -F, …
has the system command date format a value and pass it into the awk script as variable date, which you can then print where you want it. If you want arbitrary dates passed in, then the second mechanism is the one to use.
awk -F, -v date=$(date +'%Y-%m-%d') '
BEGIN { OFS = FS }
FNR==NR { a[$1] = $7; next }
FNR==1 { n1 = n = NF + 1; $n = date; print; next }
{ $n1 = ($1 in a) ? a[$1] : "NA"; print }
' file2.txt file1.txt
That forces today's date into the command. You can also do things prospectively or retrospectively, such as:
tmp=$(mktemp coladd.XXXXXXXXX)
trap "rm -f $tmp; exit 1" 0 1 2 3 13 15
for dd in $(seq 1 31)
do
awk -F, -v date="2014-12-$dd" '
BEGIN { OFS = FS }
FNR==NR { a[$1] = $7; next }
FNR==1 { n1 = n = NF + 1; $n = date; print; next }
{ $n1 = ($1 in a) ? a[$1] : "NA"; print }
' file2.txt file1.txt > $tmp
mv $tmp file1.txt
done
Given this extra flexibility, I'd recommend using the externally-defined date over GNU's internal date manipulating functions, but YMMV.
来源:https://stackoverflow.com/questions/27572019/comparing-two-text-files-printing-result-in-new-header