I have a file called domain which contains some domains. For example:
google.com
facebook.com
...
yahoo.com
And I have ano
One way using an awk script:
BEGIN {
FS = "[. ]"
OFS = "."
}
FNR == NR {
domain[$1] = $0
next
}
FNR < NR {
if ($2 in domain) {
for ( i = 2; i < NF; i++ ) {
if ($i != "") {
line = (line ? line OFS : "") $i
}
}
total[line] += $NF
line = ""
}
}
END {
for (i in total) {
printf "%s\t%s\n", i, total[i]
}
}
Run like:
awk -f script.awk domain.txt site.txt
Results:
facebook.com 37
google.com 18