问题
I have duplicate words in csv. And i need to count it in such way:
jsmith
jsmith
kgonzales
shouston
dgenesy
kgonzales
jsmith
to this:
jsmith@email.com
jsmith1@email.com
kgonzales@email.com
shouston@email.com
dgenesy@email.com
kgonzales1@email.com
jsmith2@email.com
I have smth like that, but it doesn't work properly for me..or i cant do it enter link description here
回答1:
A simple way to do it is maintain an array using the username as the index and increment it each time you read a user, e.g.
awk '{ print (($1 in a) ? $1 a[$1] : $1) "@email.com"; a[$1]++ }' file
The ternary (($1 in a) ? $1 a[$1] : $1)
just checks if the user in in a[]
yet, and if so uses the name plus the value of the array $1 a[$1]
if the user is not in the array, then it just uses the user $1
. The result of the ternary is concatenated with "@email.com"
to complete the output.
Lastly, the value for the array element for the user is incremented, a[$1]++
.
Example Use/Output
With your names in a file called users
you would have:
$ awk '{ print (($1 in a) ? $1 a[$1] : $1) "@email.com"; a[$1]++ }' users
jsmith@email.com
jsmith1@email.com
kgonzales@email.com
shouston@email.com
dgenesy@email.com
kgonzales1@email.com
jsmith2@email.com
To Keep E-mail In Input File
If your input already contains an e-mail at the end of the username, then you simply want to output that record and skip to the next record, e.g.
awk '$1~/@/{print; next} { print (($1 in a) ? $1 a[$1] : $1) "@email.com"; a[$1]++ }' users
That will preserve e.meeks@example.or
from your comment.
Example Input
jsmith
jsmith
kgonzales
shouston
e.meeks@example.org
dgenesy
kgonzales
jsmith
Example Output
jsmith@email.com
jsmith1@email.com
kgonzales@email.com
shouston@email.com
e.meeks@example.org
dgenesy@email.com
kgonzales1@email.com
jsmith2@email.com
回答2:
Could you please try following, written and tested with shown samples.
awk '{print $0 (arr[$0]++)"@email.com"}' Input_file
Simple explanation is printing current line($0) along with an array named arr with index of current line with its increasing count of 1 each time cursor comes here, then printing @email.com which makes output look alike shown output as per OP.
来源:https://stackoverflow.com/questions/65836903/renumbering-duplicate-lines-with-counter-awk