问题
This one work:
arr[0]="XX1 1"
arr[1]="XX2 2"
arr[2]="XX3 3"
arr[3]="XX4 4"
arr[4]="XX5 5"
arr[5]="XX1 1"
arr[6]="XX7 7"
arr[7]="XX8 8"
duplicate() { printf '%s\n' "${arr[@]}" | sort -cu |& awk -F: '{ print $5 }'; }
duplicate_match=$(duplicate)
echo "array: ${arr[@]}"
# echo "duplicate: $duplicate_match"
[[ ! $duplicate_match ]] || { echo "Found duplicate:$duplicate_match"; exit 0; }
echo "no duplicate"
with same code, this one doesn't work, why ?
arr[0]="XX"
arr[1]="wXyz"
arr[2]="ABC"
arr[3]="XX"
回答1:
To check duplicates this code is much simpler and works in both cases:
uniqueNum=$(printf '%s\n' "${arr[@]}"|awk '!($0 in seen){seen[$0];c++} END {print c}')
(( uniqueNum != ${#arr[@]} )) && echo "Found duplicates"
EDIT: To print duplicates use this awk:
printf '%s\n' "${arr[@]}"|awk '!($0 in seen){seen[$0];next} 1'
Awk command stores in an array seen
if a line isn't already part of seen
array and next move to the next line. 1
in the end prints only those lines that are duplicates.
回答2:
Slightly silly solution here. I just wanted to see if I could do this in a single command without explicit pipes. (I think for very large arrays/array elements, explicit pipes might actually be more efficient.)
Note that this is a test for the presence of duplicate array elements, and doesn't output the duplicates themselves, although the awk
command on its own will do that. Also note that if you're unlucky enough to have array elements that contain spaces, the below won't evaluate as described.
[[ $( awk -v RS=" " ' a[$0]++ ' <<< "${arr[@]} " ) ]] && echo "dups found"
Explanation:
awk -v RS=" "
- do the subsequent
awk
command on each input record with space as the record separator. Basically, this will makeawk
treat each array element as a separate "line".
' a[$0]++ '
awk
command that does two things:return at the value at key
$0
in arraya
. If this is greater than 0, print the line. Compare toawk ' { $1=$2 } 1 '
Add 1 to the value at key
$0
in arraya
.
<<< "${arr[@]} "
as the input of the
awk
command, use the string created when you print each element inarr
as a separate word, i.e. separated by space PLUS AN ADDITIONAL SPACE AT THE END.The space between
}
and"
is actually really important, because without it the final array element will not have a space after it and therefore will not be counted as a distinct "record" byawk
.
[[ $( ... ) ]]
- If the containing
awk
command gives any output at all, the test evaluates to0
, i.e. TRUE.
来源:https://stackoverflow.com/questions/22055238/search-duplicate-element-array