search duplicate element array

∥☆過路亽.° 提交于 2020-01-24 12:25:21

问题


This one work:

arr[0]="XX1 1"
arr[1]="XX2 2" 
arr[2]="XX3 3"
arr[3]="XX4 4"
arr[4]="XX5 5"
arr[5]="XX1 1"
arr[6]="XX7 7"
arr[7]="XX8 8"

duplicate() { printf '%s\n' "${arr[@]}" | sort -cu |& awk -F: '{ print $5 }'; }

duplicate_match=$(duplicate)

echo "array: ${arr[@]}"

# echo "duplicate: $duplicate_match"

[[ ! $duplicate_match ]] || { echo "Found duplicate:$duplicate_match"; exit 0; }

echo "no duplicate"

with same code, this one doesn't work, why ?

arr[0]="XX"
arr[1]="wXyz" 
arr[2]="ABC"
arr[3]="XX"

回答1:


To check duplicates this code is much simpler and works in both cases:

uniqueNum=$(printf '%s\n' "${arr[@]}"|awk '!($0 in seen){seen[$0];c++} END {print c}')

(( uniqueNum != ${#arr[@]} )) && echo "Found duplicates"

EDIT: To print duplicates use this awk:

printf '%s\n' "${arr[@]}"|awk '!($0 in seen){seen[$0];next} 1'

Awk command stores in an array seen if a line isn't already part of seen array and next move to the next line. 1 in the end prints only those lines that are duplicates.




回答2:


Slightly silly solution here. I just wanted to see if I could do this in a single command without explicit pipes. (I think for very large arrays/array elements, explicit pipes might actually be more efficient.)

Note that this is a test for the presence of duplicate array elements, and doesn't output the duplicates themselves, although the awk command on its own will do that. Also note that if you're unlucky enough to have array elements that contain spaces, the below won't evaluate as described.

[[ $( awk -v RS=" " ' a[$0]++ ' <<< "${arr[@]} " ) ]] && echo "dups found"

Explanation:

awk -v RS=" "

  • do the subsequent awk command on each input record with space as the record separator. Basically, this will make awk treat each array element as a separate "line".

' a[$0]++ '

  • awk command that does two things:

    • return at the value at key $0 in array a. If this is greater than 0, print the line. Compare to awk ' { $1=$2 } 1 '

    • Add 1 to the value at key $0 in array a.

<<< "${arr[@]} "

  • as the input of the awk command, use the string created when you print each element in arr as a separate word, i.e. separated by space PLUS AN ADDITIONAL SPACE AT THE END.

  • The space between } and " is actually really important, because without it the final array element will not have a space after it and therefore will not be counted as a distinct "record" by awk.

[[ $( ... ) ]]

  • If the containing awk command gives any output at all, the test evaluates to 0, i.e. TRUE.


来源:https://stackoverflow.com/questions/22055238/search-duplicate-element-array

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!