I\'m using Ruby 1.8.6 with Rails 1.2.3, and need to determine whether two arrays have the same elements, regardless of whether or not they\'re in the same order. One of the
combining &
and size
may be fast too.
require 'benchmark/ips'
require 'set'
Benchmark.ips do |x|
x.report('sort') { a.sort == b.sort }
x.report('sort!') { a.sort! == b.sort! }
x.report('to_set') { a.to_set == b.to_set }
x.report('minus') { ((a - b) + (b - a)).empty? }
x.report('&.size') { a.size == b.size && (a & b).size == a.size }
end
Calculating -------------------------------------
sort 896.094k (±11.4%) i/s - 4.458M in 5.056163s
sort! 1.237M (± 4.5%) i/s - 6.261M in 5.071796s
to_set 224.564k (± 6.3%) i/s - 1.132M in 5.064753s
minus 2.230M (± 7.0%) i/s - 11.171M in 5.038655s
&.size 2.829M (± 5.4%) i/s - 14.125M in 5.010414s
Speed comparsions
require 'benchmark/ips'
require 'set'
a = [1, 2, 3, 4, 5, 6]
b = [1, 2, 3, 4, 5, 6]
Benchmark.ips do |x|
x.report('sort') { a.sort == b.sort }
x.report('sort!') { a.sort! == b.sort! }
x.report('to_set') { a.to_set == b.to_set }
x.report('minus') { ((a - b) + (b - a)).empty? }
end
Warming up --------------------------------------
sort 88.338k i/100ms
sort! 118.207k i/100ms
to_set 19.339k i/100ms
minus 67.971k i/100ms
Calculating -------------------------------------
sort 1.062M (± 0.9%) i/s - 5.389M in 5.075109s
sort! 1.542M (± 1.2%) i/s - 7.802M in 5.061364s
to_set 200.302k (± 2.1%) i/s - 1.006M in 5.022793s
minus 783.106k (± 1.5%) i/s - 3.942M in 5.035311s
This doesn't require conversion to set:
a.sort == b.sort
One approach is to iterate over the array with no duplicates
# assume array a has no duplicates and you want to compare to b
!a.map { |n| b.include?(n) }.include?(false)
This returns an array of trues. If any false appears, then the outer include?
will return true. Thus you have to invert the whole thing to determine if it's a match.
If you know the arrays are of equal length and neither array contains duplicates then this works too:
( array1 & array2 ) == array1
Explanation: the &
operator in this case returns a copy of a1 sans any items not found in a2, which is the same as the original a1 iff both arrays have the same contents with no duplicates.
Analyis: Given that the order is unchanged, I'm guessing this is implemented as a double iteration so consistently O(n*n)
, notably worse for large arrays than a1.sort == a2.sort
which should perform with worst-case O(n*logn)
.
If you expect [:a, :b] != [:a, :a, :b]
to_set
doesn't work. You can use frequency instead:
class Array
def frequency
p = Hash.new(0)
each{ |v| p[v] += 1 }
p
end
end
[:a, :b].frequency == [:a, :a, :b].frequency #=> false
[:a, :b].frequency == [:b, :a].frequency #=> true