问题
I have a sorted unique array and want to efficiently insert an element into it that is not in the array like this:
a = [1,2,4,5,6]
new_elm = 3
insert_at = a.bsearch_index {|x| x > new_elm } # => 2
a.insert(insert_at, new_elm) # now a = [1,2,3,4,5,6]
The method bsearch_index doesn't exist: only bsearch, which returns the matching element rather than the matching element's index. Is there any built in way to accomplish this?
回答1:
You can use the Enumerator object returned by each_with_index to return a nested array of [value, index] pairs and then conduct your binary search on that array:
a = [1,2,4,5,6]
new_elm = 3
index = [*a.each_with_index].bsearch{|x, _| x > new_elm}.last
=> 2
a.insert(index, new_elm)
EDIT:
I've run some simple benchmarks in response to your question with an array of length 1e6 - 1:
require 'benchmark'
def binary_insert(a,e)
  index = [*a.each_with_index].bsearch{|x, _| x > e}.last
  a.insert(index, e)
end
a = *1..1e6
b = a.delete_at(1e5)
=> 100001
Benchmark.measure{binary_insert(a,b)}
=> #<Benchmark::Tms:0x007fd3883133d8 @label="", @real=0.37332, @cstime=0.0, @cutime=0.0, @stime=0.029999999999999805, @utime=0.240000000000002, @total=0.2700000000000018> 
With this in mind, you might consider trying using a heap or a trie instead of an array to store your values. Heaps in particular have constant insertion and removal time complexities, making them ideal for large storage applications. Check out this article here: Ruby algorithms: sorting, trie, and heaps
回答2:
How about using SortedSet?:
require 'set'
a = SortedSet.new [1,2,4,5,6]
new_elm = 3
a << new_elm # now a = #<SortedSet: {1, 2, 3, 4, 5, 6}>
SortedSet is implemented using rbtree. I've made the following benchmark:
def test_sorted(max_idx)
  arr_1 = (0..max_idx).to_a
  new_elm = arr_1.delete(arr_1.sample)
  arr_2 = arr_1.dup
  set_1 = SortedSet.new(arr_1)
  Benchmark.bm do |x|
    x.report { arr_1.insert(arr_1.index { |x| x > new_elm }) }
    x.report { arr_2.insert([*arr_2.each_with_index].bsearch{|x, _| x > new_elm}.last) }
    x.report { set_1 << new_elm }
  end
end
With the following results:
test_sorted 10_000
# =>       user     system      total        real
# =>   0.000000   0.000000   0.000000 (  0.000900)
# =>   0.010000   0.000000   0.010000 (  0.001868)
# =>   0.000000   0.000000   0.000000 (  0.000007)
test_sorted 100_000
# =>       user     system      total        real
# =>   0.000000   0.000000   0.000000 (  0.001150)
# =>   0.000000   0.010000   0.010000 (  0.048040)
# =>   0.000000   0.000000   0.000000 (  0.000013)
test_sorted 1_000_000
# =>       user     system      total        real
# =>   0.040000   0.000000   0.040000 (  0.062719)
# =>   0.280000   0.000000   0.280000 (  0.356032)
# =>   0.000000   0.000000   0.000000 (  0.000012)
回答3:
"The method bsearch_index doesn't exist": Ruby 2.3 introduces bsearch_index. (Kudos for getting the method name right before it existed). 
回答4:
Try this
(0...a.size).bsearch { |n| a[n] > new_element }
This uses bsearch defined on Range to search the array and thus returns the index. 
Performance will be way better than each_with_index which materializes O(n) temporary array tuples and thus clogs up garbage collection.
回答5:
Ruby 2.3.1 introduced bsearch_index thus the question can now be resolved this way:
a = [1,2,4,5,6]
new_elm = 3
index = a.bsearch_index{|x, _| x > new_elm}
=> 2
a.insert(index, new_elm)
回答6:
The index method accepts a block and will return the first index where the block is true
a = [1,2,4,5,6] 
new_elem = 3
insert_at = a.index{|b| b > new_elem}
#=> 2
a.insert(insert_at, new_elm) 
#=>[1,2,3,4,5,6]
来源:https://stackoverflow.com/questions/23481725/using-bsearch-to-find-index-for-inserting-new-element-into-sorted-array