Using bsearch to find index for inserting new element into sorted array

被刻印的时光 ゝ 提交于 2019-11-27 22:16:13

问题


I have a sorted unique array and want to efficiently insert an element into it that is not in the array like this:

a = [1,2,4,5,6]
new_elm = 3
insert_at = a.bsearch_index {|x| x > new_elm } # => 2
a.insert(insert_at, new_elm) # now a = [1,2,3,4,5,6]

The method bsearch_index doesn't exist: only bsearch, which returns the matching element rather than the matching element's index. Is there any built in way to accomplish this?


回答1:


You can use the Enumerator object returned by each_with_index to return a nested array of [value, index] pairs and then conduct your binary search on that array:

a = [1,2,4,5,6]
new_elm = 3

index = [*a.each_with_index].bsearch{|x, _| x > new_elm}.last
=> 2

a.insert(index, new_elm)

EDIT:

I've run some simple benchmarks in response to your question with an array of length 1e6 - 1:

require 'benchmark'

def binary_insert(a,e)
  index = [*a.each_with_index].bsearch{|x, _| x > e}.last
  a.insert(index, e)
end

a = *1..1e6
b = a.delete_at(1e5)
=> 100001

Benchmark.measure{binary_insert(a,b)}
=> #<Benchmark::Tms:0x007fd3883133d8 @label="", @real=0.37332, @cstime=0.0, @cutime=0.0, @stime=0.029999999999999805, @utime=0.240000000000002, @total=0.2700000000000018> 

With this in mind, you might consider trying using a heap or a trie instead of an array to store your values. Heaps in particular have constant insertion and removal time complexities, making them ideal for large storage applications. Check out this article here: Ruby algorithms: sorting, trie, and heaps




回答2:


How about using SortedSet?:

require 'set'

a = SortedSet.new [1,2,4,5,6]
new_elm = 3
a << new_elm # now a = #<SortedSet: {1, 2, 3, 4, 5, 6}>

SortedSet is implemented using rbtree. I've made the following benchmark:

def test_sorted(max_idx)
  arr_1 = (0..max_idx).to_a
  new_elm = arr_1.delete(arr_1.sample)
  arr_2 = arr_1.dup
  set_1 = SortedSet.new(arr_1)
  Benchmark.bm do |x|
    x.report { arr_1.insert(arr_1.index { |x| x > new_elm }) }
    x.report { arr_2.insert([*arr_2.each_with_index].bsearch{|x, _| x > new_elm}.last) }
    x.report { set_1 << new_elm }
  end
end

With the following results:

test_sorted 10_000
# =>       user     system      total        real
# =>   0.000000   0.000000   0.000000 (  0.000900)
# =>   0.010000   0.000000   0.010000 (  0.001868)
# =>   0.000000   0.000000   0.000000 (  0.000007)

test_sorted 100_000
# =>       user     system      total        real
# =>   0.000000   0.000000   0.000000 (  0.001150)
# =>   0.000000   0.010000   0.010000 (  0.048040)
# =>   0.000000   0.000000   0.000000 (  0.000013)

test_sorted 1_000_000
# =>       user     system      total        real
# =>   0.040000   0.000000   0.040000 (  0.062719)
# =>   0.280000   0.000000   0.280000 (  0.356032)
# =>   0.000000   0.000000   0.000000 (  0.000012)



回答3:


"The method bsearch_index doesn't exist": Ruby 2.3 introduces bsearch_index. (Kudos for getting the method name right before it existed).




回答4:


Try this

(0...a.size).bsearch { |n| a[n] > new_element }

This uses bsearch defined on Range to search the array and thus returns the index.


Performance will be way better than each_with_index which materializes O(n) temporary array tuples and thus clogs up garbage collection.




回答5:


Ruby 2.3.1 introduced bsearch_index thus the question can now be resolved this way:

a = [1,2,4,5,6]
new_elm = 3

index = a.bsearch_index{|x, _| x > new_elm}
=> 2

a.insert(index, new_elm)



回答6:


The index method accepts a block and will return the first index where the block is true

a = [1,2,4,5,6] 
new_elem = 3
insert_at = a.index{|b| b > new_elem}
#=> 2
a.insert(insert_at, new_elm) 
#=>[1,2,3,4,5,6]


来源:https://stackoverflow.com/questions/23481725/using-bsearch-to-find-index-for-inserting-new-element-into-sorted-array

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!