How to fix a deadlock in join() in Ruby

柔情痞子 提交于 2019-12-04 17:44:16

问题


I am working in multi-threading in Ruby. The code snippet is:

  threads_array = Array.new(num_of_threads)  
  1.upto(num_of_threads) do |i|  

    Thread.abort_on_exception = true
      threads_array[i-1] =  Thread.new {
        catch(:exit) do
          print "s #{i}"
          user_id = nil
          loop do
            user_id = user_ids.pop()
            if user_id == nil
              print "a #{i}"
              Thread.stop()
            end
            dosomething(user_id)
          end
        end
      }
    end
    #puts "after thread"
    threads_array.each {|thread| thread.join}

I am not using any mutex locks, but I get a deadlock. This is the output of the above code snippet:

s 2s 6s 8s 1s 11s 7s 10s 14s 16s 21s 24s 5s 26s 3s 19s 20s 23s 4s 28s 9s 12s 18s 22s 29s 30s 27s 13s 17s 15s 25a 4a 10a 3a 6a 21a 24a 16a 9a 18a 5a 28a 20a 2a 22a 11a 29a 8a 14a 23a 26a 1a 19a 7a 12fatal: deadlock detected

The above output tells me that the deadlock is after the user_ids array is null and happening with Thread's join and stop.

What actually is happening and what is the solution to this error?


回答1:


The simplest code to reproduce this issue is:

t = Thread.new { Thread.stop }
t.join # => exception in `join': deadlock detected (fatal)

Thread::stop → nil

Stops execution of the current thread, putting it into a “sleep” state, and schedules execution of another thread.

Thread#join → thr
Thread#join(limit) → thr

The calling thread will suspend execution and run thr. Does not return until thr exits or until limit seconds have passed. If the time limit expires, nil will be returned, otherwise thr is returned.

As far as I understand you call Thread.join without parameters on thread and wait for it to exit, but the child thread calls Thread.stop and goes into sleep status. This is a deadlock situation, the main thread waits for the child thread to exit, but the child thread is sleeping and not responding.

If you call join with limit the parameter then the child thread will be aborted after a timeout without causing a deadlock to your program:

t = Thread.new { Thread.stop }
t.join 1 # => Process finished with exit code 0

I would recommend exiting your worker threads after they do the job with Thread.exit or get rid of the infinite loop and reach the end of the execution thread normally, for example:

if user_id == nil
  raise StopIteration
end

#or 
if user_id == nil
  Thread.exit
end



回答2:


In addition to Alex Kliuchnikau's answer, I'll add that #join could raise this error when thread is waiting for Queue#pop. A simple and conscious solution is call #join with a timeout.

This is from ruby 2.2.2:

[27] pry(main)> q=Queue.new
=> #<Thread::Queue:0x00000003a39848>
[30] pry(main)> q << "asdggg"
=> #<Thread::Queue:0x00000003a39848>
[31] pry(main)> q << "as"
=> #<Thread::Queue:0x00000003a39848>
[32] pry(main)> t = Thread.new {
[32] pry(main)*   while s = q.pop
[32] pry(main)*     puts s
[32] pry(main)*   end  
[32] pry(main)* }  
asdggg
as
=> #<Thread:0x00000003817ce0@(pry):34 sleep>
[33] pry(main)> q << "asg"
asg
=> #<Thread::Queue:0x00000003a39848>
[34] pry(main)> q << "ashg"
ashg
=> #<Thread::Queue:0x00000003a39848>
[35] pry(main)> t.join
fatal: No live threads left. Deadlock?
from (pry):41:in `join'
[36] pry(main)> t.join(5)
=> nil



回答3:


If I get your intentions right I would consider something simpler (and probably safer, users_ids.pop() from within thread looks scary to me):

user_ids = (0..19).to_a
number_of_threads = 3

user_ids \
  .each_slice(user_ids.length / number_of_threads + 1) \
  .map { |slice| 
      Thread.new(slice) { |s| 
        puts s.inspect 
      }
  }.map(&:join)


来源:https://stackoverflow.com/questions/8925001/how-to-fix-a-deadlock-in-join-in-ruby

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!