Following are my Celluloid codes.
client1.rb One of the 2 clients. (I named it as client 1)
client2.rb 2nd of the 2 clients. (named as client 2
Using your gists, I verified this issue can be reproduced in MRI 2.2.1 as well as jRuby 1.7.21 and Rubinius 2.5.8 ... The difference between server1.rb and server2.rb is the use of the DisplayMessage and message class method in the latter.
sleep in DisplayMessage is out of Celluloid scope.When sleep is used in server1.rb it is using Celluloid.sleep in actuality, but when used in server2.rb it is using Kernel.sleep ... which locks up the mailbox for Server until 60 seconds have passed. This prevents future method calls on that actor to be processed until the mailbox is processing messages ( method calls on the actor ) again.
Use a defer {} or future {} block.
Explicitly invoke Celluloid.sleep rather than sleep ( if not explicitly invoked as Celluloid.sleep, using sleep will end up calling Kernel.sleep since DisplayMessage does not include Celluloid like Server does )
Bring the contents of DisplayMessage.message into handle_message as in server1.rb; or at least into Server, which is in Celluloid scope, and will use the correct sleep.
defer {} approach:def handle_message(message)
defer {
DisplayMessage.message(message)
}
end
Celluloid.sleep approach:class DisplayMessage
def self.message(message)
#de ...
Celluloid.sleep 60
end
end
To reiterate, the deeper issue is not the scope of sleep ... that's why defer and future are my best recommendation. But to post something here that came out in my comments:
Using defer or future pushes a task that would cause an actor to become tied up into another thread. If you use future, you can get the return value once the task is done, if you use defer you can fire & forget.
But better yet, create another actor for tasks that tend to get tied up, and even pool that other actor... if defer or future don't work for you.
I'd be more than happy to answer follow-up questions brought up by this question; we have a very active mailing list, and IRC channel. Your generous bounties are commendable, but plenty of us would help purely to help you.
Managed to reproduce and fix the issue.
Deleting my previous answer.
Apparently, the problem lies in sleep.
Confirmed by adding logs "actor/kernel sleeping" to the local copy of Celluloids.rb's sleep().
In server1.rb,
the call to
sleepis withinserver- a class that includes Celluloid.Thus Celluloid's implementation of
sleepoverrides the nativesleep.
class Server
include Celluloid::ZMQ
...
def run
loop { async.handle_message @socket.read }
end
def handle_message(message)
...
sleep 60
end
end
Note the log actor sleeping from server1.rb. Log added to Celluloids.rb's sleep()
This suspends only the current "actor" in Celluloid i.e. only the current "Celluloid thread" handling the client1 sleeps.
In server2.rb,
the call to
sleepis within a different classDisplayMessagethat does NOT include Celluloid.Thus it is the native
sleepitself.
class DisplayMessage
def self.message(message)
...
sleep 60
end
end
Note the ABSENCE of any actor sleeping log from server2.rb.
This suspends the current ruby task i.e. the ruby server sleeps (not just a single Celluloid actor).
In server2.rb, the appropriate
sleepmust be explicitly specified.
class DisplayMessage
def self.message(message)
puts "Received at #{Time.now.strftime('%I:%M:%S %p')} and message is #{message}"
## Intentionally added sleep to test whether Celluloid block the main process for 60 seconds or not.
if message == 'client-1'
puts 'Going to sleep now'.red
# "sleep 60" will invoke the native sleep.
# Use Celluloid.sleep to support concurrent execution
Celluloid.sleep 60
end
end
end