问题
I spent quite a lot of time troubleshooting an issue I had in the application I am working on. This application is a web app, exposing REST endpoints using scotty. It uses a TVar
to hold its state which is updated through STM a
actions triggered by the front-end layer.
As this application is based on event sourcing principles, any event generated by business layer after STM transactions complete is stored into an EventStore
(currently a simple flat file...). Here is the relevant code fragment:
newtype (EventStore m) => WebStateM s m a = WebStateM { runWebM :: ReaderT (TVar s) m a }
deriving (Functor,Applicative,Monad, MonadIO, MonadTrans, MonadReader (TVar s))
applyCommand :: (EventStore m, Serializable (Event a)) =>
Command a
-> TVar s
-> WebStateM s m (Event a)
applyCommand command = \ v -> do
(e, etype :: EventType s) <- liftIO $ atomically $ actAndApply v
storeEvent e etype
return e
where
actAndApply = \ v -> do
s <- readTVar v
let view = getView s
let e = view `act` command
let bv = view `apply` e
modifyTVar' v (setView bv)
return (e, getType view)
This works perfectly, until a bug slipped in the storeEvent
function. This function is responsible for serialising the event with the appropriate type, and I made a (gross) mistake in my serialisation routine for some type which lead to an infinite loop! Then all of a sudden, my cabal test
began to hang and fail with a timeout (I use wreq as client library to test REST services). It took me a couple of hours to pin down the actual error on the server side: tests: thread blocked indefinitely in an STM transaction
. Suspecting the serialisation routine, it took me another couple of hours to nail down the culprit and fix the issue.
Although I am of course entirely responsible for the error (I should have tested more thoroughly my serialisation routine!), I found it quite misleading. I would like to understand better where this error comes from and how to prevent it. I have read Edward Yang's post on the the subject, and this mail thread but I must confess the logical chain of events leading to observing this error is not entirely clear to me.
I think I understand the thread calling applyCommand
, which is spawned by scotty, dies from some exception (stack exhausted?) launched while evaluating storeEvent
, but I do not understand how this is related to the transaction being garbage.
回答1:
The exception says that one thread tried to do a transaction, and hit retry
, which will rerun the transaction when something changes. But the thing it's waiting for changes on is no longer referenced anywhere, so the retry can never happen. And that's a bug. Basically that thread is hung now.
I would imagine that some thread somewhere was supposed to update this TVar
, but it died because of an exception, thereby dropping the last reference to that TVar
and provoking the exception.
That's what I think happened. Without seeing the entire application, it's difficult to be sure.
来源:https://stackoverflow.com/questions/26117165/how-to-handle-or-avoid-blockedindefinitelyonstm-exception