Reentrant caching of “referentially transparent” IO calls

我怕爱的太早我们不能终老 提交于 2019-12-04 06:27:51

Yeah, basically reimplement the logic. Although it seems similar to what GHC is already doing, that's GHC's choice. Haskell can be implemented on VMs that work very differently, so in that sense it isn't already done for you.

But yeah, just use an MVar (Map InputType OutputType) or even an IORef (Map InputType OutputType) (make sure to modify with atomicModifyIORef), and just store the cache in there. If this imperative solution seems wrong, it's the "if not for the IO, this function would be pure" constraint. If it were just an arbitrary IO action, then the idea that you would have to keep state in order to know what to execute or not seems perfectly natural. The problem is that Haskell does not have a type for "pure IO" (which, if it depends on a database, it is just behaving pure under certain conditions, which is not the same as being a hereditarily pure).

import qualified Data.Map as Map
import Control.Concurrent.MVar

-- takes an IO function and returns a cached version
cache :: (Ord a) => (a -> IO b) -> IO (a -> IO b)
cache f = do
    r <- newMVar Map.empty
    return $ \x -> do
        cacheMap <- takeMVar r
        case Map.lookup x cacheMap of
            Just y -> do 
                putMVar r cacheMap
                return y
            Nothing -> do
                y <- f x
                putMVar (Map.insert x y cacheMap)
                return y

Yeah it's ugly on the inside. But on the outside, look at that! It's just like the type of a pure memoization function, except for it has IO stained all over it.

Here's some code implementing more or less what I was after in my original question:

import           Control.Concurrent
import           Control.Exception
import           Data.Either
import           Data.Map           (Map)
import qualified Data.Map           as Map
import           Prelude            hiding (catch)

-- |Memoizing wrapper for 'IO' actions
memoizeIO :: Ord a => (a -> IO b) -> IO (a -> IO b)
memoizeIO action = do
  cache <- newMVar Map.empty
  return $ memolup cache action

  where
    -- Lookup helper
    memolup :: Ord a => MVar (Map a (Async b)) -> (a -> IO b) -> a -> IO b
    memolup cache action' args = wait' =<< modifyMVar cache lup
      where
        lup tab = case Map.lookup args tab of
          Just ares' ->
            return (tab, ares')
          Nothing    -> do
            ares' <- async $ action' args
            return (Map.insert args ares' tab, ares')

The code above builds upon Simon Marlow's Async abstraction as described in Tutorial: Parallel and Concurrent Programming in Haskell:

-- |Opaque type representing asynchronous results.
data Async a = Async ThreadId (MVar (Either SomeException a))

-- |Construct 'Async' result. Can be waited on with 'wait'.
async :: IO a -> IO (Async a)
async io = do
  var <- newEmptyMVar
  tid <- forkIO ((do r <- io; putMVar var (Right r))
                 `catch` \e -> putMVar var (Left e))
  return $ Async tid var

-- |Extract value from asynchronous result. May block if result is not
-- available yet. Exceptions are returned as 'Left' values.
wait :: Async a -> IO (Either SomeException a)
wait (Async _ m) = readMVar m

-- |Version of 'wait' that raises exception.
wait' :: Async a -> IO a
wait' a = either throw return =<< wait a

-- |Cancels asynchronous computation if not yet completed (non-blocking).
cancel :: Async a -> IO ()
cancel (Async t _) = throwTo t ThreadKilled
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!