Automatic differentiation (AD) with respect to list of matrices in Haskell

我只是一个虾纸丫 提交于 2020-01-04 10:06:22

问题


I am trying to understand how can I use Numeric.AD (automatic differentiation) in Haskell.

I defined a simple matrix type and a scalar function taking an array and two matrices as arguments. I want to use AD to get the gradient of the scoring function with respect to both matrices, but I'm running into compilation problems. Here is the code

{-# LANGUAGE DeriveTraversable, DeriveFunctor, DeriveFoldable #-}
import Numeric.AD.Mode.Reverse as R
import Data.Traversable as T
import Data.Foldable as F

--- Non-linear function on "vectors"
logistic x = 1.0 / (1.0 + exp(-x) )
phi v = map logistic v
phi' (x:xs) = x : (phi xs)

--- dot product
dot u v = foldr (+) 0 $ zipWith (*) u v

--- simple matrix type
data Matrix a = M [[a]] deriving (Eq,Show,Functor,F.Foldable,T.Traversable)

--- action of a matrix on a vector
mv _ [] = []
mv (M []) _ = []
mv ( M m ) v = ( dot (head m)  v ) :  (mv (M (tail m)) v )

--- two matrices
mbW1 = M $ [[1,0,0],[-1,5,1],[1,2,-3]]
mbW2 = M $ [[0,0,0],[1,3,-1],[-2,4,6]]

--- two different scoring functions
sc1 v m = foldr (+) 0 $ (phi' . (mv m) )  v  

sc2 :: Floating a => [a] -> [Matrix a] -> a
sc2 v [m1, m2] = foldr (+) 0 $ (phi' . (mv m2) . phi' . (mv m1) ) v

strToInt = read :: String -> Double
strLToIntL = map strToInt
--- testing
main = do
        putStrLn $ "mbW1:" ++ (show mbW1)
        putStrLn $ "mbW2:" ++ (show mbW2)
        rawInput <-  readFile "/dev/stdin"
        let xin= strLToIntL $ lines rawInput
        putStrLn "sc xin mbW1"
        print $ sc1 xin mbW1  --- ok. = 
        putStrLn "grad (sc1 xin) mbW1"
        print $ grad ( sc1 xin) mbW1   -- yields an error: expects xin [Reverse s Double] instead of [Double]
        putStrLn "grad (sc1 [3,5,7]) mbW1"
        print $ grad ( sc1 [3,5,7]) mbW1   --- ok. =
        putStrLn "sc2 xin [mbW1,mbW2]"
        print $ sc2 xin [mbW1, mbW2]
        putStrLn "grad (sc2 [3,5,7) [mbW1,mbW2]"
        print $ grad ( sc2 [3,5,7]) [mbW1, mbW2]  --- Error: see text

The last line (grad on sc2) gives the following error:

Couldn't match type ‘Reverse s (Matrix Double)’
               with ‘Matrix (Reverse s (Matrix Double))’
Expected type: [Reverse s (Matrix Double)]
               -> Reverse s (Matrix Double)
  Actual type: [Matrix (Reverse s (Matrix Double))]
               -> Reverse s (Matrix Double)
In the first argument of ‘grad’, namely ‘(sc2 [3, 5, 7])’
In the second argument of ‘($)’, namely
  ‘grad (sc2 [3, 5, 7]) [mbW1, mbW2]’

I don't understand where the "Matrix of Matrix" in the actual type seen comes from. I'm feeding the grad with a curried version of sc2, making it a function on a list of Matrix.

Commenting out the two offending lines runs without problem, i.e., the first gradient works and is correctly calculated (I'm feeding [1,2,3] as input to the program):

mbW1:M [[1.0,0.0,0.0],[-1.0,5.0,1.0],[1.0,2.0,-3.0]]
mbW2:M [[0.0,0.0,0.0],[1.0,3.0,-1.0],[-2.0,4.0,6.0]]
sc1 xin mbW1
1
2
3
2.0179800657874893
grad (sc1 [3,5,7]) mbW1
M [[3.0,5.0,7.0],[7.630996942126885e-13,1.2718328236878141e-12,1.7805659531629398e-12],[1.0057130122694228e-3,1.6761883537823711e-3,2.3466636952953197e-3]]
sc2 xin [mbW1,mbW2]
1.8733609463863194

Both errors are an issue. I want to take the gradient of any such sc2 scoring function, depending on an array of matrices, evaluated at any given "point" xin. Clearly, I'm not yet understanding the AD library well enough. Any help would be appreciated.

来源:https://stackoverflow.com/questions/37088840/automatic-differentiation-ad-with-respect-to-list-of-matrices-in-haskell

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!