I am very much new to Haskell, and really impressed by the language\'s \"architecture\", but it still bothers me how monads can be pure.
As
redbneb's answer is almost right, except that for Monads the two timelines are intermingled, which is their essence;
a Haskell computation does take place after the outside world has supplied some inputs, say, in a previous computation step; to construct the next recipe ⁄ "computation descriptions", which is then run in its turn. Otherwise it would not be a Monad, but an Applicative, which constructs its recipes ⁄ descriptions from components known ahead of time.
And lowly Functor itself already has the two timelines (which is its essence): IO a
value describes an "outside world's" ⁄ future IO-computation "producing" an "inside" ⁄ pure a
result.
Consider:
[f x | x <- xs] f <$> xs Functor [r | x<-xs,r<-[f x]]
[y x | y <- f, x <- xs] f <*> xs Applicative [r | y<-f,x<-xs,r<-[y x]]
[r | x <- xs, r <- f x] f =<< xs Monad [r | x<-xs,r<- f x ]
(written with monad comprehensions). Of course a Functor (Applicative / Monad / ...) can be pure as well; still there are two timelines ⁄ "worlds" there.
Few concrete examples:
~> [x*2 | x<-[10,100]]
~> [r | x<-[10,100], r <- [x*2]] -- non-monadic
[20,200] -- (*2) <$> [10,100]
~> [x*y | x<-[10,100], y <- [2,3]]
~> [r | x<-[10,100], y <- [2,3], r <- [x*y]] -- non-monadic
[20,30,200,300] -- (*) <$> [10,100] <*> [2,3]
~> [r | x<-[10,100], y <- [2,3], r <- replicate 2 (x*y) ]
~> [r | x<-[10,100], y <- [2,3], r <- [x*y, x*y]] -- still non-monadic:
~> (\a b c-> a*b) <$> [10,100] <*> [2,3] <*> [(),()] -- it's applicative!
[20,20,30,30,200,200,300,300]
~> [r | x<-[10,100], y <- [2,3], r <- [x*y, x+y]] -- and even this
~> (\a b c-> c (a*b,a+b)) <$> [10,100] <*> [2,3] <*> [fst,snd] -- as well
~> (\a b c-> c a b) <$> [10,100] <*> [2,3] <*> [(*),(+)]
[20,12,30,13,200,102,300,103]
~> [r | x<-[10,100], y <- [2,3], r <- replicate y (x*y) ] -- only this is _essentially_
~> [10,100] >>= \x-> [2,3] >>= \y -> replicate y (x*y) -- monadic !!!!
[20,20,30,30,30,200,200,300,300,300]
Essentially-monadic computations are built of steps which can't be constructed ahead of the combined computation's run-time, because what recipe to construct is determined by the value resulting from a previously computed value -- value, produced by the recipe's computation when it is actually performed.
The following image might also prove illuminating: