In Haskell, if I write
fac n = facRec n 1
where facRec 0 acc = acc
facRec n acc = facRec (n-1) (acc*n)
and compile it with GHC
Tail recursion does make sense in (GHC) Haskell if your accumulator is strict. To demonstrate the problem, here is a "trace" of your tail-recursive definition of fac
:
fac 4
~> facRec 4 1
~> facRec 3 (1*4)
~> facRec 2 ((1*4)*3)
~> facRec 1 (((1*4)*3)*2)
~> facRec 0 ((((1*4)*3)*2)*1)
~> (((1*4)*3)*2) * 1
~> ((1*4)*3) * 2
~> (1*4) * 3
~> 1*4
~> 4 * 3
~> 12 * 2
~> 24 * 1
~> 24
The indentation level corresponds (roughly) to stack level. Note that the accumulator is only evaluated at the very end, and that may cause a stack overflow. The trick, of course, is to make the accumulator strict. It's theoretically possible to show that facRec is strict if it is called in a strict context, but I am not aware of any compiler that does that, ATM. GHC does do tail call optimisation, though, so the facRec
calls use constant stack space.
For the same reason foldl'
is usually preferred over foldl
, since the former is strict in the accumulator.
Regarding your second part, runhaskell
/runghc
is just a wrapper over GHCi. If GHCi finds compiled code it will use that, otherwise it will use the bytecode interpreter which performs few optimisations, so don't expect any fancy optimisations to happen.
Your question isn't complete. I assume you mean GHC, and at least without optimizations the answer is "yes" because the worker function (facRec
in the first or fac
in the second) has an arity 2 compared to one and the assembly will reflect this. With optimizations or with JHC the answer is probably "no".
In haskell, it only helps to write your program in a tail-recursive way if your accumulator is strict and you need to whole result.
With ghc's runHaskell the program won't be optimised, so there won't be a strictness analysis, so you may stack overflow; while if you compile with optimisations the compiler may detect the accumulator needs to be strict and optimise accordingly.
To see how things happen differently (or not) the best way is to inspect the core langage generated, a good blog post from Don Stewart explains things . Many of his blog post are interesting if your interested about performance, by the way.