How does Y-combinator compute the fixed point programmatically?

问题

I believe I understand mathematically the idea of Y-combinator: it returns the fixed point of a given functional F, thus f = Y(F) where f satisfies f == F(f).

But I don't understand how it does the actual computation program wise?

Let's take the javascript example given here:

var Y = (F) => ( x => F( y => x(x)(y) ) )( x => F( y => x(x)(y) ) )
var Factorial = (factorial) => (n => n == 0 ? 1 : n * factorial(n-1))

Y(Factorial)(6) == 720    // => true
computed_factorial = Y(Factorial)

The part I don’t understand is how the computed_factorial function (the fixed point) actually get computed? By tracing the definition of Y, you’ll find it runs into a infinitely recursion at the x(x) part, I can't see any terminating case implied there. However, it strangely does return. Can anyone explain?

回答1:

I'm going to use ES6 arrow function syntax. Since you seem to know CoffeeScript, you should have no trouble reading it.

Here's your Y combinator

var Y = F=> (x=> F (y=> x (x) (y))) (x=> F (y=> x (x) (y)))

I'm going to use an improved version of your factorial function tho. This one uses an accumulator instead which will prevent the evaluation from turning into a big pyramid. The process of this function will be linear iterative, whereas yours would be recursive. When ES6 finally gets tail call elimination, this makes an even bigger difference.

The difference in syntax is nominal. It doesn't really matter anyway since you just want to see how the Y is evaluated.

var factorial = Y (fact=> acc=> n=>
  n < 2 ? acc : fact (acc*n) (n-1)
) (1);

Well this will already cause the computer to start doing some work. So let's evaluate this before we go any further...

I hope you have a really good bracket highlighter in your text editor...

var factorial
= Y (f=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (1)                                                                                                                                                                // sub Y
= (F=> (x=> F (y=> x (x) (y))) (x=> F (y=> x (x) (y)))) (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (1)                                                                                                         // apply F=> to fact=>
= (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (1)                                                               // apply x=> to x=>
= (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (1) // apply fact=> to y=>
= (acc=> n=> n < 2 ? acc : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (acc*n) (n-1)) (1)             // apply acc=> to 1
= n=> n < 2 ? 1 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (1*n) (n-1)                             // return value
= [Function] (n=> ...)

So you can see here, after we call:

var factorial = Y(fact=> acc=> n=> ...) (1);
//=> [Function] (n=> ...)

We get a function back that is waiting for a single input, n. Let's run a factorial now

Before we proceed, you can verify (and you should) that every line here is correct by copying/pasting it in a javascript repl. Each line will return 24 (which is the correct answer for factorial(4). Sorry if I spoiled that for you). This is like when you're simplifying fractions, solving algebraic equations, or balancing chemical formulas; each step should be a correct answer.

Be sure to scroll all the way to the right for my comments. I tell you which operation I completed on each line. The result of the completed operation is on the subsequent line.

And make sure you have that bracket highlighter handy again...

factorial (4)                                                                                                                                                                                                                     // sub factorial
= (n=> n < 2 ? 1 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (1*n) (n-1)) (4)                                 // apply n=> to 4
= 4 < 2 ? 1 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (1*4) (4-1)                                           // 4 < 2
= false ? 1 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (1*4) (4-1)                                           // ?:
= (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (1*4) (4-1)                                                       // 1*4
= (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (4) (4-1)                                                         // 4-1
= (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (4) (3)                                                           // apply y=> to 4
= (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (4) (3)                                                                     // apply x=> to x=>
= (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (4) (3)       // apply fact=> to y=>
= (acc=> n=> n < 2 ? acc : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (acc*n) (n-1)) (4) (3)                   // apply acc=> to 4
= (n=> n < 2 ? 4 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (4*n) (n-1)) (3)                                 // apply n=> to 3
= 3 < 2 ? 4 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (4*3) (3-1)                                           // 3 < 2
= false ? 4 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (4*3) (3-1)                                           // ?:
= (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (4*3) (3-1)                                                       // 4*2
= (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (12) (3-1)                                                        // 3-1
= (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (12) (2)                                                          // apply y=> to 12
= (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (12) (2)                                                                    // apply x=> to y=>
= (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (12) (2)      // apply fact=> to y=>
= (acc=> n=> n < 2 ? acc : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (acc*n) (n-1)) (12) (2)                  // apply acc=> 12
= (n=> n < 2 ? 12 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (12*n) (n-1)) (2)                               // apply n=> 2
= 2 < 2 ? 12 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (12*2) (2-1)                                         // 2 < 2
= false ? 12 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (12*2) (2-1)                                         // ?:
= (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (12*2) (2-1)                                                      // 12*2
= (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (24) (2-1)                                                        // 2-1
= (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (24) (1)                                                          // apply y=> to 24
= (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (24) (1)                                                                    // apply x=> to x=>
= (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (24) (1)      // apply fact=> to y=>
= (acc=> n=> n < 2 ? acc : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (acc*n) (n-1)) (24) (1)                  // apply acc=> to 24
= (n=> n < 2 ? 24 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (24*n) (n-1)) (1)                               // apply n=> to 1
= 1 < 2 ? 24 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (24*1) (1-1)                                         // 1 < 2
= true ? 24 : (y=> (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (x=> (fact=> acc=> n=> n < 2 ? acc : fact (acc*n) (n-1)) (y=> x (x) (y))) (y)) (24*1) (1-1)                                          // ?:
= 24

I've seen other implementations of Y as well. Here's a simple process to build another one (for use in javascript) from scratch.

// text book
var Y = f=> f (Y (f))

// prevent immediate recursion (javascript is applicative order)
var Y = f=> f (x=> Y (f) (x))

// remove recursion using U combinator
var Y = U (h=> f=> f (x=> h (h) (f) (x)))

// given: U = f=> f (f)
var Y = (h=> f=> f (x=> h (h) (f) (x))) (h=> f=> f (x=> h (h) (f) (x)))

回答2:

The Y combinator is one of the most interesting phenomenons in the lambda calculus. I doubt that right away seeing it, one can come up with an interpretation of it's functionality.

Y = f => (g => g(g))(g => n => f(g(g))(n));

The idea is to run a lambda (anonymous function) recursively.

Hey wait a minute..! How exactly you can do it if you don't have a name to reference a function and call it within itself in the first place..?

Let's try to understand it's derivation step by step. I will use the arrow functions so if you are not familiar with them please follow this link. They are very simple. x => x means function(x){return x;}. The JS this keyword has a different meaning within the arrows but that's off topic as per this subject.

So as always we will go with the factorial function but the Y combinator that we will derive will be valid for all recursive functions.

A factorial function can simply be expressed as follows

var fa = n => n === 0 ? 1 : n*fa(n-1);
fa(5) // <- 120

But say that we don't want to refer to the fa function recursively; instead we would like to derive a working factorial function from a hypothetical version of the factorial function. What's a hypothetical factorial function? The hypothetical factorial function takes a proper factorial function and returns us a working factorial function. Like the following

var fh = f => n => n === 0 ? 1 : n*f(n-1);

So if i pass fa function to fh as an argument it shall work. Like;

fh(fa)(5); // <- 120

But we don't want to refer to another factorial function like fa since we have already "sort of" defined the factorial logic within the fh function. Then we think. fh keeps the f argument in closure and returns me a working factorial function (n => n === 0 ? 1 : n*f(n-1)) if i pass a proper factorial function to it like fa. So what if i pass itself to it; A quick try fh(fh)(5) // <- NaN meh..!

So we start playing with the inner function. Normally i would pass this step but it might be helpful to see the transformations... so lets continue. I can define the fb function to return me a function which takes two arguments, itself and factorial count n

fb = (f,n) => n === 0 ? 1 : n* f(f,n-1), // fb(fb,5)  <- 120

So far so good but the two argument factorial function is no where close to what i am looking for. I can separate them by adding another functional step known as partial application.

fc = f => n => n === 0 ? 1 : n* f(f)(n-1), // fc(fc)(5)  <- 120

Now this is very close to our hypothetical function fh. But the inner shows f(f)(n-1) We have to correct this to show f(n-1). Well may be we can use a JS beauty IIFE to help us...

fd = f => n => ((g,n) => n === 0 ? 1 : n * g(n-1))(f(f),n) // fd(fd)(5)  <- 120

Can you see the IIFE..? ((g,n) => n === 0 ? 1 : n * g(n-1))(f(f),n) Yet, while this seems OK i have to get rid of the dual argument (g,n) IIFE to achieve the desired result. This will take another level of getting functional by partial application.

fe = f => n => (g => n => n === 0 ? 1 : n * g(n-1))(f(f))(n) // fe(fe)(5)  <- 120

Now we have inside g => n => n === 0 ? 1 : n * g(n-1) which is the body of our hypothetical function fh. Which means i can substitute (I love this part.. just like calculus substitution; in fact it is...) fh in the above function to read like;

fe = f => n => fh(f(f))(n) // fe(fe)(5)  <- 120

Ok time to wrap it up. What i have wanted in the first place..? I want to feed fh to a function (Y-combinator) and get it's fixed point. Here i know that fe(fe) uses fh and returns me the properly working factorial function. So lets define a function to take our hyporthetical recursive function as an argument and give us something working (fixed). IIFE to help again.

Y = f => (g => g(g))(g => n => f(g(g))(n));

So this should work for anything. Let's try our Y combinator with a hypothetical Fibonacci function.

var Y = f => (g => g(g))(g => n => f(g(g))(n)),
 fibH = f => n => n === 0 ? 0
                          : n === 1 ? 1
                                    : f(n-2) + f(n-1),
 fibo = Y(fibH);
console.log(fibo(10));

I hope it's all clear...

回答3:

In a lazy evaluation language, the Y-combinator can be defined as:

Y = (f =>
  (x => f( x(x) ))
  (x => f( x(x) )))

But since Javascript is an eager evaluation language, defining Y this way would cause the x(x) part to recurse indefinitely at the time when you try to apply Y to a function.

To get around this problem, an anonymous "wrapper" function can be introduced to delay the execution of x. This wrapper function would behave the same as x(x) when called, but would return instantly since it's just a function definition.

Knowing that x(x) would be bound to the recursive function, in the case of the example:

Factorial = f => n => n==0 ? 1 : n*f(n-1)

We can tell ahead of time that only a single argument would be passed to it. It allows us to use the following pattern to generate an anonymous function that behaves the same as any given function f(x):

f => x => f(x)

When we apply this pattern to the x(x) term, Y would no longer recurse indefinitely and becomes:

Y = (f =>
  (x => f( y => x(x)(y) ))
  (x => f( y => x(x)(y) )))

回答4:

I would like to elaborate (this will be a long read) on diwo's answer about eager infinite x(x) evaluation.

After reading diwo's answer I browsed very quickly and skipped uninterested parts in this and below is how I understand this.

Our own notation, definitions

Lets denote evaluation (executing of program) with x -> v meaning "x evaluates to value v".

In both eager and lazy evaluation, anonymous function is treated as value, i.e. it is already evaluated, so evaluation of function stops immediately.

Then eager evaluation of f(y) would go like this: f(y) -> (first evaluate y to v) -> f(v) -> (apply function f to argument v) -> f(v). (now (after second step) function is really applied, will see in second)

While for contrast, lazy evaluation of f(y) would skip first step: f(y) -> (apply function f to argument v) -> f(y) (now function is really applied, but note, that y remains unchanged).

Let now F be Factorial as in diwo's answer and Y as defined first time:

Y = (f =>
 (x => f( x(x) ))
 (x => f( x(x) )))

and

F = Factorial = f => n => n==0 ? 1 : n*f(n-1)

Let's get down to business

The whole evaluation (of not working) Y F would look like:

Y F -> w(w):= (x => F( x(x) ))(x => F( x(x) )) -> F( (x => F( x(x) ))(x => F( x(x) )) ) = F( w(w)), where w is notation for (x => F( x(x) )). Now this is same in both eager and lazy evaluation, but from now on, we get the difference.

First solution + lazy

In lazy evaluation, F would get "abstractly" (without evaluation) applied to w(w) and like so evaluate to

-> (n => n==0 ? 1 : n*(w(w))(n-1)) ) ,

which would then stop evaluation, since this is anonymous function and we already know that anonymous functions do not get evaluated any further.

First (not working) solution + eager

In eager evaluation for contrast, F(w(w)) -> F(v), which means argument w(w) has to be evaluated first.

Now evaluate w(w) = (x => F( x(x) ))(x => F( x(x) )) -> F( (x => F( x(x) )) (x => F( x(x) )) ) = F( w(w)). Now this in eager evaluation evaluates further using rule to first evaluate the argument, i.e. w(w). Which as we have just seen evaluates again to F( w(w)). We are therefore stuck in loop ... Y F -> w(w) -> F(w(w)) -> F( F(w(w)) ) -> F( F(F(w(w))) ) -> ... error.

Second (working) solution + eager

If we improve this with defining

Y = (f =>
  (x => f( y => x(x)(y) ))
  (x => f( y => x(x)(y) )))

instead, evaluation is similar to the lazy situation:

Y F -> z(z) := (x => F( y => x(x)(y) )) (x => F( y => x(x)(y) )) -> F( y => (x => F( y => x(x)(y) ))((x => F( y => x(x)(y) )))(y) )) = F( y => z(z)(y) )).

As eager rule, we now have to evaluate the argument (y => z(z)(y) ). Because this is anonymous function, evaluation of it is complete, so we continue with applying F to (y => z(z)(y) ) similarly as in lazy evaluation. We now get

F( y => z(z)(y) )) -> (n => n==0 ? 1 : n*(( y => z(z)(y) )(n-1)) ) which now ends evaluation since this is anonymous function. This is similar to the first lazy evaluation.

来源：https://stackoverflow.com/questions/37101637/how-does-y-combinator-compute-the-fixed-point-programmatically

标签

javascript

y-combinator