I understand that a closure is defined as:
[A] stack-frame which is not deallocated when the function returns. (as if a \'stack-frame\' were malloc\'e
Here is an example of how you can transform code that needs closures into code that doesn't. The essential points to pay attention to are: how function declarations are transformed, how function calls are transformed, and how accesses to local variables that have been moved to the heap are transformed.
Input:
var f = function (x) {
x = x + 10
var g = function () {
return ++x
}
return g
}
var h = f(3)
console.log(h()) // 14
console.log(h()) // 15
Output:
// Header that goes at the top of the program:
// A list of environments, starting with the one
// corresponding to the innermost scope.
function Envs(car, cdr) {
this.car = car
this.cdr = cdr
}
Envs.prototype.get = function (k) {
var e = this
while (e) {
if (e.car.get(k)) return e.car.get(k)
e = e.cdr
}
// returns undefined if lookup fails
}
Envs.prototype.set = function (k, v) {
var e = this
while (e) {
if (e.car.get(k)) {
e.car.set(k, v)
return this
}
e = e.cdr
}
throw new ReferenceError()
}
// Initialize the global scope.
var envs = new Envs(new Map(), null)
// We have to use this special function to call our closures.
function call(f, ...args) {
return f.func(f.envs, ...args)
}
// End of header.
var f = {
func: function (envs, x) {
envs = new Envs(new Map().set('x',x), envs)
envs.set('x', envs.get('x') + 10))
var g = {
func: function (envs) {
envs = new Envs(new Map(), envs)
return envs.set('x', envs.get('x') + 1).get('x')
},
envs: envs
}
return g
},
envs: envs
}
var h = call(f, 3)
console.log(call(h)) // 14
console.log(call(h)) // 15
Let's break down how the three key transformations go. For the function declaration case, assume for concreteness that we have a function of two arguments x
and y
and one local variable z
, and x
and z
can escape the stack frame and so need to be moved to the heap. Because of hoisting we may assume that z
is declared at the beginning of the function.
Input:
var f = function f(x, y) {
var z = 7
...
}
Output:
var f = {
func: function f(envs, x, y) {
envs = new Envs(new Map().set('x',x).set('z',7), envs)
...
}
envs: envs
}
That's the tricky part. The rest of the transformation just consists in using call
to call the function and replacing accesses to the variables moved to the heap with lookups in envs.
A couple of caveats.
How did we know that x
and z
needed to be moved to the heap but not y
? Answer: the simplest (but possibly not optimal) thing is to just move anything to the heap that is referenced in an enclosed function body.
The implementation I have given leaks a ton of memory and requires function calls to access access local variables moved to the heap instead of inlining that. A real implementation wouldn't do these things.
Finally, user3856986 posted an answer that makes some different assumptions than mine, so let's compare it.
The main difference is that I assumed that local variables would be kept on a traditional stack, while user3856986's answer only makes sense if the stack will be implemented as some kind of structure on the heap (but he or she is not very explicit about this requirement). A heap implementation like this can work, though it will put more load on the allocator and GC since you have to allocate and collect stack frames on the heap. With modern GC technology, this can be more efficient than you might think, but I believe that the commonly used VMs do use traditional stacks.
Also, something left vague in user3856986's answer is how the closure gets a reference to the relevant stack frame. In my code, this happens when the envs
property is set on the closure while that stack frame is executing.
Finally, user3856986 writes, "All variables in b() become local variables to c() and nothing else. The function that called c() has no access to them." This is a little misleading. Given a reference to the closure c
, the only thing that stops one from getting access to the closed variables from the call to b
is the type system. One could certainly access these variables from assembly (otherwise, how could c
access them?). On the other hand, as for the true local variables of c
, it doesn't even make sense to ask if you can get access to them until some particular invocation of c
has been specified (and if we consider some particular call, by the time control gets back to the caller, the information stored in them might already have been destroyed).