About Closure, Lexicalenvironment and Gc

About closure, LexicalEnvironment and GC

tl;dr answer: "Only variables referenced from inner fns are heap allocated in V8. If you use eval then all vars assumed referenced.". In your second example, o2 can be allocated on the stack and is thrown away after f1 exits.

I don't think they can handle it. At least we know that some engines cannot, as this is known to be the cause of many memory leaks, as for example:

function outer(node) {
    node.onclick = function inner() { 
        // some code not referencing "node"
    };
}

where inner closes over node, forming a circular reference inner -> outer's VariableContext -> node -> inner, which will never be freed in for instance IE6, even if the DOM node is removed from the document. Some browsers handle this just fine though: circular references themselves are not a problem, it's the GC implementation in IE6 that is the problem. But now I digress from the subject.

A common way to break the circular reference is to null out all unnecessary variables at the end of outer. I.e., set node = null. The question is then whether modern javascript engines can do this for you, can they somehow infer that a variable is not used within inner?

I think the answer is no, but I can be proven wrong. The reason is that the following code executes just fine:

function get_inner_function() {
    var x = "very big object";
    var y = "another big object";
    return function inner(varName) {
        alert(eval(varName));
    };
}

func = get_inner_function();

func("x");
func("y");

See for yourself using this jsfiddle example. There are no references to either x or y inside inner, but they are still accessible using eval. (Amazingly, if you alias eval to something else, say myeval, and call myeval, you DO NOT get a new execution context - this is even in the specification, see sections 10.4.2 and 15.1.2.1.1 in ECMA-262.)

Edit: As per your comment, it appears that some modern engines actually do some smart tricks, so I tried to dig a little more. I came across this forum thread discussing the issue, and in particular, a link to a tweet about how variables are allocated in V8. It also specifically touches on the eval problem. It seems that it has to parse the code in all inner functions. and see what variables are referenced, or if eval is used, and then determine whether each variable should be allocated on the heap or on the stack. Pretty neat. Here is another blog that contains a lot of details on the ECMAScript implementation.

This has the implication that even if an inner function never "escapes" the call, it can still force variables to be allocated on the heap. E.g.:

function init(node) {

    var someLargeVariable = "...";

    function drawSomeWidget(x, y) {
        library.draw(x, y, someLargeVariable);
    }

    drawSomeWidget(1, 1);
    drawSomeWidget(101, 1);

    return function () {
        alert("hi!");
    };
}

Now, as init has finished its call, someLargeVariable is no longer referenced and should be eligible for deletion, but I suspect that it is not, unless the inner function drawSomeWidget has been optimized away (inlined?). If so, this could probably occur pretty frequently when using self-executing functions to mimick classes with private / public methods.

Answer to Raynos comment below. I tried the above scenario (slightly modified) in the debugger, and the results are as I predict, at least in Chrome:

Screenshot of Chrome debugger
When the inner function is being executed, someLargeVariable is still in scope.

If I comment out the reference to someLargeVariable in the inner drawSomeWidget method, then you get a different result:

Screenshot of Chrome debugger 2
Now someLargeVariable is not in scope, because it could be allocated on the stack.

How JavaScript closures are garbage collected

I tested this in IE9+ and Firefox.

function f() {
  var some = [];
  while(some.length < 1e6) {
    some.push(some.length);
  }
  function g() { some; } //removing this fixes a massive memory leak
  return function() {};   //or removing this
}

var a = [];
var interval = setInterval(function() {
  var len = a.push(f());
  if(len >= 500) {
    clearInterval(interval);
  }
}, 10);

Live site here.

I hoped to wind up with an array of 500 function() {}'s, using minimal memory.

Unfortunately, that was not the case. Each empty function holds on to an (forever unreachable, but not GC'ed) array of a million numbers.

Chrome eventually halts and dies, Firefox finishes the whole thing after using nearly 4GB of RAM, and IE grows asymptotically slower until it shows "Out of memory".

Removing either one of the commented lines fixes everything.

It seems that all three of these browsers (Chrome, Firefox, and IE) keep an environment record per context, not per closure. Boris hypothesizes the reason behind this decision is performance, and that seems likely, though I'm not sure how performant it can be called in light of the above experiment.

If a need a closure referencing some (granted I didn't use it here, but imagine I did), if instead of

function g() { some; }

I use

var g = (function(some) { return function() { some; }; )(some);

it will fix the memory problems by moving the closure to a different context than my other function.

This will make my life much more tedious.

P.S. Out of curiousity, I tried this in Java (using its ability to define classes inside of functions). GC works as I had originally hoped for Javascript.

JavaScript Memory leak from closure lexical environment

The primary answer is that in your second code block, no direct reference to either of the closures (closure1 or someMethod) survives the return of outer (nothing outside outer refers to them), and so there's nothing left that refers to the context where they were created, and that context can be cleaned up. In your second code block, though, a direct reference to someMethod survives the return, as part of the object that you're assigning to aThing, and so the context as a whole cannot be GC'd.

Let's follow what happens with your first block:

After the first execution of outer, we have (ignoring a bunch of details):


            +−−−−−−−−−−−−−+
aThing−−−−−>| (object #1) |     
            +−−−−−−−−−−−−−+     
            | str: ...    |     +−−−−−−−−−−−−−−−−−−−−+
            | someMethod  |−−−−>| (context #1)       |
            +−−−−−−−−−−−−−+     +−−−−−−−−−−−−−−−−−−−−+
                                | something: null    |
                                | closure1: function |
                                +−−−−−−−−−−−−−−−−−−−−+

after the second execution:


            +−−−−−−−−−−−−−+
aThing−−−−−>| (object #2) |     
            +−−−−−−−−−−−−−+     
            | str: ...    |     +−−−−−−−−−−−−−−−−−−−−+     
            | someMethod  |−−−−>| (context #2)       |     
            +−−−−−−−−−−−−−+     +−−−−−−−−−−−−−−−−−−−−+     +−−−−−−−−−−−−−+                           
                                | something          |−−−−>| (object #1) |                           
                                | closure1: function |     +−−−−−−−−−−−−−+                           
                                +−−−−−−−−−−−−−−−−−−−−+     | str: ...    |     +−−−−−−−−−−−−−−−−−−−−+
                                                           | someMethod  |−−−−>| (context #1)       |
                                                           +−−−−−−−−−−−−−+     +−−−−−−−−−−−−−−−−−−−−+
                                                                               | something: null    |
                                                                               | closure1: function |
                                                                               +−−−−−−−−−−−−−−−−−−−−+

after the third execution:


            +−−−−−−−−−−−−−+
aThing−−−−−>| (object #3) |     
            +−−−−−−−−−−−−−+     
            | str: ...    |     +−−−−−−−−−−−−−−−−−−−−+     
            | someMethod  |−−−−>| (context #3)       |     
            +−−−−−−−−−−−−−+     +−−−−−−−−−−−−−−−−−−−−+     +−−−−−−−−−−−−−+                                                                          
                                | something          |−−−−>| (object #2) |                                                                          
                                | closure1: function |     +−−−−−−−−−−−−−+                                                                          
                                +−−−−−−−−−−−−−−−−−−−−+     | str: ...    |     +−−−−−−−−−−−−−−−−−−−−+                                               
                                                           | someMethod  |−−−−>| (context #2)       |                                               
                                                           +−−−−−−−−−−−−−+     +−−−−−−−−−−−−−−−−−−−−+     +−−−−−−−−−−−−−+                           
                                                                               | something          |−−−−>| (object #1) |                           
                                                                               | closure1: function |     +−−−−−−−−−−−−−+                           
                                                                               +−−−−−−−−−−−−−−−−−−−−+     | str: ...    |     +−−−−−−−−−−−−−−−−−−−−+
                                                                                                          | someMethod  |−−−−>| (context #1)       |
                                                                                                          +−−−−−−−−−−−−−+     +−−−−−−−−−−−−−−−−−−−−+
                                                                                                                              | something: null    |
                                                                                                                              | closure1: function |
                                                                                                                              +−−−−−−−−−−−−−−−−−−−−+

You can see where this is going.

Since the second block never retains a reference to closure1 or someMethod, neither of them keeps the context in memory.

When originally answering your question in 2015 I was slightly surprised that V8 (Chrome's JavaScript engine) didn't optimize this leak away, since only someMethod is retained, and someMethod doesn't actually use something or closure1 (or eval or new Function or debugger). Although in theory it has references to them via the context, static analysis would show that they can't actually be used and so could be dropped. But closure optimization is really easy to disturb, I guess something in there is disturbing it, or that the V8 team found that doing that level of analysis wasn't worth the runtime cost. I do recall seeing a tweet from one of the V8 team saying that it used to do more closure optimization than it does now (this edit is in Sep 2021) because the trade-off wasn't worth it.

Over how much of its enclosing scope does a (javascript) closure close?

I'm mostly interested into whether there is some kind of specification for these cases

The ECMAScript specification does not really detail this. It simply says that a function closes over the whole lexical environment which includes all variables in all parent scopes, organised in so-called environment records.

Yet it does not specify how an implementation should do garbage-collection - so engines do have to optimise their closures themselves - and they typically do, when they can deduce that some "closed over" variable is never needed (referenced). Specifically, if you do use eval anywhere in the closure, they cannot do that of course, and have to retain everything.

not so much about the behavior of some specific implementation

Regardless, you'll want to have a look at How JavaScript closures are garbage collected, garbage collection with node.js, About closure, LexicalEnvironment and GC and How are closures and scopes represented at run time in JavaScript

JavaScript Closures Concerning Unreferenced Variables

The question is -- what happens to x?

The answer varies depending on theory vs. implementation.

In theory, yes, x is kept alive, because the closure (the anonymous function) has a reference to the binding object of the context of the call to foo, which includes x.

In practice, modern JavaScript engines are quite smart. If they can prove to themselves that x cannot be referenced from the closure, they can leave it out. The degree to which they do that will vary from engine to engine. Example: V8 (the engine in Chrome and elsewhere) will start out with x, y, and even the object that x refers to on the stack, not the heap; then when exiting foo, it looks to see what things still have outstanding references, and moves those to the heap. Then it pops the stack pointer, and the other things don't exist anymore. :-)

So, how can they prove it? Basically, if the code in the closure doesn't refer to it and doesn't use eval or new Function, the JavaScript engine is likely to be able to know that x isn't needed.

If you need to be sure that even if x still exists, the object is available for GC even on older browsers that might be literal (dumb) about it, you can do this:

x = undefined;

That means nothing keeps a reference to the object x used to refer to. So even though x still exists, at least the object it referred to is ready for reaping. And it's harmless. But again, modern engines will optimize things for you, I wouldn't worry about it unless you were faced with a specific performance problem and tracked it down to some code allocating large objects that aren't referenced once the function returns, but don't seem to be getting cleaned up.

Unfortunately, as you pointed out below, there are limits to this, such as the one mentioned in this question. But it's not all doom and gloom, see below under the profile snapshot for what you can do...

Let's look this code in V8, using Chrome's heap snapshot feature:

function UsedFlagClass_NoFunction() {}
function UnusedFlagClass_NoFunction() {}
function build_NoFunction() {
  var notused = new UnusedFlagClass_NoFunction();
  var used = new UsedFlagClass_NoFunction();
  return function() { return used; };
}

function UsedFlagClass_FuncDecl() {}
function UnusedFlagClass_FuncDecl() {}
function build_FuncDecl() {
  var notused = new UnusedFlagClass_FuncDecl();
  var used = new UsedFlagClass_FuncDecl();
  function unreachable() { notused; }
  return function() { return used; };
}

function UsedFlagClass_FuncExpr() {}
function UnusedFlagClass_FuncExpr() {}
function build_FuncExpr() {
  var notused = new UnusedFlagClass_FuncExpr();
  var used = new UsedFlagClass_FuncExpr();
  var unreachable = function() { notused; };
  return function() { return used; };
}

window.noFunction = build_NoFunction();
window.funcDecl = build_FuncDecl();
window.funcExpr = build_FuncExpr();

And here's the expanded heap snapshot:

no description available

When processing the build_NoFunction function, V8 successfully identifies that the object referenced from notused cannot be reached and gets rid of it, but it doesn't do so in either of the other scenarios, despite the fact that unreachable cannot be reached, and therefore notused cannot be reached through it.

So what can we do to avoid this kind of unnecessary memory consumption?

Well, for anything that can be handled via static analysis, we can throw a JavaScript-to-JavaScript compiler at it, like Google's Closure Compiler. Even in "simple" mode, the beautified result of "compiling" the code above with Closure Compiler looks like this:

function UsedFlagClass_NoFunction() {}
function UnusedFlagClass_NoFunction() {}
function build_NoFunction() {
    new UnusedFlagClass_NoFunction;
    var a = new UsedFlagClass_NoFunction;
    return function () {
        return a
    }
}

function UsedFlagClass_FuncDecl() {}
function UnusedFlagClass_FuncDecl() {}
function build_FuncDecl() {
    new UnusedFlagClass_FuncDecl;
    var a = new UsedFlagClass_FuncDecl;
    return function () {
        return a
    }
}

function UsedFlagClass_FuncExpr() {}
function UnusedFlagClass_FuncExpr() {}
function build_FuncExpr() {
    new UnusedFlagClass_FuncExpr;
    var a = new UsedFlagClass_FuncExpr;
    return function () {
        return a
    }
}
window.noFunction = build_NoFunction();
window.funcDecl = build_FuncDecl();
window.funcExpr = build_FuncExpr();

As you can see, static analysis told CC that unreachable was dead code, and so it removed it entirely.

But of course, you probably used unreachable for something during the course of the function, and just don't need it after the function completes. It's not dead code, but it is code you don't need when the function ends. In that case, you have to resort to:

unused = undefined;

at the end. Since you don't need the function anymore, you might also release it:

unused = unreachable = undefined;

(Yes, you can do that, even when it was created with a function declaration.)

And no, sadly, just doing:

unreachable = undefined;

...doesn't succeed (as of this writing) in making V8 figure out that unused can be cleaned up. :-(

Does this pattern causes a circular reference in a closure?

This could cause a circular reference within the JavaScript engine (because the parent lexical environment of innerFn includes outerVal, which includes innerFn), but it does not cause a circular reference that can be observed by JavaScript code.

When outerFn runs, the function innerFn is defined. In JavaScript, a newly-defined function has access to all variables currently accessible in scope, so code inside of innerFn can access outerVar:

function outerFn() {
  var outerVar = {};
  function innerFn() {
    alert(outerVar);    // totally fine
  }
  return innerFn;
}

In ECMAScript terms, this is achieved because every function has a lexical environment used to resolve variable identifiers called [[Scope]]. A newly-defined function's [[Scope]] internal property is set to the lexical environment of its parent function. So, here, the [[Scope]] of innerFn is the lexical environment of outerFn, which contains a reference to outerFn.

In ECMAScript terms, the circular reference path goes:

innerFn
innerFn's [[Scope]] (a lexical environment)
innerFn's [[Scope]]'s environment record
the outerVar binding in innerFn's [[Scope]]'s environment record
the variable associated with the outerVar binding in innerFn's [[Scope]]'s environment record
this variable has innerFn as a property value

However, since you can't access a function's [[Scope]] internal property from JavaScript code, you can't observe a circular reference from the code.

Bonus info

Note that a clever implementation will not actually store this circular reference in your code, because it sees that outerVar is never used in any of outerFn's child functions. The binding for outerVar can be safely forgotten entirely when outerFn ends. It is further interesting to note that this optimization is not possible with eval, because it's not possible to recognize whether innerFn will ever use outerVar:

function outerFn() {
  var outerVar = {};
  function innerFn(codeStr) {
    alert(eval(codeStr));    // will `codeStr` ever be "outerVar"?
  }
  return innerFn;
}

Is it true that every function in JavaScript is a closure?

Is the function created by Function constructor also a closure?

Yes, it closes over the global scope. That might be unintuitive because all other JavaScript closures close over their lexical scope, but it still matches our definition of a closure. In your example, a is a free variable, and resolves to the a in an other scope when the inner/fn function is called somewhere.

If an inner function doesn't have any free variables, can we still call it a closure?

Depends on whom you ask. Some say Yes, others call them "uninteresting closures", personally I say No because they don't reference an outer scope.

Arrow functions and memory leak

tl;dr

according to ECMAScript, the complete lexical environment is bound
in practice, engines optimize this if possible by binding only the used variables
the optimization is not possible for example when eval() is being used inside

I found a great article series where this is discussed in-depth:

http://dmitrysoshnikov.com/ecmascript/es5-chapter-3-1-lexical-environments-common-theory/, especially "Combined environment frame model" and the follow-up article
http://dmitrysoshnikov.com/ecmascript/es5-chapter-3-2-lexical-environments-ecmascript-implementation/, for example "Eval and inner functions may break optimizations"

The articles are quite old but still valid, which you can verify by yourself (see below).

For your example: in theory someValues would be bound (and not garbage collected) although it's not used in the record.getPrice closure. But in practice only the variable you use there (sum) is bound. And the fact that sum is bound has no effect on the binding of someValues, because sum is derived from someValues, but needs no further reference to it (it's a different thing it had been defined as const sum = () => _.sumBy(someValues, 'total'))

Verfication: execute the following in the browser console:

(() => {
    //eval(); // <- uncomment this line for second test
    const thisIsUsed = 1;
    const isThisBound = 2;
    return () => {
        debugger;
        return ('result: ' + thisIsUsed);
    }
})()();

When the debugger kicks in, take a look at the "Scope" (Chrome). You could also add thisIsUsed and isThisBound to the "Watch" list.

Here's a screenshot using Chrome (Canary, version 85.0.4154.0):

Screenshot of Chrome Developer Tools Debugger

The same behavior can be observed with a current Firefox (version 76.0.1).

According to Dmitry Soshnikov's articles, eval() can break the optimization. This is easy to understand as the engine then to assume that any variable may be accessed. This behavior can also be verified, just uncomment the line in the code sample above.

About Closure, Lexicalenvironment and Gc