Why does a lambda expression in C# cause a memory leak?

问题

Note: this is not just some random useless code, this is an attempt to reproduce an issue with lambda expressions and memory leaks in C#.

Examine the following program in C#. It's a console application that simply:

Creates a new object of type Test
Writes to the console that the object was created
Calls garbage collection
Wait for any user input
Shuts down

I run this program using JetBrains DotMemory, and I take two memory snapshots: one after the object was initialized, and another after its been collected. I compare the snapshots and get what I expect: one dead object of type Test.

But here's the quandary: I then create a local lambda expression inside the object's constructor and I DO NOT USE IT ANYWHERE. It's just a local constructor variable. I run the same procedure in DotMemory, and suddenly, I get an object of type Test+<>, which survives garbage collection.

See the attached retention path report from DotMemory: The lambda expression has a pointer to the Test+<> object, which is expected. But who has a pointer to the lambda expression, and why is it kept in memory?

Also, this Test+<> object - I assume it is just temporary object to hold the lambda method, and has nothing to do with the original Test object, am I right?

public class Test
{
    public Test()
    {
        // this line causes a leak
        Func<object, bool> t = _ => true;
    }

    public void WriteFirstLine()
    {
        Console.WriteLine("Object allocated...");
    }

    public void WriteSecondLine()
    {
        Console.WriteLine("Object deallocated. Press any button to exit.");
    }
}

class Program
{
    static void Main(string[] args)
    {
        var t = new Test();
        t.WriteFirstLine();
        Console.ReadLine();
        t.WriteSecondLine();
        GC.Collect();
        GC.WaitForPendingFinalizers();
        GC.Collect();

        Console.ReadLine();
    }
}

回答1:

If you decompile your code with something (like dotpeek), you will see that compiler generated something like this:

public class Test {
    public Test() {
        if (Test.ChildGeneratedClass.DelegateInstance != null)
            return;
        Test.ChildGeneratedClass.DelegateInstance = 
            Test.ChildGeneratedClass.Instance.DelegateFunc;
    }

    public void WriteFirstLine() {
        Console.WriteLine("Object allocated...");
    }

    public void WriteSecondLine() {
        Console.WriteLine("Object deallocated. Press any button to exit.");
    }

    [CompilerGenerated]
    [Serializable]
    private sealed class ChildGeneratedClass {
        // this is what's called Test.<c> <>9 in your snapshot
        public static readonly Test.ChildGeneratedClass Instance;
        // this is Test.<c> <>9__0_0
        public static Func<object, bool> DelegateInstance;

        static ChildGeneratedClass() {
            Test.ChildGeneratedClass.Instance = new Test.ChildGeneratedClass();
        }

        internal bool DelegateFunc(object _) {
            return true;
        }
    }
}

So it created child class, put your function as a instance method of that class, created singleton instance of that class in a static field and finally created static field with your Func<object,bool referencing method DelegateFunc. So no surprise that those static members generated by compiler cannot be collected by GC. Of course those objects are not created for each Test object you create, only once, so I cannot really call that a "leak".

回答2:

I suspect that what you're seeing is the effect of a compiler optimization.

Suppose Test() is called multiple times. The compiler could create a new delegate each time - but that seems a little wasteful. The lambda expression doesn't capture either this or any local variables or parameters, so a single delegate instance can be reused for all invocations of Test(). The compiler emits code to create the delegate lazily, but store it in a static field. So it's like this:

private static Func<object, bool> cachedT;

public Test()
{
    if (cachedT == null)
    {
        cachedT = _ => true;
    }
    Func<object, bool> t = cachedT;
}

Now that does create an object that will never be garbage collected, but it reduces GC pressure if Test is called frequently. The compiler can't really know which is likely to be better, unfortunately.

This is detectable with reference equality by looking at the delegates resulting from the lambda expression. For example, this prints True (at least for me; it's a compiler implementation detail):

using System;

class Test
{
    private Func<object> CreateFunc()
    {
        return () => new object();
    }

    static void Main()
    {
        Test t = new Test();
        var f1 = t.CreateFunc();
        var f2 = t.CreateFunc();
        Console.WriteLine(ReferenceEquals(f1, f2));
    }
}

But if you change the lambda expression to () => this; it prints False.

来源：https://stackoverflow.com/questions/46962507/why-does-a-lambda-expression-in-c-sharp-cause-a-memory-leak

标签

lambda

memory-leaks