In a program I\'m using the dynamic keyword to invoke the best matching method. However, I have found that the framework crashes with a StackOverflowExcep
The problem is that you are deriving a type from itself:
abstract class SetTreeNode<TKey> : KeyTreeNode<SetTreeNode<TKey>, TKey> { }
The type SetTreeNote<TKey> becomes KeyTreeNode<SetTreeNode<TKey>,TKey> which becomes KeyTreeNode<KeyTreeNode<SetTreeNode<TKey>,TKey>,TKey> and this goes on and on until the stack overflows.
I don't know what you are trying to accomplish by using this complex model, but that is your problem.
I managed to reduce it to this example which fails:
interface ISortedSet<TKey> { }
sealed class SetTree<TKey> : BalancedTree<SetTreeNode<TKey>>, ISortedSet<TKey> { }
abstract class BalancedTree<TNode> { }
abstract class SetTreeNode<TKey> : KeyTreeNode<SetTreeNode<TKey>, TKey> { }
abstract class KeyTreeNode<TNode, TKey> : TreeNode<TNode> { }
abstract class TreeNode<TNode> { }
And then I fixed it by doing this:
interface ISortedSet<TKey> { }
sealed class SetTree<TKey> : BalancedTree<SetTreeNode<TKey>>, ISortedSet<TKey> { }
abstract class BalancedTree<TNode> { }
abstract class SetTreeNode<TKey> : KeyTreeNode<TKey, TKey> { }
abstract class KeyTreeNode<TNode, TKey> : TreeNode<TNode> { }
abstract class TreeNode<TNode> { }
The only difference between the two is that I replaced KeyTreeNode<SetTreeNode<TKey>, TKey> with KeyTreeNode<TKey, TKey>
I created a shorter, more to-the-point SSCCE that illustrates the problem:
class Program
{
static void Main()
{
dynamic obj = new Third<int>();
Print(obj); // causes stack overflow
}
static void Print(object obj) { }
}
class First<T> where T : First<T> { }
class Second<T> : First<T> where T : First<T> { }
class Third<T> : Second<Third<T>> { }
Looking at the call stack, it seems to be bouncing between two pairs of symbols in the C# runtime binder:
Microsoft.CSharp.RuntimeBinder.SymbolTable.LoadSymbolsFromType(
System.Type originalType
)
Microsoft.CSharp.RuntimeBinder.SymbolTable.GetConstructedType(
System.Type type,
Microsoft.CSharp.RuntimeBinder.Semantics.AggregateSymbol agg
)
and
Microsoft.CSharp.RuntimeBinder.Semantics.TypeManager.SubstTypeCore(
Microsoft.CSharp.RuntimeBinder.Semantics.CType type,
Microsoft.CSharp.RuntimeBinder.Semantics.SubstContext pctx
)
Microsoft.CSharp.RuntimeBinder.Semantics.TypeManager.SubstTypeArray(
Microsoft.CSharp.RuntimeBinder.Semantics.TypeArray taSrc,
Microsoft.CSharp.RuntimeBinder.Semantics.SubstContext pctx
)
If I had to hazard a guess, some of the generic type constraint nesting you've got going on has managed to confuse the binder into recursively walking the types involved in the constraints along with the constraints themselves.
Go ahead and file a bug on Connect; if the compiler doesn't get caught by this, the runtime binder probably shouldn't either.
This code example runs correctly:
class Program
{
static void Main()
{
dynamic obj = new Second<int>();
Print(obj);
}
static void Print(object obj) { }
}
internal class First<T>
where T : First<T> { }
internal class Second<T> : First<Second<T>> { }
This leads me to believe (without much knowledge of the internals of the runtime binder) that it's proactively checking for recursive constraints, but only one level deep. With an intermediary class in between, the binder ends up not detecting the recursion and tries to walk it instead. (But that's all just an educated guess. I'd add it to your Connect bug as additional information and see if it helps.)
Can someone please explain what's going on? Is this a bug in the framework?
Yes.
The issue is in the way that generic types are resolved into their particular concrete uses.
Okay, let's start with some obvious stuff so as to build up to what the compiler is getting wrong. As you know, with something like List<int> the compilers (whether it's the dynamic compiler, or any of the static compilers since C#2 introduced generics) have to take the List<> type and the int type and combine information about both of those to produce the List<int> type.
Now, consider:
public class Base<T, U>
{
}
public class Derived<T> : Base<T, int>
{
}
Derived<long> l = new Derived<long>();
Here you can see that in the same work on the types Derived<T> and long the compiler has to fill three slots:
T defined on Derived<>, which gets filled with long.T defined on Base<,> which gets filled with the T defined on Derived<>, which was filled with long.U defined on Base<,> which gets filled with int.When you consider nested classes, long inheritance chains, generic types derived from other generic types and adding further generic parameters, and so on, you can see that there's a good few different permutations to cover. If you're starting with Derived<long> and have to answer the question "what is the base type of the class?" (which obviously compilers need to consider a lot) then all of this has to be worked out.
The dynamic compiler was based on the pre-Roslyn static compiler, which was based on the compiler before that which was actually written in C++ rather than C# (there is still quite a lot of the dynamic compiler that while it's in C#, smells of C++). The can be considered as being more similar in end point (code that can be executed) than in start point; a bunch of text that has to be parsed for the static compiler to understand what types and operations are involved vs the dynamic compiler starting with already-existing types and operations represented by objects and flags.
One thing they both need to know that is that if a type is mentioned several times, that it's the same type (that's pretty much the most basic definition of what a type means, after all). If we compile new List<int>((int)x) that would obviously not work unless it knew int meant the same thing both times. They also need to avoid chewing up gigs of RAM.
Both problems are solved by a hash-consing or flyweight-like approach. When it goes to construct the object representing a particular type, it first sees if it has already constructed that type, and only constructs a new one if necessary. This also helps a lot of relationships within hierarchies be correctly constructed, though clearly not the particular case in your question.
For most types (all except for a few special cases like pointers, references, arrays, nullables [though there's an exception to that exception], type parameters… okay, there's actually quite a few exceptions) the state is mostly three things:
Dictionary<int, int> it has the TKey and TValue of Dictionary<TKey, TValue>).T of List<T> for the open type, the int of List<int> for the constructed type, or a mix for e.g. Dictionary<T, int> relative to some generic type or method that defines T).Okay, so far, so good. If it needs to do something with List<int>.Enumerator it first either finds the List<T> symbol in the store, or adds it if new, then finds List<T>.Enumerator symbol in the store, or adds it if new, then finds int in the store (int is preloaded as a very common type) and finally finds the type that combines List<T>.Enumerator with int in the store, or adds it if new. We now have the only List<int>.Enumerator type object.
The problem, which caused your bug, comes in at the end of this last step. Consider what we said above about having to assign types to base types when creating a concrete implementation of a type. The base type of a concrete generic type is a concrete type, potentially itself a concrete generic type, but the information we have here is of the generic type and some type arguments: We do not know what the concrete generic type is.
The method for finding the base type is lazy-loaded, but calls into the symbol that doesn't know the type parameters to use.
The solution used was to temporarily define that symbol's base type in terms of the concrete base type, call the lazy-loading base-type method, and then set it back again.
I don't know why something was lazy-loaded when it was called immediately after creation. At a guess, I'd say it was something that made more sense in terms of the static compilation, and so got ported over that way rather than rewrite the mechanism from scratch (which would be a more risky approach in most regards).
This works pretty well, even with quite complicated hierarchies. However, if there's a hierarchy that is both circular in terms of type parameters and has more than one step before a non-generic type (such as object) is reached (so the fix-up has to recurse on a base type too) then it isn't able to find the type it's in the process of making (remember the bit about storing the objects for types) because it's been temporarily changed to make the fix-up work, and has to make it again. And again, and again, until you hit a StackOverflowException.
From Adam Maras' answer:
This leads me to believe (without much knowledge of the internals of the runtime binder) that it's proactively checking for recursive constraints, but only one level deep.
It's almost the opposite, in that the problem is proactively setting the base class in such a way as to prevent it realising it already has a type it needs. I think I managed to fix it today though it remains to be seen if someone sees some problem with that fix I missed (a good thing about contributing to the framework is they have a high standard of code reviews, but that of course means I can't be sure a contribution will be accepted until it's in).