Interface conflict resolution in C#

问题

This is a spin-off question based on Eric Lippert's answer on this question.

I would like to know why the C# language is designed not being able to detect the correct interface member in the following specific case. I am not looking on feedback whether designing a class this way is considered best practice.

class Turtle { }
class Giraffe { }

class Ark : IEnumerable<Turtle>, IEnumerable<Giraffe>
{
    public IEnumerator<Turtle> GetEnumerator()
    {
        yield break;
    }

    // explicit interface member 'IEnumerable.GetEnumerator'
    IEnumerator IEnumerable.GetEnumerator()
    {
        yield break;
    }

    // explicit interface member 'IEnumerable<Giraffe>.GetEnumerator'
    IEnumerator<Giraffe> IEnumerable<Giraffe>.GetEnumerator()
    {
        yield break;
    }
}

In the code above, Ark has 3 conflicting implementation of GetEnumerator(). This conflict is resolved by treating IEnumerator<Turtle>'s implementation as default, and requiring specific casts for both others.

Retrieving the enumerators works like a charm:

var ark = new Ark();

var e1 = ((IEnumerable<Turtle>)ark).GetEnumerator();  // turtle
var e2 = ((IEnumerable<Giraffe>)ark).GetEnumerator(); // giraffe
var e3 = ((IEnumerable)ark).GetEnumerator();          // object

// since IEnumerable<Turtle> is the default implementation, we don't need
// a specific cast to be able to get its enumerator
var e4 = ark.GetEnumerator();                         // turtle

Why isn't there a similar resolution for LINQ's Select extension method? Is there a proper design decision to allow the inconsistency between resolving the former, but not the latter?

 // This is not allowed, but I don't see any reason why ..
 // ark.Select(x => x);                                // turtle expected

 // these are allowed
 ark.Select<Turtle, Turtle>(x => x);
 ark.Select<Giraffe, Giraffe>(x => x);

回答1:

It's important to first understand what mechanism is being used to resolve the call to the extension method Select. C# uses a generic type inference algorithm which is fairly complex; see the C# specification for the details. (I really should write a blog article explaining it all; I recorded a video about it in 2006 but unfortunately it has disappeared.)

But basically, the idea of generic type inference on Select is: we have:

public static IEnumerable<R> Select<A, R>(
  this IEnumerable<A> items,
  Func<A, R> projection)

From the call

ark.Select(x => x)

we must deduce what A and R was intended.

Since R depends on A, and in fact is equal to A, the problem reduces to finding A. The only information we have is the type of ark. We know that ark:

Is Ark
Extends object
Implements IEnumerable<Giraffe>
Implements IEnumerable<Turtle>
IEnumerable<T> extends IEnumerable and is covariant.
Turtle and Giraffe extend Animal which extends object.

Now, if those are the only things you know, and you know that we're looking for IEnumerable<A>, what conclusions can you reach about A?

There are a number of possibilities:

Choose Animal, or object.
Choose Turtle or Giraffe by some tiebreaker.
Decide that the situation is ambiguous, and give an error.

We can reject the first option. A design principle of C# is: when faced with a choice between options, always choose one of the options or produce an error. C# never says "you gave me a choice between Apple and Cake so I choose Food". It always chooses from the choices you gave it, or it says that it has no basis on which to make a choice.

Moreover, if we chose Animal, that just makes the situation worse. See the exercise at the end of this post.

You propose the second option, and your proposed tiebreaker is "an implicitly implemented interface gets priority over an explicitly implemented interface".

This proposed tiebreaker has some problems, starting with there is no such thing as an implicitly implemented interface. Let's make your situation slightly more complicated:

interface I<T>
{
  void M();
  void N();
}
class C : I<Turtle>, I<Giraffe>
{
  void I<Turtle>.M() {} 
  public M() {} // Used for I<Giraffe>.M
  void I<Giraffe>.N() {}
  public N() {}
  public static DoIt<T>(I<T> i) {i.M(); i.N();}
}

When we call C.DoIt(new C()) what happens? Neither interface is "explicitly implemented". Neither interface is "implicitly implemented". Interface members are implicitly or explicitly implemented, not interfaces.

Now we could say "an interface that has all of its members implicitly implemented is an implicitly implemented interface". Does that help? Nope. Because in your example, IEnumerable<Turtle> has one member implicitly implemented and one member explicitly implemented: the overload of GetEnumerator that returns IEnumerator is a member of IEnumerable<Turtle> and you've explicitly implemented it.

(ASIDE: A commenter notes that the above is inelegantly worded; it is not entirely clear from the specification whether members "inherited" from "base" interfaces are "members" of the "derived" interface, or whether it is simply the case that a "derivation" relationship between interfaces is simply the statement of a requirement that any implementor of the "derived" interface must also implement the "base". The specification has historically been unclear on this point and it is possible to make arguments either way. Regardless, my point is that the derived interface requires you to implement a certain set of members, and some of those members can be implicitly implemented and some can be explicitly implemented, and we can count how many there are of each should we choose to.)

So now maybe the proposed tiebreaker is "count the members, and the interface that has the least members explicitly implemented is the winner".

So let's take a step back here and ask the question: how on earth would you document this feature? How would you explain it? Suppose a customer comes to you and says "why are turtles being chosen over giraffes here?" How would you explain it?

Now suppose the customer asks "how can I make a prediction about what the compiler will do when I write the code?" Remember, that customer might not have the source code to Ark; it might be a type in a third-party library. Your proposal makes the invisible-to-users implementation decisions of third parties into relevant factors that control whether other people's code is correct or not. Developers generally are opposed to features that make it impossible for them to understand what their code does, unless there is a corresponding boost in power.

(For example: virtual methods make it impossible to know what your code does, but they are very useful; no one has made the argument that this proposed feature has a similar usefulness bonus.)

Suppose that third party changes a library so that a different number of members are explicitly implemented in a type you depend on. Now what happens? A third party changing whether or not a member is explicitly implemented can cause compilation errors in other people's code.

Even worse, it can not cause a compilation error; imagine a situation in which someone makes a change just in the number of methods that are implicitly implemented, and those methods are not even methods that you call, but that change silently causes a sequence of turtles to become a sequence of giraffes.

Those scenarios are really, really bad. C# was carefully designed to prevent this kind of "brittle base class" failure.

Oh, but it gets worse. Suppose we did like this tiebreaker; could we even implement it reliably?

How can we even tell if a member is explicitly implemented? The metadata in the assembly has a table that lists what class members are explicitly mapped to what interface members, but is that a reliable reflection of what is in the C# source code?

No, it is not! There are situations in which the C# compiler must secretly generate explicitly implemented interfaces on your behalf in order to satisfy the verifier (describing them would be quite off topic). So you cannot actually tell very easily how many interface members the type's implementor decided to implement explicitly.

It gets worse still: suppose the class is not even implemented in C#? Some languages always fill in the explicit interface table, and in fact I think Visual Basic might be one of those languages. So your proposal is to make the type inference rules possibly different for classes authored in VB than an equivalent type authored in C#.

Try explaining that to someone who just ported a class from VB to C# to have an identical public interface, and now their tests stop compiling.

Or, consider it from the perspective of the person implementing class Ark. If that person wishes to express the intention "this type can be used as both a sequence of turtles and giraffes, but if there is an ambiguity, choose turtles". Do you believe that any developer who wished to express that belief would naturally and easily come to the conclusion that the way to do that is to make one of the interfaces more implicitly implemented than the other?

If that were the sort of thing that developers needed to be able to disambiguate, then there should be a well-designed, clear, discoverable feature with those semantics. Something like:

class Ark : default IEnumerable<Turtle>, IEnumerable<Giraffe> ...

for example. That is, the feature should be obvious and searchable, rather than emerging by accident from an unrelated decision about what the public surface area of the type should be.

In short: The number of interface members that are explicitly implemented is not a part of the .NET type system. It's a private implementation strategy decision, not a public surface that the compiler should use to make decisions.

Finally, I've left the most important reason for last. You said:

I am not looking on feedback whether designing a class this way is considered best practice.

But that is an extremely important factor! The rules of C# are not designed to make good decisions about crappy code; they're designed to make crappy code into broken code that does not compile, and that has happened. The system works!

Making a class that implements two different versions of the same generic interface is a terrible idea and you should not do it. Because you should not do it, there is no incentive for the C# compiler team to spend even a minute figuring out how to help you do it better. This code gives you an error message. That is good. It should! That error message is telling you you're doing it wrong, so stop doing it wrong and start doing it right. If it hurts when you do that, stop doing that!

(One can certainly point out that the error message does a poor job of diagnosing the problem; this leads to another whole bunch of subtle design decisions. It was my intention to improve that error message for these scenarios, but the scenarios were too rare to make them a high priority and I did not get to it before I left Microsoft in 2012. Apparently no one else has made it a priority in the years that followed either.)

UPDATE: You ask why a call to ark.GetEnumerator can do the right thing automatically. That is a much easier question. The principle here is a simple one:

Overload resolution chooses the best member that is both accessible and applicable.

"Accessible" means that the caller has access to the member because it is "public enough", and "applicable" means "all the arguments match their formal parameter types".

When you call ark.GetEnumerator() the question is not "which implementation of IEnumerable<T> should I choose"? That's not the question at all. The question is "which GetEnumerator() is both accessible and applicable?"

There is only one, because explicitly implemented interface members are not accessible members of Ark. There is only one accessible member, and it happens to be applicable. One of the sensible rules of C# overload resolution is if there is only one accessible applicable member, choose it!

Exercise: What happens when you cast ark to IEnumerable<Animal>? Make a prediction:

I will get a sequence of turtles
I will get a sequence of giraffes
I will get a sequence of giraffes and turtles
I will get a compile error
I will get something else -- what?

Now try out your prediction and see what really happens. Draw conclusions as to whether it is a good or bad idea to write types that have multiple constructions of the same generic interface.

来源：https://stackoverflow.com/questions/56875750/interface-conflict-resolution-in-c-sharp

标签

linq

oop

interface

enumerator