“?” type modifer precedence vs logical and operator (&) vs address-of operator (&)

后端未结

关注

 2  2050

Update: It seems that I am not being clear enough of what exactly I am asking (and as the question developed over time I also lost track a bit), so here is

相关标签:

2条回答

耶瑟儿～

2020-12-30 08:18

The clue here is the unsafe keyword. There are actually two different & operators - bitwise-AND & operator you're used to, but also the address-of & operator, which not only has higher precedence, but is evaluated right-to-left, like all unary operators.

This means that &b is evaluated first, and results in a pointer value. The rest of the statement is, as the compiler complains, unparseable. It's either (a is byte?) (address), or (as the compiler tries to parse it) (a is byte) ? (address) and is missing the :.

I get the same compile error when replacing & with + or -, both symbols that can be either unary or binary operators.

The reason the second statement compiles fine is that there isn't a unary, right-to-left high-precedence && operator.

0 讨论(0)
发布评论:

提交评论
- 加载中...

谎友^

2020-12-30 08:27

To be honest I am not quite sure whether i should post this as an answer or add this information to the - already quite verbose - question, but I finally found why it behaves that way. (But I still think it is not explicitly described in the standard, and that it is in fact a limitation of the current implementation of the compiler.)

Also, I am not going to accept my own answer for a while, hoping that someone might be able to give a better answer alternative.

I spent a little time with Roslyn, and I debugged through the lexing and parsing of the various statements from this code:

var test1 = a is byte & b;
var test2 = a is byte? & b;
var test3 = a is byte? && b;

The exact syntax trees are already added to the question, so I am not going to repeat them here.

The difference between the statements comes from this part of the compiling process (from LanguageParser.cs):

private TypeSyntax ParseTypeCore(
    bool parentIsParameter,
    bool isOrAs,
    bool expectSizes,
    bool isArrayCreation)
{
    var type = this.ParseUnderlyingType(parentIsParameter);

    if (this.CurrentToken.Kind == SyntaxKind.QuestionToken)
    {
        var resetPoint = this.GetResetPoint();
        try
        {
            var question = this.EatToken();

            // Comment added by me
            // This is where the difference occurs 
            // (as for '&' the IsAnyUnaryExpression() returns true)
            if (isOrAs && (IsTerm() || IsPredefinedType(this.CurrentToken.Kind) || SyntaxFacts.IsAnyUnaryExpression(this.CurrentToken.Kind)))
            {
                this.Reset(ref resetPoint);

                Debug.Assert(type != null);
                return type;
            }

            question = CheckFeatureAvailability(question, MessageID.IDS_FeatureNullable);
            type = syntaxFactory.NullableType(type, question);
        }
        finally
        {
            this.Release(ref resetPoint);
        }
    }

    // Check for pointer types (only if pType is NOT an array type)
    type = this.ParsePointerTypeMods(type);

    // Now check for arrays.
    if (this.IsPossibleRankAndDimensionSpecifier())
    {
        var ranks = this.pool.Allocate<ArrayRankSpecifierSyntax>();
        try
        {
            while (this.IsPossibleRankAndDimensionSpecifier())
            {
                bool unused;
                var rank = this.ParseArrayRankSpecifier(isArrayCreation, expectSizes, out unused);
                ranks.Add(rank);
                expectSizes = false;
            }

            type = syntaxFactory.ArrayType(type, ranks);
        }
        finally
        {
            this.pool.Free(ranks);
        }
    }

    Debug.Assert(type != null);
    return type;
}

And the same result would occur in case of symbols after the byte? part for whose this function returns anything but SyntaxKind.None:

public static SyntaxKind GetPrefixUnaryExpression(SyntaxKind token)
{
    switch (token)
    {
        case SyntaxKind.PlusToken:
            return SyntaxKind.UnaryPlusExpression;
        case SyntaxKind.MinusToken:
            return SyntaxKind.UnaryMinusExpression;
        case SyntaxKind.TildeToken:
            return SyntaxKind.BitwiseNotExpression;
        case SyntaxKind.ExclamationToken:
            return SyntaxKind.LogicalNotExpression;
        case SyntaxKind.PlusPlusToken:
            return SyntaxKind.PreIncrementExpression;
        case SyntaxKind.MinusMinusToken:
            return SyntaxKind.PreDecrementExpression;
        case SyntaxKind.AmpersandToken:
            return SyntaxKind.AddressOfExpression;
        case SyntaxKind.AsteriskToken:
            return SyntaxKind.PointerIndirectionExpression;
        default:
            return SyntaxKind.None;
    }
}

So the problem is that after an is (or an as) operator, when we face a ? token, then we check if the next token can be interpreted as a unary operator and if so: we don't care about the possibility of the ? token being a type modifier, we simply return the type before it, and will parse the rest accordingly (there are more conditions to be met, but this is the relevant information regarding my question). The irony in that is that the & symbol can't even be a unary operator, only in an unsafe context, but this is never taken into consideration.

As others have pointed out in comments, maybe this issue could be solved if we looked ahead a little bit more, e.g.: in that particular case we could check if there is a matching : for the ? token, and if not, than ignore the possibility of the unary & operator and treat the ? as a type modifier. If I will have the time, I will try to implement a workaround, and see where it will cause even greater problems :) (Luckily there are a lot of tests in the Roslyn solution...)

Thanks for everyone for their feedback.

0 讨论(0)