What are all the syntax problems introduced by the usage of angle brackets in C++ templates?

前提是你 提交于 2019-11-27 12:48:34

问题


In C++ templates are instantiated with angle brackets vector<int> and the Java and C# languages have adopted the same syntax for their generics.

The creators of D, however, have been quite vocal about the problems that angle brackets bring and they made a new syntax foo!(int) — but I've never seen too many details about what problems angle brackets bring, exactly.

One of them was when instantiating a template with another template vector<vector<int>>, which would cause some (older?) compilers to confuse the trailing '>>` with the bit-shift or streaming operators. The solution was to insert a space between the two angle brackets, but haven't compilers become able to parse that syntax, nowadays?

Another problem was when using the greater-than operator foo<3 > 2>. The parser would think that the operator actually closes the template instantiation; the fix was to introduce parentheses foo<(3 > 2)>. But I don't think there that many cases where you need to do this and, at any rate, I'd rather have to type the extra parentheses when they are needed, instead of introducing new syntax and always having to type the exclamation mark.

What other problems are there with angle brackets that made the D developers create a new syntax?


回答1:


but haven't compilers become able to parse that syntax, nowadays?

Of course. But it’s far from trivial. In particular, it prevents you from implementing a clean separation between context-unaware lexer and parser. This is particularly irksome for syntax highlighters and other support tools that need to parse C++, but don’t want/can implement a fully-fledged syntactical analyser.

It makes C++ so much harder to parse that a lot of tools simply won’t bother. This is a net loss for the ecosystem. Put differently: it makes developing a parsing tool much more expensive.

For instance, ctags fails for some template definitions, which makes it unusable with our current C++ project. Very annoying.

But I don't think there that many cases where you need to [distinguish between angle brackets and less-than]

It doesn’t matter how often you need to do this. Your parser still needs to handle this.

D’s decision to drop angle backets was a no-brainer. Any one reason would have sufficed, given that it’s a net benefit.




回答2:


Personally, the most hideous problem I have seen is the invocation of template functions in dependent context:

template <typename T>
void foo(T t) {
  t.bar<3>();
}

This looks admittedly simple, but in fact is incorrect. The C++ Standard requires the introduction of the template keyword to disambiguate t.bar < 3 vs a method invocation yielding:

t.template bar<3>(); // iirk

litb made some very interesting posts regarding the possible interpretation a compiler could come up with.

Regarding the >> issue, it's fixed in C++0x, but requires more clever compilers.




回答3:


The issue is making the language grammar context-free. When a program is tokenized by the lexer, it uses a technique called maximal munch, which means that it always takes the longest string possible which could designate a token. That means that >> is treated as the right bitshift operator. So, if you have something like vector<pair<int, int>>, the >> on the end is treated as the right bitshift operator instead of part of a template instantiation. For it to treat >> differently in this context, it must be context-sensitive instead of context-free - that is it has to actually care about the context of the tokens being parsed. This complicates the lexer and parser considerably. The more complicated the lexer and parser are, the higher the risk of bugs - and more importantly, the harder it is for tools to implement them, which means fewer tools. When stuff like syntax highlighting in an IDE or code editor becomes complicated to implement, it's a problem.

By using !() - which would result in vector!(pair!(int, int)) for the same declaration - D avoids the context sensitivity issue. D has made a number of such choices in its grammar explicitly with the idea of making it easier for tools to implement lexing or parsing when they need to in order to do what they do. And since there's really no downside to using !() for templates other than the fact that it's a bit alien to programmers who have used templates or generics in other languages which use <>, it's a sound language design choice.

And how often you do or don't use templates which would create ambiguities when using the angle bracket syntax - e.g. vector<pair<int, int>> - isn't really relevant to the language. The tools must implement it regardless. The decision to use !() rather than <> is entirely a matter of simplifying the language for tools, not for the programmer. And while you may or may not particularly like the !() syntax, it's quite easy to use, so it ultimately doesn't cause programmers any problems beyond learning it and the fact that it may go against their personal preference.




回答4:


In C++ another problem is that the preprocessor doesn't understand angle brackets, so this fails:

#define FOO(X) typename something<X>::type

FOO(std::map<int, int>)

The problem is that the preprocessor thinks FOO is being called with two arguments: std::map<int and int>. This is an example of the wider problem, that it's often ambiguous whether the symbol is an operator or a bracket.




回答5:


Have fun figuring out what this does:

bool b = A< B>::C == D<E >::F();
bool b = A<B>::C == D<E>::F();

Last time I checked, you could make it parse either way by changing what's in scope.

Using < and > as both matching and non matching tokens is a disaster. As to the !() making the D usage longer: for the common case of having a single argument, the () are optional, e.g. this is legal:

Set!int foo;



回答6:


I believe those were the only cases.

However, it's not so much a user problem as it is an implementer problem. This seemingly trivial difference makes it much harder to build a correct parser for C++ (as compared to D). D was also designed to be implementer-friendly, and as such they tried their best to avoid making ambiguous code possible.

(Side note: I do find the shift-exclamation point combination to be somewhat awkward... one advantage of angle brackets is definitely ease of typing!)




回答7:


Ultimately, what any compiler has to do it translate your semi-English source code- in whatever language- into the real machine code a computer can actually operate on. This is ultimately a series of incredibly complex mathematical TRANSFORMS.

Well, mathematics tells us that the mapping we need for compilation are "onto" or "surjective". All that means is that every legal program CAN be mapped unambiguously to assembly. This is what language keywords and punctuation like ";" exist for, and why every language has them. However, languages like C++ use the same symbols like "{}" and "<>" for multiple things, so the compiler has to add extra steps to produce the overall, onto transform it needs (this is what you're doing in linear algebra when you multiply matrices). That adds to compile times, introduces significant complexity that itself can harbor bugs, and can limit the compiler's ability to optimize the output.

For example, Strousoup could've used '@' for templates argument- it was an unused character that would've been perfect for letting compilers know that "this is, and only ever will be, some kind of template". That is actually a 1-to-1 transform, which is perfect for analytic tools. But he didn't; he used symbols that already mapped to greater-than and less-than. That alone immediately introduces ambiguity, and it only gets worse from there.

It sounds like "D" decided to make the sequence '!()' a special symbol, used only for templates, like my '@' example above. I'm willing to guess that its highly templated code compiles faster and with fewer bugs as a result.




回答8:


>= greater-than or equals ambiguity is another case that wasn't mentioned:

Fails:

template <int>
using A = int;
void f(A<0>=0);

Works:

void f(A<0> =0);

I think this did not change in C++11 like >>.

See this question for more details: Why does the template-id in "A<0>=0" not compile without space because of the greater-or-equal-than operator ">="?



来源:https://stackoverflow.com/questions/7304699/what-are-all-the-syntax-problems-introduced-by-the-usage-of-angle-brackets-in-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!