What is the rationale for parenthesis in C++11's raw string literals R“(…)”?

空扰寡人 提交于 2019-11-28 03:45:35

As the other answer explains, there must be something additional to the quotation mark to avoid the parsing ambiguity in cases where " or )", or actually any closing sequence that may appear in the string itself.

As for the syntax choice, well, I agree the syntax choice is suboptimal, but it is OK in general (you could think of it: "things could be worse", lol). I think it is a good compromise between usage simplicity and parsing simplicity.

Proposal 1. Inspired by python. Cannot support string literals with triple quotes:
R"""any content, except for triple quotes, which you don't actually use that often."""

There is indeed a problem with this - "quotes, which you don't actually use that often". Firstly, the very idea of raw strings is to represent raw strings, i.e. exactly as they would appear in a text file, without any modifications to the string, regardless of the string contents. Secondly, the syntax should be general, i.e. without adding variations like "almost raw string", etc.

How would you write one quote with this syntax? Two quotes? Note - those are very common cases, especially when your code is dealing with strings and parsing.

Proposal 2.
R"delim"content of string"delim".
R""Looks better, doesnt it?"".
R"#"Here are double quotes: "", thanks"#".

Well, this one might be a better candidate. One thing though - a common case (and I believe it was a motivating case for accepted syntax), is that the double-quote character itself is very common and raw strings should come in handy for these cases.

So, lets see, normal string syntax:

s1 = "\"";
s2 = "\"quoted string\"";

Your syntax e.g. with "x" as delim:

s1 = R"x"""x";
s2 = R"x""quoted string""x";

Accepted syntax:

s1 = R"(")";
s2 = R"("quoted string")";

Yes, I agree that the brackets introduce some annoying visual effect. So I suspect the authors of the syntax were after the idea that the additional "delim" in this case will be rarely needed, since )" appears not very often inside a string. But OTOH, trailing/leading/isolated quotes are quite often, so e.g. your proposed syntax (#2) would require some delim more often, which in turn would require more often changing it from R"".."" to R"delim"..."delim". Hope you get the idea.

Could the syntax be better? I personally would prefer an even simpler variant of syntax:

Rdelim"string contents"delim;

With the above examples:

s1 = Rx"""x; 
s2 = Rx""quoted string""x;

However to work correctly (if its possible at all in current grammar), this variant would require limiting the character set for the delim part, say to letters/digits only (because of existing operators), and maybe some further restrictions for the initial character to avoid clashes with possible future grammar.
So I believe a better choice could have been made, although nothing significantly better can be done in this case.

The purpose of the parentheses is to allow you to specify a custom delimiter:

R"foo(Hello World)foo"   // the string "Hello World"

In your example, and in typical use, the delimiter is simply empty, so the raw string is enclosed by the sequences R"( and )".

Allowing for arbitrary delimiters is a design decision that reflects the desire to provide a complete solution without weird limitations or edge cases. You can pick any sequence of characters that does not occur in your string as the delimiter.

Without this, you would be in trouble if the string itself contained something like " (if you had just wanted R"..." as your raw string syntax) or )" (if the delimiter is empty). Both of those are perfectly common and frequent character sequences, especially in regular expressions, so it would be incredibly annoying if the decision whether or not you use a raw string depended on the specific content of your string.

Remember that inside the raw string there's no other escape mechanism, so the best you could do otherwise was to concatenate pieces of string literal, which would be very impractical. By allowing a custom delimiter, all you need to do is pick an unusual character sequence once, and maybe modify it in very rare cases when you make a future edit.

But to stress once again, even the empty delimiter is already useful, since the R"(...)" syntax allows you to place naked quotation marks in your string. That by itself is quite a gain.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!