I\'m having a very simple program that outputs simple JSON string that I manually concatenate together and output through the std::cout stream (the output really is that sim
Caveat
Whatever solution you take, keep in mind that the JSON standard requires that you escape all control characters. This seems to be a common misconception. Many developers get that wrong.
All control characters means everything from '\x00' to '\x1f', not just those with a short representation such as '\x0a' (also known as '\n'). For example, you must escape the '\x02' character as \u0002.
See also: ECMA-404 The JSON Data Interchange Format, Page 10
Simple solution
If you know for sure that your input string is UTF-8 encoded, you can keep things simple.
Since JSON allows you to escape everything via \uXXXX, even " and \, a simple solution is:
#include
#include
std::string escape_json(const std::string &s) {
std::ostringstream o;
for (auto c = s.cbegin(); c != s.cend(); c++) {
if (*c == '"' || *c == '\\' || ('\x00' <= *c && *c <= '\x1f')) {
o << "\\u"
<< std::hex << std::setw(4) << std::setfill('0') << (int)*c;
} else {
o << *c;
}
}
return o.str();
}
Shortest representation
For the shortest representation you may use JSON shortcuts, such as \" instead of \u0022. The following function produces the shortest JSON representation of a UTF-8 encoded string s:
#include
#include
std::string escape_json(const std::string &s) {
std::ostringstream o;
for (auto c = s.cbegin(); c != s.cend(); c++) {
switch (*c) {
case '"': o << "\\\""; break;
case '\\': o << "\\\\"; break;
case '\b': o << "\\b"; break;
case '\f': o << "\\f"; break;
case '\n': o << "\\n"; break;
case '\r': o << "\\r"; break;
case '\t': o << "\\t"; break;
default:
if ('\x00' <= *c && *c <= '\x1f') {
o << "\\u"
<< std::hex << std::setw(4) << std::setfill('0') << (int)*c;
} else {
o << *c;
}
}
}
return o.str();
}
Pure switch statement
It is also possible to get along with a pure switch statement, that is, without if and . While this is quite cumbersome, it may be preferable from a "security by simplicity and purity" point of view:
#include
std::string escape_json(const std::string &s) {
std::ostringstream o;
for (auto c = s.cbegin(); c != s.cend(); c++) {
switch (*c) {
case '\x00': o << "\\u0000"; break;
case '\x01': o << "\\u0001"; break;
...
case '\x0a': o << "\\n"; break;
...
case '\x1f': o << "\\u001f"; break;
case '\x22': o << "\\\""; break;
case '\x5c': o << "\\\\"; break;
default: o << *c;
}
}
return o.str();
}
Using a library
You might want to have a look at https://github.com/nlohmann/json, which is an efficient header-only C++ library (MIT License) that seems to be very well-tested.
You can either call their escape_string() method directly, or you can take their implementation of escape_string() as a starting point for your own implementation:
https://github.com/nlohmann/json/blob/ec7a1d834773f9fee90d8ae908a0c9933c5646fc/src/json.hpp#L4604-L4697