问题
Let’s consider that snippet, and please suppose that a, b, c and d are non-empty strings.
std::string a, b, c, d;
d = a + b + c;
When computing the sum of those 3 std::string
instances, the standard library implementations create a first temporary std::string
object, copy in its internal buffer the concatenated buffers of a
and b
, then perform the same operations between the temporary string and the c
.
A fellow programmer was stressing that instead of this behaviour, operator+(std::string, std::string)
could be defined to return a std::string_helper
.
This object’s very role would be to defer the actual concatenations to the moment where it’s casted into a std::string
. Obviously, operator+(std::string_helper, std::string)
would be defined to return the same helper, which would "keep in mind" the fact that it has an additional concatenation to carry out.
Such a behavior would save the CPU cost of creating n-1 temporary objects, allocating their buffer, copying them, etc. So my question is: why doesn’t it already work like that ?I can’t think of any drawback or limitation.
回答1:
why doesn’t it already work like that?
I can only speculate about why it was originally designed like that. Perhaps the designers of the string library simply didn't think of it; perhaps they thought the extra type conversion (see below) might make the behaviour too surprising in some situations. It is one of the oldest C++ libraries, and a lot of wisdom that we take for granted simply didn't exist in past decades.
As to why it hasn't been changed to work like that: it could break existing code, by adding an extra user-defined type conversion. Implicit conversions can only involve at most one user-defined conversion. This is specified by C++11, 13.3.3.1.2/1:
A user-defined conversion sequence consists of an initial standard conversion sequence followed by a user-defined conversion followed by a second standard conversion sequence.
Consider the following:
struct thingy {
thingy(std::string);
};
void f(thingy);
f(some_string + another_string);
This code is fine if the type of some_string + another_string
is std::string
. That can be implicitly converted to thingy
via the conversion constructor. However, if we were to change the definition of operator+
to give another type, then it would need two conversions (string_helper
to string
to thingy
), and so would fail to compile.
So, if the speed of string building is important, you'll need to use alternative methods like concatenation with +=
. Or, according to Matthieu's answer, don't worry about it because C++11 fixes the inefficiency in a different way.
回答2:
The obvious answer: because the standard doesn't allow it. It impacts code by introducing an additional user defined conversion in some cases: if C
is a type having a user defined constructor taking an std::string
, then it would make:
C obj = stringA + stringB;
illegal.
回答3:
It depends.
In C++03, it is exact that there may be a slight inefficiency there (comparable to Java and C# as they use string interning by the way). This can be alleviated using:
d = std::string("") += a += b +=c;
which is not really... idiomatic.
In C++11, operator+ is overloaded for rvalue references. Meaning that:
d = a + b + c;
is transformed into:
d.assign(std::move(operator+(a, b).append(c)));
which is (nearly) as efficient as you can get.
The only inefficiency left in the C++11 version is that the memory is not reserved once and for all at the beginning, so there might be reallocation and copies up to 2 times (for each new string). Still, because appending is amortized O(1), unless C is quite longer than B, then at worst a single reallocation + copy should take place. And of course, we are talking POD copy here (so a memcpy
call).
回答4:
Sounds to me like something like this already exists: std::stringstream.
Only you have <<
instead of +
. Just because std::string::operator +
exists, it doesn't make it the most efficient option.
回答5:
I think if you use +=
, then it will be little faster:
d += a;
d += b;
d += c;
It should be faster, as it doesn't create temporary objects.Or simply this,
d.append(a).append(b).append(c); //same as above: i.e using '+=' 3 times.
回答6:
The main reason for not doing a string of individual +
concatenations, and especially not doing that in a loop, is that is has O(n2) complexity.
A reasonable alternative with O(n) complexity is to use a simple string builder, like
template< class Char >
class ConversionToString
{
public:
// Visual C++ 10.0 has some DLL linking problem with other types:
CPP_STATIC_ASSERT((
std::is_same< Char, char >::value || std::is_same< Char, wchar_t >::value
));
typedef std::basic_string< Char > String;
typedef std::basic_ostringstream< Char > OutStringStream;
// Just a default implementation, not particularly efficient.
template< class Type >
static String from( Type const& v )
{
OutStringStream stream;
stream << v;
return stream.str();
}
static String const& from( String const& s )
{
return s;
}
};
template< class Char, class RawChar = Char >
class StringBuilder;
template< class Char, class RawChar >
class StringBuilder
{
private:
typedef std::basic_string< Char > String;
typedef std::basic_string< RawChar > RawString;
RawString s_;
template< class Type >
static RawString fastStringFrom( Type const& v )
{
return ConversionToString< RawChar >::from( v );
}
static RawChar const* fastStringFrom( RawChar const* s )
{
assert( s != 0 );
return s;
}
static RawChar const* fastStringFrom( Char const* s )
{
assert( s != 0 );
CPP_STATIC_ASSERT( sizeof( RawChar ) == sizeof( Char ) );
return reinterpret_cast< RawChar const* >( s );
}
public:
enum ToString { toString };
enum ToPointer { toPointer };
String const& str() const { return reinterpret_cast< String const& >( s_ ); }
operator String const& () const { return str(); }
String const& operator<<( ToString ) { return str(); }
RawChar const* ptr() const { return s_.c_str(); }
operator RawChar const* () const { return ptr(); }
RawChar const* operator<<( ToPointer ) { return ptr(); }
template< class Type >
StringBuilder& operator<<( Type const& v )
{
s_ += fastStringFrom( v );
return *this;
}
};
template< class Char >
class StringBuilder< Char, Char >
{
private:
typedef std::basic_string< Char > String;
String s_;
template< class Type >
static String fastStringFrom( Type const& v )
{
return ConversionToString< Char >::from( v );
}
static Char const* fastStringFrom( Char const* s )
{
assert( s != 0 );
return s;
}
public:
enum ToString { toString };
enum ToPointer { toPointer };
String const& str() const { return s_; }
operator String const& () const { return str(); }
String const& operator<<( ToString ) { return str(); }
Char const* ptr() const { return s_.c_str(); }
operator Char const* () const { return ptr(); }
Char const* operator<<( ToPointer ) { return ptr(); }
template< class Type >
StringBuilder& operator<<( Type const& v )
{
s_ += fastStringFrom( v );
return *this;
}
};
namespace narrow {
typedef StringBuilder<char> S;
} // namespace narrow
namespace wide {
typedef StringBuilder<wchar_t> S;
} // namespace wide
Then you can write efficient and clear things like …
using narrow::S;
std::string a = S() << "The answer is " << 6*7;
foo( S() << "Hi, " << username << "!" );
来源:https://stackoverflow.com/questions/9619659/stdstring-and-multiple-concatenations