Keep temporary std::string and return c_str() to prevent memory leaks

问题

I found myself using this type of code below to prevent memory leaks, is there anything wrong with it in terms of performance, safety, style or ...?

The idea is that if i need to return an edited string (in terms of a c-string not std::string), i use a temporary std::string as a helper and set it to what I want my return to be and keep that temporary alive.

Next time i call that function it re-sets the temporary to the new value that I want. And since the way i use the returned c-string, I only read the returned value, never store it.

Also, I should mention, std::string is an implementation detail, and dont want to expose it (so cant return std::string, have to return c-string).

Anyway, here is the code:

 //in header
class SomeClass
{
private:
    std::string _rawName;

public:
    const char* Name(); // return c-string
};

//in cpp file
std::string _tempStr; // my temporary helper std::string

const char* SomeClass::Name()
{
    return (_tempStr = "My name is: " +
            _rawName + ". Your name is: " + GetOtherName()).c_str();
}

回答1:

In C++ you cannot simply ignore object lifetimes. You cannot talk to an interface while ignoring object lifetimes.

If you think you are ignoring object lifetimes, you almost certainly have a bug.

Your interface ignores the lifetime of the returned buffer. It lasts "long enough" -- "until someone calls me again". That is a vague guarantee that will lead to really bad bugs.

Ownership should be clear. One way to make ownership clear is to use a C-style interface. Another is to use a C++ library types, and require your clients to match your library version. Another is to use custom smart objects, and guarantee their stability over versions.

These all have downsides. C-style interfaces are annoying. Forcing the same C++ library on your clients is annoying. Having custom smart objects is code duplication, and forces your clients to use whatever string classes you wrote, not whatever they want to use, or well written std ones.

A final way is to type erase, and guarantee the stability of the type erasure.

Let us look at that option. We type erase down to assigning-to a std like container. This means we forget the type of the thing we erase, but we remember how to assign-to it.

namespace container_writer {
  using std::begin; using std::end;
  template<class C, class It, class...LowPriority>
  void append( C& c, It b, It e, LowPriority&&... ) {
    c.insert( end(c), b, e );
  }

  template<class C, class...LowPriority>
  void clear(C& c, LowPriority&&...) {
    c = {};
  }
  template<class T>
  struct sink {
    using append_f = void(*)(void*, T const* b, T const* e);
    using clear_f = void(*)(void*);
    void* ptr = nullptr;
    append_f append_to = nullptr;
    clear_f clear_it = nullptr;

    template<class C,
      std::enable_if_t< !std::is_same<std::decay_t<C>, sink>{}, int> =0
    >
    sink( C&& c ):
      ptr(std::addressof(c)),
      append_to([](void* ptr, T const* b, T const* e){
        auto* pc = static_cast< std::decay_t<C>* >(ptr);
        append( *pc, b, e );
      }),
      clear_it([](void* ptr){
        auto* pc = static_cast< std::decay_t<C>* >(ptr);
        clear(*pc);
      })
    {}
    sink(sink&&)=default;
    sink(sink const&)=delete;
    sink()=default;

    void set( T const* b, T const* e ) {
      clear_it(ptr);
      append_to(ptr, b, e);
    }
    explicit operator bool()const{return ptr;}
    template<class Traits>
    sink& operator=(std::basic_string<T, Traits> const& str) {
      set( str.data(), str.data()+str.size() );
      return *this;
    }
    template<class A>
    sink& operator=(std::vector<T, A> const& str) {
      set( str.data(), str.data()+str.size() );
      return *this;
    }
  };
}

Now, container_writer::sink<T> is a pretty darn DLL-safe class. Its state is 3 C-style pointers. While it is a template, it is also standard layout, and standard layout basically means "has a layout like a C struct would".

A C struct that contains 3 pointers is ABI safe.

Your code takes a container_writer::sink<char>, and inside your DLL you can assign a std::string or a std::vector<char> to it. (extending it to support more ways to assign to it is easy).

The DLL-calling code sees the container_writer::sink<char> interface, and on the client side converts a passed std::string to it. This creates some function pointers on the client side that know how to resize and insert stuff into a std::string.

These function pointers (and a void*) pass over the DLL boundary. On the DLL side, they are blindly called.

No allocated memory passes from the DLL side to the client side, or vice versa. Despite that, every bit of data has well defined lifetime associated with an object (RAII style). There is no messy lifetime issues, because the client controls the lifetime of the buffer being written to, while the server writes to it with an automatically written callback.

If you have a non-std style container and you want to support container_sink it is easy. Add append and clear free functions to the namespace of your type, and have them do the required action. container_sink will automatically find them and use them to fill your container.

As an example, you can use CStringA like this:

void append( CStringA& str, char const* b, char const* e) {
  str += CStringA( b, e-b );
}
void clear( CStringA& str ) {
  str = CStringA{};
}

and magically CStringA is now a valid argument for something taking a container_writer::sink<char>.

The use of append is there just in case you need fancier construction of the container. You could write a container_writer::sink method that eats non-contiguous buffers by having it feed the stored container fixed sized chunks at a time; it does a clear, then repeated appends.

live example

Now, this doesn't let you return the value from a function.

To get that to work, first do the above. Expose functions that return their strings through container_writer::sink<char> over the DLL barrier.

Make them private. Or mark them as not-to-be-called. Whatever.

Next, write inline public functions that call those functions, and return the filled std::string. These are pure header file constructs, so the code lives in the DLL client.

So we get:

class SomeClass
{
private:
   void Name(container_writer::container_sink<char>);
public:
   // in header file exposed from DLL:
   // (block any kind of symbol export of this!)
   std::string Name() { 
     std::string r;
     Name(r);
     return r;
   }
};

void SomeClass::Name(container_writer::container_sink<char> s) 
{
  std::string tempStr = "My name is: " +
        _rawName + ". Your name is: " + GetOtherName();
  s = tempStr;
}

and done. The DLL interface acts C++, but is actually just passing 3 raw C pointers through. All resources are owned at all times.

回答2:

This is a mistake. If you pass a pointer as a return value, the caller must have a guarantee that the pointer will remain valid as long as necessary. In this case the pointer could be invalidated if the owning object is destroyed, or if the function is called a second time causing a new string to be generated.

You want to avoid an implementation detail, but you're creating an implementation detail that is much worse than the one you want to avoid. C++ has strings, use them.

回答3:

This might backfire if you ever use your class in multi-threaded environment. Instead of those tricks, just return std::string by value.

I have seen the answer about 'implementation detail'. I do not agree with it. std::string is no more implementation detail than const char*. It is a way to provide string representations.

来源：https://stackoverflow.com/questions/33659609/keep-temporary-stdstring-and-return-c-str-to-prevent-memory-leaks

标签

c++

c++11

memory-leaks

stdstring