A C++ program that uses several DLLs and QT should be equipped with a malloc replacement (like tcmalloc) for performance problems that can be verified to be caused by Window
Where does your premise "A C++ program that uses several DLLs and QT should be equipped with a malloc replacement" come from?
On Windows, if the all the dlls use the shared MSVCRT, then there is no need to replace malloc. By default, Qt builds against the shared MSVCRT dll.
One will run into problems if they:
1) mix dlls that use static linking vs using the shared VCRT
2) AND also free memory that was not allocated where it came from (ie, free memory in a statically linked dll that was allocated by the shared VCRT or vice versa).
Note that adding your own ref counted wrapper around a resource can help mitigate that problems associated with resources that need to be deallocated in particular ways (ie, a wrapper that disposes of one type of resource via a call back to the originating dll, a different wrapper for a resource that originates from another dll, etc).