I wrote my own malloc and free and compiled them in a shared library. I LD_PRELOAD that library with my program. In this way would my program always use my
Using LD_PRELOAD to override malloc etc. is expected to work; this is how e.g. DUMA works.
In addition to malloc, calloc and free, make sure you override realloc, memalign and valloc. In addition you might need to override C++ new, new[], delete and delete[].
See Overriding 'malloc' using the LD_PRELOAD mechanism for an example of how to do this right.