This question may sound fairly elementary, but this is a debate I had with another developer I work with.
I was taking care to stack allocate things where I could, i
Stack allocation is a couple instructions whereas the fastest rtos heap allocator known to me (TLSF) uses on average on the order of 150 instructions. Also stack allocations don't require a lock because they use thread local storage which is another huge performance win. So stack allocations can be 2-3 orders of magnitude faster depending on how heavily multithreaded your environment is.
In general heap allocation is your last resort if you care about performance. A viable in-between option can be a fixed pool allocator which is also only a couple instructions and has very little per-allocation overhead so it's great for small fixed size objects. On the downside it only works with fixed size objects, is not inherently thread safe and has block fragmentation problems.