How to delay an ARM Cortex M0+ for n cycles, without a timer?

a 夏天 提交于 2019-11-30 15:34:37

The code is going to depend on exactly what n is, and whether it needs to be dynamically variable, but given the M0+ core's instruction timings, establishing bounds for a particular routine is pretty straightforward.

For the smallest possible (6-byte) complete loop with a fixed 8-bit immediate counter:

   movs  r0, #NUM    ;1 cycle
1: subs  r0, r0, #1  ;1 cycle
   bne   1b          ;2 if taken, 1 otherwise

with NUM=1 we get a minimum of 3 cycles, plus 3 cycles for every extra loop up to NUM=255 at 765 cycles (of course, you could have 2^32 iterations from NUM=0, but that seems a bit silly). That puts the lower bound for a loop being practical at about 6 cycles. With a fixed loop it's easy to pad NOPs (or even nested loops) inside it to lengthen each iteration, and before/after to align to a non-multiple of the loop length. If you can arrange for a number of iterations to be ready in a register before you need to start waiting, then you can lose the initial mov and have pretty much any multiple of 3 or more cycles, minus one. If you need single-cycle resolution for a variable delay, the initial setup cost is going to be somewhat higher to correct for the remainder (a computed branch into a NOP sled is what I'd do for that)

I'm assuming that if you're at the point of cycle-critical timing you've already got interrupts off (otherwise throw in another cycle somewhere for CPSID), and that you don't have any bus wait states adding extra cycles to instruction fetches.

As for trying to do it in C: the fact that you have to hack in an empty asm to keep the "useless" loop from being optimised away is a tip-off. The abstract C machine has no notion of "instructions" or "cycles" so there is simply no way to reliably express this in the language. Trying to rely on particular C constructs to compile to suitable instructions is extremely fragile - change a compiler flag; upgrade the compiler; change some distant code which affects register allocation which affects instruction selection; etc. - pretty much anything could change the generated code unexpectedly, so I'd say hand-coded assembly is the only sensible approach for cycle-accurate code.

The shortest ARM loop that I can think of goes like:

mov r0, #COUNT
L:
subs r0, r0, #1
bnz L

Since I don't have the device in question, no idea about timing. Those are core dependent.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!