mumble ... I do not like your solution for many reasons:
You are not taking in account the execution time of child processes/thread.
You should compare cpu-usage not the bare elapsed time. This way your statistics will not depend from, e.g., disk access congestion.
Let your child process do something. Remember that "modern" fork uses copy-on-write mechanisms to avoid to allocate memory to the child process until needed. It is too easy to exit immediately. This way you avoid quite all the disadvantages of fork.
CPU time is not the only cost you have to account. Memory consumption and slowness of IPC are both disadvantages of fork solution.
You could use "rusage" instead of "clock" to measure real resource usage.
P.S. I do not think you can really measure the process/thread overhead writing a simple test program. There are too many factors and, usually, the choice between threads and processes is driven by other reasons than mere cpu-usage.