i would to know how to write a profiler? What books and / or articles recommended? Can anyone help me please?
Someone has already done something like this?
As another answer, I just looked at LukeStackwalker on sourceforge. It is a nice, small, example of a stack-sampler, and a nice place to start if you want to write a profiler.
Here, in my opinion, is what it does right:
Sigh ... so near yet so far. Here, IMO, is what it (and other stack samplers like xPerf) should do:
It should retain the raw stack samples. As it is, it summarizes at the function level as it samples. This loses the key line-number information locating the problematic call sites.
It need not take so many samples, if storage to hold them is an issue. Since typical performance problems cost from 10% to 90%, 20-40 samples will show them quite reliably. Hundreds of samples give more measurement precision, but they do not increase the probability of locating the problems.
The UI should summarize in terms of statements, not functions. This is easy to do if the raw samples are kept. The key measure to attach to a statement is the fraction of samples containing it. For example:
5/20 MyFile.cpp:326 for (i = 0; i < strlen(s); ++i)
This says that line 326 in MyFile.cpp showed up on 5 out of 20 samples, in the process of calling strlen. This is very significant, because you can instantly see the problem, and you know how much speedup you can expect from fixing it. If you replace strlen(s) by s[i], it will no longer be spending time in that call, so these samples will not occur, and the speedup will be approximately 1/(1-5/20) = 20/(20-5) = 4/3 = 33% speedup. (Thanks to David Thornley for this sample code.)
The UI should have a "butterfly" view showing statements. (If it shows functions too, that's OK, but the statements are what really matter.) For example:
3/20 MyFile.cpp:502 MyFunction(myArgs)
2/20 HisFile.cpp:113 MyFunction(hisArgs)
5/20 MyFile.cpp:326 for (i = 0; i < strlen(s); ++i)
5/20 strlen.asm:23 ... some assembly code ...
In this example, the line containing the for statement is the "focus of attention". It occurred on 5 samples. The two lines above it say that on 3 of those samples, it was called from MyFile.cpp:502, and on 2 of those samples, it was called from HisFile.cpp:113. The line below it says that on all 5 of those samples, it was in strlen (no surprise there). In general, the focus line will have a tree of "parents" and a tree of "children". If for some reason, the focus line is not something you can fix, you can go up or down. The goal is to find lines that you can fix that are on as many samples as possible.
IMPORTANT: Profiling should not be looked at as something you do once. For example, in the sample above, we got a 4/3 speedup by fixing one line of code. When the process is repeated, other problematic lines of code should show up at 4/3 the frequency they did before, and thus be easier to find. I never hear of people talking about iterating the profiling process, but it is crucial to getting overall large compounded speedups.
P.S. If a statement occurs more than once in a single sample, that means there is recursion taking place. It is not a problem. It still only counts as one sample containing the statement. It is still the case that the cost of the statement is approximated by the fraction of samples containing it.