C Program to determine Levels & Size of Cache

前端 未结 6 974
北海茫月
北海茫月 2020-11-30 00:38

Full Re-Write/Update for clarity (and your sanity, its abit too long) ... (Old Post)

For an assignment, I need to find the levels (L1,L2,...) and si

6条回答
  •  小蘑菇
    小蘑菇 (楼主)
    2020-11-30 01:29

    The time it takes to measure your time (that is, the time just to call the clock() function) is many many (many many many....) times greater than the time it takes to perform arr[(i*16)&lengthMod]++. This extremely low signal-to-noise ratio (among other likely pitfalls) makes your plan unworkable. A large part of the problem is that you're trying to measure a single iteration of the loop; the sample code you linked is attempting to measure a full set of iterations (read the clock before starting the loop; read it again after emerging from the loop; do not use printf() inside the loop).

    If your loop is large enough you might be able to overcome the signal-to-noise ratio problem.

    As to "what element is being incremented"; arr is an address of a 1MB buffer; arr[(i * 16) & lengthMod]++; causes (i * 16) * lengthMod to generate an offset from that address; that offset is the address of the int that gets incremented. You're performing a shift (i * 16 will turn into i << 4), a logical and, an addition, then either a read/add/write or a single increment, depending on your CPU).

    Edit: As described, your code suffers from a poor SNR (signal to noise ratio) due to the relative speeds of memory access (cache or no cache) and calling functions just to measure the time. To get the timings you're currently getting, I assume you modified the code to look something like:

    int main() {
        int steps = 64 * 1024 * 1024;
        int arr[1024 * 1024];
        int lengthMod = (1024 * 1024) - 1;
        int i;
        double timeTaken;
        clock_t start;
    
        start = clock();
        for (i = 0; i < steps; i++) {
            arr[(i * 16) & lengthMod]++;
        }
        timeTaken = (double)(clock() - start)/CLOCKS_PER_SEC;
        printf("Time for %d: %.12f \n", i, timeTaken);
    }
    

    This moves the measurement outside the loop so you're not measuring a single access (which would really be impossible) but rather you're measuring steps accesses.

    You're free to increase steps as needed and this will have a direct impact on your timings. Since the times you're receiving are too close together, and in some cases even inverted (your time oscillates between sizes, which is not likely caused by cache), you might try changing the value of steps to 256 * 1024 * 1024 or even larger.

    NOTE: You can make steps as large as you can fit into a signed int (which should be large enough), since the logical and ensures that you wrap around in your buffer.

提交回复
热议问题