Memcpy takes the same time as memset

前端 未结 2 1623
北海茫月
北海茫月 2021-01-07 04:44

I want to measure memory bandwidth using memcpy. I modified the code from this answer:why vectorizing the loop does not have performance improvement which used

2条回答
  •  遥遥无期
    2021-01-07 05:25

    The point is that malloc and calloc on most platforms don't allocate memory; they allocate address space.

    malloc etc work by:

    • if the request can be fulfilled by the freelist, carve a chunk out of it
      • in case of calloc: the equivalent ofmemset(ptr, 0, size) is issued
    • if not: ask the OS to extend the address space.

    For systems with demand paging (COW) (an MMU could help here), the second options winds downto:

    • create enough page table entries for the request, and fill them with a (COW) reference to /dev/zero
    • add these PTEs to the address space of the process

    This will consume no physical memory, except only for the Page Tables.

    • Once the new memory is referenced for read, the read will come from /dev/zero. The /dev/zero device is a very special device, in this case mapped to every page of the new memory.
    • but, if the new page is written, the COW logic kicks in (via a page fault):
      • physical memory is allocated
      • the /dev/zero page is copied to the new page
      • the new page is detached from the mother page
      • and the calling process can finally do the update which started all this

提交回复
热议问题