As others already pointed out, Linux uses 14 an optimistic memory allocation strategy.
The difference between the first and 13 the following
memcpys is the initialization of 12
As you have already seen, when you eliminate 11
memset(DataSrc, 0, N), the first
memcpy is even slower, because the 10 pages for the source must be allocated as 9 well. When you initialize both,
memset(DataSrc, 0, N); memset(DataDest, 0, N);
memcpys will run with roughly the same speed.
For 7 the second question: when you initialize 6 the allocated memory with
memset all pages will 5 be laid out consecutively. On the other 4 side, when the memory is allocated as you 3 copy, the source and destination pages will 2 be allocated interleaved, which might make 1 the difference.
This is most likely due to lazy allocation 12 in your VM subsystem. Typically when you 11 allocate a large amount of memory only the 10 first N pages are actually allocated and 9 wired to physical memory. When you access 8 beyond these first N pages then page faults 7 are generated and further pages are allocated 6 and wired in on an "on demand" basis.
As 5 to the second part of the question, I believe 4 some VM implementations actually track zeroed 3 pages and handle them specially. Try initialising 2
DataSrc to actual (e.g. random) values and repeat 1 the test.
More Related questions