References: CPE380 Memories

The lecture slides as a PDF provide a good overview of everything.

The book also does a very good job explaining memory structures. Emphasis here should be on cache and TLB concepts, because these are the things that are truly hardware... you'll see more about demand-paged virtual memory if you take an OS course. Here's a quick outline for what you should have a basic understanding of:

So, how much does memory access order matter? Consider the following program:

volatile double a[N][N];

main()
{
        register int i, j;

a:        for (i=0; i<N; ++i) {
b:	        for (j=0; j<N; ++j) {
        	        a[i][j] = 0;
	        }
        }
}

Let's compile four versions of this for N=4096. The first version, x0, is as shown above. The second version, x1, swaps lines a: and b:. The third and fourth versions, x2 and x3, are like the first two versions, except in that the loops run backwards, using for (i=N-1; i>=0; --i) { and for (j=N-1; j>=0; --j) {. On a rather old AMD Athlon 64 3200+ processor and on a modern Intel i7-8700 3.2GHz processor, the user times are:
Program Athlon 64 3200+ User Time (seconds) i7-8700 @ 3.2GHz User Time (seconds)
x0 0.136 0.024
x1 1.652 0.126
x2 0.148 0.024
x3 1.632 0.138

In summary, the best memory access order was 12X faster than the worst one on the older processor. It didn't make as much difference on the newer processor, but was still 5.75X faster. The difference is less on the newer processor because much more fits in cache -- it would take a larger N to see as big a performance hit on the newer machine.

However, there's another way to interpret these performance numbers. The AMD Athlon 64 3200+ was released in 2003. The Intel i7-8700 3.2GHz was released in 2018. Not surprisingly, the newer processor was between 5X and 13X faster than the old one using the same version of the code. Very surprisingly, the best layout using the 15-years older processor was slightly FASTER than the worst access order using the new processor!


CPE380 Computer Organization and Design.