Implement the efficient cache-oblivious algorithm for transposing matrices NxN whose elements are 32-bits integers.
Measure the time for one transposition for N = 64, 69, 75, 80, 87, 92, 101, 109, 119, 128, 139, 150, 161, 174, 189, 203, 220, 237, 256, 280, 300, 322, 352, 378, 406, 442. 476, 512, 552, 598, 645, 705, 760, 812, 878, 960, 1024. Compare time with the direct transposition by definition. Repeat the measurement so that the result is accurate.
Write a report containing
Send an e-mail with subject of the form 'HW4-55973318' (i.e., HW4-student ID) with the following attachments:
The body of your e-mail shall have the following form:
Name: Jirka Fink
Pseudonym for web results: Guybrush Threepwood
Tutorial section: Monday-12:20-odd
Download the simulator and start it with the following parameters:
B
– size of a page;C
– number of pages in the cache.The simulator reads commands for transposing matrix on the standard input. Every row has one of the following forms:
N n
– initialize a matrix n×nX r1 s1 r2 s2
– swap elements
(r1,s1) and (r2,s2), assuming that coordinates are numbered from 0 to n-1E
– terminates the transpositionThe output of the simulator is a text file describing results of the experiment:
Elements: the number of elements of the matrix
Accesses: the number of memory accesses
Misses: the number of memory transfers from the main memory to the cache
Missed-bytes: the total number of elements loaded to the cache
We recommend to experiment the following parameters of cache:
Size of one page [B] | Number of pages | Size of cache [B] |
---|---|---|
64 | 64 | 4096 |
64 | 1024 | 65536 |
64 | 4096 | 262144 |
512 | 512 | 262144 |
4096 | 64 | 262144 |
The simulator expects that matrix is stored by rows and the first row starts on a beginning of a page.