Problem 1: Memory Hierarchy
Problem 1a: Assume that we have a 32-bit processor (with 32-bit words) and that this processoris byte-addressed (i.e. addresses specify bytes). Suppose that it has a 512-byte cache that is twowayset-associative, has 4-word cache lines, and uses LRU replacement. Split the 32-bit addressinto “tag”, “index”, and “cache-line offset” pieces. Which address bits comprise each piece?
tag:
index:
offset: bits 3—0 (we’ll give you this one).
Problem 1b: How many sets does this cache have? Explain.
Problem 1c: Draw a block diagram for this cache. Show a 32-bit address coming into thediagram and a 32-bit data result and “Hit” signal coming out. Include, all of the comparators inthe system and any muxes as well. Include the data storage memories (indexed by the “Index”),the tag matching logic, and any muxes. You can indicate RAM with a simple block, but makesure to label address widths and data widths. Make sure to label the function of various blocksand the width of any buses.
Problem 1d: Below is a series of memory read references set to the cache from part (a). Assumethat the cache is initially empty and classify each memory references as a hit or a miss. Identifyeach miss as either compulsory, conflict, or capacity. One example is shown. Hint: start bysplitting the address into components. Show your work.
Problem 1e: Calculate the miss rate and hit rate.
Problem 2: Basic Pipelining
Problem 1a: Assume that we have a 32-bit processor (with 32-bit words) and that this processoris byte-addressed (i.e. addresses specify bytes). Suppose that it has a 512-byte cache that is twowayset-associative, has 4-word cache lines, and uses LRU replacement. Split the 32-bit addressinto “tag”, “index”, and “cache-line offset” pieces. Which address bits comprise each piece?
tag:
index:
offset: bits 3—0 (we’ll give you this one).
Problem 1b: How many sets does this cache have? Explain.
Problem 1c: Draw a block diagram for this cache. Show a 32-bit address coming into thediagram and a 32-bit data result and “Hit” signal coming out. Include, all of the comparators inthe system and any muxes as well. Include the data storage memories (indexed by the “Index”),the tag matching logic, and any muxes. You can indicate RAM with a simple block, but makesure to label address widths and data widths. Make sure to label the function of various blocksand the width of any buses.
Problem 1d: Below is a series of memory read references set to the cache from part (a). Assumethat the cache is initially empty and classify each memory references as a hit or a miss. Identifyeach miss as either compulsory, conflict, or capacity. One example is shown. Hint: start bysplitting the address into components. Show your work.
Problem 1e: Calculate the miss rate and hit rate.
Problem 2: Basic Pipelining
Problem 2a: How many branch delay slots does this pipeline have? Explain
Problem 2b: Suppose that we include complex branch conditions: eg:bisqrt $r1, $r2, somewhere ; branch If $r1 square root of $r2Is this likely to change the number of branch delay slots? If not, explain. If so, how many willthere be now?
Problem 2c: What is a load delay-slot? Is it a feature of the instruction set or of a particularimplementation?
Problem 2d: Suppose that the data cache takes 1 cycle to access on a hit and 100 cycles for amiss. How many load delay slots will there be? Explain.
Problem 2e: Suppose that cache hits (both instructions and data) take 4 cycles but are pipelined.What does this affect?
Problem 2f: Modify the following datapath to handle forwarding. Be careful (don’t forgetforwarding for branches!). Pick an economical solution and make sure to include control signals.You can draw something below if required.
Problem 2b: Suppose that we include complex branch conditions: eg:bisqrt $r1, $r2, somewhere ; branch If $r1 square root of $r2Is this likely to change the number of branch delay slots? If not, explain. If so, how many willthere be now?
Problem 2c: What is a load delay-slot? Is it a feature of the instruction set or of a particularimplementation?
Problem 2d: Suppose that the data cache takes 1 cycle to access on a hit and 100 cycles for amiss. How many load delay slots will there be? Explain.
Problem 2e: Suppose that cache hits (both instructions and data) take 4 cycles but are pipelined.What does this affect?
Problem 2f: Modify the following datapath to handle forwarding. Be careful (don’t forgetforwarding for branches!). Pick an economical solution and make sure to include control signals.You can draw something below if required.
Problem 2g: Why might the following instruction sequence be important? Is there any way thatwe can handle it without stalling?
Problem 3: Open problem
Problem3a:
Problem3b: RISC V.S. CISC
Problem3c:
Problem 3: Open problem
Problem3a:
Problem3b: RISC V.S. CISC
Problem3c: