## 16.482 / 16.561: Computer Architecture and Design Fall 2014

Lecture 4: Key Questions September 25, 2014

|    | Depicificer 23, 2014                                        |  |  |  |  |
|----|-------------------------------------------------------------|--|--|--|--|
| 1. | (Review) What is the cause of a control (or branch) hazard? |  |  |  |  |
| 2. | What are the potential solutions for control hazards?       |  |  |  |  |
| 3. | What is instruction-level parallelism?                      |  |  |  |  |
| 4. | Why is ILP limited in basic blocks?                         |  |  |  |  |

5. What is loop-level parallelism, and how can we exploit it?

6. Briefly describe loop unrolling.

7. Define/describe a branch history table.

8. How do we choose a BHT entry for a given branch?

16.482/16.561: Computer Architecture & Design Fall 2014

Instructor: M. Geiger Lecture 4: Key Questions

9. Draw the state diagram for a 2-bit BHT.

- 10. <u>BHT Example 1:</u> Given a simple loop with 1 branch and 4 iterations, and an initial BHT state of 00:
- a. What are the predictions and outcomes for every branch the first time this loop executes? How does the BHT entry state change with each iteration?

b. Say we return to the loop a second time. Repeat part (a), but use the BHT state <u>after</u> completing part (a) (e.g., after having executed the loop once).

16.482/16.561: Computer Architecture & Design Fall 2014

11. <u>BHT Example 2:</u> You are given the nested loop shown below. The outer loop starts at address 0 and concludes with the BEQ at address 28; the inner loop starts at address 8 and concludes with the BNE at address 16:

Instructor: M. Geiger

Lecture 4: Key Questions

| Address |        |         |       |
|---------|--------|---------|-------|
| 0       | Loop1: |         |       |
| 8       | Loop2: |         |       |
| 16      |        | BNE R4, | Loop2 |
| 20      |        |         |       |
| 28      |        | BEQ R7, | Loop1 |

Assume you have a 4-entry BHT, with all entries initialized to 00. Answer the following:

- a. How many PC bits do you need to index into this BHT?
- b. Which PC bits should you use?
- c. What is the initial prediction for every branch?
- d. Assume the inner loop has 8 iterations and the outer loop has 4. What is the misprediction rate (i.e., how many mispredictions occur)?

<u>Hint:</u> The easiest way to track this may be through the use of a table that looks at every branch, the BHT entry it accesses, the prediction (and whether or not it is correct), and any changes to the BHT entry.

16.482/16.561: Computer Architecture & Design Fall 2014 Instructor: M. Geiger Lecture 4: Key Questions

## Extra space for BHT Example 2

16.482/16.561: Computer Architecture & Design
Fall 2014

Instructor: M. Geiger
Lecture 4: Key Questions

12. Define/describe a correlated branch predictor. What are the key differences between a correlated predictor and a simple BHT?

16.482/16.561: Computer Architecture & Design
Fall 2014

Instructor: M. Geiger
Lecture 4: Key Questions

- 13. <u>Correlating example:</u> Say we have one entry of a simple (2,2) predictor. Assume:
  - Entry state is currently: (00, 10, 11, 01)
  - Global history is currently:  $01 \rightarrow \text{Last 2 branches were NT, T (T most recent)}$

Say we have a branch accessing this entry of the table, and that instruction executes five times. What are the predictions for this branch, and how does the predictor entry change, given that:

- The first 2 times, the branch is taken
- The next 2 times, the branch is not taken
- The final time, the branch is taken

16.482/16.561: Computer Architecture & Design
Fall 2014

Instructor: M. Geiger
Lecture 4: Key Questions

14. Describe the operation and purpose of a branch target buffer.