Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
CMPEN 331 – Computer Organization and Design,
Final Project
You will convert your (almost complete) single-cycle processor that you built throughout the semester into a 5-stage pipelined processor. Open the Vivado project from HW5. We will start from there.
Important Notes on Grading
The final project is in three steps, and each step has a point of 15 pts, 5 pts, and 5 pts. They must be done in order to receive points. For example, you cannot do the first part and the third part only, and not do the second part (the third part will not receive any credits in this case). At least try to make the first part run correctly for 15 pts. The second/third part is harder but only 5 pts each, and the points for these two parts will only be given if the code produces a correct result (no partial point for trying and failing). Strategically plan on trying the last two parts.
1. Implementing Pipelining (15 pts)
Below is the 5-stage pipeline processor diagram from our lecture slide. The main differences are (1) four pipeline registers are added (highlighted in yellow), and (2) signals go through the pipeline registers, instead of directly connecting the components. You do not need to implement the signal and logic related to branch/jump (red X).
1.1. Adding Pipeline Registers
Write four modules for each pipeline register. Until now, module skeletons were always provided, but this time you must write your own modules from scratch. Fortunately, the four modules are simple and look very similar. You should be able to identify the inputs/outputs of the registers from what you learned from the lecture. However, I am listing them below to make your life easier.
· IF/ID Pipeline Register
o Inputs: 1-bit clk, 32-bit inst
o Outputs: 32-bit inst_d
o Description: On a positive edge of clk, inst_d is set to inst. The _d indicates that the signal is an output from the if/id register.
· ID/EX Pipeline Register
o Inputs
§ 1-bit clk, regWrite, memToReg, memWrite, aluSrc, regDst, memRead
§ 32-bit regOut1, regOut2, imm32
§ 5-bit rt, rd (HINT: These are part of inst_d)
§ 4-bit aluControl
o Outputs: Duplicated versions of the above signals, with an _x subscript.
o Description: On a positive edge of clk, the output signals (with an _x subscript) are set the corresponding input signals.
· EX/MEM Pipeline Register
o Inputs
§ 1-bit clk, regWrite_x, memToReg_x, memWrite_x, memRead_x
§ 32-bit aluOut, regOut2_x
§ 5-bit writeAddr
o Outputs: Duplicated versions of the above signals, with an _m subscript.
o Description: On a positive edge of clk, the output signals (with an _m subscript) are set the corresponding input signals.
· MEM/WB Pipeline Register
o Inputs
§ 1-bit clk, regWrite_m, memToReg_m
§ 32-bit aluOut_m, memOut
§ 5-bit writeAddr_m
o Outputs: Duplicated versions of the above signals, with an _b subscript.
o Description: On a positive edge of clk, the output signals (with an _b subscript) are set the corresponding input signals.
1.2. Re-routing Signals
Place each pipeline registers between the stages and connect them properly with existing components. You will need to replace many of the existing wires with the outputs of the pipeline registers. Carefully refer to the above figure and the lecture slides to correctly connect each component. This step is extremely easy to make mistakes.
1.3. Testing
If you are done, try running your code. The result, unfortunately, will look like this.
The result is different from what we have seen in HW4! Why? (Write your answer in the report)
Replace the saved instructions in the instruction memory to the following and try running again. Don’t entirely delete the original instructions (just comment it out) as we will use it again later.
· memory[25] = {6'b100011, 5'd0, 5'd1, 16'd0};
· memory[26] = {6'b100011, 5'd0, 5'd2, 16'd4};
· memory[27] = {6'b100011, 5'd0, 5'd3, 16'd8};
· memory[28] = {6'b100011, 5'd0, 5'd4, 16'd16};
· memory[29] = {6'b000000, 5'd1, 5'd2, 5'd5, 11'b00000100000};
· memory[30] = {6'b100011, 5'd3, 5'd6, 16'hFFFC};
· memory[31] = {6'b000000, 5'd4, 5'd3, 5'd7, 11'b00000100010};
If successful, you should see the result like below:
What are the new instructions added? Is this result correct? (Explain in the report)
1.4. Debugging
It is very easy to make small mistakes in this project. In the waveform view, carefully follow each signal through the pipeline and see if their behavior matches your expectation. See if there are any unexpected Xs or Zs. Debugging is an essential part of programming, and your ability to debug is part of the project evaluation. TAs will not debug your code for you, although they can guide you at a high level.
2. Data Forwarding (5 pts)
“One” reason the original code did not run correctly was due to data hazards. You will implement EX forwarding and MEM forwarding that we learned during class to (almost) fix the problem. Below is the diagram from the lecture slide on how forwarding works.
2.1. Adding an Additional Signal in the Pipeline Register
You will see that one signal is missing: ID/EX RegisterRs is currently not an input from our ID/EX pipeline register (or what we would call in our convention, rs_x). Our ID/EX pipeline register must be changed to include this signal.
· ID/EX Pipeline Register (revisited)
o (Additional) Input: 5-bit rs (HINT: this is part of inst_d)
o (Additional) Output: 5-bit rs_x.
o Description: On a positive edge of clk, rs_x is set to rs.
2.2. Implementing Additional Modules
You need to additionally implement modules for a forwarding unit and a 32-bit 3x1 mux.
· Forwarding Unit
o Inputs
§ 5-bit writeAddr_m, writeAddr_b, rs_x, rt_x
§ 1-bit regWrite_m, regWrite_b
o Outputs: 2-bit forwardA, forwardB
o Description: Set forwardA and forwardB based on other signals. The c-code equivalent can be found in the slides.
· 32-bit 3x1 Mux
o Inputs: 32-bit in0, 32-bit in1, 32-bit in2, 2-bit sel
o Output: 32-bit out
o Description: If sel==0, out=in0. If sel==1, out=in1. If sel==2, out=in2.
2.3. Connecting Everything
As shown in the above figure, add an instance of the forwarding unit and two instances of the 3x1 mux, and redirect wires properly between them.
2.4. Testing Your Code
This time, initialize the instruction memory with the code below. This is a new code that is different both with the original code (from the skeleton) and the code from above:
· memory[25] = {6'b100011, 5'd0, 5'd1, 16'd0};
· memory[26] = {6'b100011, 5'd0, 5'd2, 16'd4};
· memory[27] = {6'b100011, 5'd0, 5'd4, 16'd16};
· memory[28] = {6'b000000, 5'd1, 5'd2, 5'd3, 11'b00000100010};
· memory[29] = {6'b100011, 5'd3, 5'd4, 16'hFFFC};
If you correctly implemented data forwarding, you should see a result like this (without data forwarding, the result will look different, of course):
Now, try running the original 4-line code from the skeleton. Unfortunately, your result will look like this:
This is still wrong. Why? (Explain in your report)
3. Detecting Load-use Hazards and Stalling (5 pts)
This is the final part of the proposal. You will implement a hazard unit that detects load-use hazard and stall the pipeline. As we have learned, hazard unit detects load-use hazard and generates three signals: IF/ID.Bubble, PC.Write, and IF/ID.Write, as shown in the below figure. IF/ID.Bubble inserts zeros to the ID/EX pipeline instead of the output signals of the control unit. IF/ID.Write disables the IF/ID pipeline from being updated. PC.Write disables the PC from being updated.
We will simplify the design a little bit: We will generate only one signal, stall, that replaces all three aforementioned signals. Instead of adding another mux after the control unit, we will modify the control unit itself to take in stall as an input and generate zeros if stall==1.
3.1. Implementing the Hazard Unit
· Hazard Unit
o Inputs: 5-bit rt_x, rt_d, rs_d, 1-bit memRead_x
o Outputs: 1-bit stall
o Description: Depending on the input signals, generate the stall signal. The c-code equivalent can be found in the lecture slides. stall==1 means the pipeline is stalled.
o HINT: rs_d and rt_d are parts of inst_d.
o HINT: You might need to use an additional initial begin-end block to initialize the stall signal to be zero.
3.2. Modifying control_unit, program_counter, and IF/ID Pipeline register
Below modules must be updated accordingly
· control_unit
o (Additional) Input: 1-bit stall
o Description: If stall==1, set all the output signals to zero.
· program_counter & IF/ID pipeline register
o (Additional) Input: 1-bit stall
o Description: If stall==1, do not update the output.
o HINT: This should be just one additional if statement.
3.3. Connecting Everything
As shown in the above figure, add the hazard unit and connect it with other modules properly. If your code is correct, the original code from the skeleton (4-line version) must produce the same output as HW5:
4. Submission
You must submit (1) your new datapath.v and (2) a short report in a single zip file. The report must discuss the following:
· What are the 7 instructions from 1.3? Does the waveform result match the expected output? Explain why the 7 instructions from 1.3 run correctly, while the original 4 instructions fail to produce the same output as HW5.
· What are the 5 instructions from 2.4? Does the waveform result match the expected output? Explain why the 5 instructions from 2.4 run correctly, while the original 4 instructions fail to produce the same output as HW5.
The report must be in pdf or Microsoft Word. No handwritten report allowed.
Always write the code clearly and add proper comments. Hard-to-read code will lose points.