tags: verilog Digital IC front end cpu design
This article records my learning journey of RISC-V Hummingbird E203 processor
This is the third source code file that I formally read the code study
For the study of the code, I combined my own understanding to make detailed comments on the interface of each module and the key internal signals.
The original is not easy, please protect the copyright, contact the author for reprinting, and please indicate the source and mark the original link, thank you~~~
e203_ifu_litebpu.v
/*
Copyright 2017 Silicon Integrated Microelectronics, Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
//=====================================================================
//-- _______ ___
//-- ( ____/ /__/
//-- \ \ __
//-- ____\ \ / /
//-- /_______\ /_/ MICROELECTRONICS
//--
//=====================================================================
// Designer : Bob Hu
//
// Description:
// The Lite-BPU module to handle very simple branch predication at IFU
//
// ====================================================================
// +++++++++++++++++++++++++++ My reading notes:
// This module is used to further process the result of instruction micro-decoding to predict whether a jump is needed
// Branch prediction uses the simplest static branch prediction. For conditional jumps such as BXX, the backward jump is required to jump, and the forward jump is not required to jump.
// For jal and jalr that do not need to jump, do not need to predict, they must be jumped, just go to the jump address and fetch the instruction
// For PC and IR, IR is the instruction that has been successfully fetched, that is, the instruction to be executed by EXU, and PC is the instruction in the IR
// Corresponding address
// +++++++++++++++++++++++++++++++++++++++++
`include "e203_defines.v"
module e203_ifu_litebpu(
// Current PC
input [`E203_PC_SIZE-1:0] pc, //PC pointer value corresponding to the decoded instruction (that is, the instruction currently being executed by EXU)
// The mini-decoded info
input dec_jal, //Sent from the mini-decode module, the specific instruction decoding result information, this signal indicates a direct jump
input dec_jalr, // This signal indicates that it is an indirect jump to calculate the address from the register
input dec_bxx, // This signal indicates that it is a conditional jump instruction that needs to be based on the result of conditional judgment to jump
input [`E203_XLEN-1:0] dec_bjp_imm, //The immediate value of the jump instruction
input [`E203_RFIDX_WIDTH-1:0] dec_jalr_rs1idx, //The cpu internal address of the rs1 source operand of the jump instruction
// The IR index and OITF status to be used for checking dependency
input oitf_empty, //oitf module is an empty sign, used to determine the correlation between the jump instruction and the previous instruction
input ir_empty, //The current IR register is empty
input ir_rs1en, //This signal is derived from the dec2ifu_rs1en module at the entry of the instruction prefetch module, which indicates the currently executing instruction and whether its rs1 operand is valid
input jalr_rs1idx_cam_irrdidx, //The index of rs1 of the jalr instruction is compared with the index of rd in the IR register. If they are equal, it is 1
//This is for the jalr instruction. If the source operand rs1 and the destination operand rd are the same register, this will lead to branch prediction. Here, this instruction itself is relevant? ? ?
// The add op to next-pc adder
output bpu_wait, // branch predict unit
output prdt_taken, // Indicates whether the prediction result of the branch prediction unit is to jump
output [`E203_PC_SIZE-1:0] prdt_pc_add_op1, //The operand used to calculate the PC value
output [`E203_PC_SIZE-1:0] prdt_pc_add_op2,
input dec_i_valid, // The instruction currently being executed is decoded and found to be valid
// The RS1 to read regfile
output bpu2rf_rs1_ena,// Generate the enable signal for the first read port, which will load the rs1 index register at the same level as the IR register to read REGFile
input ir_valid_clr, // Indicates that the content in the current ir is cleared(// The ir valid is cleared when it is accepted by EXU stage *or* the flush happening)
input [`E203_XLEN-1:0] rf2bpu_x1, // The content of x1 that is at the same level as IR, pulled directly from regfile
input [`E203_XLEN-1:0] rf2bpu_rs1, // The content of rs1 retrieved from the read port of regfile
input clk,
input rst_n
);
// BPU of E201 utilize very simple static branch prediction logics
// * JAL: The target address of JAL is calculated based on current PC value
// and offset, and JAL is unconditionally always jump // The JAL instruction is unconditionally unpredictable as always needing to jump
// * JALR with rs1 == x0: The target address of JALR is calculated based on
// x0+offset, and JALR is unconditionally always jump // When the source operand of the JALR instruction is x0, since x0 is fixed to 0, JALR is equivalent to JAL
// * JALR with rs1 = x1: The x1 register value is directly wired from regfile
// when the x1 have no dependency with ongoing instructions by checking
// two conditions: //When the source operand of the JALR instruction is the x1 register, if the following two conditions are currently met, then no matter what the currently executed instruction is, the x1 directly received from the reg file is used
// ** (1) The OTIF in EXU must be empty
// ** (2) The instruction in IR have no x1 as destination register // Here I temporarily understand that if the rd of jalr is x1, then the execution result of jalr will change x1. At this time, the value of x1 cannot be pulled directly
// * If there is dependency, then hold up IFU until the dependency is cleared //If there is a dependency, then you need to keep IFU until the dependency is lifted.//Because at this time the value of x1 is not an immediate number, but the returned rd, The specific timing should be dealt with later? ? ?
// * JALR with rs1 != x0 or x1: The target address of JALR need to be resolved
// at EXU stage, hence have to be forced halted, wait the EXU to be
// empty and then read the regfile to grab the value of xN.
// This will exert 1 cycle performance lost for JALR instruction
// * Bxxx: Conditional branch is always predicted as taken if it is backward
// jump, and not-taken if it is forward jump. The target address of JAL
// is calculated based on current PC value and offset
// The JAL and JALR is always jump, bxxx backward is predicted as taken
assign prdt_taken = (dec_jal | dec_jalr | (dec_bxx & dec_bjp_imm[`E203_XLEN-1])); //The decoded instruction is jal, jalr must jump, when it is a bxx instruction
// The JALR with rs1 == x1 have dependency or xN have dependency //If the immediate (signed) sign bit of the decoded bxx instruction is 1, it means it is a negative number, and jump backward
wire dec_jalr_rs1x0 = (dec_jalr_rs1idx == `E203_RFIDX_WIDTH'd0); //bxx instructions jump backwards before predicting to jump
wire dec_jalr_rs1x1 = (dec_jalr_rs1idx == `E203_RFIDX_WIDTH'd1);
wire dec_jalr_rs1xn = (~dec_jalr_rs1x0) & (~dec_jalr_rs1x1);
// The instruction is valid, decoded is the jalr instruction, the rs1 operand of the jalr instruction is the x1 register, and oitf is not empty or the source operand and destination operand of jarl are the same register
wire jalr_rs1x1_dep = dec_i_valid & dec_jalr & dec_jalr_rs1x1 & ((~oitf_empty) | (jalr_rs1idx_cam_irrdidx)); //rs1==x1 and relevant
wire jalr_rs1xn_dep = dec_i_valid & dec_jalr & dec_jalr_rs1xn & ((~oitf_empty) | (~ir_empty)); //rs1==xN and relevant
//
// If only depend to IR stage (OITF is empty), then if IR is under clearing, or
// it does not use RS1 index, then we can also treat it as non-dependency
wire jalr_rs1xn_dep_ir_clr = (jalr_rs1xn_dep & oitf_empty & (~ir_empty)) & (ir_valid_clr | (~ir_rs1en)); //
wire rs1xn_rdrf_r; // Regarding this piece, you can read the notes on the book for more details; this 1bit means that when rs1 is xn, it is currently in the state of read regfile
wire rs1xn_rdrf_set = (~rs1xn_rdrf_r) & dec_i_valid & dec_jalr & dec_jalr_rs1xn & ((~jalr_rs1xn_dep) | jalr_rs1xn_dep_ir_clr);
wire rs1xn_rdrf_clr = rs1xn_rdrf_r; //If the first read port of the current regfile is not occupied, and the current command is a jalr command with rs1 being xn, and the current command has no correlation, or IR is underclearing
wire rs1xn_rdrf_ena = rs1xn_rdrf_set | rs1xn_rdrf_clr; // If the set is valid, then the enable is also valid, and the reg value is set to 1 in the next shot; if the register value itself is valid, the register is also in the state of en
wire rs1xn_rdrf_nxt = rs1xn_rdrf_set | (~rs1xn_rdrf_clr);
sirv_gnrl_dfflr #(1) rs1xn_rdrf_dfflrs(rs1xn_rdrf_ena, rs1xn_rdrf_nxt, rs1xn_rdrf_r, clk, rst_n);
assign bpu2rf_rs1_ena = rs1xn_rdrf_set; // Generate the enable signal for the first read port, which will load the rs1 index register at the same level as the IR register to read REGFile
assign bpu_wait = jalr_rs1x1_dep | jalr_rs1xn_dep | rs1xn_rdrf_set; //In these three cases, bpu needs to wait for the end of exu execution to fetch instructions
assign prdt_pc_add_op1 = (dec_bxx | dec_jal) ? pc[`E203_PC_SIZE-1:0] //The operand 1 for PC calculation, if it is a jump instruction that does not require the rs1 operand, then op1 is 0
: (dec_jalr & dec_jalr_rs1x0) ? `E203_PC_SIZE'b0 //If it is jalr, op1 is 0 when rs1 of jalr is x0
: (dec_jalr & dec_jalr_rs1x1) ? rf2bpu_x1[`E203_PC_SIZE-1:0] //If it is jalr and the rs1 of jalr is x1, then take the content of x1 directly pulled from regfile
: rf2bpu_rs1[`E203_PC_SIZE-1:0]; //If it is jalr, and rs1 of jalr is xn, then take the value from the read port of regfile
assign prdt_pc_add_op2 = dec_bjp_imm[`E203_PC_SIZE-1:0]; //The immediate data needed for PC calculation
endmodule
[Hummingbird E203 kernel analysis] The process of the instruction execution of the chap.2 E203 kernel-why is E203 a two-level assembly line? 0. E203's two -level assembly line structure 1. IFU to inst...
Hummingbird E203 Learning Notes Foreword 1 part of the decoding code 2 branch prediction and PC generation 3 Request and response 4 Filial Wir 5 blocking request Foreword This article takes the second...
When I run the gpio program, I have used the .mcs in hbird to burn into the ddr200t development board through vivado, but my core in the ide: still only N205, no e203 ---------- --------ide does not a...
Earlier we mentioned that the RISC-V vector extension adds 32 directional registers and 7 non-privileges CSRS to one and the scalar RISC-V instruction set architecture. This section discusses the func...
Articles directory RISC-V instruction Learning Notes (based on CH32V103) 1. Classification of instruction structure Second, register function 3. Load storage instructions Fourth, calculation of calcul...
RISC-V purpose - simple RISC-V Instruction Set Architecture (ISA) as a new generation of open source instruction set architecture, is a new, simple, clear, open source instruction set architecture. RI...
RISC-V instruction set R TYPE ADD SUB SLL SLT SLTU XOR SRL SRA OR AND I TYPE ADDI SLTI SLTIU XORI ORI ANDI SLLI SRLI SRAI I_L TYPE LB This command reads a byte (byte) from a valid address, and is writ...
RISC-V address space Foreword 1. RISC-V address space 2. RV32i general register and function call agreement 3. RV32i instruction format Summarize Foreword In the process of debugging and project devel...
[Hummingbird E203 kernel analysis] CHAP.3 custom instruction and collaborator design 1. Concept 2. How to call the NICE processor independent from the "main nuclear process"? 3. How to custo...
Original source: With the first domestic RISC-V Chinese book "Teach you to design CPU - RISC-V processor" officially listed, more and more enthusiasts began to use the open source Hummingbir...