E203 Hummingbird RISC-V processor code reading notes instruction prefetch module 2 (instruction branch prediction) e203_ifu_litebpu.v.v

tags: verilog  Digital IC front end  cpu design

This article records my learning journey of RISC-V Hummingbird E203 processor
This is the third source code file that I formally read the code study

For the study of the code, I combined my own understanding to make detailed comments on the interface of each module and the key internal signals.
The original is not easy, please protect the copyright, contact the author for reprinting, and please indicate the source and mark the original link, thank you~~~
e203_ifu_litebpu.v

 /*                                                                      
 Copyright 2017 Silicon Integrated Microelectronics, Inc.                
                                                                         
 Licensed under the Apache License, Version 2.0 (the "License");         
 you may not use this file except in compliance with the License.        
 You may obtain a copy of the License at                                 
                                                                         
     http://www.apache.org/licenses/LICENSE-2.0                          
                                                                         
  Unless required by applicable law or agreed to in writing, software    
 distributed under the License is distributed on an "AS IS" BASIS,       
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and     
 limitations under the License.                                          
 */                                                                      
                                                                         
                                                                         
                                                                         
//=====================================================================
//--        _______   ___
//--       (   ____/ /__/
//--        \ \     __
//--     ____\ \   / /
//--    /_______\ /_/   MICROELECTRONICS
//--
//=====================================================================
// Designer   : Bob Hu
//
// Description:
//  The Lite-BPU module to handle very simple branch predication at IFU
//
// ====================================================================
// +++++++++++++++++++++++++++ My reading notes:
// This module is used to further process the result of instruction micro-decoding to predict whether a jump is needed
// Branch prediction uses the simplest static branch prediction. For conditional jumps such as BXX, the backward jump is required to jump, and the forward jump is not required to jump.
// For jal and jalr that do not need to jump, do not need to predict, they must be jumped, just go to the jump address and fetch the instruction
// For PC and IR, IR is the instruction that has been successfully fetched, that is, the instruction to be executed by EXU, and PC is the instruction in the IR
// Corresponding address
// +++++++++++++++++++++++++++++++++++++++++
`include "e203_defines.v"

module e203_ifu_litebpu(

  // Current PC
  input  [`E203_PC_SIZE-1:0] pc, //PC pointer value corresponding to the decoded instruction (that is, the instruction currently being executed by EXU)

  // The mini-decoded info 
  input  dec_jal,                //Sent from the mini-decode module, the specific instruction decoding result information, this signal indicates a direct jump
  input  dec_jalr,               // This signal indicates that it is an indirect jump to calculate the address from the register
  input  dec_bxx,                // This signal indicates that it is a conditional jump instruction that needs to be based on the result of conditional judgment to jump
  input  [`E203_XLEN-1:0] dec_bjp_imm,  //The immediate value of the jump instruction
  input  [`E203_RFIDX_WIDTH-1:0] dec_jalr_rs1idx,  //The cpu internal address of the rs1 source operand of the jump instruction

  // The IR index and OITF status to be used for checking dependency
  input  oitf_empty,  //oitf module is an empty sign, used to determine the correlation between the jump instruction and the previous instruction
  input  ir_empty, //The current IR register is empty
  input  ir_rs1en, //This signal is derived from the dec2ifu_rs1en module at the entry of the instruction prefetch module, which indicates the currently executing instruction and whether its rs1 operand is valid
  input  jalr_rs1idx_cam_irrdidx, //The index of rs1 of the jalr instruction is compared with the index of rd in the IR register. If they are equal, it is 1
                                  //This is for the jalr instruction. If the source operand rs1 and the destination operand rd are the same register, this will lead to branch prediction. Here, this instruction itself is relevant? ? ?
  // The add op to next-pc adder
  output bpu_wait,    // branch predict unit 
  output prdt_taken,  // Indicates whether the prediction result of the branch prediction unit is to jump
  output [`E203_PC_SIZE-1:0] prdt_pc_add_op1,  //The operand used to calculate the PC value
  output [`E203_PC_SIZE-1:0] prdt_pc_add_op2,

  input  dec_i_valid, // The instruction currently being executed is decoded and found to be valid

  // The RS1 to read regfile
  output bpu2rf_rs1_ena,// Generate the enable signal for the first read port, which will load the rs1 index register at the same level as the IR register to read REGFile
  input  ir_valid_clr,  // Indicates that the content in the current ir is cleared(// The ir valid is cleared when it is accepted by EXU stage *or* the flush happening)
  input  [`E203_XLEN-1:0] rf2bpu_x1, // The content of x1 that is at the same level as IR, pulled directly from regfile
  input  [`E203_XLEN-1:0] rf2bpu_rs1, // The content of rs1 retrieved from the read port of regfile

  input  clk,
  input  rst_n
  );


  // BPU of E201 utilize very simple static branch prediction logics
  //   * JAL: The target address of JAL is calculated based on current PC value
  // and offset, and JAL is unconditionally always jump // The JAL instruction is unconditionally unpredictable as always needing to jump
  //   * JALR with rs1 == x0: The target address of JALR is calculated based on
  // x0+offset, and JALR is unconditionally always jump // When the source operand of the JALR instruction is x0, since x0 is fixed to 0, JALR is equivalent to JAL
  //   * JALR with rs1 = x1: The x1 register value is directly wired from regfile
  //          when the x1 have no dependency with ongoing instructions by checking
  // two conditions: //When the source operand of the JALR instruction is the x1 register, if the following two conditions are currently met, then no matter what the currently executed instruction is, the x1 directly received from the reg file is used
  //            ** (1) The OTIF in EXU must be empty 
  // ** (2) The instruction in IR have no x1 as destination register // Here I temporarily understand that if the rd of jalr is x1, then the execution result of jalr will change x1. At this time, the value of x1 cannot be pulled directly
  // * If there is dependency, then hold up IFU until the dependency is cleared //If there is a dependency, then you need to keep IFU until the dependency is lifted.//Because at this time the value of x1 is not an immediate number, but the returned rd, The specific timing should be dealt with later? ? ?
  //   * JALR with rs1 != x0 or x1: The target address of JALR need to be resolved
  //          at EXU stage, hence have to be forced halted, wait the EXU to be
  //          empty and then read the regfile to grab the value of xN.
  //          This will exert 1 cycle performance lost for JALR instruction
  //   * Bxxx: Conditional branch is always predicted as taken if it is backward
  //          jump, and not-taken if it is forward jump. The target address of JAL
  //          is calculated based on current PC value and offset

  // The JAL and JALR is always jump, bxxx backward is predicted as taken  
  assign prdt_taken   = (dec_jal | dec_jalr | (dec_bxx & dec_bjp_imm[`E203_XLEN-1]));  //The decoded instruction is jal, jalr must jump, when it is a bxx instruction
  // The JALR with rs1 == x1 have dependency or xN have dependency //If the immediate (signed) sign bit of the decoded bxx instruction is 1, it means it is a negative number, and jump backward
  wire dec_jalr_rs1x0 = (dec_jalr_rs1idx == `E203_RFIDX_WIDTH'd0);                     //bxx instructions jump backwards before predicting to jump
  wire dec_jalr_rs1x1 = (dec_jalr_rs1idx == `E203_RFIDX_WIDTH'd1);
  wire dec_jalr_rs1xn = (~dec_jalr_rs1x0) & (~dec_jalr_rs1x1);
                        // The instruction is valid, decoded is the jalr instruction, the rs1 operand of the jalr instruction is the x1 register, and oitf is not empty or the source operand and destination operand of jarl are the same register
  wire jalr_rs1x1_dep = dec_i_valid & dec_jalr & dec_jalr_rs1x1 & ((~oitf_empty) | (jalr_rs1idx_cam_irrdidx)); //rs1==x1 and relevant
  wire jalr_rs1xn_dep = dec_i_valid & dec_jalr & dec_jalr_rs1xn & ((~oitf_empty) | (~ir_empty));               //rs1==xN and relevant
                                                                                            //     
                      // If only depend to IR stage (OITF is empty), then if IR is under clearing, or
                          // it does not use RS1 index, then we can also treat it as non-dependency
  wire jalr_rs1xn_dep_ir_clr = (jalr_rs1xn_dep & oitf_empty & (~ir_empty)) & (ir_valid_clr | (~ir_rs1en)); //

  wire rs1xn_rdrf_r; // Regarding this piece, you can read the notes on the book for more details; this 1bit means that when rs1 is xn, it is currently in the state of read regfile
  wire rs1xn_rdrf_set = (~rs1xn_rdrf_r) & dec_i_valid & dec_jalr & dec_jalr_rs1xn & ((~jalr_rs1xn_dep) | jalr_rs1xn_dep_ir_clr);
  wire rs1xn_rdrf_clr = rs1xn_rdrf_r;   //If the first read port of the current regfile is not occupied, and the current command is a jalr command with rs1 being xn, and the current command has no correlation, or IR is underclearing                     
  wire rs1xn_rdrf_ena = rs1xn_rdrf_set |   rs1xn_rdrf_clr; // If the set is valid, then the enable is also valid, and the reg value is set to 1 in the next shot; if the register value itself is valid, the register is also in the state of en
  wire rs1xn_rdrf_nxt = rs1xn_rdrf_set | (~rs1xn_rdrf_clr);

  sirv_gnrl_dfflr #(1) rs1xn_rdrf_dfflrs(rs1xn_rdrf_ena, rs1xn_rdrf_nxt, rs1xn_rdrf_r, clk, rst_n);

  assign bpu2rf_rs1_ena = rs1xn_rdrf_set; // Generate the enable signal for the first read port, which will load the rs1 index register at the same level as the IR register to read REGFile

  assign bpu_wait = jalr_rs1x1_dep | jalr_rs1xn_dep | rs1xn_rdrf_set; //In these three cases, bpu needs to wait for the end of exu execution to fetch instructions

  assign prdt_pc_add_op1 = (dec_bxx | dec_jal) ? pc[`E203_PC_SIZE-1:0]                   //The operand 1 for PC calculation, if it is a jump instruction that does not require the rs1 operand, then op1 is 0
                         : (dec_jalr & dec_jalr_rs1x0) ? `E203_PC_SIZE'b0                //If it is jalr, op1 is 0 when rs1 of jalr is x0
                         : (dec_jalr & dec_jalr_rs1x1) ? rf2bpu_x1[`E203_PC_SIZE-1:0]    //If it is jalr and the rs1 of jalr is x1, then take the content of x1 directly pulled from regfile
                         : rf2bpu_rs1[`E203_PC_SIZE-1:0];                                //If it is jalr, and rs1 of jalr is xn, then take the value from the read port of regfile

  assign prdt_pc_add_op2 = dec_bjp_imm[`E203_PC_SIZE-1:0];            //The immediate data needed for PC calculation

endmodule

Intelligent Recommendation

[Hummingbird E203 kernel analysis] The process of the instruction execution of the chap.2 E203 kernel-why is E203 a two-level assembly line?

[Hummingbird E203 kernel analysis] The process of the instruction execution of the chap.2 E203 kernel-why is E203 a two-level assembly line? 0. E203's two -level assembly line structure 1. IFU to inst...

Hummingbird E203 Learning Notes-Overview of Fingering Module (2)

Hummingbird E203 Learning Notes Foreword 1 part of the decoding code 2 branch prediction and PC generation 3 Request and response 4 Filial Wir 5 blocking request Foreword This article takes the second...

Hummingbird E203 open source RISC-V (DDR200T pit)

When I run the gpio program, I have used the .mcs in hbird to burn into the ddr200t development board through vivado, but my core in the ide: still only N205, no e203 ---------- --------ide does not a...

RISC-V vector instruction extension (2)

Earlier we mentioned that the RISC-V vector extension adds 32 directional registers and 7 non-privileges CSRS to one and the scalar RISC-V instruction set architecture. This section discusses the func...

RISC-V instruction Learning Notes (based on CH32V103)

Articles directory RISC-V instruction Learning Notes (based on CH32V103) 1. Classification of instruction structure Second, register function 3. Load storage instructions Fourth, calculation of calcul...

More Recommendation

RISC-V instruction format

RISC-V purpose - simple RISC-V Instruction Set Architecture (ISA) as a new generation of open source instruction set architecture, is a new, simple, clear, open source instruction set architecture. RI...

RISC-V instruction set

RISC-V instruction set R TYPE ADD SUB SLL SLT SLTU XOR SRL SRA OR AND I TYPE ADDI SLTI SLTIU XORI ORI ANDI SLLI SRLI SRAI I_L TYPE LB This command reads a byte (byte) from a valid address, and is writ...

RISC-V instruction and register

RISC-V address space Foreword 1. RISC-V address space 2. RV32i general register and function call agreement 3. RV32i instruction format Summarize Foreword In the process of debugging and project devel...

[Hummingbird E203 kernel analysis] CHAP.3 custom instruction and collaborator design

[Hummingbird E203 kernel analysis] CHAP.3 custom instruction and collaborator design 1. Concept 2. How to call the NICE processor independent from the "main nuclear process"? 3. How to custo...

Hummingbird E203 open source RISC-V development board: Hummingbird FPGA development board and JTAG debugger introduction

Original source: With the first domestic RISC-V Chinese book "Teach you to design CPU - RISC-V processor" officially listed, more and more enthusiasts began to use the open source Hummingbir...

Copyright  DMCA © 2018-2026 - All Rights Reserved - www.programmersought.com  User Notice

Top