NVIDIA Interview Question
Software Engineer / DevelopersBTB is a special cache that stores the most recent used branch target addresses. Generally it has 32 locations. BTB and BPB are implemented in I-fetch stage. So we can get the target address and the direction of branch prediction in I-fetch stage, which reduces the penalty. If there is no BTB, we only can get the target address in decode stage. If a branch is predicted taken, we have to delay one cycle to get the target instruction.
If there is a branch misprediction you have a penalty of 2 clock cycles. branch target buffer will help you only if you are predicting a branch as taken and it turns out so.
- XYZ October 28, 2008