Ace's Hardware heeft deel 4 van z'n Secrets of High Performance CPU's artikel gepost. Best wel heftige info, dieper dan een bouwput van 100 jaargangen Computer Idee. In deel 4 wordt uitgelegd welke technieken worden gebruikt om een processor zoveel mogelijk instructies per klokcyclus uit te laten voeren:
It is clear that designing a CPU which can deal with a lot of instructions in one clock cycle is the best way to get an ultra-fast CPU. In Part 2 we saw that superscalar CPUs have a buffer for the instruction decoders. The Superscalar CPU will issue instructions from the decoder to the execution units when they are available. In Part 3 we saw that branches make it difficult to issue many instructions together each clockcycle. We solved that by predicting the branches. So, what's keeping Intel and AMD from building the 8-way CPU?Once a instruction is decoded, the scheduler has to decide whether or not it will issue this instruction to the execution unit immediately. There are 3 core reasons, besides branches, why a x86 CPU can not issue a lot of instructions each clockcycle (immediately):
1.Dependences
2.The Intel x86 architecture does not have more than 8 general purpose registers
3.Execution units are busy, the right execution unit is not ready.Let's examine each one by one.