# What is the clock cycle time in a pipelined and non-pipelined processor?

4.8 In this exercise, we examine how pipelining affects the clock cycle time of the processor. Problems in this exercise assume that individual stages of the data path have the following latencies: Also, assume that instructions executed by the processor are broken down as follows: 4.8.1 [5] What is the clock cycle time in a pipelined and non-pipelined processor? 4.8.2 [10] What is the total latency of an LW instruction in a pipelined and non-pipelined processor? 4.8.3 [10] If we can split one stage of the pipelined data path into two new stages, each with half the latency of the original stage, which stage would you split and what is the new clock cycle time of the processor? 4.8.4 [101 Assuming there are no stalls or hazards, what is the utilization of the data memory? 4.8.5 [10] Assuming there are no stalls or hazards, what is the utilization of the write-register port of the Registers unit? 4.8.6 [30] Instead of a single-cycle organization, we can use a multi-cycle organization where each instruction takes multiple cycles but one instruction finishes before another is fetched. In this organization, an instruction only goes through stages it actually needs (e.g., ST only takes 4 cycles because it does not need the WB stage). Compare clock cycle times and execution times with single cycle, multi-cycle, and pipelined organization.

Also Read :   How do you write a song that will knock over a cow?

Screenshot of the solution:

4.8.1 Pipelining: In pipelining, all the stages take a single clock cycle, so the clock cycle must be long enough to accommodate the slowest operation. Therefore In pipeline the cycle time is determined by the slowest stage Instruction Decode (ID): 350 ps Non-Pipelining In non-pipelining, each instruction goes through all the stages. Therefore, cycle time is determined by the sum of all the stages: Cycle time: 250+ 350+150+300+200 1,250 ps
4.8.2 The LW (load word) instruction uses all 5 stages. Pipeline: Pipelined processor takes 5 cycles at 350ps per cycle as described in 4.8.1 Total latency (Pipeline) Cycles x Clock Cycle time -5 x 350 – 1,750 ps Non-Pipeline: Non-Pipeline processor takes 5 stages at individual time Total latency (Non-Pipeline) – Sum of all stages. – 250350 150300 200 1250 ps
4.8.3 Splitting the longest stage is the only way to reduce the cycle time. After splitting it, the new cycle time is based on the new longest stage. a. Old longest stage is Instruction decode. Old CT-350 ps. b. New longest stage is Memory. New CT 300 ps 4.8.4 Data Memory is utilized only by Load and Store word instructions Given: The distribution of the LW instruction on the processor: 20% The distribution of the SW instruction on the processor: 15% Utilization of the data memory 20% + 15% 35% of the clock cycles
4.8.5 The write Register port may be utilized by ALU and LW instructions Given: The distribution of the LW instruction on the processor: 20% The distribution of the ALU instruction on the processor: 45% The utilization of the write Register port-20% 45% 65% of the clock cycles
4.8.6 The multi-cycle organization has the same clock cycle time as the pipelined organization. In single-cycle, every instruction takes one (long) clock cycle. In pipelined, a long running program with no pipeline stalls completes one instruction in every cycle. Finally a multi-cycle organization completes a LW in 5 cycles, a Swin 4 cycles (no WB), anALỤ instruction in 4 cycles (no MEM), and a heg in 4 cycles (no WB) So we have the speed- up of pipeline. Multi Cycle Execution time is X times pipelined execution time, where X is: (5 x 20%)+(4 x (45% +20%+15%)) (5 x 20) +(4 x 80) -4.2 Single Cycle Execution time is X times pipelined execution time, where X is: Cycle time an-P Cycle Time p 1250 350 3.5

Also Read :   Why did Derek Jeter and Halle Berry break up?