Little's Law, The Latency Effect, and ''Ready for...'' Queues
Bottlenecks between teams aren't so bad if those teams are still chunking through work they need to do...right? Maybe not.
Join the DZone community and get the full member experience.
Join For FreeWhen your team introduces a "ready for…" queue, beware. As developers pull from the backlog faster than testing can be completed, a ready-for-testing queue builds. It feels faster, but you’re probably adding costs of delay and causing the team to deliver less overall. Little’s Law and the Latency Effect help us understand why.
Little’s Law
Little’s Law is a way to understand the relationship between the following:
- Throughput. How many items of work (i.e., user stories) we can complete each week?
- Work in progress. This is all work that has been started but not finished, even if it’s blocked waiting for someone or something.
- Cycle time. This is the time it takes to complete a specific item of work from start to finish (deployed to production)
Hopp and Spearman’s book Factory Physics expresses Little’s Law as:
Consider a team that, at any given moment, has 10 items of work-in-progress. This team completes 10 items of work per week — the team’s throughput.
Little’s Law predicts a cycle time, per item of work, of one week with the logic that 10 ÷ 10 = 1 week.
When developers are able to get through the work faster than testing can be completed, this creates spare capacity. A developer may then use that available capacity to take the next item off the backlog. The more work the developers get through, the more features we’ll have, right? Wrong — but why?
Throughput
Firstly, this team’s throughput is constrained by a bottleneck in the process. Several testing activities are happening at the end of each item of work and can’t go any faster without some investment.
Having more work in progress, even in the best-case scenario, won’t result in more being delivered due to this constraint.
Work in Progress
Meanwhile, the developers are getting through five more items per week than can be completed by the testing team.
With 10 items of work in progress with the developers and five in progress with the testers, we now have 15 items of work in progress.
Cycle Time
With 15 items of work in progress, and the bottleneck constraining the throughput of the team to 10 items per week, the cycle-time increases, introducing costs of delay. Where each item was taking one week before, it’s now taking 1.5 weeks from start to finish. Now, 15 ÷ 10 = 1.5 weeks.
The Ready-for-Testing Queue Emerges
As developers get through more work, testing is still only being completed at the same rate as before. Many teams then introduce a ready-for-testing queue to allow the developers to continue to work independently of the bottleneck. Why should the developers slow down because the testing can’t be completed fast enough? Let’s explore.
Each week, the developers add five more items to the queue. By the end of the third week, we end up with 30 items of work in progress: 15 queueing up for testing in addition to the other 15 items of work in progress.
This pushes the cycle-time out even more so that 30 ÷ 10 = 3 weeks.
Now the cycle time is at three weeks per work item and the process continues with ever-increasing cycle time.
It is for similar reasons that, in some Scrum teams, we see coding happening in one Sprint and testing a Sprint behind. The result is that testing falls further and further behind.
To make things worse, throughput likely reduces, resulting in fewer features than before. This is largely due to the Latency Effect.
The Latency Effect
As the cycle time per item of work increases, the feedback loop lengthens. When the team was flowing, the developers were receiving feedback from the testers soon after they completed their work. Now, the latency between the testers providing feedback has increased. The developer has moved on by several backlog items.
The developer needs to reload the context from days (or even weeks) ago. The further in the past this work is, the longer it will take the developer to reload this context to then diagnose the cause of any defect and incorporate any feedback.
What Can You Do?
Instead of pulling more and more work in, what could the team have done to avoid the ready-for-testing queue? Simple: Use the spare capacity to address why that queue was needed at all.
Are lots of defects being found, causing the testers to spend more time investigating and reproducing them? Can we get the testers involved earlier? Can any predictable tests be discussed, agreed upon, and automated during development so developers can ensure the code passes those before any end-of-process testing happens?
Are the testers manually regression testing every user story or once per time-box (iteration or Sprint)? Could the developers help by automating more of that? Are the testers having to perform repetitive tasks to get to the part of the user-journey they’re actually testing? Can that be automated by the developers?
Addressing any of these issues is likely to speed up your testing activities, opening up the bottleneck. The result is that this increases the throughput of the entire team. Now, taking that next item off the backlog will actually have the desired effect of more features than the team has delivered before.
Arguably, it’s the handoffs that are at the root of the problem. However, these handoffs are a reality for many teams. Little’s Law and the Latency Effect aren’t just restricted to teams with handoffs. Regardless of your process, increasing work in progress beyond the throughput that the bottlenecks in your process allow is a false economy. You’ll feel faster while, at best, delivering at the same rate. With the Latency Effect, you’ll deliver less rather than more.
Published at DZone with permission of Antony Marcano, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments