Swarm like bees and control your interruptions to go faster.
This is the final post in a series on limiting what you start to go faster. The other posts in the series are listed below:
- Introduction: Introduces how we can achieve optimal productivity and many other benefits, including happiness, by performing one task to completion before starting a new task.
- Priorities and Simplicity: Illustrates how prioritization and simplicity can limit what you start, provide your teams with focus, and eliminate wasteful development of features your users do not need.
This post outlines how two additional Agile patterns that will add further focus to your Agile team. These patterns are Swarming and Interrupt Handling.
Better Together: The Agile Patterns of Swarming and Interrupt Handling
When an Agile team is in a sprint, focus is key to achieving the sprint forecast, which is the definition of a successful sprint. Swarming and Interrupt Handling patterns, when used together, are a powerful combination that helps ensure a team achieves this goal.
The Agile Swarming Pattern
Swarming is best described as one-by-one production of sprint backlog stories. Essentially, during the sprint, the team swarms on one story at a time to completion, working together with a focus on the flow of the highest priority work to a “done” state. This is in direct contrast to focusing on individual worker efficiency, which leads to scrum-fall patterns, siloed team member work, diminished learning, and a high probability of sprint failure. Please read the details of swarming in this prior post: Gain Control By Breaking Dependencies–Task Level Part 2b.
The Agile Interrupt Handling Pattern
A common oversight Agile teams make is assuming the sprint will go as planned. This, in combination with a desire to deliver the most value possible in the sprint time box, leads to filling the sprint to the limit with no excess capacity for unknown or unpredictable events. We will call these unpredictable events interrupts.
Interrupts happen. Some common examples are:
- Automated test failures
- Production incidents
- Environment downtime
- Escaped defects
- Unanticipated implementation complexities
- External requests
When these interrupts occur and no capacity exists for handling the interruption, a team typically takes one of three actions:
- Move to something else: If the interrupt blocks current in-progress work, the team puts the work on hold and starts other lower-priority work not impeded. This breaks the Swarming WIP limit of one story at a time and delays the delivery of the highest-value story.
- Stop-the-line: The team stops to fix the interrupt. This jeopardizes the sprint goal as the sprint has no capacity for addressing the interruption.
- Ignore it: The team ignores the interrupt and completes the sprint. This can lead to waste as certain interrupts, such as defects, become more difficult to solve and can breed further problems as time progresses.
None of these actions are desired as they lead to waste and can result in sprint failure.
For a Product Owner, given interrupts are inevitable and costly, it is a mistake not to have a strategy for dealing with them during the sprint. Teams that plan for interrupts, even if they do not occur, have a higher probability of sprint success.
The Interrupt Handling Pattern is a simple but powerful antidote to avoiding the waste and sprint failure caused by interrupts. Figure A outlines the Interrupt Handling pattern:
Step 1. Reserve Capacity
The first step to the Interrupt Handling pattern is to reserve capacity for interrupts during Sprint Planning. This can be achieved by setting aside a certain number of team hours from the sprint capacity or by reserving story points off of the average velocity.
Teams often struggle with how much capacity to reserve. A common approach is to reflect on recent sprint interrupts and set the capacity according to recent history. Then, in the sprint retrospective, the team tweaks the capacity as necessary for the next sprint based on evolving interrupt trends.
Step 2. Handle the Interrupt
When an interrupt occurs, the team first determines if they have remaining interrupt capacity. Teams often track remaining interrupt capacity on an information radiator, such as a burndown or a simple graphical representation.
If no interrupt capacity remains when an interrupt occurs, the team moves to Step 3 of the pattern—Product Owner Decides at Limit.
If remaining interrupt capacity exists, the team decides how to handle the interrupt based on the team’s working agreement. Two examples of common team working agreements for handling interrupts are below:
- “Stop the line” and pause the current in-progress story to handle the interruption.
- Complete the current in-progress story before addressing the interrupt.
Once handling the interrupt, many teams time box the time spent on the interrupt.
If the interrupt exceeds the time box, the team moves to Step 3 of the Pattern—Product Owner Decides at Limit.
Step 3. Product Owner (PO) Decides at Limit
If the remaining interrupt capacity is not sufficient to handle an interrupt or if the interrupt time box is consumed without fully handling the interrupt, the Product Owner decides how to handle the interrupt among three options:
- Consume another time box: If one interrupt capacity time box has been consumed and interrupt capacity remains, the Product Owner decides to consume an additional time box from the interrupt capacity.
- Reduce sprint scope: If no interrupt capacity remains, the Product Owner removes scope from the sprint backlog to make room to address the interrupt.
- Postpone the interrupt handling: If the remaining sprint scope is a high enough priority, the Product Owner postpones addressing the interrupt to a later sprint.
In many cases, interrupts that are postponed are addressed at the beginning of the next sprint if delaying them further will cause a prohibitive amount of waste. For instance, delaying a defect will drastically increase the resolution effort and will breed more defects. Thus, if no interrupt capacity remains in the current sprint, escaped defects should likely be fixed at the beginning of the next sprint before new features are started.
Step 4. Abort for Overflow
If the team exceeds the interrupt capacity by any amount, the team must abort the sprint and communicate to all stakeholders that the sprint is aborted and delivery timeline forecasts will be missed.
This should be an explicit policy known within the team and external to the team.
An abort policy is extreme for a reason. It puts positive pressure on the team and external requestors to not exceed the interrupt capacity. Nobody wants to abort a sprint, so awareness of this policy will cause under-the-radar work and favors to dissipate.
Step 5. Consume Unused Capacity
If the team completes its sprint forecast and there is interrupt capacity remaining, the team decides how to consume the unused capacity. Several options are available for the team:
- Delivering the next prioritized ready item off of the backlog
- Paying back tech debt
- Learning a new skill
- Implementing a team improvement
- Preparing for the next sprint
- Taking a break before the next sprint
Conclusion
The Swarming and Interrupt patterns when used together are a powerful force for ensuring sprint success. These patterns allow the team to start less and finish more. Use them and reap the benefits of less waste and increased value, predictability, and teamwork.
Other Posts in the Series
- Limiting What You Start to Go Faster – An Introduction
- Limiting What You Start to Go Faster – Priorities and Simplicity
Related Posts
Todd Lankford unlocks Lean Leverage in organizations to cultivate powerful, engaged product teams who maximize outcomes and impact.
His articles share his experiences and learnings along the way. Join the mailing list to get them in your inbox.