November 2007

Accounting for bugs and rework

How do you handle bugs and rework in a software development kanban system?

One way or another, buggy work-in-process is still in process and counts against the total WIP limit. The only question is which part of the workflow gets stuck with a kanban token for rework.

Bug scenario

Here we have a two-stage work package decomposition. Green tickets represent a “requirement” work item. These could be something like Use Cases, they could be Functional Requirements,… What matters is that they are something that represents customer value and that seems like a “thing” to analysts, UI designers, testers, and the like.

These are decomposed into smaller work items for developers. Each yellow ticket represents a “feature,” in the spirit of FDD. Each feature will be designed, reviewed, coded, tested, reviewed, and integrated into a development branch. When all of a requirement’s features are complete, they will be rolled back up for review between analysis, testing, and development, merged into the test branch, and approved for functional verification and validation.

We should expect that bugs will be found from time to time, though we hope that this happens with decreasing frequency as a team matures. In this example, we have two requirements in testing, and they have found three bugs so far (blue tickets).

The question is: what do we do with these bugs, now that we have found them? Let’s consider two options: 1) Reinsert defective WIP into an upstream station, 2) Assign bugs to a shared rework station. Each case has pros and cons, which depend on circumstances, like process maturity.

Option 1: reinsert defective WIP upstream

In this scenario, we’ve taken the kanban for work item R5 out of the Test station and placed it back in Development, where it’s treated like any other requirement work item. When it’s done, it will be placed back in the Resolved:Ready queue and retested. It will be subject to all of the usual limits and rules along the way. The kanban is charged to Development, and Test is free to pick up the next thing that appears in their Ready queue.

This is the softer approach. It’s less disruptive and it treats bugs with less urgency. A downside of treating rework like a regular work item is that if Development capacity is full, then the buggy work item will have to be placed in a queue to wait its turn.

The attitude here is that bugs are an expected common cause variation.

Option 2: shared rework station

In this scenario, we’re leaving the kanbans for work items R4 and R5 in test and moving the bug kanbans to a special rework station. Test may continue to work on R4 and R5 while the rework is done, but since they are at capacity, they can’t pull in any new work. That means that the bugs have to be treated like an expedited request. Otherwise, the system will stall until they are resolved.

This is the harder approach. It treats the bugs like a process failure that must be attended to immediately. The kanban is still allocated to the Test station, and the Rework station does not count against the Development limit. Since the rework station is dedicated, there’s no waiting for a slot to open up in development. Regardless, development capacity will be reduced because people will have to give priority to the bugs in order to make space to resolve the regularly scheduled work items.

The attitude here is that bugs are a special cause variation and call for corrective action. This might be the right configuration if a team or project is new, or the team is having an acute quality problem. Once they get the problem under control, they can relax to the reinsertion model.

Comments (11)

Print This Post Print This Post

Email This Post Email This Post


Effective small teams need coordination to make an effective large organization

This is part 2 of 10 of the 10 Pitfalls of Agility on Large Projects. In part 1, we talked about how planning a month or less ahead is not enough on a very large project, and what to do about it.

Here’s some of why people say that long-term, full-detail plans are essential:

  1. You need the detailed work breakdown structure through the end of the project to produce estimates.
  2. And you need estimates to schedule handoffs, deliverables, and other dependences between teams.
  3. And as things do change, you need a plan through which to communicate those changes.

How can we satisfy these needs while still allowing small teams latitude to adapt and be agile?

Concern #1 is a red herring of sorts. Estimates constructed from a detailed WBS are not the most accurate, if you’re in a domain with significant unknowns (like most new product development). In these domains, you’ll get much better estimates from other methods like an experienced expert or group of experts using a technique like Wideband Delphi. For more, see a book like McConnell’s Software Estimation: Demystifying the Black Art

#2 is a dominant concern for organizations with long internal lead times. The motivation and techniques for attacking this problem in other ways is what lean thinking is all about. In short, agile/lean teams are much better equipped to handle changes in other teams’ plans, so they don’t need those plans to be as firm. It’s a self-reinforcing benefit of shorter cycles that pays off in spades.  The trick is keeping the peace during the (often long) transition period where an organization has a mix of long-cycle and short-cycle teams. These solutions to #3 can help during this transition.

Concern #3 speaks to allowing your high and low level plans to evolve as you progress and learn. But how do you keep them in sync?

Top Down and Bottom Up

  • Have top-down goals and priorities that are clear about the customer and business need, but that don’t over-anticipate the technology to best fulfill that need.
  • Be prepared to take top-down input and provide bottom-up feedback as part of your regular planning cycle (e.g. Scrum’s monthly sprint planning).
    • For inputs, the Scrum Product Backlog and processes around it are an effective way to turn top-down priorities into actionable technical workitems.
    • For feedback, provide actuals. In order to keep the trust of the organization, some kind of actuals in terms of feature throughput, earned value, or time data, etc. are essential. If agile teams “go dark” on a large organization, it becomes harder maintain trust when things go bad (as they invariably will from time to time on a large project).
    • For feedback, provide new estimates on the larger goals, based on this last cycle’s progress on specific workitems. To make this feasible, use a fast group estimation method like planning poker or its elder kin, wideband delphi.
  • As the size of the team goes up and dependencies between teams get more tangled, coordination on just a monthly basis isn’t enough. Getting information more frequently than your usual planning cycle (or getting your planning cycle down to one or two week sprints) may become essential. The diagram above says weekly (which might match an org with more than 6-8 Scrum teams).

    This might also be the threshold where project management specialists are called for — don’t distract your project leads with sub-sprint communication and coordination between teams. But also don’t lose Scrum’s designed benefit of protecting teams from constant interrupts — the team controls whether their plan changes within a sprint. Project Managers can help make sure status and communication flow between teams even during a sprint, but they (like all stakeholders) should be prepared to hold new work and priority changes until teams plan their next sprint.

If you’re adopting short-cycle methods in a (long-cycle) large organization — what are your pain points that weren’t covered here. And how have you adapted?

Comments (2)

Print This Post Print This Post

Email This Post Email This Post


Planning a month or less ahead is not enough

This is part 1 of 10 of the 10 Pitfalls of Agility on Large Projects.

One of the most common, valid critiques of agile or lean processes is the time horizon of planning. Scrum focuses on one month sprints. XP advises shorter iterations (2 week, typical). Lean focuses on a single piece (one feature, in the case of design projects), delivered in the shortest possible time.

To many organizations and many people — especially when they first manage projects — these kinds of planning horizons are crazy and negligent. Rather, they strive to plan in as much detail as possible, out to the end of the project. They want to identify critical paths, plan resources, etc. It’s obvious, right? Ah, but I see you’re smiling.

Unfortunately, what you may know (but isn’t obvious to everyone) is this over-planning can be disastrous when there is any level of risk or significant unknowns. Almost any non-trivial software development project would fall in this category.

Usually, this highly detailed initial plan falls quickly out of touch with reality, and must be ignored by the team after a certain point. Good project managers will try to adapt the plan, but if they built in too much detail initially, they’ll find keeping it up to date impossible. Either way, this all can be damaging, as now the team often feels like they’re confused and failing, and management or stakeholders can quickly get dangerously out of touch — they’re still looking at and expecting that initial plan.

Beyond that, there are a host of other harms. First among them that you’re trying hard to lock down your plan at the earliest possible stage of the project, when you have the least understanding of what customers want, what the technology is capable of, and how quickly your team will be able to deliver it. You’ve not explored or mitigated any of the risks yet. You’ve basically committed to be as unresponsive as possible to the new things you learn as the project unfolds.

This is such a common problem — so much pain and so many failed projects could be avoided if it could be solved. And a simple conceptual solution is widely known, but is under-adopted.

It’s called Rolling Wave Planning. And it’s one effective way to unify the worlds of agile/lean and traditional project planning. Here’s a crude diagram illustrating how plans are detailed in the short term, but get progressively more generalized and flexible in the longer term.

Rolling Wave Planning

How does this work?

  • Identify just a few strategic, long-term product line and product goals. If they don’t fit on one side of one sheet of paper, they’re probably too long. These might look 1-2 years out for a large organization.
  • Expand that into a short, prioritized list of near-term problems for your team to solve in the coming year.
  • Bring in your more technical people to produce a short list of functionality the organization is capable of delivering in the coming months to make progress towards solving those problems. There should be lots of room to scale bells and whistles up or back, and especially to make technical choices about how to implement the functionality — you will reap significant efficiencies if your team can adjust as they learn more about the technologies involved and how long things will take to implement.
  • Involve the whole team to do a detailed work-item level just a few weeks ahead. If you have a low-risk, well-understood domain, you could choose to approach this as a work breakdown structure with gantt chart and analysis. If you’re in a higher-risk domain (like new product development), use Scrum-style monthly sprints or, even better, a lean production flow. This is the schedule people can rally around for day to day work.
  • At the end of that shortest planning cycle, percolate your learnings from the small, granular work through to the larger grained goals, then back into your next short-term planning cycle. The key to making this doable is again to not allow too much detail into the larger goals. Keep them high-level, meaningful, and always flexible.

In short, you match the level of detail to (1) how far out in time you’re planning and (2) how risky your domain is.

By doing so, you’ll gain a host of benefits, many of which relate to lean — reducing work in progress, making decisions at the last responsible moment (when you have the most information), pushing responsibility down and empowering the people who are closest to the problem, and generally being open to feedback and agile in response to the changing forces around your project.

Comments (8)

Print This Post Print This Post

Email This Post Email This Post


10 Pitfalls of Agility on Large Projects

Most of these pitfalls don’t apply to very small projects. They reflect some of the feedback you’d get when trying to drive agility at a large company with a lot of inertia behind existing ways of working (this list was born from experiences trying to drive mix of Scrum, XP, and lean concepts at Microsoft). They also embody some of the common trade-offs or dualities of projects.

We’ll keep it short (thus perhaps cryptic) in this post — but then each pitfall/solution pair will be expanded upon in future posts. Any pitfalls that simply don’t make sense, or any you’d add to the list?

Pitfall #1: Planning a month or less ahead is not enough.
Use rolling wave planning to create an evolving big picture.

Pitfall #2: Effective small teams need coordination to make an effective large organization.
Combine bottom-up (scrum-style) and top-down (traditional) planning.

Pitfall #3: We can’t afford to trust everyone on larger teams.
Turn up the knob on transparency (especially time and quality data).

Pitfall #4: The customer doesn’t want a release every month.
Release early and often internally, with longer cycles for expanded audiences.

Pitfall #5: Hundreds of people can’t check directly into “main” every day.
Separate dependent sub-projects and use incremental integration with branches.

Pitfall #6: Not all activities are best handled by generalists.
Apply lean techniques to more effectively handle specialization.

Pitfall #7: Our team/management expects to plan, and execution to plan.
Making firm commitments to something we don’t yet understand is counter-productive. As they come in, actuals have to trump estimates.

Pitfall #8: We are already in the dark. We need more documentation, not less.
On large projects, there are usually reams of wasted documentation. But it may be that “just enough” documentation and status-taking is still a lot.

Pitfall #9: Large teams will reject big changes in how we work.
Start with the way the team works today. Reflect and adapt towards agility.

Pitfall #10: Being agile on a large project is unrealistic and impossible to sustain.
There is no surer strategy for large-scale failure than large projects without empowered teams, short cycles, strong feedback, and a culture which embraces change and adaptation. All we can do is have the patience, persistence, and thoughtfulness to always keep driving in the right direction.

Comments (4)

Print This Post Print This Post

Email This Post Email This Post


Bucket brigades as an alternative to design-in-process inventories?

A bucket brigade is a self-organizing workflow that strikes an optimum division of labor with minimum work-in-process. The division of labor is softer than a fixed assembly line, while still exploiting comparative advantage. Dare I say this could be an Agile division of labor?

BTW, in an actual bucket brigade, the bucket is the kanban.

More on bucket brigades.

Comments (1)

Print This Post Print This Post

Email This Post Email This Post


E-mail It
Socialized through Gregarious 42