Feature Crews: kanban systems for software engineering in the large

Comments (12)

Print This Post Print This Post

Email This Post Email This Post


The Feature Crew process was originally developed at Microsoft for managing the Office development program, and has since spread to other parts of the company. Feature Crew is an iterative process which has some resemblance to other contemporary iterative processes like those of the Agile family. Like most things from Microsoft, Feature Crews evolved independently in order to meet the distinct requirements of a sprawling project like Office. Consequently, some of the concerns and practices of the Feature Crew model are different from common Agile practice.

The Feature Crew model is important to the Lean software development discussion because it is another major variety of kanban system, and is probably the most successful application of pull scheduling in software engineering to date.  Feature Crews are a strong and direct expression of the Lean principles of pull, flow, and value.


Value is represented in Feature Crews by the feature. A feature is roughly defined as a unit of customer-valued functionality that can be built and fully integrated within an interval of a few weeks. The feature is the fundamental unit of scheduling in Feature Crews. Features are generally derived from a user-centered analysis practice like Personas/Scenarios, and they should be bundled into customer- and business-valued packages for integration and deployment. Defined in this way, features are roughly equivalent to the Minimum Marketable Feature (MMF) concept. Personally, I define features exactly as MMFs and therefore I define:

The Feature Crew process is a one-piece-flow pull system for Minimum Marketable Features.

Feature Crews

The namesake attribute of a Feature Crew is the cross-functional workcell. A crew should contain most of the capability that it needs to fully specify, design, verify, and integrate a complete product feature. Typically that means a Program Manager (in the Microsoft sense of the title), a handful of developers, and a couple of testers. Depending on the type of feature, there may also be a product designer or other specialist, but we’ll see how such a resource might be shared across workcells.


One of the primary problems that Feature Crews address is the difficulty of maintaining the integrity of very large code bases under development (imagine 1000 developers coding against a 10,000,000 line system). FC poses the problem as the tension between a) keeping the main branch as current as possible, and b) keeping the main branch as robust as possible. The FC solution is to make features an atomic transaction. A feature is either 0% complete or 100% complete, and a feature is not 100% complete until it can be demonstrated that it satisfies the same quality criteria as the rest of the main branch.

Features-in-process are not allowed on the main branch. The FC alternative is branch-by-feature. A crew takes a branch when it takes possession of the feature kanban. The crew is responsible for forward-integrating any changes that are checked into main while their feature is in process. That is, if another crew integrates and breaks your feature-in-process, it’s your responsibility, not theirs. When your feature is finally complete AND you have integrated with all changes on main AND you pass all of the quality gates, THEN you can reverse integrate your feature into the main branch, and everybody else will have to forward integrate your changes.

From a Lean perspective, even without the scale issue, there is an argument to be made for atomic MMFs and branch-by-feature. MMFs are business-valued. User stories are merely user-valued. Customers demand features. Users merely request them. Allowing features-in-process on the main branch exposes the product to inventory and market risk. A Lean deployment model should be transactional at the granularity of business value, not user value. The purpose of the MMF practice is to bring those values as close together as possible.

One Feature, One Crew

A feature is an atomic customer-valued work item (value). A Feature Crew is a dedicated cross-functional team (flow). We need one more thing to implement pull, a rule: One Feature, One Crew. A Feature Crew works on one and only one feature until the feature is fully integrated into the main branch. When such a feature is finally accepted, then capacity becomes available to begin another feature. Such a rule about the relationship between work-in-process and available resources is a kanban rule, thus Feature Crew is a kanban system, where each token represents the capacity of one workcell and is exchanged for exactly one feature.

What happens to a Feature Crew when its feature is complete? There are a couple of variations. One style keeps the crew intact and reschedules it as a unit. Another returns the crew to a resource pool from which new crews can be allocated. Neither pure strategy is necessary. I like mostly durable teams with light rotation of individuals.

Quality Gates

While things like unit tests and customer acceptance tests are necessary to meet the quality criteria of a multi-million-line codebase, they are certainly not sufficient. A Minimum Marketable Feature, taken as a whole, has properties that are more than the sum of its component tasks. There are certain kinds of design and quality control activities that have a larger natural granularity than user stories.

Quality Gates ensure that all of the systemic work gets done that does not fit naturally into the functional development process. A good deal of security and reliability engineering has nothing to do with the intended functionality of a feature, and everything to do with how components of a system will behave in the presence of other components of that system under a wide range of operating conditions. Quality Gates also facilitate the sharing of specialized resources, like a security engineer or a system architect, that are impractical to include on every crew.

Feature Crew is only a process framework

The Feature Crew model treats both features and workcells essentially as black boxes. Like Scrum, the Feature Crew method is not prescriptive about workflow. What happens inside the crew is somewhat up to the crew.  One could imagine that a crew implements a mini-phase/gate (and some do, though we know that isn’t wise). A crew could choose to implement an off-the-shelf process like Cleanroom or XP for their internal workflow, and many do. Some crews will use an ad-hoc local body of practice, or git-r-done cowboy craft. The quality gates set the bar, how you meet the bar is your concern.

Since Feature Crews address a different scale than Scrum, we can even combine Feature Crews with Scrum. If a crew takes on a 6-week feature, that crew could then overlay a 1-week timebox within those 6 weeks and decompose the feature into Scrum-like work items and goals, which are then implemented according to local custom. Again, this is not uncommon practice.

Kanban hierarchy and matrix workcells

Since we already understand Feature Crews as a type of kanban system, and we see how can we can overlay Scrum as a secondary planning process, then it follows that we can use something like Scrumban as the internal scheduling process. We don’t have to do this, but doing so buys us some symmetry between the layers of planning hierarchy, and allows us to share metrics, tools, and terminology between workers and management. The cumulative flow report for your team is of a similar kind to the cumulative flow report for the project as a whole.

Feature Crews are an effective method of capacity calculation, but they are also a blunt instrument for that purpose. Making a team self-contained can result in boom-and-bust duty cycles for individual team members. The beginning of a feature branch may be heavy on analysis and high-level design, and the end of a feature branch may be heavy on system testing and bug fixing. A UI designer might find herself very busy for the first couple of weeks and mostly idle for the last couple of weeks. Such a team may be able to learn a certain amount of task leveling, but only so much before running into other problems.

The complementary dysfunction of individual bursting is poor total resource utilization. If every crew employs their own UI designer at 50% capacity, then you’ve hired twice as many designers as you really need to get the work done. As much as we love flow, that is a high price to pay for it. Introducing a second level of kanban granularity gives us access to a finer set of controls that we can use to dial in a better balance between availability and utilization.

Nested Scrumban is enough to give us a more consistent process between the whole and its parts, but it doesn’t help us directly with our utilization problem. For that, we will appeal to a little bit of queueing theory. For some development functions we may be able to share resources between workcells with the same low delay as more dedicated resources, but at much higher utilization. Software development costs are overwhelmingly dominated by labor costs, so paying at least some attention to labor utilization is worth our consideration. Lean cares a lot about labor utilization, it just cares about it in the right order.

Given these tools, we can design a hybrid feature crew / matrix organization. Some resources can be feature-aligned and dedicated to their workcells. Other resources can be function-aligned and pooled across workcells. A product group with 10 workcells probably doesn’t need 10 security engineers, 10 user researchers, 10 architects, and 10 database administrators. But there is some right ratio of each of these functions to each other, and those ratios can be determined by value stream analysis, theory of constraints, and other heuristics.