May 02, 2016

A Train to SAFety: An ongoing series; Part 2: Types of trains

In the first post of this series, we discussed the concept of a train, a collection of teams working together on related goals that collaborate around understanding goal dependencies and planning activities.  A secondary characteristic of trains is that all the teams on a train are working in a similar fashion and their progress can be measured using similar metrics.  These trains are a key part of the program management view of the SAFe framework.  However, while trains are a useful metaphor for a program, not all trains are the same.  This post gives examples of types of trains, how they operate, and the factors that drove us to develop these kinds of trains.

Scrum Trains are familiar to most people doing agile software development. Cross-functional teams work in short time boxes, estimating effort, writing code and tests simultaneously and demonstrating functionality to product owners.  The major metrics are team velocity, stories punted from one sprint to the next, defects opened and closed, and code quality.  This is the kind of train most agilists are most familiar with, and all other kinds of trains are mostly understood in how they differ from Scrum Trains.

Implementation Trains are used for teams that do setup, configuration and maintenance of customers on software.  These teams support everything from setting up a new customer with user accounts, integration configurations, etc.  to minor changes customer configurations.  So, the amount of work per request is highly variable.  Customers also make requests with variable levels of urgency and amount of lead time, which means that first-in/first-out processing of work items won’t fit.  

Therefore, a time-boxed approach where work is broken down into small enough pieces so that accurate estimation can be done and new functionality regularly demonstrated doesn’t fit well.  However, almost all of the work follows a predictable set of steps.  So these sorts of teams we’re moving to a Kanban approach.  Performance is measured by throughput, accuracy of estimation, defects opened and closed and frequency of hitting work-in-progress constraints.

This arrangement removes sprint-related time-box constraints that don’t necessarily align to customer implementation schedules.  It will gives feedback about relative team sizes, and warnings about impending work crunches.  The presence of early review states in a Kanban workflow may even give better advance warning of quality issues at the program level than Scrum.

Maintenance Trains are used for teams that work on legacy code bases that are in a low-effort, maintenance portion of their lifecycle.  There’s usually no major additions to functionality, and not enough work to support even a single person dedicated to the application.  What work is done is a slow accumulation of code and configuration changes made to fix defects or adjust to changes in shared services or infrastructure.  Typically, no other teams depend on work done by maintenance teams, and their level of dependence on other teams is minimal.  So, the principal questions in measuring team performance are: Were all the issues raised in a given release cycle closed? Was it done effectively? What was the quality of the work?  

These are the slow freight trains of the ecosystem.  When a product owner feels that sufficient work has accumulated to warrant a release to production, then that team identifies their release date, based on outstanding work and our release calendar.  Then, the work is completed, and regression testing performed and released to production.

There are definitely weaknesses in these trains.  The fact that requirements can sit on a list for a significant period of time, with knowledge of the drivers getting stale is a source of concern.  Also the fact that testing and acceptance is deferred until late in a release cycle is decidedly anti-agile.  However, the total amount of effort on these trains is usually small relative to the total amount of work and is well-understood by the teams who have been working on it for years.  So, trading the risks for the flexibility of working on a system on an intermittent, on-demand fashion can be a net positive.

How to best measure these teams?  More by looking at trends then by comparison to an absolute standard.  Looking at time between releases, level of effort between releases, testing effort , defects opened and closed gives visibility into the health of the code base and can tell when it might be time to initiate a major technical debt paydown or re-write.

Platform Trains hold teams that serve multiple functions.  They build and maintain the SAFe Architectural Runway.   Depending on an organization’s work this may involve work on development of shared libraries, frameworks and services, proof of concepts, continuous integration and delivery, orchestration and provisioning.  The common thread linking them is that they’re all at least one step removed from immediate business value.  The work can operate in both pull (e.g., product team requests a new feature to be added to a shared library) and push (e.g., a new Gradle plugin is released for adoption) modes.  This makes delivery a function of a complicated interplay across teams.  The teams also tend to be cross-functional, and staffed with people who have multiple responsibilities.  The amount of effort involved in a given piece of work is also highly variable.  This makes maintaining a predictable velocity challenging.

The best fit we’ve found for these trains so far is Kanban.  The variability in the size and nature of work items isn’t a great fit, but having work queues with common, high-level steps, having people pull work items from queues, and measuring how long they stay in the queue does seem to give us some ability to improve resource allocation.

Dependencies between families of trains tends to be of manageable complexity.  Requirements flow down from Implementation trains to Scrum trains and Maintenance trains, which feed requirements to Platform trains.  Work products tend to flow in the opposite direction - from Platform trains up to Scrum and Maintenance trains, and from there to Implementation trains.  Communication of these requirements and products can be done through program increment/release planning process.

Matt Kleiderman

Director of Architecture