Tải bản đầy đủ - 0trang
Chapter 8. Adopt Lean Engineering Practices
10,000 hosts.2 Amazon, of course, is subject to regulations such as SarbanesOxley and PCI-DSS.
A major reason Amazon has invested in this capability is to make it extremely
cheap and low-risk for employees to design and run safe-to-fail online experiments of the type we describe in Chapter 9 to gather data from real users. In
many cases, running an experiment doesn’t require going through a bureaucratic change request process. This gives Amazon’s cross-functional delivery
teams the ability to test out wild ideas—safe in the knowledge that if something goes wrong, the experiment can be turned off with only a tiny percentage
of users impacted for a very short time.
Despite the name, continuous delivery is not about deploying to production
multiple times a day. The goal of continuous delivery is to make it safe and
economic to work in small batches. This in turn leads to shorter lead times,
higher quality, and lower costs. It’s for these reasons that the HP FutureSmart
team rearchitected their firmware from scratch to minimize the lead time
between code check-in and validated, releasable software. Finally, continuous
delivery results in boring, safe, push-button deployments rather than long,
painful ordeals that must be performed outside of business hours.
This chapter is aimed at readers who wish to understand the principles and
practices behind continuous delivery. For those who want just the high-level
picture, we present an executive summary of lean engineering practices in the
next section. Readers may then skip to the final section of this chapter.
The Fundamentals of Continuous Delivery
Continuous delivery is the ability to get changes—experiments, features, configuration changes, bug fixes—into production or into the hands of users safely
and quickly in a sustainable way. Let’s examine each of those requirements.
In order to ensure deployments are safe, we construct a deployment pipeline which subjects each proposed change to a battery of automated tests
of several different types, followed by manual validations such as exploratory testing and usability testing. We then enable push-button deployments
of validated builds to downstream test and staging environments, and ultimately to production, release to manufacturing, or an app store (depending on the type of software). A major goal of the deployment pipeline is to
detect and reject changes that are risky, contain regressions, or take us outside the envelope of acceptable performance. As a byproduct of
2 According to Jon Jenkins’ talk at Velocity 2011, “Velocity Culture (the unmet challenge in
implementing a deployment pipeline, we get an audit trail of where each
change has been introduced, what tests have been run against it, which
environments it has passed through, who deployed it, and so forth. This
information is invaluable as evidence for compliance.
We must constantly monitor and reduce the lead time for getting changes
into the hands of users. Mary and Tom Poppendieck ask, “How long
would it take your organization to deploy a change that involves just one
single line of code?”3 We reduce lead time by working to simplify and
automate the build, deploy, test, and release process. We must be able to
spin up test environments on demand, deploy software packages to them,
and run comprehensive automated tests of several varieties rapidly in parallel on a grid of compute resources. Using this process, it is possible to get
a high level of confidence that our software is releasable. Typically this
involves architecting (or rearchitecting) with testability and deployability
in mind. An important side effect of this work is that the product team can
get rapid feedback on the quality of their work, and problems are found
soon after they are introduced rather than in later integration and testing
phases when they are more expensive to fix.
The point of all this is to make it economically viable to work in small
batches. The reason large batches of work are released infrequently is
because corralling releases is painful and expensive. The mantra of continuous delivery is: “If it hurts, do it more often, and bring the pain forward.”
If integration, testing, and deployment are painful, we should aim to perform them every time anybody checks anything into version control. This
reveals the waste and inefficiency in our delivery process so we can address
it through continuous improvement. However, to make it economic to
work in small batches, we need to invest in extensive test and deployment
automation and an architecture that supports it.
There are two golden rules of continuous delivery that must be followed by
1. The team is not allowed to say they are “done” with any piece of work
until their code is in trunk on version control and releasable (for hosted
services the bar is even higher—“done” means deployed to production). In
The Lean Startup, Eric Ries argues that for new features that aren’t simple
3 [poppendieck-06], p. 59.
CHAPTER 8: ADOPT LEAN ENGINEERING PRACTICES
user requests, the team must also have run experiments on real users to
determine if the feature achieves the desired outcome.
2. The team must prioritize keeping the system in a deployable state over
doing new work. This means that if at any point we are not confident we
can take whatever is on trunk in version control and deliver it to users
through an automated, push-button process, we need to stop working and
fix that problem.4
We should emphasize that following these steps consistently will be hard and
require discipline—even for small, experienced teams.
Enforcing Your Definition of “Done”
The HP FutureSmart managers had a simple rule to help enforce these golden
rules. Whenever anybody wanted to demonstrate a new feature (which was
required to be able to declare it “done”), they would ask if the code had been integrated into trunk, and if the new functionality was going to be demonstrated
from a production-like environment by running automated tests. The demonstration could only proceed if the answer was “yes” to both questions.
In Chapter 6 we discussed the enormous increases in quality, productivity, and
reductions in cost the HP FutureSmart team was able to achieve. These
improvements were made possible by the team putting continuous delivery
principles at the heart of their rebuild. The FutureSmart team eliminated the
integration and testing phases from their software development process by
building integration and testing into their daily work. It was also possible to
shift priorities rapidly in response to the changing needs of product marketing
We know our quality within 24 hours of any fix going into the system…and we can test broadly even for small last-minute fixes to
ensure a bug fix doesn’t cause unexpected failures. Or we can afford to
bring in new features well after we declare “functionality complete”—
or in extreme cases, even after we declare a release candidate.
Let’s look at the engineering patterns that enabled the HP FutureSmart team to
achieve their eightfold productivity increase.
4 This is the concept of jidoka in the Toyota Production System as applied to software delivery.
5 [gruver], p. 60.
Continuous Integration and Test Automation
In many development teams, it is common for developers to work on longlived branches in version control. On small, experienced co-located teams this
can be made to work. However, the inevitable outcome of scaling this process
is “integration hell” where teams spend days or weeks integrating and stabilizing these branches to get the code released. The solution is for all developers to
work off trunk and to integrate their work into trunk at least once per day. In
order to be able to do this, developers need to learn how to break down large
pieces of work into small, incremental steps that keep trunk working and
We validate that trunk is working by building the application or service every
time a change to it is made in version control. We also run unit tests against
the latest version of the code, and give the team feedback within a few minutes
if the build or test process fails. The team must then either fix the problem or
—if the problem cannot be fixed in a few minutes—revert the change. Thus we
ensure that our software is always in a working state during the development
Continuous integration is the practice of working in small batches and using
automated tests to detect and reject changes that introduce a regression. It is,
in our opinion, the most important technical practice in the agile canon, and it
forms the foundation of continuous delivery, for which we require in addition
that each change keeps the code on trunk releasable. However, that can be
hard to adopt for teams that are not used to it.
In our experience, people tend to fall into two camps: those who can’t understand how it is possible (particularly at scale) and those who can’t believe people could work in any other way. We assure you that it is possible, both at
small scale and large scale, whatever your domain.
Let’s first address the scale problem with two examples. First, the HP
FutureSmart case study demonstrates continuous integration being effective
with a distributed team of 400 people working on an embedded system. Second, we’ll note that almost all of Google’s 10,000+ developers distributed over
40 offices work off a single code tree. Everyone working off this tree develops
and releases from trunk, and all builds are created from source. 20 to 60 code
changes are submitted every minute, and 50% of the codebase changes every
month.6 Google engineers have built a powerful continuous integration system
CHAPTER 8: ADOPT LEAN ENGINEERING PRACTICES
that, in 2012, was running over 4,000 builds and 10 million test suites
(approximately 60 million tests) every day.7
Not only is continuous integration possible on large, distributed teams—it is
the only process that is known to scale effectively without the painful and
unpredictable integration, stabilization, or “hardening” phases associated with
other approaches, such as release trains or feature branches. Continuous delivery is designed to eliminate these activities.
Fundamentals of Test Automation
As can be seen from the Google and HP FutureSmart examples, continuous integration
relies on comprehensive test automation. Test automation is still controversial in some
organizations, but it is impossible to achieve short lead times and high-quality releases
without it. Test automation is an important and complex topic about which many good
books have been written,8 but here are some of the most important points:
• Test automation is emphatically not about reducing the number of testers—but
test automation does change the role and the skills required of testers. Testers
should be focused on exploratory testing and working with developers to create
and curate suites of automated tests, not on manual regression testing.
• It is impossible to evolve high-quality automated test suites unless testers collaborate with developers in person (irrespective of team or reporting structures). Creating maintainable suites of automated tests requires strong knowledge of software development. It also requires that the software be designed with test automation in mind, which is impossible when developers aren’t involved in testing.
• Test automation can become a maintenance nightmare if automated test suites
are not effectively curated. A small number of tests that run fast and reliably
detect bugs is better than a large number of tests that are flaky or constantly broken and which developers do not care about.
• Test automation must be designed with parallelization in mind. Running tests in
parallel enables developers to get fast feedback and prevents bad practices such
as dependencies between tests.
• Automated tests complement other types of testing such as exploratory testing,
usability testing, and security testing. The point of automated testing is to validate
core functionality and detect regressions so we don’t waste time trying to manually test (or deploy) versions of the software that contain serious problems.
• Reliable automated tests require comprehensive configuration and infrastructure
management. It should be possible to create a production-like virtual test environment on demand, either within the continuous integration environment or on
a developer workstation.
8 We recommend [freeman] and [crispin].
• Only spend time and effort on test automation for products or features once they
have been validated. Test automation for experiments is wasteful.
The main objection to continuous integration comes from developers and their
managers. Breaking every new feature or rearchitecturing effort into small
steps is harder than completing it in isolation on a branch, and takes longer if
you are not used to the discipline of working in small batches. That means it
may take longer, at first, to declare stories “dev complete.” This may, in turn,
drive the development velocity down and create the impression that the team’s
efficiency has decreased—raising the blood pressure of development managers.
However, we should not be optimizing for the rate at which we declare things
“done” in isolation on a branch. We should optimize for the overall lead time
—the time it takes us to deliver valuable software to users. Optimizing for
“dev complete” time is precisely what causes “integration hell.” A painful and
unpredictable “last mile” of integration and testing, in turn, perpetuates the
long release cycles that are a major factor in project overruns, poor quality
software, higher overall costs, and dissatisfied users.
Are You Really Doing Continuous Integration?
Continuous integration (CI) is hard, and in our experience most teams that say they are
practicing it actually aren’t. Achieving CI is not simply a case of installing and running a
CI tool; it is a mindset. One of our favorite papers on CI discusses how to do it without
any CI tool at all—using just an old workstation, a rubber chicken, and a bell (of course
you’ll need more than that on a large development team, but the principles are the
same at scale).9
To find out if you’re really doing CI, ask your team the following questions:
• Are all the developers on the team checking into trunk (not just merging from
trunk into their branches or working copies) at least once a day? In other words,
are they doing trunk-based development and working in small batches?
• Does every change to trunk kick off a build process, including running a set of
automated tests to detect regressions?
• When the build and test process fails, does the team fix the build within a few
minutes, either by fixing the breakage or by reverting the change that caused the
build to break?
If the answer to any of these questions is “no,” you aren’t practicing continuous integration. In particular, reverting bad changes is an insufficiently practiced technique. At
9 James Shore’s
Continuous Integration on a Dollar a Day.
CHAPTER 8: ADOPT LEAN ENGINEERING PRACTICES
Google, for example, anyone is empowered to revert a bad change in version control,
even if it was made by someone on a different team: they prioritize keeping the system
working over doing new work.
Of course if you are in-flight working on a large application and using lots of branches,
it’s not easy to move to continuous integration. In this situation, the goal should be to
push teams towards working on trunk, starting with the most volatile branches. In one
large organization, it took a year to go from 100 long-lived branches down to about
The Deployment Pipeline
Recall the second golden rule of continuous delivery: we must prioritize keeping the system working over doing new work. Continuous integration is an
important step towards this goal—but, typically, we wouldn’t feel comfortable
exposing to users software that has only passed unit tests.
The job of the deployment pipeline is to evaluate every change made to the system, to detect and reject changes which carry high risks or negatively impact
quality, and to provide the team with timely feedback on their changes so they
can triage problems quickly and cheaply. It takes every check-in to version control, creates packages from that version that are deployable to any environment, and performs a series of tests against that version to detect known
defects and to verify that the important functionality works. If the package
passes these tests, we should feel confident deploying that particular build of
the software. If any stage of the deployment pipeline fails, that version of the
software cannot progress any further, and the engineers must immediately triage to find the source of the problem and fix it.
Even the simplest deployment pipeline, such as that shown in Figure 8-1 (a
more complex deployment pipeline is shown in Figure 8-2), enables members
of the team to perform push-button deployments of builds that have passed CI
to production-like exploratory testing or user acceptance testing environments.
It should be possible to provision test environments and deploy any good CI
build to them using a fully automated process. This same process should be
used to deploy to production.
Figure 8-1. Changes moving through a simple deployment pipeline
The deployment pipeline connects together all the steps required to go from
check-in to deployment to production (or distribution to an app store). It also
connects all the people involved in delivering software—developers, testers,
release engineers, and operations—which makes it an important communication tool.
CHAPTER 8: ADOPT LEAN ENGINEERING PRACTICES
Figure 8-2. A more complex deployment pipeline
The FutureSmart Deployment Pipeline
The FutureSmart team’s deployment pipeline allows a 400-person distributed team to
integrate 100–150 changes—about 75–100 thousand lines of code—into trunk on
their 10-million-line codebase every day. Each day, the deployment pipeline produces
10–14 good builds of the firmware out of Level 1. All changes—including feature
development and large-scale changes—are made on trunk. Developers commit into
trunk several times every week.
All changes to any system—or the environments it runs in—should be made
through version control and then promoted via the deployment pipeline. That
includes not just source and test code but also database migrations and deployment and provisioning scripts, as well as changes to server, networking, and
The deployment pipeline thus becomes the record of which tests have been run
against a given build and what the results were, what builds have been
deployed to which environments and when, who approved promotion of a particular build and when, what exactly the configuration of every environment is
—indeed the whole lifecycle of code and infrastructure changes as they move
through various environments.
This, in turn, means that a deployment pipeline implementation has several
other important uses besides rejecting high-risk or problematic changes to the
• You can gather important information on your delivery process, such as
statistics of the cycle time of changes (the mean, the standard deviation),
and discover the bottlenecks in your process.
• It provides a wealth of information for auditing and compliance purposes.
Auditors love the deployment pipeline because it allows them to track
every detail of exactly which commands were run on which boxes, what
the results were, who approved them and when, and so forth.
• It can form the basis of a lightweight but comprehensive change management process. For example, Australia’s heavily regulated National Broadband Network telco used a deployment pipeline to automatically submit
change management tickets when changes were made to the production
infrastructure, and to automatically update their CMDB when provisioning new systems and performing deployments.10
• It enables team members to perform push-button deployments of the build
of their choice to the environment of their choice. Tools for implementing
deployment pipelines typically allow for such approvals to be issued on
per-environment basis and for workflows around build promotion to be
Continuous Delivery and Change Control
Many enterprises have traditionally used change advisory boards or similar change
control systems as a way to reduce the risk of changes to production environments.
However, the 2014 State of Devops Report,11 which surveyed over 9,000 individuals
across many industries, discovered that approval processes external to development
teams do little to improve the stability of services (measured in terms of time to restore
service and percentage of failed changes), while acting as a significant drag on
throughput (measured in terms of lead time for changes and change frequency). The
survey compared external change approval processes with peer-review mechanisms
such as pair programming or the use of pull requests. Statistical analysis revealed that
when engineering teams held themselves accountable for the quality of their code
through peer review, lead times and release frequency improved considerably with
negligible impact on system stability. Further data from the report, which supports the
use of the techniques discussed in this chapter, is presented in Chapter 14.
The data suggests that it is time to reconsider the value provided by heavyweight
change control processes. Peer review of code changes combined with a deployment
pipeline provide a powerful, safe, auditable, and high-performance replacement for
external approval of changes. The National Broadband Network case study (referenced
10 See http://puppetlabs.com/blog/a-deployment-pipeline-for-infrastructure/
CHAPTER 8: ADOPT LEAN ENGINEERING PRACTICES
above) shows one method to implement a lightweight change control process which is
compatible with frameworks such as ITIL in a regulated environment. For more on
compliance and risk management, see Chapter 12.
Implementing continuous delivery requires thinking carefully about systems
architecture and process and doing a certain amount of upfront planning. Any
manual activities which are repeated should be considered potential waste and
thus candidates for simplification and automation. This includes:
It should be possible to create packages from source, deployable to any
environment, in a single step using a script that is stored in version control
and can be run by any developer.
Anybody should be able to self-service a test environment (including network configuration, host configuration, any required software and applications) in a fully automated fashion. This process should also use information and scripts that are kept in version control. Changes to environment
configuration should always be made through version control, and it
should be cheap and painless to kill existing boxes and re-provision from
Anybody should be able to deploy application packages to any environment they have access to using a fully automated process which uses
scripts kept in version control.
It should be possible for any developer to run the complete automated test
suite on their workstation, as well as any selected set of tests. Test suites
should be comprehensive and fast, and contain both unit and acceptancelevel tests.
We require, as a foundation for automation, excellent configuration management. In particular, everything required to reproduce your production system
and to build, test, and deploy your services needs to be in version control. That
means not just source code but build, test, and deployment scripts, infrastructure and environment configuration, database schemas and migration scripts,
as well as documentation.