This section lists a set of CD-related patterns and anti-patterns in a logical order: from delivery to deployment as well as some requirements external to the automated process but necessary to facilitate its adoption.
Delivery
The delivery phase is an extension of continuous integration, involving the automatic deployment of all code changes in a test environment to qualify the source code and the compliance of a version before its deployment into production.
Automate Testing
CD lets developers automate testing beyond just unit tests so they can verify application updates across multiple dimensions before deploying to production. These tests can be different formats:
- UI testing
- Load testing
- Integration testing
- Regression testing
- API reliability testing
The main goal is to help developers thoroughly validate updates and preemptively discover issues to minimize impact on the customer experience. As an important part of the DevOps methodology, the automated testing phase is an excellent opportunity to break down silos by combining the efforts of development, quality assurance (QA), and operations teams to improve the reliability of the deliverable.
Pattern: Automate the verification and validation of software to include unit, component, capacity, functional, and deployment tests
Anti-Pattern: Use manual tests to verify and validate software
Mock the Environment
Mocking means to create a fake version of an external or internal service that can replace the real one in order for developers to test the source code faster and more reliably, using unit tests, for example. Mocking is important for ensuring the portability and reproducibility of the CI pipeline. In CD, these same processes can also be used by the operations and QA teams to perform various tests.
Pattern: Run tests the same way on any platform (laptop, on-premises, cloud, container orchestration platform, etc.) to always have the same result with mocked data
Anti-Pattern: Run tests that will potentially fail based on the status of the infrastructure
Define Release Conventions
Convention definitions are necessary in any work involving multiple teams, such as in the case of automated release management processes. Conventions make it possible to easily identify the status of the deliverable in its lifecycle, facilitate the automation of each step, and guide the onboarding of new people.
A deliverable can have several conventions:
- Package naming – An application's name separated by dashes (e.g.,
kube-prometheus-stack-35.4.2
)
- Package suffix:
SNAPSHOT
– Package deployed in the development environment
RC
(Release Candidate) – Package deployed in the QA testing environment
RELEASE
– Package ready for production
- Semantic versioning convention – Standardize the distribution of any artifacts based on code changes
Pattern: Define an enterprise-wide release convention that all development teams follow to standardize artifact management and facilitate automation
Anti-Pattern: Do not increment the application version to overwrite the previous artifact
Promote Artifacts
In order to promote artifacts from one environment to another, you must first decouple the application from its configuration. Indeed, artifact promotion is also necessary to separate the CI process (build) from the CD process (deploy, run) in following the twelve-factor app principles. Promoting an artifact consists of building the deliverable once and using the same one in all environments, and only changing the configuration.
Pattern: Build once and promote the same artifact from environment to environment, while using different configurations each time
Anti-Pattern: Rebuild the deliverable for each environment
Manage Hotfixes
A hotfix is generally defined as a patch to a live system due to a bug or vulnerability that meets a certain level of risk and severity. Normally, a hotfix is created as an urgent action against problems that need to be fixed immediately and outside of the normal git workflow. As part of a software development cycle, the development team should have a flexible definition of a hotfix and an internal method for determining what meets the needs for a hotfix. When a critical bug in a production version must be resolved, a hotfix branch may be plugged off from the corresponding tag on the main branch that marks the production version. That way, the team members can continue working on the development branch while another person prepares a quick production fix.
Pattern: Deploy a hotfix as soon as possible; test the code in staging before moving it to production
Anti-Pattern: Schedule the deployment of a hotfix; test directly in production
Deployment
The deployment phase takes the features validated by the QA team in a staging environment and flags them as ready for production — whether for manual or automatic deployment into the production environment.
Automated Deployment
With CD, software is built in a way that enables it to be released into production at any time. To do this, a CD pipeline involves production-type test environments in which a new software version is automatically deployed for testing. Once the code is validated, the CD process requires human intervention to approve production deployments, and then the deployment itself is done by automation.
Pattern: Build your binaries once, while deploying the binaries to multiple target environments as necessary
Anti-Pattern: Build software in every stage of the deployment pipeline
Blue-green deployment is a technique that eliminates downtime and reduces risk by running two identical production environments that are called blue and green. At any given time, only one of the environments is live, with the live environment serving all production traffic. For example, let's say blue is currently active and green is inactive. When preparing a new version of software, the deployment and the final stage of testing take place in an environment that is not live: in this example, green. Once the software is deployed and fully tested in green, all the incoming requests can be routed to green instead of blue. Green is now alive, and blue is inactive.
When using this technique, if something unexpected happens with the new version on green, the traffic can immediately return to the latest version in the blue environment.
Canary deployment is a strategy of deploying a software version incrementally. The idea is to first deploy the change to a small subset of servers, test it, and then deploy the change to the rest of the servers. The target environment's infrastructure is updated in small phases (e.g., 10% > 25% > 75% > 100%) to limit the impact of downtime on users: If the canary deployment fails, the rest of the users or servers are not impacted.
Feature Flags
Within the context of CD, feature flags allow developers to release software faster with less risk and more control. Feature flags play a key part in CD schemes where features are constantly being deployed but are not necessarily available for all the users in production. It's important to treat every change equally, meaning that even if it's an emergency, developers must avoid making changes or performing other work outside the scope of the CD pipeline. Feature flags support this practice by disabling a feature when it is no longer needed, and vice versa.
Pattern: Deploy new features or services to production but limit access dynamically for testing purposes
Anti-Pattern: Wait until a feature is fully complete before committing the source code
GitOps
GitOps is a methodology that aims to optimize the time/effort between the development and operations team members. The main components of the GitOps methodology are:
- A versioning source control tool that acts as a single source of truth for declarative infrastructure and app configurations
- An automated process to match the production environment to the state described in the repository
GitOps improves continuous delivery by empowering the process with verifiable and auditable changes, automated deployments, and rollbacks in case of failure.
Pattern: Write all deployment processes in scripts, check them into to the version control system, and run the scripts as part of the single delivery system
Anti-Pattern: Use deployment documentation instead of automation; perform manual deployments or partially manual deployments
Disposable/Preview Environments
A disposable environment is an on-demand environment based on the automation of cloud infrastructure provisioning, configuration, deployment, and deletion processes. It relies on CD automation to deploy software the same way as done in another environment. This practice can be complex because it requires a good understanding of the application architecture to deploy all its dependencies.
Disposable environments have different roles:
- Create a reliable and repeatable end-to-end test configuration in a similar but separate on-demand environment, such as a production architecture-based disposable environment, to test a new functionality
- Create a dedicated environment for support engineers to replicate customer bugs with exact versions of languages and dependencies
- Start a new stable environment dedicated to product demonstrations for customers
Pattern: Utilize the automated provisioning, scripted deployment, and scripted database patterns; any environment should be capable of terminating and launching at will
Anti-Pattern: Fix environments to "DEV," "QA," or other predetermined environments
A/B Testing
Experimentation and feature management must work hand in hand. Experimentation is an important way for a company to validate ideas before launching new products. With development teams using CI and CD processes, flags that control the deployment of new experiences can mitigate the risk of launching something that hasn't worked for everyone at the same time. By first running an A/B test with some of the traffic, developers can test and gradually optimize a new feature. Once the best user experience has been achieved, it can be deployed in a controlled manner across the entire customer base to reduce the risk of technical issues related to the publishing process.
Pattern: Automate tests to verify the behavior of the application; continually run these tests to provide near real-time benchmarking
Anti-Pattern: Never benchmark the performance of new features
Permission Management
Controlling access to automation tools used in a CD pipeline is necessary as these processes provide a measure of stability and resilience for an application, and they often have access to confidential data (e.g., company source code, passwords). Securing their access is essential to avoid any human error due to a lack of attention or malicious behavior that could negatively impact the company.
Pattern: Secure all actions on the deployment job's orchestration platform by defining a role-based access policy to scripts and pipelines
Anti-Pattern: Allow anyone to launch any job at any time without controls or checkpoints
Four-Eyes Principle
The four-eyes principle means that any activity by an individual within the organization must be approved by at least two people. This control mechanism is used to facilitate delegation of authority and increase transparency. This approach not only ensures process efficiency by allowing rapid decision-making while ensuring effective monitoring and control, but also leads to cultural change to minimize risk.
Pattern: Define at least two reviewers for each project and apply a rule to obtain at least one approval to deploy new code into production
Anti-Pattern: Bypass peer validation and deploy changes
Rollback
The process of rolling back means returning the system to its last working state. This ensures a system can be immediately restored if a system failure occurs to minimize disruption to the business.
Automate Processes
Designing a CD pipeline requires determining predefined steps to roll back the application manually or automatically after a deployment failure. The rollback process occurs only for components of an application that were not skipped during the last deployment. Combining an AIOps platform with the CI/CD pipeline can form a truly automated deployment pipeline that not only facilitates deployment but can also verify and recover from any anomalies or failures that are observed in a production environment. Reducing downtime, therefore, minimizes the impact on customers.
Pattern: Provide an automated single-command rollback of changes after an unsuccessful deployment
Anti-Pattern: Manually undo changes applied in a recent deployment; shut down production instances while changes are undone
Dissociate Configuration from Code
Adding flexibility to the application's management strategy is essential in a world where dynamic platforms (e.g., for container orchestration) are becoming more prominent and widely used. Source code and configuration files are two distinct components and should have their own management strategy. Source code should work the same way in all environments because it is immutable. Configuration files are an external part of the application that matter only during execution and can be overridden before starting the application.
Dissociating configuration files and source code makes it easier to maintain and secure them for many reasons, including:
- Operations can update the contents of a configuration file without changing the application's source code.
- The same artifact can be promoted without having to rebuild it.
- The application is easily portable into a new environment.
- Security teams can better audit the access to sensitive data.
Pattern: Capture changes between environments as configuration information; externalize all variable values from the application configuration into build/deployment-time properties
Anti-Pattern: Hardcode values inside the source code or per target environment
Release Your Data
Databases are the cornerstones of all modern software projects; no project of any scale beyond a prototype can function without some form of a database. Continuous database integration is the integration of the database schema and logical changes in application development efforts. Applying the same principles of integration and deployment patterns to the database allows all database changes to flow through the pipeline of each software version, synchronized with the application code.
The main goal is to keep a release's code aligned with the schema of a database, which is essential when launching a new feature — and even more crucial during a rollback where retro compatibility must be ensured.
Pattern: Ensure your application is backward and forward compatible with your database so you can deploy and roll back each independently
Anti-Pattern: Keep a strong dependence between the database schema and the application source code, not being able to deploy one without the other
Observability
Observability is now an essential component of any architecture to effectively manage a system, determine whether it is working properly, and decide what needs to be fixed, modified, or improved on any level, CD processes included.
Monitoring
The impact of collecting deployment pipeline metrics is oftentimes minimized or forgotten. DevOps indicators are data points that directly reveal the performance of a development pipeline and help to quickly identify and eliminate any bottlenecks in the CD process. These metrics can be used to track the progress of a DevOps transition or measure the adoption of automated processes and associated tools by development teams.
Four metrics that are important to measure:
- Lead time for changes – The time needed to push new changes to production.
- Deployment frequency – How often builds are deployed to an environment.
- Change failure rate – How many changes result in defects.
- Mean time to recovery – The time required to recover from failure.
All these metrics are essential for measuring the speed at which teams can correct existing bugs, develop new functionality, and deploy to production. That is, in order to improve a company's competitive advantage, a main objective must be to better organize and manage projects.
Pattern: Deploy software one instance at a time while conducting behavior-driven monitoring; if an error is detected during the incremental deployment, a rollback release is initiated to revert changes
Anti-Pattern: Conduct non-incremental deployments without monitoring
Logging
Every year, companies across industries adopt the DevOps philosophy, but without the necessary data to support their decisions, every deployment can be a risk. One way to mitigate such risk and allow for quick and safe changes — to both overall deployment processes and the software itself — is to enrich the continuous delivery processes with log data coupled with strategic log management.
Logs can offer insight into a specific event in time. Log data can also be used to forensically identify any potential issues proactively before they cause real problems in a system. But logs — especially in modern, hyperconnected, ecosystem-based services — require appropriate optimization to be effective:
- Structure logs – Logging in an unstructured format dramatically increases the complexity of detecting patterns in your logs. Using a JSON format is highly recommended.
- Classify logs – Each log must have a severity level assigned to quickly be able to identify a potential source of error and link the event to other metrics.
- Create actionable alerts – A correctly formatted log line brings more context to the information than a simple metric, making it easier to interpret an error and the actions needed to resolve it.
- Centralize logs – Forwarding system and application logs onto a single platform has several advantages, including accessibility, readability, and ease of correlation with other events.
Pattern: Define an intelligent alert process based on the CD pipeline logs to notify the team capable of resolving the problem
Anti-Pattern: Leave logs on the servers and wait for someone to notify others of an error present in the pipeline
Auditing
DevOps principles like CD help teams develop and deploy applications at a higher velocity. While software delivery speed is an advantage, it is equally critical to ensure the software delivery pipeline is compliant with governance policies and industry standards. Audits are intended to ensure the effectiveness of controls put in place (e.g., dependency verification, code quality control). A way to help mitigate the risk of noncompliance is to aggregate data generated and collected throughout CI and CD workflows into easy-to-read reports that address customers' security and auditing requirements.
Pattern: Produce weekly reports on applications created and deployed; schedule a monthly review with development teams to proactively determine impact on security policies
Anti-Pattern: Wait for the annual infrastructure audit to determine if an application is no longer compliant
External Reporting
We often think about observability in technical matters but having external reporting with data that can be understood by all teams and stakeholders, from technical to executive, is a huge added value. For instance, knowing how often a deployment fails for a service or how many deployments are made during a development cycle or a sprint are critical pieces of information that can be used to improve the SDLC.
Pattern: Define standard data points and export them for every team/project for delivery to stakeholders
Anti-Pattern: Create custom reports with different datasets for each team
Communication and Documentation
Communication between teams is the key to the success of any project. The transition to the DevOps methodology has proven that working in silos is not an effective method and that team collaboration through clear and transparent communication is necessary.
Generate Release Notes
An easy-to-automate means of communication is the creation of release notes in order to share information about a deployment with all teams. These release notes can take on different formats such as an email or a chat message. The objective is to promote the dissemination of information to the entire engineering team — both for the purposes of widespread visibility and transparency as well as so everyone has the ability to interpret it, then take, or not take, any necessary actions.
The most effective and informative release notes contain details like:
- The name of the application and its new version
- The ticket number(s) attached to this version to identify the work performed
- The release date to correlate with potential anomalies detected following the deployment
- The team leading the project and potentially the developer who carried out production
- A summarized runbook for rolling back in case of major force
Pattern: Automatically notify all engineering team members to coordinate projects and further actions
Anti-Pattern: Wait until an error occurs during a deployment for dev and ops teams to collaborate and identify issues with the latest release currently in production
Perform Root-Cause Analyses
Another important communication format, which aligns with the Site Reliability Engineering (SRE) methodology, is identifying, then reporting on, the root causes of a problem that occurred after a deployment. A root-cause analysis prompts a postmortem, where all teams can join to reflect on and discuss the issue — as well as identify the actions that should be carried out to prevent future occurrences of the problem. Postmortems should then be written up, defining the necessary actions and their respective owners. These tasks must be completed within a specified timeframe after issue identification to ensure that the defect has been fixed before the next deployment.
Common actions in a postmortem:
- Review the outcomes and results
- Identify what went well and what did not
- Give everyone the chance to speak
- Identify post actions and assignees
Pattern: Question everything: ask "why" of every symptom until discovering the root cause
Anti-Pattern: Accept the symptom as the root cause of the problem
Document Workflows
Some means of communication are more appropriate than others depending on the information you want to share — documentation included, which is an important component in the DevOps approach. Disseminating information to a large number of people is crucial to foster the understanding of automation principles and adoption of such practices. We have at our disposal today many tools that allow us to share information according to the target audience:
- Engineers – Often prefer reading deployment scripts, which is not always the simplest approach but is probably the most up-to-date documentation.
- Project managers – Will prefer centralized documentation, such as a wiki, with a certain level of information to interpret and understand different aspects of a pipeline without having to understand the specific technique.
- Managers – Tend to prefer high-level documentation describing the overall process, the catalog of applications, application statuses, and the teams or people attached to the project — all information that will allow them to quickly identify and route any information from other teams regarding their projects.
The main purpose of workflow documentation is to align teams in order to promote mutual assistance and process optimization, which can be achieved with automation (e.g., release notes can be automatically generated based on commit messages and the created tag).
Pattern: Define a standard for documentation for every audience
Anti-Pattern: Do not document deployment and configuration workflows; allow only a few employees to access the catalog of applications and other assets
{{ parent.title || parent.header.title}}
{{ parent.tldr }}
{{ parent.linkDescription }}
{{ parent.urlSource.name }}