Maturing an Engineering Organization From DevOps to Platform Team

Learn how platform engineering evolves DevOps by reducing engineers' infrastructure burden through streamlined tools, enhancing productivity, and more.

Rajat Gupta

Sep. 26, 24 · Analysis

Likes (3)

Comment

Save

5.6K Views

The DevOps model broke down the wall between development and production by assigning deployment and production management responsibilities to the application engineers and providing them with infrastructure management tools. This approach expanded engineers' competencies beyond their initial skill sets.

This model helped companies gain velocity as applications weren't passed around from team to team, and owners became responsible from ideation to production. It shortened the development lifecycle and time to deployment, making companies more agile and responsive.

DevOps became the logical recommendation for fast-paced digital transformation, and most engineering organizations in recent years are built following this strategy.

Limitations of DevOps and the Need for Platform Engineering

The DevOps movement had one significant implication: it heightened the cognitive load on all development teams about infrastructure, deployment, and operations. Learning and understanding various parts of the stack deeply is much work honed by all engineers, on top of developing an increasing number of complex applications.

This had unintended drawbacks. Spreading engineers thinner on their competencies lowered their productivity on their core applications. All time spent learning infrastructure as code or setting up testing and continuous integration pipelines resulted in less time to work on customer-impacting features.

This further impacted hiring and onboarding, as engineers competent on many parts of the technology stack are rarer than strict application engineers. Ramping up on a new stack was significantly heavier than on a single application. Internal training became an even more critical and under-invested part of an engineering organization.

Finally, while the scope was creeping up for engineers, the operational load could generate stress and burnout on already pressured teams, potentially impacting delivery.

To alleviate these concerns, DevOps had to evolve towards offering turnkey service to their engineers, not asking them to learn new technologies while giving them all the same control and ownership of their applications.

Companies must build a golden path of tools and automation from onboarding to production for all engineering teams to thrive. This resulting practice is called platform engineering and is the natural evolution of DevOps within an organization.

What Is Platform Engineering?

Platform engineering is evolving and consolidating multiple disciplines within a standard engineering organization. A platform team's primary role is to deliver ways for product teams to build and deploy their application effortlessly.

It takes examples from what cloud-native architecture (hear Platform-as-a-Service) offers its customers and builds a replica of that model tailor-fitted to a company and its engineering organization. Product teams act as customers to their internal platform, the platform team being an intermediary between development and production.

Thus, by contract, a platform team is also responsible for offering observability as a service to their users. While this can be broken down into separate entities, deployment and monitoring are interlinked and must be provided as a one-stop solution to manage applications, from ideation to production and scaling.

Knowing that each engineering organization has its complexities and particularities, what binds together these services is a cohesive developer experience tailored to the daily life of engineers. Building interfaces (UI: User Interfaces, CLI: Command-line Interfaces, or APIs: Application Programming Interfaces) is the main differentiator and the exciting work of an effective platform team.

This first helps improve team velocity during a standard development lifecycle. It speeds up onboarding, allowing the company to iterate on new products and services quicker for its users.

When Does It Make Sense To Build a Platform Team?

Not all engineering organizations need a platform team, as it often depends on the organization's size and growth rate.

As a rule of thumb, it is worth investing in a platform team once you have multiple products (customer-facing or otherwise) concurrently being developed by independent teams. It usually comes when an engineering organization reaches more than 20 engineers.

Another good indicator is the expected growth rate of the company. If there is planning to hire quickly and develop new products within the next year or so, investing now in platform engineering while focusing on onboarding and reducing friction during development is critical to maximizing a growth period and alleviating the typical productivity slump when scaling an organization quickly.

An Effort of Standardization

A common challenge that a platform team faces is adoption within the company. As stated earlier, investing in building this team only makes sense once a company has found some velocity and applications are being regularly and continuously deployed to customers. This initial velocity sometimes comes at the price of technological choices, deployment strategies, and tooling. We expect drifts between teams as the company grows and technological decisions evolve.

Once an organization introduces a platform team, it must decide on the direction of the tools and technologies to be used by the company in the future. While there should be as few as possible, migrations are sometimes inevitable to bring an engineering organization to a better place. It means convincing other teams to take time out of their roadmap to adopt new tools to save time and complexity down the line.

This investment can be a hard sell, as it impacts product roadmaps. To help with buy-in, stating the intention of standardization and adoption of a single platform with shared tools as an organization legitimizes the effort for all.

Product Teams Regain Focus On Their Offering

The first natural objective of building tooling is reducing the friction for developers to build, deploy, and manage their applications. It can initially allow access to any infrastructure a team needs to run their service. It eventually means removing the need to manage infrastructure so that they spend most of their time on application development.

It is critical to work with all teams regularly and assess their day-to-day, understanding where and when they spend time on something different than their application.

Highlighting major friction points helps the platform team drive their roadmap. Time-to-build, time-to-deploy, and time-to-test are the main KPIs (Key Performance Indicators) to track. Knowing that with each minute saved, the company exponentially accelerates its delivery to users.

Internal Open-Source Culture

By having multiple teams benefit from the same tooling offered by a platform team, there is an opportunity to build an internal open-source culture where users contribute to the tools they use daily. Allowing anyone to look into the source code of their tools and welcoming them to suggest and add any feature that can help in their day-to-day heightens confidence and buy-in into the platform they are using. An ecosystem is only as robust as its users' contributions. Inviting all engineers to sculpt their environment how they want builds a more cohesive and dynamic engineering organization.

This culture often reaches outside the platform scope, building software that benefits any company's applications and products. It further improves the organization's engineering excellence and ability to deliver great products.

Operational Security Benefits

Automating and standardizing infrastructure-related efforts for all teams has one key benefit for your organization: heightened operational security.

A controlled and standardized environment reduces the potential attack surface at an infrastructure level. Application security becomes the de-facto focus moving forward as it is the growing moving piece as the company continues to deliver products and features to its users.

Cloud Scaling and Cost Control

Having infrastructure normalized between multiple teams and products allows one to work on multi-tenancy and shared resources. It can materialize as clusters of agnostic machines running all applications within a company. It gives complete infrastructure control and scaling as the whole company needs. It also allows for better efficiency. When every team shares the same resources, they are easier to allocate. Costs are easier to manage and keep under control.

A Golden Path From Onboarding to Production

The main success criteria for a platform team to demonstrate is the ability to take an idea into production with the least friction possible. By showing that any code can be taken to production autonomously (without any manual actions from a controlling team) and quickly (continuous deployment integrated by default, automatic provisioning of domain names and load balancers, etc.), a platform team demonstrates a clear, attractive solution for a company to rely on and grow.

With automation and intuitive tooling, one can build an engineering organization where a newcomer can join a team and immediately contribute to their codebase. This onboarding time is critical to keeping velocity during growth.

Controlled Innovation

It is important to remember that infrastructure innovation is necessary over the years. While you want strong standards and control, evaluating the possibility of using different types of resources for various use cases is valuable in the long run.

Introducing innovation as an internal platform provider forces you to fully understand the ins and outs of technology before offering it as a service. Thus, you must be fully competent as an operator before running critical applications on said infrastructure.

Reliability and Incident Management

Standardizing and offering lean services to the engineering team is a forcing function to build up reliability across the board. As observability is factored into a solution already running for other services, the risk of launching a new service is lower than it would be spinning up fresh infrastructure.

Relying on tried-and-true services to build an application also means that incident management is more approachable, as the time to learn and understand said infrastructure is factored into the service offered by a platform team.

Conclusion

The evolution from the DevOps model towards a fully-fledged platform team is an exercise in maturity for any engineering organization to consider. Once a company reaches a critical size, runs a certain amount of products, and continuously delivers services and products to its customers with reliability and scale, it needs to invest in a solid foundation for its engineering teams to thrive.

From scaling to security and reliability, critically building that foundation is a decision made from the top of the company that will make every part of their technology better and with a stronger future.

DevOps Engineering Production (computer science) platform engineering

Opinions expressed by DZone contributors are their own.

Related

Trending