What Is Platform Engineering? How To Get Started
This guide explores platform engineering vs. DevOps vs. SRE, explores the roles and responsibilities of platform engineering, and teaches steps to implement.
Join the DZone community and get the full member experience.
Join For FreePlatform engineering is the discipline of building and maintaining a self-service platform for developers. The platform provides a set of cloud-native tools and services to help developers deliver applications quickly and efficiently. The goal of platform engineering is to improve developer experience (DX) by standardizing and automating most of the tasks in the software delivery lifecycle (SDLC). Instead of context switching like provisioning infrastructure, managing security, and learning curve, developers can focus on coding and delivering the business logic using automated platforms.
Platform engineering has an inward-looking perspective as it focuses on optimizing developers in the organization for better productivity. Organizations benefit greatly from developers working at the optimum level because it leads to faster release cycles. The platform makes it happen by providing everything developers need to get their code into production so they do not have to wait for other IT teams for infrastructure and tooling. The self-service platform that makes developers' day-to-day activities more effortless and autonomous is called an internal developer platform (IDP).
What Is an Internal Developer Platform (IDP)?
IDP is a platform that comprises self-serving cloud-native tools and technologies which developers can use to build, test, deploy, monitor or does almost anything regarding application development and delivery with as little overhead as possible. Platform engineers or platform teams build it after consulting the developers and understanding their unique challenges and workflows.
After discussing and implementing Kubernetes CI/CD pipelines and GitOps solutions for many large hi-tech enterprises, we realized a typical IDP would consist of the below 5 pillars:
- CI/CD platforms for automated deployments (Jenkins, Docker Hub, Argo CD, Devtron, Spinnaker)
- Container orchestration platforms for managing containers (Kubernetes, Nomad, Docker Swarm)
- Security management tools for authentication, authorization, and secret management (HashiCorp Vault, AWS Secrets Manager, Okta Identity Cloud)
- Infrastructure as code (IaC) tools for automated infrastructure provisioning (Terraform, Ansible, Chef, AWS CloudFormation)
- Observability stacks for workloads and applications visualization across all the clusters (Devtron Kubernetes dashboard, Prometheus, Grafana, ELK stack)
The platform team designs IDP in a way that is easy to use for developers with a minimal learning curve. IDPs can help reduce developers' cognitive load and improve DX by automating repetitive tasks, reducing maintenance overhead, and eliminating the need for endless scripting. IDP enables development teams to independently manage resources, infrastructure needs, deployments, and rollbacks by providing a self-service platform. This increases developer autonomy and accountability, reduces dependencies, and streamlines the development cycle.
Why Is Platform Engineering Important?
Platform engineering can help organizations reap several internal (developers) and external (end users) benefits:
Kubernetes Dashboard is an external service developed on top of Kubernetes architecture. Under the hood, the Dashboard uses APIs to read all cluster-wide information for visibility into a single pane. It also uses the APIs to deploy resources and applications into a cluster. Both CLI and Kubernetes Dashboards depend on the kube-API-server to process the requests. To get started with the CLI, the Ops team must deploy the Kubernetes Dashboard in the same cluster (similar to Kubectl deployment).
- Improved developer experience (DX): The plethora of cloud-native tools increases the cognitive load of developers, as it takes a good amount of time to decide which one to use for their specific use cases and master it. Platform engineering solves this and improves DX by providing a simplified, standardized set of tools and services that suit developers’ unique workflows.
- Increased productivity: The IDP provides everything developers need to get their code tested and deployed in a self-service manner. This reduces the delays in different stages of SDLC, like waiting for someone to provision the infrastructure to deploy, for example. Platform engineering ensures developer productivity by helping them focus mainly on the core development work.
- Standardization by design: IT teams use a variety of tooling in a typical software organization, varying from team to team. Maintaining and keeping track of things becomes complex in such a situation. Platform engineering solves this by standardizing the tools and services, and it is easier for them to solve any bottlenecks because the platform is identical for every developer.
- Faster releases: The platform team ensures developers are working on delivering the business logic by providing toolchains that are easily consumable, reusable, and configurable. Developers are very productive as a result, and it accelerates faster time-to-market for features and innovations reliably and securely.
Implementing a successful platform team in an organization and leveraging the above benefits requires following some common principles. Treating the platform as a product is one of them.
Platform as a Product
One of the core principles of platform engineering is productizing the platform. The platform team needs to employ a product management mindset to design and maintain a platform that is not only user-friendly but meets the expectations and needs of the customers (app developers). It starts with collecting data points around the problems developers have and identifying which area to facilitate. This could improve deployment frequency, reduce the change failure rate, improve reliability and security, improve DX, etc.
It is important to note that building a platform is all about building a core product that solves common challenges most teams have. It is not about solving the problems of a single team but providing the product across multiple teams to solve the same set of problems. For example, if multiple teams require the same piece of infrastructure, it makes sense for the platform team to work on that shared piece and distribute it. This idea of reusing the platform and repeatability is crucial as it allows for standardization, consistency, and scalability in application delivery.
As in product management, the platform team owns the product, chooses certain metrics, and continues taking customer feedback to improve the user experience. The platform's product roadmap evolves with respect to feedback, and it accommodates changing needs and desires of the customers.
Roles and Responsibilities of Platform Engineers
The primary role of a platform engineer is to design and maintain a self-service platform (IDP) and provide platform services for developers. It starts with engaging with the developers and understanding their pain points:
Listen to the Customers
Interview developers and different IT teams to understand their engineering landscape and challenges and to know what they are optimizing for. They may be trying to build an effective CI/CD pipeline or implement better access control, among many other challenges around software delivery.
Prioritize
Identify common challenges most teams share and prioritize solving them over problems individual teams face. For example, if most teams find it hard to store and retrieve secrets securely, it is ideal to prioritize and solve them for everyone.
Platform Designing
Design IDP with required tools that would solve those problems for users, along with documentation to enable developers to self-serve resources and infrastructure. Adopting a secret management tool would solve challenges around securely managing secrets in the above case. Part of platform designing also includes writing scripts to automate routine development tasks, such as spinning up new environments and provisioning infrastructure to reduce errors and friction points in the development flow.
Metrics
Choose specific metrics around the goals to measure the platform's effectiveness. For example, if the goal is to improve DX, the metrics include engagement scores, team feedback, etc. Similarly, the metrics will change if the goal is to reduce the change failure rate or to increase deployment frequency.
Gather Feedback and Maintain the Platform
Continue listening to the customers and watch the metrics. Gather user feedback to add new tools to the platform and optimize for a better user experience. This also includes staying up-to-date with emerging tools and technologies in the DevOps and cloud infrastructure space and adopting them if necessary.
It is easy to confuse the roles of a DevOps engineer or SRE with that of a platform engineer since they all manage the underlying infrastructure and support software development teams. Although there are certain overlapping responsibilities between all these roles, each differs from the others with its unique focus.
Platform Engineering vs. DevOps
DevOps is a philosophy that brought a cultural shift to SDLC to improve software delivery speed and quality. DevOps facilitated collaboration and communication between development and ops teams and accelerated automation to streamline deployments. Platform engineering — a practice rather than a philosophy — can be considered the next iteration of DevOps as it shares some core principles of DevOps: collaboration (with Ops), continuous improvement, and automation.
The daily tasks of a platform team and DevOps differ from each other in some aspects. DevOps use certain tools and automation to streamline getting the code to production, managing it, and observing it using logging and monitoring tools. They mostly work on building an effective CI/CD pipeline. Platform engineers take all the tools used by DevOps and integrate them into a shared platform, which different IT teams can use on an enterprise level. This eliminates the need for teams to configure and manage infrastructure and tooling on their own and saves significant time, effort, and resources. Platform engineers also create the documentation and optimize the platform so developers can self-serve the tools and infrastructure in their workflow.
Platform teams are required only in matured companies with many different IT teams using complex tools and infrastructure. Naturally, a dedicated platform team to manage the complexity will become necessary in such an engineering landscape. The platform team builds and manages the infrastructure, helping DevOps speed up continuous delivery. However, it is common for the DevOps team to perform platform engineering tasks (configuring Terraform, for example) in startups.
Platform Engineering vs. SRE
Site reliability engineers (SREs) focus on ensuring the application is reliable, secure, and always available. They work with developers and Ops teams to create systems or infrastructure that support delivering highly reliable applications. SREs also perform capacity planning and infrastructure scaling and manage and respond to incidents so that the platform meets required service level objectives (SLOs). On the other hand, platform engineering manages complex infrastructure and builds an efficient platform for developers to optimize SDLC. While both work on platforms and their roles sound similar, their goals differ.
The major difference between platform engineering and SRE regards whom they face and cater their services to. SREs face end users and ensure the application is reliable and available for them. Platform engineers face internal developers and focus on improving their developer experience. The daily tasks of both teams differ with respect to these goals. Platform engineering provides the underlying infrastructure for rapid application delivery, while SREs do the same to deliver highly reliable and available applications. SREs work more on troubleshooting and incident response, and platform engineers focus on complex infrastructure and enabling developer self-service.
To achieve their respective goals, both SREs and platform teams use different tools in their workflows. SREs mostly use monitoring and logging tools like Prometheus or Grafana to detect anomalies in real-time and to set automated alerts. Platform teams work with different sets of tools spanning various stages of the software delivery process, such as container orchestration tools, CI/CD pipeline tools, and IaC tools. All in all, SREs and platform teams work on building a reliable and scalable infrastructure with different goals but with some overlapping between the tools they use.
How To Implement Platform Engineering in an Organization
A platform team will not be an immediate requirement in a startup with a few engineers. Once the organization grows to multiple IT teams and starts dealing with complex tooling and infrastructure, it is ideal to have platform engineers to manage the complexity.
Create the Role (Head/VP of Engineering)
Top-level engineers like the VP or Head of Engineering usually create the role of a platform engineer when developers spend more time configuring the tools and infrastructure rather than delivering the business logic. They would find that most IT teams are solving the same problems, like spinning up a new environment, which lags the delivery process. So the Head of Engineering would define the scope of platform engineering, identify the areas of responsibility, and create the role of a platform engineer/team.
Create an Internal Developer Platform (Platform Engineers/Team)
The platform engineer starts by building the logs of the infrastructure and tools that are already used in the organization. Then they would interview developers and understand their challenges and build the internal developer platform with tools and services that solve problems on an enterprise level. They will build the platform in a way that is flexible and facilitates different architectures and deployment styles. Platform engineers also create documentation and conduct training sessions to help developers self-serve the platform. It is ideal for platform engineers to have a developer background so they know what it is like to be a developer and understand the challenges better.
Onboard Users (Application Developers)
Once the platform is ready, platform engineers onboard application developers. It will require internal marketing and letting teams know of the platform and what it can solve. The best way to onboard users is to pull them to the platform rather than throw the platform at them. This can be done by starting with a small team and helping them overcome a challenge. For example, help a small team optimize CI/CD pipeline and provide the best experience possible in the process. Word-of-mouth from early adopters will have a positive ripple effect throughout the organization, which will help onboard more users to the platform.
Platform engineering does not stop at onboarding the users. It is a continuous process where the platform accommodates emerging tools and technologies and the changing needs and requirements of the users.
Conclusion: Platform Engineering With Open-Source Tools
Selecting an open-source platform that is built to enable platform engineers with a standardized toolchain that helps developers accelerate software delivery is important. Devtron is one such platform that helps developers by automating CI/CD platform, security, and observability for end-to-end SDLC.
Published at DZone with permission of Jyoti Sahoo. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments