The digital landscape is moving from monoliths to microservices, cloud-based services. Enterprises and companies need to adapt to continually changing technical landscape and keep up with the change. Systems are becoming more and more complex and are not easy to manage. I will try to explain some of the newer architectural approaches, trends, and provide insight towards AIOps and how it helps to solve this very problem. Microservices Microservices architecture is becoming the most preferred architectural and development strategies. The advantages of building and running software with this architecture outweigh the disadvantages. Benefits include: Building highly efficient autonomous "small" teams to deliver new services or features faster, this means you do not have to wait for the long release cycle, you release features as soon as they are ready. Increased productivity and speed of deliverables. Isolated development approach results in highly independently deployable service, and testable service. Managing and maintaining tests for the microservice is easier since the scope is limited to the service capabilities, automated testing for unit testing, regression testing as well as performance testing can be achieved easily. Different teams can build microservices using different technologies. The best suitable technologies can be leveraged to build specific microservices. Teams can be spread across geography. Microservices are easier to build and deploy; specifically on container platform, the resource utilization is optimized. Microservices are built around a particular business functionality Microservices scalability is better since they have a very small footprint Serverless Cloud platforms provide many capabilities and tools to work with. A fully managed system on cloud platforms is serverless. Cloud-based serverless technologies are a big boost to companies smaller or larger to move their small functions/code like nano services, asynchronous jobs, scheduled jobs, integration of cloud services with on-premises. With serverless: The underlying infrastructure is fully managed by the cloud platforms, resulting in very low maintenance costs The deployed serverless services are highly available Easier to scale globally with high concurrency Event-driven architecture which means that it is used only when a relevant event is triggered Most serverless are pay-for-usage which results in low cost for running the services Digital Modernization Utilizing both microservices-based architecture and serverless together is what is called as Digital Modernization. Container platforms like Kubernetes, Openshift are the most suitable platform for hosting microservices, Serverless can be utilized for, but not limited to, asynchronous processing, scheduled jobs, ETL jobs, etc. With the advent of microservices and serverless different challenges arise like: Monitoring the high number of microservices Identifying the root cause for failure Addressing the failure quickly Testing across the various features Monitoring end-user conversions Adapting to continuous upgrades and changes to the systems Simple DevOps strategy will not be enough to manage such a complex system. Bringing in Artificial Intelligence (AI) coupled with Machine Learning capabilities into DevOps will help to address the new complexities in development, deployment as well as production application performance monitoring (APM). AIOps helps in enabling autonomous DevOps, offer prescriptive resolutions and self-healing. AIOps AIOps brings in four critical features needed for creating highly effective processes and systems: 1. AIOps: Analysis of the traffic, logs, usage with the help of machine learning, anomaly detection and alerting, and reliable root cause analysis 2. Intelligence DevOps: software quality is improved significantly with AI is driven performance and regression testing 3. Remediation and Self Healing: Auto-detect issues and alerts and trigger remediation and self-healing, provide prescriptive automation 4. User Experience: AIOps provide better insights for the usage of the system and measure conversions easily With AIOps, immediately test your code for performance and regression automatically analyzing the test traffic and detect issues early. DevOps pipelines integration with the AI-based complete APM solutions together is a powerful AIOps tool. AI-based APM solutions perform analysis of traffic, logs and resource utilization and detect an anomaly. If any anomaly observed an alert is triggered, based on the alerts teams can build automated scripts for known issues which can be executed as and when the issue occurs. E.g. Increase disk space or allocate new persistent volumes if the resource(s) is running out of disc space Run archival processes when a sudden surge in traffic resulted in increased database table sizing Scale up or scale down based on memory or CPU usage exceeding prescribed thresholds (usually in absence of autoscaling setup) Auto-remediation of a failed deployment and revert to older build if the new build failed A typical workflow for automated remediation will look like this: How cool would it be if you were able to automate most of the remediations, self-healing! AIOps Tools There are many AIOps tools available in the market including the ones provided by the cloud platform. They continue to evolve with a better understanding of systems and behaviors. New features and capabilities are being added. Some of the tools are: 1. Dynatrace: The front runner and identified as one of the leaders by Gartner is one of the most powerful APM solutions for managing multiple on-premise as well as cloud environments. This product provides very strong AIOps capabilities. Its root cause analysis tool is one of the best in the market. Finding the root cause can be done within minutes. Dynatrace also provides a reference implementation for Autonomous Cloud Management through its framework called keptn.sh. Its auto-discovery of services is one of its kind and very powerful. Link: https://www.dynatrace.com Link: https://keptn.sh 2. Cisco AppDynamics: AppDynamics is an application performance management (APM) and IT operations analytics (ITOA) company based in San Francisco. The company focuses on managing the performance and availability of applications across cloud computing environments as well as inside the data center. Link: https://www.appdynamics.com/ 3. New Relic: New Relic's software analytics product for application performance monitoring (APM) delivers real-time and trending data about your web application's performance and the level of satisfaction that your end-users experience. With end to end transaction tracing and a variety of color-coded charts and reports, APM visualizes your data, down to the deepest code levels. Link: https://www.newrelic.com Conclusion Digital Modernization is here to stay and CTO group need to view this strategy holistically and adopt AI-based DevOps or AIOps. AIOps will bring significant improvements for APM capabilities as well as streamlining the development and testing. Please share your feedback and views which will help me improve my writing. References: 1. https://www.dynatrace.com/gartner-magic-quadrant-application-performance-monitoring-suites/ 2. https://en.wikipedia.org/wiki/AppDynamics 3. https://docs.newrelic.com/docs/apm/new-relic-apm/getting-started/introduction-new-relic-apm