Unified Observability: Metrics, Logs, and Tracing of App and Database Tiers in a Single Grafana Console
This first post of a series discussing unified observability with microservices and the Oracle database takes a look at the basics and basic use cases.
Join the DZone community and get the full member experience.
Join For FreeThis is the first in a series of blogs discussing unified observability with microservices and the Oracle database. Future pieces in this series will go into more depth on details of specific use case scenarios and ways to optimally observe and troubleshoot them.
I will go into the fundamentals and basic use cases in this first piece, and also refer the reader to a new Unified Observability in Grafana Workshop that demonstrates the principles, provides code and configuration examples, etc. that I will also be building upon over time.
Data-Driven Microservices and Observability
Modern application development and the nature of microservices architectures, in general, have made observability (metrics, logs, and tracing) an absolutely critical requirement for DevOps. Additionally, the more efficient the observability tools, the more effective the user and the application will be. Currently, there are a number of disparate tools for monitoring, tracing, and logging, making correlation across the tools difficult. Furthermore, app and database observability is currently quite different and separate, as are on-prem and cloud-based tools.
CNCF furnishes a number of open standards, such as OpenTelemetry, and there are corresponding open technologies, such as Prometheus, Jaeger, and Grafana. These standards and technologies can be used together to provide unified observability across metrics, logs, and tracing and across application and database tiers.
*Note, I am not referring to "tiers" in any strict topological meaning of the word per se but rather to explain the separation of the two aspects of application and database. Having said that, I am indeed referring to a Kubernetes-based application tier in the context of this blog.
The following is a diagram of the unified observability architecture:
*Note that there there is nothing that excludes the use of ELK (Elasticsearch, Logstash, and Kibana) or EFK (Elasticsearch, Fluentd, and Kibana) stacks, nor any Oracle database products such as Enterprise Manager, Trace File Analyzer, Application Performance Monitor, etc. Each has its strengths, weaknesses, and use cases. What is presented here as far as PLG (Promtail, Loki, Grafana) is a solution promoted by Grafana which has become increasingly popular, as it lends itself extremely well to the ability to have a single-pane-of-glass view of the system.
Metrics, Logs, and Tracing Exporters
Oracle enhances these capabilities by providing exporters from the database. These exporters are all open source and can be found here. They are currently at various stages of development with the metrics exporter being available in the repos mentioned, the log exporter proof of concept available in the workshop mentioned, and the tracing exporter under development. All three exporters are being actively enhanced as are the drivers and database itself to provide highly customizable, seamless end-to-end observability. Again, I will elaborate on these efforts in future blogs of this series.
Config and Code Examples To Try
The Unified Observability Database Workshop currently contains the following labs.
- Setup and Deploy Application
- The "GrabDish" application from the Simplifying Data-driven Microservices Workshop is used.
- Enable Observability with Database Metrics, Logs Exporters
- A simple script sets up everything you need here and you can refer to the script for details.
- This will be replaced and automated by the use of an Observability Controller/Operator in the next version of the workshop.
- Configure Alert Channel
- Simply shows how to set up Slack (or Prometheus AlertManager, etc.) to receive alerts.
- Diagnosability and Debugging Use Case
- A basic use case where a failure in the system is shown on the dashboard and fixed
- This will be enhanced to demonstrate a more complicated debugging scenario across tiers in the next version of the workshop.
- Performance Monitoring Use Case
- A basic use case where a drop in performance due to a load test triggers an alert.
- This will be enhanced to provide more details of the performance details across tiers in the next version of the workshop.
- Better Understand and Modify Metrics, Logging, and Tracing
- Basic details on how metrics, logging, and tracing are collected in both app and database tiers, how OpenTracing Id and Oracle ECID are used to trace across tiers, etc., including source code examples.
- More details and enhanced use of OpenTelemetry auto-instrumentation, etc. will be included in the next version of the workshop.
I will refer you to the workshop for more details but will give a few highlights here.
Sample Grafana Dashboard
The sample dashboard shows microservices information in essentially for quadrants:
- Runtime metrics for the microservices running on Kubernetes such as CPU
- App-specific metrics for the microservices running on Kubernetes such as "PlaceOrder count"
- Runtime metrics for the microservice's database such as connection sessions
- App-specific metrics for the microservice's database such as "Orders in pending state"
Logs to Trace and Trace to Logs
Metrics, logs, and tracing can be viewed side by side in Grafana. The user can drill down on a particular metric to the corresponding logs. The user can also drill down on the logs to the corresponding trace and vice-versa. Notice the "Jaeger" link in the following log output. These log-to-trace correlations are provided for both the Kubernetes microservices logs and the Oracle database logs.
The trace below corresponds to the log above (accessed via the "Jaeger" link mentioned) and visual graph analysis. Again this is all occurring within the same Grafana console.
Conclusion
I have touched upon some of the basics of the unified observability solution being provided for modern application development and data-driven microservices using the Oracle database. I look forward to publishing continued content in this series soon. We are creating some exciting new functionality in this area to help developers and operations have a simpler, more effective, and enjoyable experience with observability.
I appreciate your time reading and any feedback or questions you might have to send me.
Opinions expressed by DZone contributors are their own.
Comments