Getting Started With Prometheus Workshop: Using Advanced Queries
Learn about using advanced Prometheus queries with PromQL, and expand your query toolbox with more advanced queries for visualizing collected metrics data.
Join the DZone community and get the full member experience.
Join For FreeAre you looking to get away from proprietary instrumentation? Are you interested in open-source observability but lack the knowledge to just dive right in?
This workshop is for you, designed to expand your knowledge and understanding of open-source observability tooling that is available to you today.
Dive right into a free, online, self-paced, hands-on workshop introducing you to Prometheus. Prometheus is an open-source systems monitoring and alerting tool kit that enables you to hit the ground running with discovering, collecting, and querying your observability today. Over the course of this workshop, you will learn what Prometheus is, what it is not, install it, start collecting metrics, and learn all the things you need to know to become effective at running Prometheus in your observability stack.
Previously, I shared an introduction to Prometheus, installing Prometheus, an introduction to the query language, and exploring basic queries as free online workshop labs. In this article, you'll continue your journey using advanced Prometheus queries with PromQL.
Your learning path dives deeper into using advanced PromQL queries. Note: this article is only a short summary, so please see the complete lab found online here to work through it in its entirety yourself.
The following is a short overview of what is in this specific lab of the workshop. Each lab starts with a goal. In this case, it is fairly simple: This lab takes you deeper into PromQL expanding your query toolbox with more advanced queries for visualizing collected metrics data.
A start is made by looking back in review, and sharing how you've gotten an understanding of how to build and execute basic PromQL queries so far. You've done this up to now using the default Prometheus console expression browser and graphs.
For this lab, you'll be diving deeper into PromQL, and, to broaden your knowledge of the tooling available, you'll install, configure, and query using an open-source query tool called PromLens. This is one of the best assistants you can find to help you build and understand what you are querying while seeing the results directly.
Installing PromLens
Your first task is to install on your machine PromLens, a standalone tool for learning PromQL and displaying insights into the queries you are running.
To test out your new installation, you dive right into the concept of nested queries. PromQL expressions are not a single query, but often a set of nested expressions, each one being evaluated and used as an argument or operand to the expressions above it in the nested structure. You run examples of nested queries and explore their results using the teaching aspects of PromLens in the explainer tab. When this query is entered:
rate(demo_api_request_duration_seconds_count{job="services"}[5m])
You see the explanation of each part of this query in PromLens like this:
These explanations are extremely valuable when you are first starting out and trying to master a complex functional language like PromQL.
Language Theory
Before diving in further, you explore some of the language theory and definitions that are crucial to you learning to use PromQL effectively. There are two concepts of expression type when talking about querying Prometheus, and it's crucial you're able to understand the differences:
- Metric type - As reported by a scraped target (counter, gauge, histogram, summary, or untyped)
- Results type - Data type of a PromQL expression (string, scalar, instant vector, or range vector)
PromQL has no concept of metric types. It's only concerned with expression result types. Each PromQL expression has a type, and each function, operator, or other type of operation requires its arguments to be of a certain expression type.
Not only do the expression types exist, but there are also 10 different node types, which are the types of queries or expressions you can write. Here is the list with details about each one:
- Number literals: 6.45
- String literals: "hello o11y" - Occur infrequently, used as parameter values to functions
- Instant vector selectors:
some_metric{job="services"}
- Explained previously - Range vector selectors:
some_metric{job="services"}[15m]
- Explained previously - Aggregation:
sum by(job) (some_metric)
- Allows aggregating over multiple series, always yields an instant vector - Unary operators:
-some_metric
- Negates any scalar or instant vector values, returns the same type as it was applied on - Binary operators:
some_metric_1 + some_metric_2
- Returns scalar if both operands are scalar, otherwise vector - Function calls:
rate(some_metric[15m])
- Takes input parameters of varying types, returns varying types - Sub-queries:
(expression)[1d:]
- Takes instant vector expression as input, returns a range vector - Parentheses expressions:
(42)
- May return string, scalar, instant vector, or range vector, depending on usage
Feels like we are entering the realm of mathematics and you might even remember some of this theory in your computer science courses from university, no? Don't worry, just the short foundational theory is covered before you jump right back into the hands-on application of it all.
Advanced Topics
You jump right into the more advanced topics, such as: histograms and quantiles, learn to calculate latency, aggregate away extra metrics dimensions (cardinality problems), apply filters, create queries with thresholds for alerting rules, filter with time series data, filter with booleans, explore the set operators available to you (AND
, OR
, UNLESS
), explore metrics with timestamps, start manipulating metrics with timestamps, set up detection queries to discover slow batch jobs, set up a second services demo instance to explore how to query for running instance health in your infrastructure, and learn how to smooth out spiky graphs you generate with complex queries.
This was pretty fun to see, so let's slow down here and share the spiky graph-generating query:
go_goroutines{job="services"}
Which indeed does produce something pretty ugly:
To make this graph more useful, you smooth it out using averages over time as follows:
avg_over_time(go_goroutines{job="services"}[10m])
Which sorts out the graph into something you can make sense of:
There is so much you learn in this lab that it does not fit into an article, so make sure to take your time and run through this lab and you'll be running advanced queries to solve all kinds of observability issues!
Missed Previous Labs?
This is one lab in the more extensive free online workshop. Feel free to start from the very beginning of this workshop here if you missed anything previously:
You can always proceed at your own pace and return any time you like as you work your way through this workshop. Just stop and later restart Perses to pick up where you left off.
Coming Up Next
I'll be taking you through the following lab in this workshop where you'll start to explore open dashboards and visualization of your metrics data. Stay tuned for more hands-on material to help you with your cloud-native observability journey.
Published at DZone with permission of Eric D. Schabell. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments