Prometheus

Prometheus: As Simple As Possible

by | 19.09.2021 | Engineering

Distributed systems help an organisation absorb countless benefits but at the cost of complexity. With the rise of the adoption of container orchestrators like Kubernetes, a need for monitoring and alerting systems came.

One such system is Prometheus which is famous for being “the Kubernetes monitoring solution.”

This post will explore Prometheus in a beginner way without any intricate details that generally scare a novice.

Let’s Start!

What is Prometheus?

SoundCloud first developed Prometheus in 2012 after determining that their existing monitoring tools were insufficient for their needs. However, the first release of Prometheus v1 would not be available to the public until 2016, but now it’s fully open sourced and a CNCF graduate project.

Prometheus is a time series database-based Monitoring and Alerting tool. Prometheus gathers data from apps and systems and allows you to visualise it and set up alerts. We will go deeper with this in a minute, but why do we need monitoring again?

The Stroy Behind Prometeus and Cloud

Long before you start reading this blog, there used to stay giant creatures called Monoliths. They were slow and had complexity. It was hard to understand them and harder to diagnose problems. But then we had an evolution, and micro creatures was a better option. They had a lot of benefits and had less complexity with them. But managing one creature is more demanding than managing hundreds or thousands.

You’re correct!


So, we need someone to orchestrate the herd and someone to look over or monitor the flock. The orchestrator is Kubernetes, and Prometheus manages the monitoring. Suppose your micro creatures had one telephone in their camp. Prometheus helps you look for how many hours they’re talking, whether they can connect to the telephone service and how many times they can’t call someone. All without going to every individual’s tent and enquiring!

Your micro creatures are microservices in a container (or Virtual Machine) that performs a job, and you would love to read these three posts on the components of Evolution:

Prometheus Architecture

The Prometheus Server consists of 3 components:

Time series database (TSDB)

All of the metrics get stores in a time series database which is optimised for time-stamped data that needs measuring changes over time and also help query efficiently. So, simply it holds every generated metric for you to reference later.

Prometheus
The architecture of Prometheus and its components Source: Prometheus

Data Retrieval Worker

It collects (by pulling/scraping) metrics from external sources and stores them in the TSBD.

Data Retrieval Worker

It provides a simple web interface UI for configuring and querying the database to help you visualise your queries. We receive centralised management and configuration using the server that can track when we show new data from each unique target.

How does Prometheus work?

Querying and Scraping

Prometheus scrapes metrics from our apps and services regularly via HTTP endpoints from the target systems that have the client library installed. So, in order to collect metrics, you don’t need to install custom software or configure anything on your physical servers, nor do you need to do so on your container images.

This pull based approach use by Prometheus is better than the traditional push based approach because:

You can run your monitoring on your laptop when developing changes.
You can more easily tell if a target is down.
You can manually go to a target and inspect its health with a web browser.

— Prometheus — FAQ

Service discovery

Prometheus was built from the bottom up to function in dynamic environments like Kubernetes and requires very minimal configuration when first installed. As a result, it undertakes automatic discovery of operating services in order to make a “best guess” as to what it should be monitoring.

Prometheus Exporters Source: Devconnected

Snapshots and Querying

Now, as your data is scraped what’s next? We want to store the metrics.

Prometheus stores a database record with each scraped metric’s snapshot of the data. You can use PromQL queries and the Prometheus web UI, or other tools like Grafana to explore how metric data evolves over time by querying and analysing metric data snapshots.

Also, you can label your metrics to manage the metrics. Now you don’t have to worry about searching the metrics name manually after changes because you can search them by labels. How cool is that?

Different Type of Metrics

Metrics is the specific set of data from an endpoint. It consists of TYPE and HELP Attributes which are described below.

HELP

For a token like HELP, it is expected that another token would follow, which is the metric name. The docstring for the metric name consists of all of the tokens that are left over. HELP lines should be composed of a metric name at a time.

TYPE

There are two more tokens required if the token is TYPE. The first identifier is the metric name, and the second identifier is a qualifier specifying what kind of metric it is (for example, a counter, gauge, histogram, summary, or type unknown). A single TYPE line is allowed for each metric name. Metric names should have their TYPE lines, which specify the metric’s unit of measurement, positioned first before any metric samples are given. If a metric name does not have a TYPE line, the type is left as untyped.

Metrics Type

Counter

It simply counts the metrics. We’ll count these by things like the number of errors or the number of requests. This type is recommended unless your metric value can fluctuate and go down.

Gauge

It is ideal for metrics that fluctuate, like CPU utilisation.

Histogram

A histogram is used to measure information by counting information in specific observed buckets. Additionally, it presents a complete account of the total amount of all observed values, so it is one of the most challenging types of metrics to read.  

Summary

For every observation, a summary takes a sample (usually things like request durations and response sizes). It likewise gives you the total number of observations and a list of all observed values, but it is capable of generating a customisable number of quantiles across a sliding time window.

You can only utilise four sorts of metrics (Counter, Gauge, Summary and Histogram), therefore choose the best one for the purpose.

Fail Proofs

When “pulling” metrics is not possible (for example, short-lived jobs that will not live long enough to be scraped), Prometheus provides a Pushgateway that allows applications to still push metric data if necessary. Essentially, we get the best of both worlds.

Alerting

You can use PromQL, the querying language used by prometheus to retrieve metrics via the database and use Prometheus Web UI, API clients and Grafana to visualise. Setting up external alerting services like pagerduty and email is even possible via Alertmanager.

Final Thoughts

Prometheus is popular, and everyone in the CNCF landscape knows this. The features help it become one of the most promising monitoring systems and beat the likes of Amazon CloudWatch, ApplicationInsights, NewRelic, which are push based platforms. But, Prometheus itself states that the pull based factor shouldn’t be the major point when considering a monitoring system. Also, even if Prometheus serves as Kubernetes best friend it’s capable of more!

If you want to get started with prometheus no other place is better than the First Steps with Prometheus documentation.

Read more of our posts from the CNCF landscape below and feel free to contact us if you need help with your Monitoring Journey!

Happy Learning!

CommunityNew

The DevOps Awareness Program

Subscribe to the newsletter

Join 100+ cloud native ethusiasts

#wearep3r

Join the community Slack

Discuss all things Kubernetes, DevOps and Cloud Native

Related articles6

Startup speed, enterprise quality

Startup speed, enterprise quality

Liebe Kunden, Partner und Kollegen,2021 ist vorbei und uns alle erwarten neue Herausforderungen und Ziele in 2022.In den letzten 3 Jahren hat sich p3r von einer One-Man-Show zu einer festen Größe im deutschen Cloud-Sektor entwickelt. Mit inzwischen 11...

Introduction to GitOps

Introduction to GitOps

GitOps serves to make the process of development and operations more developer-centric. It applies DevOps practices with Git as a single source of truth for infrastructure automation and deployment, hence the name “Git Ops.” But before getting deeper into what is...

Kaniko: How Users Can Make The Best Use of Docker

Kaniko: How Users Can Make The Best Use of Docker

Whether you love or hate containers, there are only a handful of ways to work with them properly that ensures proper application use with Docker. While there do exist a handful of solutions on the web and on the cloud to deal with all the needs that come with running...

Cilium: A Beginner’s Guide To Improve Security

Cilium: A Beginner’s Guide To Improve Security

A continuation from the previous series on eBPF and security concerns; it cannot be reiterated enough number of times how important it is for developers to ensure the safety and security of their applications. With the ever expanding reach of cloud and software...

How to clean up disk space occupied by Docker images?

How to clean up disk space occupied by Docker images?

Docker has revolutionised containers even if they weren't the first to walk the path of containerisation. The ease and agility docker provide makes it the preferred engine to explore for any beginner or enterprise looking towards containers. The one problem most of...

Parsing Packages with Porter

Parsing Packages with Porter

Porter works as a containerized tool that helps users to package the elements of any existing application or codebase along with client tools, configuration resources and deployment logic in a single bundle. This bundle can be further moved, exported, shared and distributed with just simple commands.