Difficulty: Moderate
Estimated Time: 10 minutes

Intermediate: Ingesting metrics data from unreachable sources with Thanos Receive

The Thanos project defines a set of components that can be composed together into a highly available metric system with unlimited storage capacity that seamlessly integrates into your existing Prometheus deployments.

In this course you get first-hand experience building and deploying this infrastructure yourself.

In this tutorial, you will learn:

  • How to ingest metrics data from Prometheus instances that are unreachable from your infrastructure.
  • How to setup a Thanos Querier to access this data.
  • How Thanos Receive is different from Thanos Sidecar, and when is the right time to use each of them.

This will allow you to setup infrastucture

NOTE: This course uses docker containers with pre-built Thanos, Prometheus, and Minio Docker images available publicly.

Prerequisites

Please complete tutorial #1 first: Global View and seamless HA for Prometheus 🤗

Feedback

Do you see any bug, typo in the tutorial or you have some feedback for us? Let us know on https://github.com/thanos-io/thanos or #thanos slack channel linked on https://thanos.io

Contributed by:

Summary

Congratulations! 🎉🎉🎉 You completed this Thanos Receive tutorial. Let's summarize what we learned:

  • Thanos Receive is a component that implements the Prometheus Remote Write protocol.
  • Prometheus can be configured to remote write its metric data in real-time to another server that implements the Remote Write protocol.

See next courses for other tutorials about different deployment models and more advanced features of Thanos!

Further Reading

To understand more about Thanos Receive - check out the following resources:

Feedback

Do you see any bug, typo in the tutorial or you have some feedback for us?

let us know on https://github.com/thanos-io/thanos or #thanos slack channel linked on https://thanos.io

Intermediate: Ingesting metrics data from unreachable sources with Thanos Receive

Step 1 of 4

Problem Statement & Setup

Problem Statement & Setup

Problem Statement

Let's imagine that you run a company called Wayne Enterprises. This company runs two clusters: Batcave & Batcomputer. Each of these sites runs an instance of Prometheus that collects metrics data from applications and services running there.

However, these sites are special. For security reasons, they do not expose public endpoints to the Prometheus instances running there, and so cannot be accessed directly from other parts of your infrastructure.

As the person responsible for implementing monitoring these sites, you have two requirements to meet:

  1. Implement a global view of this data. Wayne Enterprises needs to know what is happening in all parts of the company - including secret ones!
  2. Global view must be queryable in near real-time. We can't afford any delay in monitoring these locations!

Firstly, let us setup two Prometheus instances...

Setup

Batcave

Let's use a very simple configuration file, that tells prometheus to scrape its own metrics page every 5 seconds.

global:
  scrape_interval: 5s
  external_labels:
    cluster: batcave
    replica: 0

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['127.0.0.1:9090']

Run the prometheus instance:

docker run -d --net=host --rm \
    -v /root/editor/prometheus-batcave.yaml:/etc/prometheus/prometheus.yaml \
    -v /root/prometheus-batcave-data:/prometheus \
    -u root \
    --name prometheus-batcave \
    quay.io/prometheus/prometheus:v2.27.0 \
    --config.file=/etc/prometheus/prometheus.yaml \
    --storage.tsdb.path=/prometheus \
    --web.listen-address=:9090 \
    --web.external-url=https://[[HOST_SUBDOMAIN]]-9090-[[KATACODA_HOST]].environments.katacoda.com \
    --web.enable-lifecycle

Verify that prometheus-batcave is running by navigating to the Batcave Prometheus UI.

Why do we enable the web lifecycle flag? By specifying --web.enable-lifecycle, we tell Prometheus to expose the /-/reload HTTP endpoint. This lets us tell Prometheus to dynamically reload its configuration, which will be useful later in this tutorial.

Batcomputer

Almost exactly the same configuration as above, execpt we run the Prometheus instance on port 9091.

global:
  scrape_interval: 5s
  external_labels:
    cluster: batcomputer
    replica: 0

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['127.0.0.1:9091']
docker run -d --net=host --rm \
    -v /root/editor/prometheus-batcomputer.yaml:/etc/prometheus/prometheus.yaml \
    -v /root/prometheus-batcomputer:/prometheus \
    -u root \
    --name prometheus-batcomputer \
    quay.io/prometheus/prometheus:v2.27.0 \
    --config.file=/etc/prometheus/prometheus.yaml \
    --storage.tsdb.path=/prometheus \
    --web.listen-address=:9091 \
    --web.external-url=https://2886795291-9091-elsy05.environments.katacoda.com \
    --web.enable-lifecycle

Verify that prometheus-batcomputer is running by navigating to the Batcomputer Prometheus UI.

With these Prometheus instances configured and running, we can now start to architect our global view of all of Wayne Enterprises.

prometheus-batcave.yaml
prometheus-batcomputer.yaml
Terminal
Prometheus Batcave
Prometheus Batcomputer
Thanos Query