Thanos working principle and components introduction

This article was last updated on: February 7, 2024 pm

Introduction to Thanos

Thanos is an “open source, highly available Prometheus system with long-term storage capabilities.” Many well-known companies use Thanos as part of the CNCF incubation program.

A key feature of Thanos is that it allows “unlimited” storage through the use of object storage such as S3. Object storage can be either object storage offered by each cloud provider or solutions like CEPH, Rook, or Minio.

How it works

Thanos and Prometheus work side by side, and it’s common to upgrade to Thanos starting with Prometheus.

Thanos is divided into components, each with a single target (typical cloud-native architecture), and components communicate with each other via gRPC.

Thanos Sidecar

Thanos runs with Prometheus (with a sidecar) and outputs Prometheus metrics to an object repository every 2 hours. This makes Prometheus virtually stateless. Prometheus still has 2 hours of metrics in memory, so you may still lose 2 hours of metrics in the event of downtime (this should be handled by your Prometheus setup, using HA/sharding, not Thanos).

📓 Reference documentation:

Prometheus basic high-availability architecture

Thanos sidecar, along with the Prometheus Operator and Kube Prometheus stack, can be easily deployed. This component acts as a store for Thanos queries.

Thanos Store

Thanos storage acts as a gateway to transform queries into remote object storage. It can also cache some information on local storage. Basically, this component allows you to query the object store for metrics. This component acts as a store for Thanos queries.

Thanos Compactor

Thanos Compactor is a monolithic (it is not extensible) that is responsible for compressing and reducing metrics stored in object storage. Downsampling (data aging) is the loosening of the granularity of an indicator over time. For example, you might want to keep your metrics for 2 or 3 years, but you don’t need as many data points as yesterday’s metrics. This is where a compressor comes in, which saves bytes on object storage and thus costs.

Thanos Query

Thanos Query is the main component of Thanos, and it is the central point to which PromQL queries are sent. The Thanos query exposes a Prometheus-compatible endpoint. It then assigns the query to all “stores”. Keep in mind that the Store might be any other Thanos component that provides metrics. Thanos queries can send queries to another Thanos query (they can be stacked).

  • Thanos Store
  • Thanos Sidecar
  • Thanos Query

It is also responsible for de-duplication of the same metrics from different stores or Prometheus. For example, if you have a metric in Prometheus that is also in object storage, Thanos Query can deduplicate that metric value. In the case of Prometheus HA setups, deduplication is also based on Prometheus replicas and shards.

Thanos Query Frontend

As its name suggests, the Thanos query front end is the front end of Thanos queries, and its goal is to split a large query into multiple smaller queries and cache the query results (in memory or memcached).

There are other components, such as Thanos Receiver and Thanos Ruler in the case of remote writes.

Thanos deployment architecture

Sidecar mode deployment:

thanos-architecture-deployment-with-sidecar

Receiver mode deployment:

thanos-architecture-deployment-with-receiver


Thanos working principle and components introduction
https://e-whisper.com/posts/17917/
Author
east4ming
Posted on
August 12, 2021
Licensed under