Prometheus e VictoriaMetrics: infraestrutura de armazenamento resiliente de métricas

Neste artigo, meu colega Luca Carboni, DevOps Engineer do escritório de Miro em Amsterdã, explica como é nossa infraestrutura de armazenamento de métricas. Todos os seus componentes obedecem aos princípios de alta disponibilidade (High Availability) e tolerância a falhas (Fault Tolerance), têm uma clara especialização, podem armazenar dados por um longo tempo e são ótimos em termos de custos.





A pilha em questão: Prometheus, Alertmanager, Pushgateway, Blackbox exporter, Grafana e VictoriaMetrics.





Configurando alta disponibilidade e tolerância a falhas para Prometheus

Prometheus federation, Prometheus. , Grafana : , - .





, . , Prometheus , .





, . (prometheus.yml) , . A B .





. IaC ( ) Terraform (CM) Ansible, . , . , .Alertmanager, Pushgateway, Blackbox,





.





Alertmanager , Prometheus Alertmanager, . Alertmanager , : Prometheus A Prometheus B. IaC CM, Alertmanager .





- , . , — Prometheus A Prometheus B .





Pushgateway , . . Pushgateway DNS Failover , ( active/passive). , .





Blackbox Prometheus A Prometheus B.





, Prometheus, Alertmanager, , Pushgateway active/passive Blackbox. .





. VPC (Virtual Private Cloud), , . . , . — . , , .





Prometheus, , . , . . " , ".





VictoriaMetrics

Prometheus . Prometheus , . . 10 . , ? , — . Prometheus , - , .





Cortex, Thanos, M3DB, VictoriaMetrics . Prometheus, — , , — .





, VictoriaMetrics.





VictoriaMetrics : «--» (single-node version) (cluster version). , , . , .





— . (), .





VictoriaMetrics : vmstorage ( ), vminsert ( ) vmselect ( ). , vminsert vmselect .





vminsert . , , . vminsert (stateless), , , .





, vminsert — (storageNode) , (replicationFactor=N, Nvmstorage). vminsert? Prometheus, remote_write.





vmstorage — , VictoriaMetrics. vminsert vmselect, vmstorage (stateful), . vmstorage , (IO latency) (IOPS), , Prometheus.





vmstorage:





  • storageDataPath — , ;





  • retentionPeriod — ;





  • dedup.minScrapeInterval — ( , ).





vmstorage , replicationFactor, vminsert, (N) .





vmstorage , , , vmstorage .





vmselect . , , . , Prometheus, , . , , Grafana. vminsert, vmselect .





Grafana

Grafana , Prometheus, , VictoriaMetrics. , VictoriaMetrics (MetricsQL) PromQL, Prometheus. Grafana.





Grafana SQLite . SQLite , , . . , PostgreSQL Amazon RDS, Multi-AZ , .





Grafana PostgreSQL. , Grafana . PostgreSQL Grafana, , vendor lock. , .





, Grafana. .





Grafana VictoriaMetrics — , vmselect, — Prometheus . .





***





, , . , vmstorage , Amazon S3.





, . , .





:





  • Prometheus — https://prometheus.io/





  • Alertmanager — https://github.com/prometheus/alertmanager





  • Pushgateway — https://github.com/prometheus/pushgateway





  • Blackbox exporter — https://github.com/prometheus/blackbox_exporter





  • https://prometheus.io/docs/instrumenting/exporters/





  • Grafana — https://grafana.com/





  • VictoriaMetrics — https://victoriametrics.com/









Miro.












All Articles