Posted in

Monitoring with Prometheus and Grafana: A Complete Guide

Monitoring with Prometheus and Grafana: A Complete Guide

In today’s world of complex, dynamic, and distributed systems, monitoring has become more than just a nice-to-have—it’s a mission-critical function for maintaining the reliability, availability, and performance of modern applications. Two of the most powerful tools in the DevOps and cloud-native ecosystem for monitoring are Prometheus and Grafana.

In this blog, we’ll walk you through everything you need to know about setting up and using Prometheus and Grafana for robust monitoring. Whether you are a beginner or looking to enhance your existing stack, this guide is for you.

Table of Contents

  1. Why Monitoring Matters
  2. What is Prometheus?
  3. What is Grafana?
  4. How Prometheus and Grafana Work Together
  5. Setting Up Prometheus
  6. Setting Up Grafana
  7. Connecting Prometheus to Grafana
  8. Creating Dashboards in Grafana
  9. Alerts and Notifications
  10. Use Cases and Best Practices
  11. Conclusion

Why Monitoring Matters

Modern applications are often deployed across microservices, containers, and multiple cloud regions. Without a proper monitoring system:

  • You can’t detect issues before they become outages.
  • You can’t understand performance bottlenecks.
  • You can’t confidently scale or release new features.

Monitoring helps in observability, giving you insight into the health, performance, and behavior of your applications and infrastructure.

What is Prometheus?

Prometheus is an open-source metrics-based monitoring and alerting toolkit, originally developed at SoundCloud and now part of the Cloud Native Computing Foundation (CNCF).

Key Features:

  • Pull-based model over HTTP
  • Powerful query language (PromQL)
  • Time-series data collection
  • Multi-dimensional data model
  • Service discovery (Kubernetes, Consul, etc.)
  • Built-in alert manager

Prometheus is ideal for capturing metrics like CPU usage, memory, latency, and request rates from your services.

What is Grafana?

Grafana is an open-source analytics and interactive visualization tool. It helps transform raw data into insightful, beautiful dashboards.

Key Features:

  • Connects with many data sources (Prometheus, InfluxDB, Elasticsearch, etc.)
  • Highly customizable dashboards
  • Rich plugin ecosystem
  • Alerting and annotations
  • Role-based access control (RBAC)

While Prometheus collects the data, Grafana helps visualize it.

How Prometheus and Grafana Work Together

Here’s a simple architecture:

Application/Service → Exporters → Prometheus → Grafana
  • Exporters expose metrics via HTTP.
  • Prometheus scrapes those metrics at intervals and stores them.
  • Grafana queries Prometheus and visualizes the metrics.

Together, they form a complete monitoring pipeline.

Setting Up Prometheus

Step 1: Install Prometheus

You can download it from the official site or run via Docker:

docker run -d --name=prometheus -p 9090:9090 \
-v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml \
prom/prometheus

Step 2: Configure Prometheus

Here’s a basic prometheus.yml:

global:
scrape_interval: 15s

scrape_configs:
- job_name: 'node'
static_configs:
- targets: ['localhost:9100']

This configuration tells Prometheus to scrape metrics every 15 seconds from the node_exporter running on localhost:9100.

Step 3: Exporters

You need exporters to expose metrics. Example:

  • Node Exporter for Linux server metrics
  • cAdvisor for container metrics
  • Blackbox Exporter for uptime and HTTP checks

Setting Up Grafana

Step 1: Install Grafana

You can install Grafana using Docker:

docker run -d -p 3000:3000 --name=grafana grafana/grafana

Access Grafana at http://localhost:3000 (default credentials: admin/admin).

Step 2: Add Prometheus as a Data Source

  • Go to Configuration → Data Sources → Add Data Source
  • Choose Prometheus
  • Set URL as http://localhost:9090
  • Save & Test

Connecting Prometheus to Grafana

Once Prometheus is added as a data source:

  • Create a new dashboard
  • Add a new panel
  • Use PromQL to query Prometheus data.
    Example: promqlCopyEditrate(http_requests_total[1m])
  • Customize visualizations (graph, gauge, bar chart, etc.)

Creating Dashboards in Grafana

Grafana allows you to:

  • Use pre-built dashboards from the community (via Grafana.com)
  • Create custom dashboards using Prometheus queries
  • Set variables for dynamic filters (like host, job, region)
  • Use templating for multi-use dashboards

Example panel metrics:

MetricDescription
node_cpu_seconds_totalCPU usage
node_memory_MemAvailable_bytesMemory available
node_network_receive_bytes_totalNetwork I/O

Alerts and Notifications

Both Prometheus and Grafana support alerting.

Prometheus Alertmanager:

Define rules in prometheus.yml:

tgroups:
- name: example
rules:
- alert: HighCPU
expr: avg(rate(node_cpu_seconds_total[1m])) > 0.9
for: 1m
labels:
severity: warning
annotations:
summary: "High CPU usage detected"

Configure Alertmanager to send notifications to:

  • Email
  • Slack
  • PagerDuty
  • Webhooks

Grafana Alerts:

Grafana also supports alerting at the panel level.

  • Create an alert rule on any panel
  • Choose alert conditions (e.g., value > threshold)
  • Set frequency and notification channels
  • Integrate with services like Slack, Microsoft Teams, or email

Use Cases and Best Practices

Common Use Cases:

  • Infrastructure Monitoring (CPU, memory, disk, network)
  • Application Monitoring (errors, latency, requests)
  • Kubernetes Monitoring (pods, nodes, containers)
  • SLA & SLO tracking
  • Incident response & RCA

Best Practices:

  • Use labels effectively in Prometheus (e.g., instance, job)
  • Don’t overload Prometheus with too many high-cardinality metrics
  • Retain only useful data (use recording rules)
  • Use Grafana folders and sharing permissions wisely
  • Always set up alerts for critical metrics
  • Backup Grafana dashboards and Prometheus data regularly

Conclusion

Prometheus and Grafana are the backbone of modern observability stacks. When combined, they provide end-to-end visibility into your systems, help in proactive alerting, and allow you to act before outages happen.

By using them effectively, you empower your DevOps and SRE teams to maintain uptime, ensure performance, and deliver reliable software to users.

So if you’re building or operating in the cloud-native world, it’s time to harness the power of Prometheus and Grafana. Monitor smart. Act faster. Stay ahead.

Next Steps

  • Try monitoring your local system with Node Exporter.
  • Deploy Prometheus & Grafana in your Kubernetes cluster.
  • Explore custom exporters for your app metrics.
  • Set up alert rules and test notification systems.
  • Experiment with Grafana’s advanced features like Loki, Tempo, and alerting as code.