Cluster Provisioning and Alert Manager through Grafana

Your cluster is running, your pods are spinning, and you have absolutely no idea what’s happening inside. Welcome to the blind spot of self-hosting. The Grafana-Prometheus-Alloy-Loki stack is your escape from this darkness—a unified observability platform that collects metrics, aggregates logs, and screams at you before your users do.

Prometheus scrapes and stores time-series metrics from your nodes, pods, and applications. Loki does the same for logs, but without indexing the content (making it lightweight and cheap). Alloy acts as the collection agent, shipping logs from your pods to Loki. Grafana ties everything together with dashboards, queries, and alerting rules.

What is Alloy?

Alloy is Grafana's unified telemetry collector—the successor to Promtail, Grafana Agent, and a dozen other single-purpose collectors. Instead of running separate agents for logs, metrics, and traces, Alloy handles all three through a single binary with a declarative configuration language. It replaced Promtail in early 2024, so if you're following older tutorials that mention Promtail, Alloy is the modern equivalent with broader capabilities.

Why not just use managed observability?

You absolutely could. Datadog, New Relic, Grafana Cloud—they all work beautifully. They also charge per host, per metric, per GB ingested, per alert, and occasionally per dream you had about Kubernetes. A modest 3-node cluster with reasonable log volume can easily cost $200-500/month in managed observability. This stack? Zero. The only cost is your time and the compute resources you're already paying for.

Rather configure than operate?

This stack takes 30 minutes to deploy and a lifetime to maintain. ZipOps clusters ship with Prometheus, Loki, and Grafana pre-configured—same stack, zero YAML. If observability is a means to an end, not the end itself, see what we're building.

In production Kubernetes environments, this stack handles millions of metrics and gigabytes of logs daily. It’s what powers observability at companies that decided their cloud bills were getting ridiculous.

Init

What you need

Before diving in, ensure you have:

A running k3s cluster from the Hetzner Terraform setup
kubectl configured against your cluster with proper kubeconfig file
[optional] a domain for access without port-forward (strongly recommended)

How the Stack Fits Together

The observability pipeline flows in two parallel streams:

Metrics path: Prometheus scrapes /metrics endpoints from your pods and nodes every 15-30 seconds, stores time-series data locally, and exposes it to Grafana for dashboards and alerting rules.

Logs path: Alloy runs as a DaemonSet on every node, tails container logs from /var/log/pods, enriches them with Kubernetes metadata (namespace, pod name, labels), and ships them to Loki. Loki indexes only metadata—not log content—keeping storage costs low.

Grafana sits at the query layer, pulling from both Prometheus and Loki to correlate metrics spikes with log entries. When CPU hits 90%, you can jump directly to logs from that pod at that timestamp.

observability-data-flow

Apply

We’re deploying four Kubernetes resources: a namespace for isolation, then three HelmChart resources that leverage k3s’s built-in Helm controller. This approach means no local Helm CLI required—just apply the manifests and let k3s handle the rest.

Namespace

Every good monitoring stack deserves its own room. This namespace isolates all observability components from your application workloads.

namespace.yaml

YAML

apiVersion: v1
kind: Namespace
metadata:
  name: monitoring
  labels:
    name: monitoring

Nothing to edit here—monitoring is the conventional namespace name that other tools expect. Apply it:

Shell

kubectl apply -f namespace.yaml

#Cluster Provisioning and Alert Manager through Grafana

#Init

#What you need

#How the Stack Fits Together

#Apply

#Namespace