Reading Time: 5 minutes

Recently, I was tasked with implementing distributed tracing for a microservices platform running on Azure Kubernetes Service (AKS). The requirements were clear: use Grafana Tempo for trace storage, Azure Blob Storage as the backend for cost-effectiveness, and ensure all connectivity remained private for security compliance. What started as a straightforward Helm deployment turned into a deep dive into Azure Private Link Service, Managed Private Endpoints, and workload identity federation.

The Challenge: Private Connectivity for Observability

Most Tempo tutorials show you how to deploy it with local storage or public endpoints. But in enterprise environments, you often need:

  • Private connectivity between Grafana and Tempo (no internet traffic)
  • Azure Blob Storage backend for scalable, cost-effective trace storage
  • Workload Identity for secure authentication without storing secrets

The tricky part? Making Azure Managed Grafana communicate privately with Tempo running inside AKS, while Tempo itself authenticates to Azure Storage using managed identities.

Architecture Overview

Here’s what we’re building:

The key insight: Azure Managed Grafana can create Managed Private Endpoints to connect to Private Link Services, and AKS can automatically create Private Link Services for internal services.

Setting Up Azure Infrastructure

Storage Account and Managed Identity

First, we need a storage account and managed identity for Tempo:

Workload Identity Federation

The managed identity needs access to the storage account and federation with our Kubernetes service account:

Here’s where it gets interesting. We need to configure Tempo’s Kubernetes service to automatically create a Private Link Service:

The key annotations here are the azure-pls-* ones. These tell AKS to automatically create a Private Link Service when the LoadBalancer service is provisioned.

Dynamic Helm Deployment

Instead of hardcoding values, I used Helm’s --set parameters to inject environment-specific configuration:

This approach keeps the values file clean and environment-agnostic while injecting the right managed identity and storage configuration at deployment time.

Connecting Grafana via Managed Private Endpoint

Once Tempo is running, we need to connect Azure Managed Grafana to it privately. First, we discover the Private Link Service that AKS created:

When I first tried this, I was surprised to find that the PLS wasn’t immediately available. It takes a few minutes for AKS to provision the Private Link Service after the LoadBalancer service is created.

Creating the Managed Private Endpoint

With the PLS resource ID, we can create a Managed Private Endpoint in Grafana:

Approving the Connection

The MPE creation triggers a pending connection on the Private Link Service that needs approval:

Creating the Data Source

After approval, refresh the MPE state and get the private IP:

Lessons Learned and Gotchas

Don’t expect the PLS to be available immediately after deploying the Tempo service. I found it typically takes 2-5 minutes for AKS to provision it.

2. Service Health Probes Matter

The azure-load-balancer-health-probe-request-path annotation is crucial. Without it, the load balancer health checks fail and the PLS doesn’t work properly.

3. Workload Identity Setup Order

Create the federated identity credential before deploying Tempo. The federation is set up using the expected service account path (system:serviceaccount:tempo:tempo) – the actual Kubernetes service account gets created later during Helm deployment.

4. MPE Refresh is Required

After approving the private endpoint connection, you must run az grafana mpe refresh for Grafana to pick up the approved state and assign the private IP.

5. Values File Cleanliness

Using --set parameters instead of environment-specific values files makes the solution much more maintainable. The values file becomes a template, and the dynamic parts are injected at deployment time.

OpenTelemetry Collector Integration

To complete the setup, configure your OpenTelemetry Collector to send traces to Tempo:

One gotcha I encountered: make sure to set internalTrafficPolicy: "Cluster" in your OpenTelemetry Collector service configuration. Without this, newer versions of the OpenTelemetry Helm chart fail with a template error.

Wrapping Up

This setup gives you a production-ready distributed tracing solution with:

  • Secure private connectivity between Grafana and Tempo
  • Scalable Azure Blob Storage backend for trace data
  • No secrets in Kubernetes thanks to workload identity
  • Environment-agnostic configuration via parameterized deployments

The combination of Azure Private Link Service, Managed Private Endpoints, and workload identity might seem complex initially, but it provides the security and scalability needed for enterprise observability platforms.

Next steps you might consider: setting up trace sampling policies, configuring retention policies for cost optimization, and adding alerting based on trace error rates. The Azure CLI scripts can easily be parameterized and automated as part of your infrastructure as code pipeline.

The full automation of this setup, from storage account creation to Grafana data source configuration, takes what used to be a multi-hour manual process and reduces it to a few minutes of script execution. That’s the kind of developer experience improvement that makes platform engineering worthwhile.


Pixel Robots.

I’m Richard Hooper aka Pixel Robots. I started this blog in 2016 for a couple reasons. The first reason was basically just a place for me to store my step by step guides, troubleshooting guides and just plain ideas about being a sysadmin. The second reason was to share what I have learned and found out with other people like me. Hopefully, you can find something useful on the site.

0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *