Installing Envoy Gateway on AKS: Three Ways to Configure Your Gateway

URL has been copied successfully!

Reading Time: 8 minutes

After writing about Ingress NGINX retiring in March 2026, I recommended a two-phase migration approach. Start with Application Routing add-on for quick stability, then migrate to Envoy Gateway as the long-term strategic choice. Several readers asked for more detail on the Envoy Gateway piece, specifically around installation and the different configuration options for AKS.

I spent the last few weeks deploying Envoy Gateway across different test clusters, trying out the various deployment patterns, and documenting the gotchas I hit. This is that guide.

Why Envoy Gateway?

When I evaluated options after the NGINX retirement announcement, Envoy Gateway stood out for a few reasons. It’s the CNCF’s reference implementation of Gateway API, which means it’s going to stay aligned with where Kubernetes ingress is headed. The architecture is different from traditional ingress controllers in a way that actually makes sense for AKS.

Instead of one big controller with a LoadBalancer service, Envoy Gateway splits things into a control plane and data plane. The control plane (what you install with Helm) just manages configuration. When you create a Gateway resource, it spins up its own data plane with Envoy proxy pods and a dedicated LoadBalancer. This makes it easier to isolate different teams or environments within the same cluster. Your development team can have their own Gateway with a private LoadBalancer while production uses a public one with different scaling characteristics.

Envoy itself is battle-tested. Companies like Lyft, Pinterest, and Microsoft run it at massive scale. The Gateway API layer on top gives you the Kubernetes-native interface without having to write Envoy config directly.

Installing the Control Plane

First, install Envoy Gateway using Helm. This only installs the control plane controller, not any load balancers yet (those come when you create Gateway resources).

EG_VERSION="v1.7.0"

helm install eg oci://docker.io/envoyproxy/gateway-helm \
  --version "$EG_VERSION" \
  -n envoy-gateway-system \
  --create-namespace \
  --set deployment.replicas=2 \
  --set podDisruptionBudget.minAvailable=1 \
  --set deployment.envoyGateway.resources.requests.cpu=250m \
  --set deployment.envoyGateway.resources.requests.memory=256Mi \
  --set deployment.envoyGateway.resources.limits.cpu=1 \
  --set deployment.envoyGateway.resources.limits.memory=1024Mi

EG_VERSION="v1.7.0"

helm install eg oci://docker.io/envoyproxy/gateway-helm \

--version "$EG_VERSION" \

-n envoy-gateway-system \

--create-namespace \

--set deployment.replicas=2 \

--set podDisruptionBudget.minAvailable=1 \

--set deployment.envoyGateway.resources.requests.cpu=250m \

--set deployment.envoyGateway.resources.requests.memory=256Mi \

--set deployment.envoyGateway.resources.limits.cpu=1 \

--set deployment.envoyGateway.resources.limits.memory=1024Mi

Wait for it to become available:

kubectl wait deployment envoy-gateway \
  -n envoy-gateway-system \
  --for=condition=Available=True \
  --timeout=5m

kubectl wait deployment envoy-gateway \

-n envoy-gateway-system \

--for=condition=Available=True \

--timeout=5m

I run two replicas and set a PodDisruptionBudget because I learned the hard way that having a single control plane pod during a node upgrade can cause delays in configuration updates. Not critical, but annoying when you’re trying to roll out changes.

A note on upgrades:

Helm doesn’t update CRDs automatically. Before upgrading Envoy Gateway to a new version, you need to pull the new Helm chart and manually apply the CRD YAML files first. Otherwise, the new controller version might fail to reconcile your existing resources. Pull the chart with helm pull oci://docker.io/envoyproxy/gateway-helm --version <version> --untar, then apply the CRDs from the gateway-helm/crds/ directory before running helm upgrade.

Three Configuration Options

Here’s where it gets interesting. You can configure Envoy Gateway for different scenarios by creating GatewayClass resources. I typically use three different configurations depending on the workload.

Before diving into the examples, let me explain the three components you’ll see and how they work together:

EnvoyProxy – This is an Envoy Gateway-specific resource that customizes how the data plane gets deployed. Think of it as the configuration template that defines LoadBalancer annotations (public vs private, Private Link Service), pod settings (termination grace period), and shutdown behavior (drain timeouts). Platform teams create these to define infrastructure patterns.

GatewayClass – This is a cluster-scoped resource from the Gateway API spec. It acts as a template that application teams can reference. GatewayClasses point to an EnvoyProxy resource (via parametersRef) to define what kind of infrastructure gets created. Platform teams create one GatewayClass per deployment pattern (public, private, etc.), and application teams just reference the class name.

Gateway – This is the actual instance that provisions infrastructure. When you create a Gateway resource and reference a GatewayClass, Envoy Gateway automatically creates a Deployment with Envoy proxy pods and a LoadBalancer service in the envoy-gateway-system namespace. Application teams create Gateways in their own namespaces, but the infrastructure lives in the platform namespace.

The layering makes sense once you see it: EnvoyProxy defines “how to deploy”, GatewayClass packages it as a reusable template, and Gateway creates the actual running infrastructure. This separation means platform teams control infrastructure patterns while application teams self-service their own Gateways.

Option 1: Public Internet-Facing

This is the simplest setup for workloads that need to be accessible from the internet. First, create an EnvoyProxy resource with graceful shutdown settings to prevent connection drops during upgrades:

cat <<EOF | kubectl apply -f -
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: public-proxy
  namespace: envoy-gateway-system
spec:
  provider:
    type: Kubernetes
    kubernetes:
      envoyDeployment:
        pod:
          terminationGracePeriodSeconds: 300
  shutdown:
    drainTimeout: 120s
    minDrainDuration: 5s
EOF

cat <<EOF | kubectl apply -f -

apiVersion: gateway.envoyproxy.io/v1alpha1

kind: EnvoyProxy

metadata:

namespace: envoy-gateway-system

spec:

provider:

type: Kubernetes

kubernetes:

envoyDeployment:

pod:

terminationGracePeriodSeconds: 300

shutdown:

drainTimeout: 120s

minDrainDuration: 5s

EOF

Create a GatewayClass that uses this configuration:

cat <<EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: eg-public
spec:
  controllerName: gateway.envoyproxy.io/gatewayclass-controller
  parametersRef:
    group: gateway.envoyproxy.io
    kind: EnvoyProxy
    name: public-proxy
    namespace: envoy-gateway-system
EOF

cat <<EOF | kubectl apply -f -

apiVersion: gateway.networking.k8s.io/v1

kind: GatewayClass

metadata:

spec:

controllerName: gateway.envoyproxy.io/gatewayclass-controller

parametersRef:

group: gateway.envoyproxy.io

kind: EnvoyProxy

namespace: envoy-gateway-system

EOF

Then create a Gateway using that class:

cat <<EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: eg-public
  namespace: default
spec:
  gatewayClassName: eg-public
  listeners:
    - name: http
      protocol: HTTP
      port: 80
EOF

cat <<EOF | kubectl apply -f -

apiVersion: gateway.networking.k8s.io/v1

kind: Gateway

metadata:

namespace: default

spec:

gatewayClassName: eg-public

listeners:

- name: http

protocol: HTTP

port: 80

EOF

Check the external IP:

kubectl get svc -n envoy-gateway-system -l gateway.envoyproxy.io/owning-gateway-name=eg-public

1	kubectl get svc -n envoy-gateway-system -l gateway.envoyproxy.io/owning-gateway-name=eg-public

Notice the service is created in envoy-gateway-system, not in the default namespace where the Gateway resource lives. This confused me at first, but it makes sense for multi-tenancy. The platform team controls the infrastructure namespace, and application teams just create Gateway resources in their own namespaces.

Option 2: Internal Private LoadBalancer (Application Gateway)

For workloads that should only be accessible within your Azure VNet (like internal APIs or admin interfaces), you need a private LoadBalancer. This EnvoyProxy resource includes both the private LoadBalancer annotation and graceful shutdown settings:

cat <<EOF | kubectl apply -f -
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: internal-proxy
  namespace: envoy-gateway-system
spec:
  provider:
    type: Kubernetes
    kubernetes:
      envoyService:
        type: LoadBalancer
        annotations:
          service.beta.kubernetes.io/azure-load-balancer-internal: "true"
      envoyDeployment:
        pod:
          terminationGracePeriodSeconds: 300
  shutdown:
    drainTimeout: 120s
    minDrainDuration: 5s
EOF

cat <<EOF | kubectl apply -f -

apiVersion: gateway.envoyproxy.io/v1alpha1

kind: EnvoyProxy

metadata:

namespace: envoy-gateway-system

spec:

provider:

type: Kubernetes

kubernetes:

envoyService:

type: LoadBalancer

annotations:

service.beta.kubernetes.io/azure-load-balancer-internal: "true"

envoyDeployment:

pod:

terminationGracePeriodSeconds: 300

shutdown:

drainTimeout: 120s

minDrainDuration: 5s

EOF

Create the GatewayClass that references this configuration:

cat <<EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: eg-private
spec:
  controllerName: gateway.envoyproxy.io/gatewayclass-controller
  parametersRef:
    group: gateway.envoyproxy.io
    kind: EnvoyProxy
    name: internal-proxy
    namespace: envoy-gateway-system
EOF

cat <<EOF | kubectl apply -f -

apiVersion: gateway.networking.k8s.io/v1

kind: GatewayClass

metadata:

spec:

controllerName: gateway.envoyproxy.io/gatewayclass-controller

parametersRef:

group: gateway.envoyproxy.io

kind: EnvoyProxy

namespace: envoy-gateway-system

EOF

Now create a Gateway using the private class:

cat <<EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: eg-private
  namespace: default
spec:
  gatewayClassName: eg-private
  listeners:
    - name: http
      protocol: HTTP
      port: 80
EOF

cat <<EOF | kubectl apply -f -

apiVersion: gateway.networking.k8s.io/v1

kind: Gateway

metadata:

namespace: default

spec:

gatewayClassName: eg-private

listeners:

- name: http

protocol: HTTP

port: 80

EOF

Get the private IP:

kubectl get svc -n envoy-gateway-system -l gateway.envoyproxy.io/owning-gateway-name=eg-private

1	kubectl get svc -n envoy-gateway-system -l gateway.envoyproxy.io/owning-gateway-name=eg-private

This gives you a private IP address within your AKS subnet. You’ll need to test from a VM in the same VNet or through a VPN/bastion connection.

Option 3: Private + Private Link Service (for Azure Front Door)

If you’re using Azure Front Door Premium and want to connect via Private Link, you need to enable Private Link Service on the LoadBalancer. This is my preferred setup for production workloads because it keeps traffic off the public internet entirely. This configuration includes Private Link Service annotations and graceful shutdown:

cat <<EOF | kubectl apply -f -
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: EnvoyProxy
metadata:
  name: internal-pls-proxy
  namespace: envoy-gateway-system
spec:
  provider:
    type: Kubernetes
    kubernetes:
      envoyService:
        type: LoadBalancer
        annotations:
          service.beta.kubernetes.io/azure-load-balancer-internal: "true"
          service.beta.kubernetes.io/azure-pls-create: "true"
          service.beta.kubernetes.io/azure-pls-name: "envoy-gateway-pls"
          service.beta.kubernetes.io/azure-pls-proxy-protocol: "false"
      envoyDeployment:
        pod:
          terminationGracePeriodSeconds: 300
  shutdown:
    drainTimeout: 120s
    minDrainDuration: 5s
EOF

cat <<EOF | kubectl apply -f -

apiVersion: gateway.envoyproxy.io/v1alpha1

kind: EnvoyProxy

metadata:

namespace: envoy-gateway-system

spec:

provider:

type: Kubernetes

kubernetes:

envoyService:

type: LoadBalancer

annotations:

service.beta.kubernetes.io/azure-load-balancer-internal: "true"

service.beta.kubernetes.io/azure-pls-create: "true"

service.beta.kubernetes.io/azure-pls-name: "envoy-gateway-pls"

service.beta.kubernetes.io/azure-pls-proxy-protocol: "false"

envoyDeployment:

pod:

terminationGracePeriodSeconds: 300

shutdown:

drainTimeout: 120s

minDrainDuration: 5s

EOF

Create the GatewayClass:

cat <<EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: eg-private-pls
spec:
  controllerName: gateway.envoyproxy.io/gatewayclass-controller
  parametersRef:
    group: gateway.envoyproxy.io
    kind: EnvoyProxy
    name: internal-pls-proxy
    namespace: envoy-gateway-system
EOF

cat <<EOF | kubectl apply -f -

apiVersion: gateway.networking.k8s.io/v1

kind: GatewayClass

metadata:

spec:

controllerName: gateway.envoyproxy.io/gatewayclass-controller

parametersRef:

group: gateway.envoyproxy.io

kind: EnvoyProxy

namespace: envoy-gateway-system

EOF

Create the Gateway:

cat <<EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: eg-private-pls
  namespace: default
spec:
  gatewayClassName: eg-private-pls
  listeners:
    - name: http
      protocol: HTTP
      port: 80
EOF

cat <<EOF | kubectl apply -f -

apiVersion: gateway.networking.k8s.io/v1

kind: Gateway

metadata:

namespace: default

spec:

gatewayClassName: eg-private-pls

listeners:

- name: http

protocol: HTTP

port: 80

EOF

Verify the Private Link Service was created:

NODE_RG=$(az aks show -g <your-resource-group> -n <your-aks-cluster> --query nodeResourceGroup -o tsv)
az network private-link-service list -g $NODE_RG -o table

1 2	NODE_RG=$(az aks show -g <your-resource-group> -n <your-aks-cluster> --query nodeResourceGroup -o tsv) az network private-link-service list -g $NODE_RG -o table

Important:

Azure creates the Private Link Service automatically, but you need to manually approve the Private Endpoint connection from Front Door before traffic will flow. I wasted 20 minutes troubleshooting connectivity before realizing I needed to approve it in the Azure Portal under the Private Link Service resource. You can also approve it via CLI:

az network private-endpoint-connection approve \
  --resource-group $NODE_RG \
  --service-name envoy-gateway-pls \
  --name <connection-name> \
  --description "Approved for Front Door"

az network private-endpoint-connection approve \

--resource-group $NODE_RG \

--service-name envoy-gateway-pls \

--name <connection-name> \

--description "Approved for Front Door"

Testing with a Real Application

Let’s deploy a simple application to test the setup. I’ll use the AKS hello-world sample app because it shows useful information about the pod handling the request:

cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
  name: aks-helloworld
  namespace: default
---
apiVersion: v1
kind: Service
metadata:
  name: aks-helloworld
  namespace: default
spec:
  type: ClusterIP
  ports:
  - port: 80
    targetPort: 80
  selector:
    app: aks-helloworld
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: aks-helloworld
  namespace: default
spec:
  replicas: 2
  selector:
    matchLabels:
      app: aks-helloworld
  template:
    metadata:
      labels:
        app: aks-helloworld
    spec:
      serviceAccountName: aks-helloworld
      containers:
      - name: aks-helloworld
        image: mcr.microsoft.com/azuredocs/aks-helloworld:v1
        ports:
        - containerPort: 80
        env:
        - name: TITLE
          value: "Envoy Gateway on AKS"
EOF

cat <<EOF | kubectl apply -f -

apiVersion: v1

kind: ServiceAccount

metadata:

namespace: default

---

apiVersion: v1

kind: Service

metadata:

namespace: default

spec:

type: ClusterIP

ports:

- port: 80

targetPort: 80

selector:

app: aks-helloworld

---

apiVersion: apps/v1

kind: Deployment

metadata:

namespace: default

spec:

replicas: 2

selector:

matchLabels:

app: aks-helloworld

template:

metadata:

labels:

app: aks-helloworld

spec:

serviceAccountName: aks-helloworld

containers:

- name: aks-helloworld

image: mcr.microsoft.com/azuredocs/aks-helloworld:v1

ports:

- containerPort: 80

env:

- name: TITLE

value: "Envoy Gateway on AKS"

EOF

Wait for the pods to be ready:

kubectl wait --timeout=60s --for=condition=available deployment/aks-helloworld

1	kubectl wait --timeout=60s --for=condition=available deployment/aks-helloworld

Routing Traffic with HTTPRoute

Now we need to connect the Gateway to your application. This is where HTTPRoute comes in.

HTTPRoute is the Gateway API resource that defines routing rules. While the Gateway handles the infrastructure (LoadBalancer, listeners, ports), HTTPRoute handles the application-level routing. It defines which hostnames to accept, which paths to match, and which backend services to route traffic to. Think of it like the Ingress resource you’re used to, but more expressive and flexible.

The key difference from traditional Ingress is the separation of concerns. Application teams create HTTPRoutes in their own namespaces and reference a Gateway via parentRefs. They don’t need to worry about LoadBalancer configuration or TLS certificates at the infrastructure level. They just define routing logic: “when a request comes in for this hostname and path, send it to this service.”

This is part of what makes Gateway API powerful for multi-tenancy. One Gateway can serve multiple HTTPRoutes from different namespaces and teams. Platform teams control the Gateway (the infrastructure), and application teams control the HTTPRoutes (the routing).

Now create an HTTPRoute to connect your Gateway to the application. Change the parentRefs.name to match whichever Gateway you created (eg-public, eg-private, or eg-private-pls):

cat <<EOF | kubectl apply -f -
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: aks-helloworld
  namespace: default
spec:
  parentRefs:
    - name: eg-public  # Change to eg-private or eg-private-pls if needed
  rules:
    - backendRefs:
        - name: aks-helloworld
          port: 80
      matches:
        - path:
            type: PathPrefix
            value: /
EOF

cat <<EOF | kubectl apply -f -

apiVersion: gateway.networking.k8s.io/v1

kind: HTTPRoute

metadata:

namespace: default

spec:

parentRefs:

- name: eg-public # Change to eg-private or eg-private-pls if needed

rules:

- backendRefs:

- name: aks-helloworld

port: 80

matches:

- path:

type: PathPrefix

value: /

EOF

Get the Gateway IP and test:

GATEWAY_HOST=$(kubectl get svc -n envoy-gateway-system -l gateway.envoyproxy.io/owning-gateway-name=eg-public -o jsonpath='{.items[0].status.loadBalancer.ingress[0].ip}')

curl http://$GATEWAY_HOST/

GATEWAY_HOST=$(kubectl get svc -n envoy-gateway-system -l gateway.envoyproxy.io/owning-gateway-name=eg-public -o jsonpath='{.items[0].status.loadBalancer.ingress[0].ip}')

curl http://$GATEWAY_HOST/

You should see the AKS hello-world web page in the response. If you’re using a private gateway, remember you’ll need to test from within the VNet.

Understanding the Graceful Shutdown Settings

You might have noticed the graceful shutdown configuration I included in all three EnvoyProxy examples above. I learned the importance of this the hard way when I saw 502 errors during a rolling update in my test cluster. Without these settings, Envoy pods get terminated mid-request, which causes connection drops.

Here’s what those settings actually do when a pod needs to shut down:

Kubernetes sends a SIGTERM signal to the Envoy pod
Envoy stops accepting new connections immediately
Envoy waits up to drainTimeout (120 seconds) for existing requests to finish
After minDrainDuration (5 seconds minimum), Envoy can shut down if all connections have closed
Kubernetes waits up to terminationGracePeriodSeconds (300 seconds) before force-killing the pod

The 5-minute termination grace period might seem excessive, but long-lived connections (WebSockets, streaming APIs, file uploads) need time to close cleanly. The 2-minute drain timeout handles most HTTP requests while still being reasonable. The 5-second minimum prevents race conditions where connections close too quickly and clients retry into a shutting-down pod.

You can adjust these values based on your application’s connection patterns. If you only have short-lived HTTP requests, you might reduce the drain timeout. If you have long-running gRPC streams or WebSocket connections, you might increase it. But for most workloads, the values I’ve shown are a solid starting point.

Wrapping Up

That’s Envoy Gateway on AKS. You’ve got three production-ready configuration patterns (public, private, private + PLS), graceful shutdown baked in to prevent connection drops during upgrades, and a working HTTPRoute to see traffic flowing.

The obvious next step is adding HTTPS. Envoy Gateway works well with cert-manager for automatic TLS certificate management via Let’s Encrypt. You add an HTTPS listener to your Gateway, create a Certificate resource, and cert-manager handles the renewal automatically. I’ll probably write a follow-up post on that setup since the integration with Gateway API is cleaner than what we had with Ingress. The cert-manager team published their roadmap for XListenerSet support (experimental in v1.20, released Feb 2026), which will eventually restore the self-service TLS workflow that multi-tenant Ingress users are familiar with. Worth watching if you need per-team TLS management on a shared Gateway.

For more advanced features like rate limiting, authentication policies, or WAF integration with Coraza, the Envoy Gateway documentation has solid examples. The SecurityPolicy and BackendTLSPolicy resources give you fine-grained control over things that required custom annotations or ConfigMap hacks with NGINX.

If you’re coming from my NGINX retirement post, this is the Phase 2 migration I recommended. You can run Application Routing add-on and Envoy Gateway side-by-side, migrate routes incrementally, and validate everything works before fully switching over. For teams migrating from Ingress, the ingress2eg tool can convert existing Ingress resources to Envoy Gateway HTTPRoute format as a starting point.

The learning curve is real, especially if you’re used to Ingress. But the architecture makes more sense for multi-team environments, and the Gateway API feels like where Kubernetes ingress should have been from the start. I’ve been running this setup in test clusters for a few weeks now, and it’s solid.

Installing Envoy Gateway on AKS: Three Ways to Configure Your Gateway

Published by Pixel Robots. on February 25, 2026 February 25, 2026

Why Envoy Gateway?

Installing the Control Plane

A note on upgrades:

Three Configuration Options

Option 1: Public Internet-Facing

Option 2: Internal Private LoadBalancer (Application Gateway)

Option 3: Private + Private Link Service (for Azure Front Door)

Important:

Testing with a Real Application

Routing Traffic with HTTPRoute

Understanding the Graceful Shutdown Settings

Wrapping Up

Pixel Robots.

0 Comments

Leave a Reply Cancel reply

AKS Node Disruption Policy is Now in Preview: Control When Your Nodes Get Reimaged

Inspektor Gadget Is Now an AKS Extension (Preview)

Azure Container Linux for AKS: Flatcar Grows Up

Installing Envoy Gateway on AKS: Three Ways to Configure Your Gateway

Published by Pixel Robots. on February 25, 2026 February 25, 2026

Why Envoy Gateway?

Installing the Control Plane

A note on upgrades:

Three Configuration Options

Option 1: Public Internet-Facing

Option 2: Internal Private LoadBalancer (Application Gateway)

Option 3: Private + Private Link Service (for Azure Front Door)

Important:

Testing with a Real Application

Routing Traffic with HTTPRoute

Understanding the Graceful Shutdown Settings

Wrapping Up

Pixel Robots.

0 Comments

Leave a Reply Cancel reply

Related Posts

AKS Node Disruption Policy is Now in Preview: Control When Your Nodes Get Reimaged

Inspektor Gadget Is Now an AKS Extension (Preview)

Azure Container Linux for AKS: Flatcar Grows Up