Reading Time: 3 minutes
Share:
Twitter
LinkedIn
Facebook
Google+
Reddit
Whatsapp
Follow by Email

Microsoft protects our AKS clusters by applying OS security and/or kernel updates automatically to the nodes in our cluster. You will find some of the updates will need a reboot to complete. As part of this patching, Microsoft does not reboot our nodes. You could do an AKS upgrade, this will reboot the nodes one at a time and finish off any updates. But what happens when there is no AKS upgrade? kured (KUbernetes REboot Daemon) https://github.com/weaveworks/kured comes to the rescue. Below I will show you how to setup kured and reboot your nodes.

Note

Kured is an open-source project by Weaveworks. Support for this project in AKS is provided on a best-effort basis. Additional support can be found in the #weave-community slack channel.

Prerequisites

  • An AKS cluster
  • Azure CLI version 2.0.59 or later

Time to deploy kured

When kured is deployed, it is deployed as a DaemonSet. The YAML manifest we are going to use (from their GitHub page) will create a role and cluster role, bindings, a service account, and the DaemonSet. To deploy kured make sure you are connected to the AKS cluster you want to install kured on and type the following.

You can configure additional parameters when deploying kured, but that is outside of the scope of this guide. You can read more about it from their documentation. https://github.com/weaveworks/kured#installation

The reboot process

So, every night the nodes get updates and then creates a file called /var/run/reboot-required. The kured DaemonSet runs a  pod on each node in your AKS cluster. This pod then watches for the existence of the file and then initiates a process to reboot the nodes. This process basically puts a lock on to the node via the Kubernetes API. This lock stops any new pods from being scheduled on to the node and also indicates that only one node should be rebooted at a time.

Now one of the nodes are cordoned off as such, all running pods are drained from the node and then it is rebooted.

You can monitor the process by using the following. You can remove the –watch if you only want to see the status at that time.

This image shows a cluster upgrade. My cluster did not need any updates at the time of writing so I used a cluster upgrade to show you what you would see.

After the updates and reboots have finished, you can use the following command to check the status and patch level of the nodes.

Now your AKS cluster will reboot automatically and safely with every update. I hope you found this article helpful. If you have any questions please reach out.

Share:
Twitter
LinkedIn
Facebook
Google+
Reddit
Whatsapp
Follow by Email

Pixel Robots.

I’m Richard Hooper aka Pixel Robots. I started this blog in 2016 for a couple reasons. The first reason was basically just a place for me to store my step by step guides, troubleshooting guides and just plain ideas about being a sysadmin. The second reason was to share what I have learned and found out with other people like me. Hopefully, you can find something useful on the site.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

I agree