OS upgrades using the Avassa Edge platform (including RHEL for edge session)

In this blog post, I will show how you can upgrade the operating system of an Avassa cluster, with no impact on running applications.

I have lots of edge servers, now what?

Using the Avassa platform typically entails having many computers, distributed over many locations, the computers in each location are organized in a cluster. Each computer will have an operating system that needs to be patched, at least when they are affected by a CVE.

Red Hat provides an OS for edge environments: RHEL for Edge, accompanied by the Red Hat Hybrid Cloud Console. This setup allows an operator to build new OS images and then distribute the updated images to edge computers.

With that in place, Avassa has built-in functions for draining hosts to prepare the hosts for OS upgrades and reboots. Draining a host means that all running applications are migrated to other hosts in the cluster. If the application implements, what we call, delayed shutdown, the application can control when it’s stopped, e.g. after all ongoing requests are being handled by the application.

With delayed shutdown in place, when a host is drained, a new instance of the application is started on a new host. DNS is automatically reconfigured to steer new traffic to the new instance. Once the original instance considers itself to be ready, that instance is stopped. When this is done, you will have an empty host that can be rebooted without affecting the running applications.

Easy peasy!

Hold on, you say, I only have a single host at every site, what of the below applies to me? Of course applications cannot be migrated to other hosts during the upgrade, but you can still use e.g. the delayed shutdown feature (to drain the application from traffic) and the OS upgrade in itself works equally well for single host sites.

This article is a summary of a “RHEL Presents” session, the video in its entirety can be found here.

But wait… in this article and in the video we do these steps manually. When you have more than a handful of sites, manual actions quickly become both tedious and error-prone. Both Avassa and Red Hat provide APIs for all actions described below. My recommendation would be to fully automate the steps taken, it always pays back in the long term.

Meet Avassa drain host and reschedule

To make this effective, you should consider using the delayed shutdown feature.

The video below shows the demo application configuration. Note the configuration of a delayed-shutdown-command . In this demo, the delayed shutdown just sleeps for a while. In a real world scenario, the delayed-shutdown-command should make sure the container is ready to be stopped.

Watch video clip on Youtube.

After deploying this application, it’s time to upgrade the OS. First, we identify on which machine this application instance is running. Selecting cluster and host, we see that services instances section: visitor-counters-service-1.

Video clip on YouTube.

Knowing that, we can now drain this host from running applications, using a function we call “Drain host & reschedule”. There are other options here as well, e.g. “Drain host, no reschedule”, that will just shut each application down, but not try to move them to another host. This can be helpful if the host downtime is short, and it’s okay for this part of the application to be offline for that period of time. Maybe you have multiple instances of this app running?

Video clip on YouTube.

This will bring the system into the below state, i.e. delayed shutdown on the host which we will upgrade, and a new replica on another host.

Video clip on YouTube.

Once the application has shutdown on the host, we can go to the Red Hat Hybrid Cloud Console and upgrade that host.

Video clip on YouTube.

Once the host is upgrading/restarting we can see in the Control Tower the Edge Enforcer is down.

A couple of moments later it’s back and ready for business.

Conclusion

We are certain that with this technique, you can upgrade your entire fleet of hosts, without an impact on the application operations.