In a previous article series, we looked at the difference between central cloud and edge application orchestration. In that article, we defined distributed edge orchestration as:
💡 Distributed edge orchestration = central edge orchestration + local edge orchestration
But before we get any further, let’s take a closer look at what specific terminology is used — should we call it orchestration or management, or both?
- Orchestration coordinates larger tasks to achieve a business goal, like scheduling applications. For example, K8S orchestrates applications within one cluster.
- Management automates “lower-level” operational tasks, such as upgrades, backups, and configuration. Rancher is an example of a management tool that handles the task of running multiple clusters.
A well-functioning management layer is necessary for orchestration. Also, because of the legacy use of these terms, it is difficult to create strict definitions as a result. However, with these definitions as a starting point, we can now break down the two orchestration layers discussed in previous articles into six distinct layers:
Before elaborating further, let’s take a closer look at the numbered items in the screenshot above:
- Bring your own hardware: For the edge use case, it is essential to be able to bring your own heterogeneous hardware platforms to run your applications at various sites.
- On each edge host, we assume there is an OCI-compliant single-node container management runtime, such as Docker. Containers are and will be the application format for the edge for the foreseeable future.
- Single-site cluster orchestration: Each edge site is comprised of one or several hosts. These hosts must form a cluster with the primary goal of running local applications at the edge site. Think of this as a K3S cluster or Avassa Edge Enforcer cluster.
- Multi-site cluster management. Each edge site is a local cluster. In the edge use case, this can be thousands of edge sites. These clusters must be managed from a lifecycle perspective, such as monitoring, configuration changes, and upgrades. Most solution vendors have a centralized management system (Rancher, Avassa Control Tower, and so on).
- Edge-native application services. Most edge applications have some software components deployed at edge sites and other components deployed in a central location. This kind of distributed system often requires quite complicated implementation of common application services, such as event streaming, secrets management, and container registries. In addition, the footprint and cost limitations of edge computing make it difficult to build an integrated solution of open-source packages for these services. Therefore, Avassa provides a set of such distributed application services that are custom-built to work together in edge environments.
- Multi-site application orchestration. This is a fundamental piece and to some degree, overlooked by infrastructure-centric management solutions. In the end, you want to deploy your applications across edge sites. Applications must be first-class citizens and the site clusters should just work.
Bring your own hardware
For the edge use cases addressed in this article, it is fundamental that the applications run on any hardware your organization owns. The solution needs to support both Intel and ARM architectures and be capable of discovering both the host resources and any attached devices. The application scheduler also needs these characteristics and features available so they can then place the applications correctly.
Furthermore, the solution must manage when hosts are added, removed, and swapped on the site. Therefore, to keep up with changing sites, the central management solution needs automatic call-home mechanisms. The solution must also work on hosts ranging from small Raspberry Pis to enterprise-grade servers.
Single-site cluster orchestration
This layer should ensure that site hosts form a cluster that can support running applications. Kubernetes and its incarnations like K3S are the most well-known platforms for creating a single cluster. For the edge use case, the cluster manager at each site should be as autonomous as possible in order to run without stable network connectivity. The solution should be lean on resources in order to fit the footprint requirements of the edge.
Core features for the single-site cluster orchestrator are:
- Local site application orchestration: Schedule applications and their containers on the available hosts, taking host characteristics like disc and CPU usage and available resources like GPUs into account.
- Underlay networking: Connect all hosts on the site over a secure network that can be used as an underlay for application networking.
- Application networking: Set up a dedicated secure network for individual applications, populating DNS for service discovery, and configuring ingress networking.
- State replication: Replicate states such as scheduler state, local configuration, and so on so the cluster can survive host failures.
- Security functions: Secure all data on the network. In the edge, use case hosts are not perimeter-secured, so they might be accessed by unauthorized personnel or even stolen. All application data, therefore, needs to be encrypted. Cryptographic isolation with separate keys must be enforced between tenants and sites. IT also needs to be able to easily block a tenant, site, and host in case of jeopardy.
- Application and infrastructure monitoring: Provide observability for the infrastructure as well as the applications. This includes health states, synthetic monitoring, logs and metrics, and topology information. It is important that this is provided locally, and for each site as aggregated site context information can be sent further up in the stack. Read more about this in the Edge Observability white paper<<LINK>>
Multi-site cluster management
All of the features described in the previous section addressed cluster orchestration for a single site. Now turn to the edge use case; you may have thousands of sites, all with a local orchestrator performing single-site cluster management. This means you need to manage thousands of clusters and, using a single console, handle the entire lifecycle of the cluster software itself. This includes upgrades and configuration changes.
The edge use case also comes with other important management requirements:
- Security key rotation needs to happen for the networks on each site.
- Observability data across all sites need to be aggregated so that operations teams receive early insights into any edge site issues.
- Heterogeneous compute platforms must be managed on the site. This also includes addition, removal, and swapping of hardware dynamically.
Edge-native application services
There are common software patterns for distributed edge applications: secrets need to be distributed from the central solution to specific edge applications; central and edge applications need a way to subscribe to and publish events and logs; container images need to be stored in site-local registries to provide for long network outages. These patterns must be designed for the distributed edge use case, which puts specific requirements on both small footprint and unreliable networks.
These features must also be aligned with the multi-site application orchestration functions below. For example, you may want a secret distributed to a specific set of sites depending on which applications require these secrets.
Multi-site application orchestration
The goal of the top layer is to perform application deployments across the edge sites and maintain the desired states for all these applications. Features mentioned in the previous layers should be almost invisible to the application developer. The value lies in the applications and organizations. In general, you should not build a platform team to manage the infrastructure.
Modern application teams will have an efficient CI/CD pipeline. The goal of this pipeline should be to deploy applications to targeted edge sites. Location, host, and site characteristics matter in this instance. For example, in these placement policies, you would have something like, “Deploy to sites for customer X, on hosts size medium or larger, with cameras attached.”
The orchestrator should allow for fine-grained control and insight into application status and location. The artifacts for the deployment phase need to be at the application level, which includes an aggregate of containers and their configurations.
Application orchestration also needs strong support for observability. Based on observability functions provided by each site, the central solution should support application-centric functions such as proactive alerts on applications with degrading performance, support for analyzing and fixing the issue, and validating the correction. It also needs to provide a precise mapping between individual applications and the resources on the sites. This is critical to shortening the resolution process by avoiding the blame game between the application and infrastructure teams.
Finally, the application orchestration layer needs to be optimized for swiftly deploying changes to applications with minimal disturbance. When the application team changes application configuration, container versions, and so on, these need to be pushed to the correct sites without hassle.
Checklist for your edge application orchestrator. Does it support the following?
If so: yay! You’re set!
LET’S KEEP IN TOUCH
Sign up for our newsletter
We’ll send you occasional emails to keep you posted on updates, feature releases, and event invites, and you can opt out at any time.