October 2023: Feature releases & highlights

TPM/HSM, offline modifications

The October releases deliver heavy-weight features around security and offline capabilities.

Image-download status in Web UI: the Control Tower now shows the progress of image downloads.
An extra layer of edge security with hardware-enabled cryptographic functions: you can now store site keys in TPM and HSM modules if they are available on the site.
Offline operations: you can now modify applications locally on the site while disconnected from the Control Tower. The configuration will be reconciled based on well-defined principles.

Image download-status

In last month’s feature highlights, we mentioned the enhanced functionality around image downloads, download progress feedback, and download restore capabilities.

🆕 The Control Tower Web UI now gives clear feedback to the user concerning the image download progress:

Control Tower Web UI showing image download progress with status and percentage complete.

When you select an application, you will be presented with a table of all containers and their download status/progress.

Edge Security: TPM and HSM support

Edge sites and hosts are vulnerable to security threats. At the edge, there is no perimeter security like in the case of data center solutions. Hosts can be stolen, and hosts might be connected over insecure networks, and an intruder might try to connect malicious hosts to the cluster. Therefore, the edge orchestrator must trust the edge hosts, and the data must be encrypted. Avassa has since day 1 focused on intrinsic edge site security features. We have now added enhanced support for hardware cryptographic functions using TPM and HSM.

Let us start by describing how sites are secured.

Avassa sites are by default “sealed” for security reasons. This implies that the site is not functional or trusted: APIs, command lines, etc, are locked down. When the site boot-straps, it generates a site local seal key needed to enable the site. That key is protected and can be restored to the site.

The boot-strap procedure works like this:

The site gets a token from Control Tower, which allows storing and reading secrets in the Avassa built-in secrets manager.
The site generates a public-private key pair.
The private key is used to encrypt the token.
The token is then split using the Shamir algorithm, and each piece is stored in a controller host at the site.
The public key is used to encrypt the seal key.
Finally, the key is stored in the Control Tower.

Diagram illustrating the site security process with token generation and key encryption.

This means that the token can be recovered if a majority of the controllers are available at the site.

So what happens now when a site/host is restarted? (Note well, it might be a restart after it has been stolen or moved out of the cluster). The site can restore the token if the majority of controllers are available. Then, the scheme above is reversed, and the token is recovered and decrypted. With that token, the site can request the seal key from the Control Tower, which is now decrypted at the site with the private key.

So if a host is stolen and booted, it can not restore its token since the shares are spread amongst the other controllers.

This scheme enforces a strong level of security. But in some circumstances, you might see a restart of sites without connection to the Control Tower, for example, after a severe power failure, including network issues. You can flag a site to allow for local unseal. In that case, the seal key rather than the token is encrypted and split across the Controllers. This means that it can unseal itself without connection to the Control Tower. This can be considered a security risk if only one host is on the site.

🆕 TPM and HSM support for cryptographic functions

To provide an extra layer of security, we have added support for hardware cryptographic functions: TPM and HSM, Trusted Platform Module, and Hardware Security Module. TPMs are chips on the motherboard, whereas HSMs are external modules. The latter are more commonly used in industries with higher security requirements, whereas TPMs are more used in consumer electronics.

The above mechanisms for storing keys and Shamir shares can now be underpinned by TPM and HSM modules on the hosts. This will provide a strong layer of trust regarding host authenticity.

To utilize this feature, you need to enable the Parsec service on the hosts: https://github.com/parallaxsecond/parsec.

Parsec is an open-source initiative to provide a standard API to secure services in a platform-agnostic way.

Read more on how to use TPM and HSM on your sites.

Offline operations

As pointed out above, security is a foundational requirement for edge solutions. Another sometimes neglected characteristic is that sites need to be autonomous. They should not depend on the central component to perform healing actions. In the Avassa solution, the Edge Enforcers on a site are fully autonomous; they take all scheduling and healing actions without interacting with the Control Tower. We have elaborated on that in a previous tech blog.

So even before the 2023 October Releases, you could trust your edge sites. Your deployed applications will heal and migrate if needed in the cluster, even when the site is disconnected. You could also use the Avassa Command line tool at the site to, for example, check application and host status and restart services.

🆕 Offline modifications

We have now added support for local modifications at the site and reconciliation mechanisms when the site is connected to the Control Tower.

This is a way for a local IT operations team to add/modify applications and site configuration without using the Control Tower. This is relevant, for example, if you have an important application upgrade to perform on a site while it is disconnected. The changes are made through supctl or the Avassa API connected to the site. (Note, you can perform site-local modifications irrespective if you are connected or not).

The following modifications can be made:

Applications and application deployments can be created, deleted, and modified.
Site configuration can be modified.
Secrets can also be fully modified.

This covers the main use cases while you are disconnected. Things you will not be able to modify are, for example, tenants, users, and policies.

And what is the conflict resolution principle?

Local modifications stay, irrespective of whether you are connected or not, as long as no changes are made centrally to the same entity.
When the same application, deployment, etc, is modified in the Control Tower, the site’s local entity will be overridden by the Control Tower configuration. There is no merge; the local configuration is deleted and replaced.
A locally modified state flag will be propagated to the Control Tower when local modifications are made.

Another fundamental piece to know before showing some examples is a subtle difference in semantics regarding applications and deployments vs site configuration and secrets.

The sites have a copy of the configuration for site configuration and secrets. This means that they can be modified directly on the site.

Applications deployed from the Contol Tower do not have a configuration representation on the site. They result from the scheduler and therefore have a read-only state representation locally at the site. This means that to modify a centrally deployed application while you are disconnected at the site, you must create a site-local “copy” before you can edit.

Let us illustrate this with an example. Assume we have three sites, electric-cinema, metrograph, and edge-site2. At time T0 we deploy the application A to all sites. Later in the scenario, we will do local modifications at site edge-site2.

The table below summarizes a sequence of events. (R-O indicates read-only; there is only a state representation of the application)

Time	Connected	Control Tower	Site `edge-site2`	Comment
T0	Yes	A	A `(R – O)`	Application A has been deployed to the site from the Control Tower.
T1	No	A	A `(R – O)`	Connection lost
T2	No	A	A´	To allow for a configuration change of the application, create a local “copy” of application A with the new configuration.
T3	No	A	A´	Locally modified A
T4	No	A	A´ B	Create a completely new application B locally on the site.
T5	Yes	A `(locally-modified)` B `(locally-modified, R – O)`	A’ B	Connection established, and the locally modified flag propagated to Control Tower. Local modifications stay on the site.
T6	Yes	A” B `(locally-modified, R – O)`	A” B	Central modification of A complete overrides site local configuration at the site of A.

Now, let us describe and illustrate some of the steps above. First, we will perform the local modifications. To do that, you need to connect subctl to the site. See the detailed steps in our documentation on managing offline sites.

An important prerequisite is that you have local users in your Avassa system. Users that you create in the Control Tower are automatically replicated to all sites.

Screenshot of the Users section in Avassa, showing a list of users with their email and authentication details.

With a local user as above, you can log in locally at the site:

site$ ./supctl do login site-admin@avassa.io

At T3 we have modified the centrally deployed application A; how can you achieve that?

At the site, you will (re-)create the application configuration for A. We create it locally since we need configuration at the site. It is a good habit to have the application and deployment YAML specifications in a site-local repo if you want to be prepared for local modifications.

site$ ./supctl create applications <<EOF
name a:
...application YAML here, same as central YAML in Control Tower...
EOF

Also, create a site local deployment:

site$ ./supctl create application-deployments <<EOF
name: a
application: a
placement:
  match-site-labels: system/name = edge-site2
EOF

What you know have achieved is to get the configuration of application A available at the site. And now, use the supctl edit command to make any changes to your application:

site$ supctl edit applications a

To create the brand new application B at the site, you follow the same pattern as above.

The screenshot below shows the Control Tower UI at T5; we have both A and B shown as locally modified.

Control Tower UI showing locally modified applications on edge-site2.

You can see that application A has been modified on edge-site2. If you click the Locally Modified label, you can see the configuration picked from the site. (Assuming we have a connection)

Code snippet showing the application configuration, locally modified on the edge site.

If you try to edit B, you will be informed that there is no configuration in the Control Tower:

Screenshot indicating that there is no application configuration in the Control Tower.

Let us perform some supctl commands to illustrate the above. The following commands are performed from the Control Tower.

First, check a summary of all applications; you can see the locally-modified-sites status

control-tower$ ./supctl show application-status summary

applications:
  - name: a
    application-deployment: a-dep
    oper-status: running
    selected-sites: 3
    deployed-to-sites: 3
    deploy-failed-to-sites: 0
    locally-modified-sites: 1
    deploying: false
    all-sites-connected: true
  - name: b
    application-deployment: b-dep
    oper-status: running
    selected-sites: 1
    deployed-to-sites: 1
    deploy-failed-to-sites: 0
    locally-modified-sites: 1
    deploying: false
    all-sites-connected: true

Check the configuration of applications A and B.

A was deployed from the Control Tower, and therefore we have a configuration centrally:

control-tower$ ./supctl show -c applications a

name: a
version: "1.0"
services:
  - name: a-service
    mode: replicated
    replicas: 1
    share-pid-namespace: false
    containers:
      - name: a-container
        image: registry.gitlab.com/avassa-public/movie-theaters-demo/kettle-popper-manager
        container-log-size: 100 MB
        container-log-archive: false
        mounts: []
        env: {}
        on-mounted-file-change:
          restart: true
on-mutable-variable-change: restart-service-instance

B was only deployed locally at the site, and therefore we have no configuration in the Control Tower.

control-tower$ ./supctl show -c applications b

not found

The application-status summary command above indicated that applications were modified locally. In order to dig into a specific modified application, you check that application as shown below; this will give you details on the local change:

control-tower$ ./supctl show application-status applications b

name: b
sites:
  - name: edge-site2
    application-version: "1.0"
    oper-status: running
    connected: true
    locally-modified:
      time: 2023-11-08T14:54:27.317Z
      application-version: "1.0"
      application-deployment: b-dep

When local modifications are made, a new section locally modified will appear in the application status, as shown above. That lists all configuration changes performed locally.

At T6 in the scenario above, we modified the configuration of application A in the Control Tower. That will then overwrite all configurations at the site. The state in the UI will now be:

Control Tower UI after modifying application A, showing all sites in running state.

We have not mentioned image management yet. In connected centralized operations, you define the application in the Control Tower, including a URL to the image registry. For most offline scenarios, that will not work. So when you need to upgrade a container version or add a new application, you can add the image locally by using docker load and push to the site.

You can easily verify your site-local registry with:

site$ ./supctl show image-registry repositories 
- name: avassa-public/movie-theaters-demo/kettle-popper-manager
  tags:
    - tag: latest
      digest: sha256:27ca0299125baf78014065152a1bd786573e309f15ae3dea9727239966ee0f4e

Read more on image management in our documentation on offline sites.

Finally, let us show the appearance of a local site configuration change; this could for example, be modifications of IP pools on the site. The UI also provides a link to view the site’s local configuration. Note that the site is directly editable locally since it is a replicated configuration.

UI showing local site configuration changes, with a link to view the site's local configuration.

To conclude the story on offline modifications, here is a video demonstration.

I think that I am among the few lucky ones who are exploiting complexity. Most people are unhappy with the emergence of complexity, they would prefer it if the world were very simple, but then it would be a doom for a cryptographer like myself.
Adi Shamir, he is also the “S” in RSA.

LET’S KEEP IN TOUCH

Sign up for our newsletter

We’ll send you occasional emails to keep you posted on updates, feature releases, and event invites, and you can opt out at any time.

Highlighted resources

What is Edge AI? Key Benefits & Why You Should Use It

Smooth Sailing at the Edge: How to Migrate Legacy VMs to Containers with Avassa

Edge Observability – Shifting Left for Proactive Monitoring