Over the Air Updates (OTA)

As a part of the Jasmine release we are working on way to update a Linux-based gateway and Zephyr-based single-purpose device, such as a smart door lock.

Scope for Linux Gateways

The OTA process is restricted to system-level updates only. It does not participate in Linux containerized applications that may be running on the gateway. The OTA system offers transactional update using a pair of A/B partitions holding the immutable system image and additional data stored in a mutable state partition. The update system obtains the delta update autonomously, using a system timer. In addition the update agent offers system D-Bus APIs for integrators for precise control.

The update system will depend on a specific partitioning and runtime configuration to separate and manage immutable and mutable parts of the system.

Requirements: System Updates on Linux (#21) · Issues · OSTC / Requirements · GitLab and Transactional Linux-based OS (#158) · Issues · OSTC / Requirements · GitLab

Scope for Zephyr devices

The OTA process communicates with the gateway to obtain new system image and can install it with either A/B or recovery+system partition. Details need to be defined yet.

Requirement: System OTA OS - Zephyr (#16) · Issues · OSTC / Requirements · GitLab

Design

The design is not finalized yet. We will start a public design process here on the forum in the next few days. As the design formalizes, this initial post will be updated to reflect the design.

2 Likes

I’ve created a repository for the Linux Gateway OTA system: OSTC / OHOS / components / SystemOTA · GitLab

Partition Table for Linux Gateways

We’ve been discussing this with @agherzan for the last few weeks to a small degree and we’ve come up with a partitioning scheme and an associated rationale. I’ll share it here to start the discussion:

In general, we want to use GPT partitioning scheme. We believe this is compatible with all the hardware we are aiming at and has some important advantages compared to the MBR partitioning scheme.

GPT also allows us to put partition identifiers, separately from filesystem identifiers, and use them as a part of the system architecture, making the design more flexible.

1. Boot Partition

On x86 systems this is the EFI System Partition - which mandates FAT. On various non-standards compliant ARM boards this is almost always FAT as well.

This partition may have to hold the kernel image so it must be large enough to handle that. Our recommendation is 512MB

2. Factory Data Partition

This partition should follow the boot partition. It can be arbitrarily small or large, depending on the needs of a given device. This partition is never mounted rw and is never destroyed by the recovery process.

This partition can be used to store device specific data, such as factory calibration, MAC addresses and anything else that is necessary. This partition should use an immutable file system, such as squashfs or erofs. This partition can be mounted from the boot loader if the boot loader supports it and read access is required - for example, to configure the MAC address

TBD: GPT UUID for this partition

3. System Root A Partition

This partition holds a raw immutable file system used for booting. It may also contain the kernel image if the boot-loader is able to mount and load the kernel from this file system.

The kernel initrd assembles a file system structure using System Root A, Factory Data Partition and System Data Partition, using symbolic links, bind mounts and overlayfs. The details of that process will be designed and documented separately.

An initial system image is shipped with only the first three partitions. System Root B and System Data are both created on first boot. Removing System Data effectively wipes the device to initial state.

The boot loader is instructed to load the kernel from System Root A or System Root B whenever possible. On devices where this is not possible, the Boot Partition is used instead.

TBD: GPT UUID for this partition

4. System Root B Partition

The partition has the exact same size as System Root A.

When performing a system update, two cases are possible. Either a complete system image is downloaded to System Root B directly or a delta is downloaded to temporary space and is combined with System Root A to create a new complete system image.

System Root A and B can be used interchangeably here. Initial image contains System Root A and updates by writing to System Root B. Following a successful commit the next update will overwrite System Root A.

TBD: GPT UUID for this partition

5. System Data Partition

This partition extends to the rest of the storage device and is created and formatted by Systemd from the initrd during first boot.

TBD: GPT UUID for this partition

This partition has the following areas, each stored as a distinct directory hierarchy:

  • System Configuration Data for System Root A
  • System Configuration Data for System Root B
  • Temporary Data and Scratch Space
  • Application Data (for managed apps)

System configuration data (this is expanded later) is kept separately for both system root partitions. This is so that the data can be used to boot back to a previous image if the update is rejected (either by failing to boot or by applications reporting problems).

Temporary data and scratch space is used to store /tmp and /var/tmp. One of those may be implemented with tmpfs if the device has enough memory available. At least /var/tmp should be non-volatile so that we can use it to store a delta image necessary to re-construct the next system image.

Application data is delegated to a specific directory on the partition but is not described in detail here. This is designed for managed applications, which will most likely run on top some kind of sandbox and container engine. System applications, running without this mechanism write to regular locations that are re-mapped to the system configuration directory as we will describe later.

Linux System Configuration Data Mechanism

The key to making read-only, immutable rootfs useful is to combine it with a mechanism where some modifications are possible and are saved in a non-volatile space across reboots. This can include things as simple as the hostname and timezone setting or as complex as the set of system services, timers and sockets that are enabled or the set of SSL certificates the device knows about.

The key to make this work is to combine a technical measure which accepts modifications of a given file with an operator controlling a specific set of modifiable files and directories.

Making things writable again

To make a file writable, the immutable system image can contain one of the following objects:

1. A symbolic link pointing to a well-known, stable, writable replacement.

For example, the /etc/hostname file might be a symbolic link to /run/sysota-etc/hostname where another mechanism persists the changes to hostname and restores them in the early boot process

Symbolic links may be problematic if the software writing to the file creates a temporary file and performs an atomic rename operation. In that case, the entire directory must be technically writable for this to work without additional patches that must be maintained in the distribution.

2. A bind mount pointing to a well-known, stable, writable replacement.

This is very similar to the symlink approach with the following essential differences:

  • the file cannot be unliked
  • software which inspects file sype will see a regular file instead
  • it consumes an entry in the mount table

Bind mounts are very flexible but have added complexity with regards to mount event propagation and with regards to making the mount table cluttered or convoluted

3. An overlay filesystem

This approach combines a set of lower directories that are not modified (e.g. one of the System Root partitions) and one upper directory which stores all the writes.

Overlay filesystems are a fundamental technology at this time, but they still have some shortcomings. For example, apparmor is not very compatible with overlayfs. This may impact some of the sandboxing technology.

We could use overlay on top of strategic places, such as /etc where other places /var are handled with a large bind mount.

4. A FUSE filesystem

We could mount a custom FUSE filesystem over /etc, which handles redirects access to specific places. This could be more elegant than a swarm of bind mounts. This would require custom engineering and has a small performance penalty but for things in /etc I think that would not be a problem.

I would only consider this if we absolutely have to, as it is clearly a large and complex piece to implement if it is meant to manage /etc. We may also use an existing solution if one exists. One main advantage of this would be relative simplicity for user space, as it would behave like a normal file system and we could flexibly redirect access to read-only files or to writable files or to even synthesize files on the fly.

State Operators

State Operators are an idea to manage state in a specific location in the file system. An operator is a program which implements a specific interface and is registered in the central registry as a manager of a specific file or a file hierarchy.

In general, each file and directory in the mutable space is managed by exactly one operator. The system must ensure there is no ambiguity as to which operator is responsible for each file.

Operators allow us to have the flexibility to control what happens to a modification. For example: an operator could parse a modified /etc/hostname, store the actual value in a dedicated location and re-create the file on boot, making the file mutable but also ephemeral. Another operator could simply discard all modifications, re-creating the
file based on some data source. This idea allows us to use a regular writable filesystem, ephemeral or not, and have a way to both manage and model the data. This last aspect is important when data changes formats and the update system must create the representation that is in agreement with the software in the root filesystem.

This is why it is essential for operators to be able to both read and write the data. A hypothetical operator could parse /etc/hostname and /etc/timezone and store both in an internal format that is managed by the OTA system. Following that the operator can render the data stored internally into an appropriate format.

Operator API

type StateOperator interface {
  // UnmarshalDirectory inspects the directory alone, and marshals it into internal state.
  // A directory is always unmarshaled before any of the children.
  UnmarshalDirectory(path string) error
  // UnmarshalFile inspects the state of a given file and marshals it into internal state
  UnmarshalFile(path string) error

  // MarshalDirectory creates or re-creates a directory at a given path.
  MarshalDirectory(path string) error
  // MarshalFile creates or re-creates a file at a given path. 
  MarshalFile(path string) error
}

For example, a sample marshaler could handle /etc/hostname and /etc/timezone by storing them in a “registry” (whatever that is). This could be defined declaratively as follows:

operators:
  system-config:
    registry:
      "/etc/hostname": "system.hostname"
      "/etc/timezone": "system.timezone"
locations:
   "/etc/hostname": system-config
   "/etc/timezone": system-config

In code it could look something like this:

reg := NewRegistry("...")
sysConfig := NewRegistryOperator(reg, map[string]string{
   "/etc/hostname": "system.hostname",
   "/etc/timezone": "system.timezone",
})
// Walk all of /etc picking the right operator for each file we've seen.
// Using the locations map, we finally reach /etc/hostname and /etc/timezone
// and those are unmarshaled and stored in the registry.
sysConfig.UnmarshalFile("/etc/hostname")
sysConfig.UnmarshalFile("/etc/timezone")

// The registry can now be saved, on error the data could be discarded.

// /etc/hostname is re-constructed by this specific operator
sysConfig.MarshalFile("/etc/hostname")

Proposed state operators

I think there are three operators we would actually need:

  1. Files that are parsed, stored in the registry and re-created - we can migrate the format
  2. Files that are stashed entirely and not parsed - no way to migrate those
  3. Files that are never stored and are always restored from a reference copy (e.g. system root)

Linux Gateway OTA Backend

The OTA system will require a backend capable of responding to requests made by the OTA client running on the gateway. For the Jasmine release cycle our plan is to go with a basic backend that can be served from a dumb http server without a specialized server application.
On the network protocol side, the OTA client can ask a dumb HTTP server that is set up as current OTA server, for available updates in the following manner:

Each device knows which model it belongs to. Model is a structure with several fields, at minimum, make (brand name) and model (model name) coupled with CPU architecture of the primary CPU where the OTA system runs. For example an armv7 reference transparent gateway from ostc is an example model. The model must contain URL-friendly identifiers, any additional human-readable, localized strings is an optional separate meta-data.

With this information and the URL of the update “server” the client can form a GET request at /sysota1/$maker/$model/$stream where $maker is the identifier of the maker (brand name), $model is the identifier of the specific model by that brand, and $stream is the identifier or path of one of possibly many images available. For example, the transparent gateway may issue GET https://example.org/sysota1/ostc/transparent-gateway/jasmine/stable, subscribing to the stable update flow of the, now future, Jasmine release.

The response is a JSON document, I’ve crafted a sample to start the discussion below:

{
	"revision": 42,
	"images": [
		{
			"architecture": "x86_64",
			"hashes": {
				"sha256": "..."
			},
			"variants": [
				{
					"type": "squashfs",
					"squashfs": {
						"compression": "zstd"
					},
					"size": 1000000,
					"URLs": [
						"https://...",
						"ipfs://...",
						"magnet://..."
					]
				}, {
					"type": "squashdelta",
					"squashdelta": {
						"compression": "zstd",
						"needs": [41]
					},
					"size": 10000,
					"URLs": [
						"https://...",
						"ipfs://...",
						"magnet://..."
					]
				}
			]
		}
	]
}

Some highlights:

  • each revision has possibly multiple images
    • images encode the architecture
    • a fleet of devices using software built for different architectures will appear to be on the same revision
  • each image has extensible set of hashes
    • we can add hashes and signatures over time
  • each image has possibly multiple variants
    • the image may be encoded in different ways (xz vs zstd)
    • the image may be delta-encoded (xdelta or squashdelta)
    • each image variant has known size
  • each image variant has possibly many URLs
    • e.g. include both direct download via HTTP redirector and a peer-to-peer URL

Using this response the device can decide how to obtain an image:

  • get a delta image if one is available
  • get a variant that the kernel can mount
  • (maybe) get a debug variant with extra symbols
  • (maybe) use peer-to-peer network to obtain the image

Two more sample responses,

Server is overloaded, asks device to wait for at least two minutes. HTTP status code could be 429. The server could use this to steer traffic away from compliant devices.

{
   "error": "server-busy",
   "try-again-in": 720
}

Server is generating a delta based on request history. Irrelevant parts have been omitted but pay attention to lack of URLs and size for the delta and note the eta field which suggests that the delta may be available in an hour.

{
	"revision": 42,
	"images": [
		{
			"variants": [
				{
					"type": "squashdelta",
					"squashdelta": {
						"compression": "zstd",
						"needs": [41]
					},
                    "eta": 3600
				}
			]
		}
	]
}

SystemOTA development

I’ve started sending some initial patches for scaffolding to OSTC / OHOS / components / SystemOTA · GitLab - specifically Add scaffolding for sysota/sysotad (!1) · Merge requests · OSTC / OHOS / components / SystemOTA · GitLab

SystemOTA DBus Interface

I’ve been exploring the D-Bus interface for the system OTA service. I don’t have
a full design yet but I have some notes that are worth sharing already. I have
split the API into things that are easy and non controversial, things that feel
okay but have more possible ways of being implemented and lastly things I’m not
convinced about myself yet.

Let me break this down into specifics. Everything here is listed under bus name
dev.ostc.sysota1 - preempting the move to a dev.ostc domain name. This may
move to NewCo later on.

High-level configuration

The part I’m mostly sure about are const properties that show the maker,
model and subscribed update channel. You may notice that there’s no update
server or any kind of trust model exposed at this level. This is on purpose. The
exact way we find the update server is something I would like to have as an
implementation detail until we have some more time to explore this. We may end
up with a single update server that’s just a simple URL to set. We may end up
with something that’s baked into the model definition and is not exposed as an
API. We may have a range of URLs as well. I don’t know yet.

Name Type Signature Result/Value Flags
dev.ostc.sysota1.Service interface - - -
.Maker property s “SECO” const
.Model property s “Dev Kit 2000” const
.Channel property s “/latest” emits-change

The first two are informative only and cannot be changed from the API without a
re-model operation which is not in scope. The third one can be set to a custom
string that describes the update channel - whatever those are. At this time I
would not specify this with more detail. Later on we may want to offer some
standardized property for this as well or leave it entirely open and editable.
This will mainly depend on the development of the server side parts.

Setting the channel to an incorrect value may just fail (e.g. we can do an
online check and revert.

Requesting updates

The first thing I wanted to do is to try to download some fake files from a
local https server. This would let me explore the server side API a little, as
well as to try to handle some basic cases (downloads, resuming, etc) and start
to have a conversation about this API and about initial security aspects with
Marta.

I want to have visibility of long-running operations, like downloads, so that
they can be observed, paused or resumed, if possible, or cancelled. Having a
percentage or ETA (estimated time of arrival) as properties that emit changes,
would also allow us to build nice experiences on top, either with command line
tools that feel nice but more importantly as good APIs for customer-specific
device agents that interact with our OTA service.

There are two complications: transport methods - e.g. https:// or magnet:// and
selected image variant. Recall that draft server proposal had the idea that a
given channel has a specific single revision which has many variants. On
one hand side, it would be nice if the default API was simple and the device had
enough smarts to pick the right image (e.g. not picking a debug image) and
variant (e.g picking the supported delta algorithm or kernel requirements, like
compression type).

Transport methods are a bit more complicated. We may default to https and park
the rest for a while but assuming we may end up downloading images over
something else as well (e.g. windows-like local-network peer-to-peer or
something entirely different), having a way to pick the desired image as a hint
to the Update method might be a good way. On the other hand, methods that take
“key-value hints” and maybe do something with that are often an API smell.

My current proposal is as follows, note that less important properties (ETA) can
be done later.

Note that the Update method takes a map of key-value pairs as argument. This
is the hint set I mentioned before. It can be empty for a full-auto mode, where
our logic decides what to get. This would crucially allow both the most
simple device agent that just calls Update every time it gets a message over
MQTT and a more sophisticated agent that has all the rich logic to make informed
choices.

Name Type Signature Result/Value Flags
dev.ostc.sysota1.Service interface - - -
.Update method a{sv} o -
dev.ostc.sysota1.Operation interface - - -
.Type property s update const
.State property s paused emits-change
.CompletedPercentage property d 60 emits-change
.EstimatedTimeLeft property i 3600 emits-change
.Cancel method
.Pause method
.Resume method

The update function is a fire-and-forget. It would handle everything, from
downloading through applying the delta to configuring the boot loader and
rebooting the device.

If we want to offer more controlled update experience, e.g. sit on a pending
update but reboot separately we could offer another method, e.g. PrepareUpdate
that returns a pair of objects - the Operation and the Trigger that has
RebootAndApply method, for example.

Channel map

The channel map is the set of available channels and their properties, as
advertized by the update server. Having a method that asks the server about the
channel map may give us, at least, the list of channels (here described as
a-rray of s-strings).

Name Type Signature Result/Value Flags
dev.ostc.sysota1.Service interface - - -
.UpdateChannelMap method s as -

This could be done differently. It could return ao instead - an array of
objects paths - with individual channels exposed as distinct objects with a set
of properties. This would let us expose channel properties, like the revision
published there, information about latest update or indication that a channel is
closed. That last thin would allow us to look at other available channels and
automatically switch to that with highest preference priority, e.g. going from
a temporary channel back to a stable channel or going away from a beta channel
for a specific release to the released version instead.

This may also allow us to have methods on individual channels, e.g. Subscribe.

Name Type Signature Result/Value Flags
dev.ostc.sysota1.Service interface - - -
.UpdateChannelMap method s as -
dev.ostc.sysota1.Channel interface - - -
.PublishedRevision property i 1234 emits-change
.UpdatedOn property t 2021-05-12T20:17:13Z emits-change
.ClosedOn property t 0 emits-change
.Preference Priority property i 100 emits-change
.Subscribe method - - -

That’s it for the moment. I will pause for a while to let everyone catch up and
express their opinions.

I’ve landed the initial scaffolding, having seen no objections over the past few days and I’ve opened Add scaffolding for the D-Bus service (!2) · Merge requests · OSTC / OHOS / components / SystemOTA · GitLab which adds a lot more interesting parts.

First of all, we have a working D-Bus service which can be installed on a running system. The service is-bus activated and shuts down when unused. Right now it doesn’t to anything useful but it exposes three properties and two methods, as described above. I’ve changed some of the names having played with this a little:

Name Type Signature Result/Value Flags
dev.ostc.sysota1.Service interface - - -
.Maker property s “SECO” const
.Model property s “Dev Kit 2000” const
.Stream property s “/latest” emits-change
.UpdateStreams method - - -
.UpdateDevice method a{sv} o -

There’s also some initial structure around testing. The two most notable parts are tests involving D-Bus, allowing us to unit test the D-Bus interactions without invoking the real OTA logic as well as integration tests implemented with spread. Those test that the OTA service can be built, installed and actually used on a given system, using busctl as the command line interface.

There’s a lot of holes but I wanted to share this since I promised to time-box the exercise and because it’s much easier to iterate on top of a slice of cheese than on an empty plate. :slight_smile:

That’s it for now.

Why not considering updating bootloaders, firmware, secure world and containers from the beginning? It would be good to have a unified solution for all those needs, to avoid different set of tools for different stages?

Because that’s just a lot more complex and we won’t get the design right.

Firmware, at least on x86 is now largely managed with firmware update manager and LVFS. That’s a separate beast that has a lot of complexity already. Application containers are another story where each vendor has a custom API that’s, again, much more complex than what we’ve drafted.

Keeping those separate will not be idealistically pretty but makes it practical for us to develop an image-based update system for the OS, without getting lost in the complex application space.

I think there are some typos in this part:

unliked → unlinked
sype → type

1 Like

How about I make you a forum moderator so you can just fix those as you see them? :slight_smile:

I’m a little split on that idea. On the other hand it is good to model the data stored, but on the other hand it adds complexity like the need to update the operators and implement the data parsing in the same way in the actual service using it and in the operator. That might lead to bugs.

In addition, as noted earlier that this proposal does not handle external application, applications will need to have their own migration path for their configuration.

If implemented with operators, I think it requires a strict set of tests verifying the operators.

An additional linked question will be the frequency of updates. I think we should not assume that, it should depend on the device manufacturer. Would the operator model work with frequent updates (eg once per week)?

State Operators need to manage only what is provided in the OS image. Applications are separate, and will have state that is equally separate from the OS state - e.g. when managed by docker or snapd.

The complexity is really needed to reduce complexity of the alternative solution. The only other way that I’ve experienced so far, is to have shell scripts perform this “migration”. Usually this is a one-way process and is not tested nearly as much in practice. For reference see the countless bugs in Debian maintainer scripts on updates.

State Operators help to reduce complexity in the following ways:

  • The behaviour of a state operator is separate from using it for a specific purpose and can be tested in isolation.
  • The application of state operators to specific file system objects is declarative and can be machine-verified - for example, that each file is managed by exactly one operator, or that a persistent operator is not used in a sub-directory managed by a discard operator.
  • Having the range of available operators (format-aware, format-agnostic and special, .e.g discarding) allows us to define how state is managed in our system image at a very high level

We can be very conservative, start with a totally volatile image where most of the state is lost, apart from host name, time zone and machine ID (and ssh keys for developer images) and start with that. This allows us to keep updating to any future images. The key part is that we get to decide what happens with state. Without this layer we will have to define how to move from one image to another using imperative logic, or hope that userspace never breaks compatibility.

State Operators are executed on each boot and shutdown. Update frequency is not the relevant input here. The relevant input is if the update brings state change that needs to be managed: e.g. change in a fundamental configuration of network settings, major change to a library that may have local caches/state in non-volatile parts of the file system, etc.

A daily update process, with back-ported security fixes requires no changes to any operators. A typical cycle-sized update, e.g. once a year, will most likely require us to look for incompatibilities and how to address them with state operators.

Comments on this whole post:

  • It should be clear that the requests are by HTTPS
  • The resources file (JSON) should be at least signed IMO.
  • Not sure if the size field is needed, either a hash (we’re not talking md5, aren’t we?) or a signature will work as a verification. The client can also request the file size when downloading.
  • Implementation detail: the “try-again” and “eta” should be rather using a random back-off. Otherwise it can easily create a DoS.
  • All this update types should be generated by the user, so there will be a tool to prepare all this structure and customize it to the product needs, right?
  • The device itself will have properties allowing it to choose the right image (eg. processor version, image location, encrypted version only…)
  • What about preparing an image for a specific serial number range (eg. special update for a particular customer)?
  • What about deplying images to devices over time (eg. if there are many of them and the server could not handle all updates at the same time)?

Which applications will have access to this DBus interface? I can imagine a few things going wrong, like malicious app sending ‘Pause’ all the time. Or the device running on battery and being low on it, in this case update is not a safe thing to do. Or the device doing some important tasks and rebooting JUST NOW is a very bad thing to do.

You are right. I was also thinking of offering certificate pinning for the device → meta-data server path. The device → binary-server/CDN path is probably fine without cert pinning, as the device will have to verify signatures anyway.

I didn’t want to touch this yet because that opens up the whole security topic. I agree we need provide meta-data integrity and authenticity. If the JSON responses are coming from a meta-data server with pinned certificates then we could ship them unsigned. If we have no cert pinning around then we can add separate signatures and we have a large set of possible choices - ranging from GPG and myriad of options there all the way to json web signatures.

The size field is relevant to avoid considering updates if the update cannot fit the available slots (for complete images) or temporary storage (for deltas). Is there a specific concern or are you just wondering if we can avoid it?

That’s a good point. There’s a related topic that devices should randomize the request to the meta-data server in the first place. Based on the server response we can either (if a smart server is around) add random offset ourselves or ask the device to always do it. It should be added to the specification.

Yes. We will start with a tool that can generate the skeleton and modify it accordingly, e.g. by adding images or producing deltas. Over time we could implement some of this as a proper cloud service but it won’t affect the protocol much, I hope.

I have the following idea about this. The meta-data server will expose all of the properties of the available images and the SystemOTA service will make that data available to the device. In the most basic case, the device will just update to any compatible image - this has to be codified exactly. In a more advanced case, the SystemOTA service will be used not as a standalone component but will be driven by a product-specific device agent. The agent can then have any logic to choose appropriate image.

I expect that most customers will have their own agent or will want to develop an agent to manage other aspects of the product. This design lets them control the OTA process without having to fork the software to add anything custom.

The unspoken assumption about the maker and model is that all devices with a given (maker, model) are identical from the point of view of the update system, and can all use one image. Having said that there’s some wiggle room. A device agent can set the stream to something custom. For example hotfix/bug-123. This decision can use anything, the agent has access to, as input.

I didn’t define a device-specific serial number as a part of the OTA system. We may end up defining that ourselves as another component of the architecture. Handling serial numbers is pretty tricky as virtual machines will be cloned, white-box vendors will re-model a given generic device to a specific customer device, and add a sticker perhaps. I would like to push the problem of handling serial identity for as long as we can, without impacting the correctness of the system.

This is very interesting. There are two ways to think about this. Either the meta-data server can lie to some devices, and tell them there is no update, or it can always tell the truth and then rely on the devices doing the right thing not to overload the server.

If a customer has a device agent the agent can handle the entire schedule, randomization and progressive roll-out. In the cases without an agent, you will need something smart on the cloud side and I think we won’t be able to reach that point in Jasmine yet.

I wonder if there’s some clever math that would let us define a serial number for the purpose of the OTA update, where the number is a large random number that does not leak the population size and, together with something non-unique provided by the meta-data server, decide that it belongs to the subset that should get the update.

I’ve made some more progress on the scaffolding of the OTA service and have merged it to main. I will look at one of the other tasks related to the OTA stack - the bootloader protocol for the next few days.

I left some comments related to the design of the D-Bus API in gitlab but the tl;dr; version is that I want to sketch some more specific ways the system will create, remember and forget about operations. Operations will expose activities that the OTA service is doing in response to an update request. Those will range from downloading files, applying deltas or doing back-ups. I want to avoid some of the complexity I experienced in the past while preserving the useful aspects. I will write some more about this in the coming days.

Here is an update on the direction we took with the partition table activity for the OS.

Disk Partition Table

Overview

The OS defines the partitions included as part of the Linux-based distro as it
follows:

  • boot

    • filesystem label: x-boot (partition name when relevant)

    • It provides boot artefacts required by the lower bootloader assumptions. It
      is device-specific both in terms of filesystem and content.

  • roota

    • filesystem label: x-roota (partition name when relevant)

    • It provides the root filesystem hierarchy.

    • Filesystem type, configuration and structure are device-independent.

    • This partition is the only one provided with a redundant counterpart (see
      below).

  • rootb

    • filesystem label: x-rootb (partition name when relevant)

    • It provides a redundant root filesystem hierarchy used as part of the
      system update strategies.

    • Filesystem type, configuration and structure are device-independent.

  • devicedata

    • filesystem label: x-devicedata (partition name when relevant)

    • Device-specific data meant to be preserved over system reset (factory
      reset).

    • The runtime will completely treat this data read-only.

  • sysdata

    • filesystem label: x-sysdata (partition name when relevant)

    • This partition holds the system state to deal with the root filesystem as a
      read-only asset.

    • It ties closely into the system update strategies.

    • Data is kept over system updates (subject to state transition hooks) but
      discarded over factory reset.

  • appdata

    • filesystem label: x-appdata (partition name when relevant)

    • This partition provides application data storage.

    • Data is kept over system updates (subject to state transition hooks) but
      discarded over factory reset.

The build system tries to unify the partition as much as possible, leaving
upper layers (for example, the system update layers) with as few deviations to
deal with as feasible. This means that filesystem labels and partition names
are to be assumed by the OS components.

Partition Table

The OS will support both MBR and GPT as partition table type. In this way, the
OS can achieve more extensive device support.

The OS assumes a GPT disk layout as it follows:

  • 4MiB is left untouched at the start of the disk (to accommodate for
    hardware-specific requirements).

  • All partitions are aligned to 4MiB.

  • The filesystem labels and partition names are as described above.

On the MBR side, the disk layout is similar to GPT. The design mainly
workarounds the four physical partitions limitation:

  • 4MiB is left untouched at the start of the disk (to accommodate for
    hardware-specific requirements).

  • All partitions are aligned to 4MiB.

  • The filesystem labels and partition names are as described above.

  • The 4th partition is defined as extended and contains all the data partitions
    (devicedata, sysdata and appdata).