The moment Container Linux was updated and almost broke our fleet

The value proposition offered by Container Linux by CoreOS (Red Hat / IBM) is its ability to perform automated operating system updates thanks to its read-only active/passive /usr mount point, the update_engine, and Omaha. This philosophy (“secure the Internet“) allows system administrators to stop worrying about low-level security patches and helps define a clear separation of concerns between operations and applications. In practice, the update_engine periodically pings the configured CoreUpdate server, and upon finding a new available version of Container Linux, downloads and installs it to the passive partition, which is then marked as the active partition for the next boot. We note that CoreUpdate does not send updates to all the online nodes at once, but spreads their release over several hours/days. After that, a few strategies exist to apply the update, the most common being locksmith‘s etcd-lock which restarts up to n servers simultaneously. The second most frequently […]