2020–09–14:
Putting 13 PinePhone distributions on a 8GiB uSD card
This is a summary of how I was able to make a 5 GiB
multi-boot 13-distro image, that easily fits on a 8GiB SD card, and can be
used to test-drive most of the OSes that are available for Pinephone today.
Starting
with a 32GiB image and an old-school approach
At first I did the obvious, and started with traditional partitioning.
I created 10 partitions on a 32GiB SD card, and was barely able to fit
9 Linux distributions into that image. Each paritition needed its own
filesystem and some free space to allow for proper operation of the OS. It was
also hard to mange. Each time I needed to change anything, I had to figure out
which partition held which OS, mount it, fix things and unmount again.
It was not good.
Swithching
filesystems and sharing free space
The breakthroug came with a realization that I can mount subvolume in btrfs
as if it was a filesystem in its own right. This allowed me to have just a
single partition with a btrfs filesystem on the SD card, and have each
distribution contained in its own subvolume. This way, all distributions could
share from the same pool of free space for the data. There was no longer any
need to plan partition sizes for each distro. This was the first major win.
Subvolumes are also much easier to manage than partitions. Need a new
subvolume? Just create one and copy some files to it. Don't need the subvolume
anymore? Just delete it.
Subvolumes appear as normal directories in the filesystem when the root
subvolume is mounted, so one mount operation is all that's needed to have
access to files of all included Linux distributions.
I also used another feature of subvolumes to debug startup issues in some
distributions. Snapshotting. It's possible to create a snapshot of a current
state of any of the distributions indivudally, and restore it in the future.
Ubuntu Touch gave me a headscratch, when it didn't boot during the first boot,
but it did on the second one. So I made a snapshot of the initial state of the
filesystem, and kept comming back to it and tweaking it until I found the
reason. (One of the boot scripts checked for the presence of a file named
userdata/.writable_image
and if it was missing it tried to reboot
the system and failed, crashing the lightdm process.)
In fact, the latest multi-boot image keeps the initial state of all included
distributions in a snapshot, so it's possible to selectively restore state of
any of the distributions to the original.
Compression
Looking at btrfs, I found that it supports transparent compression for the
stored files. Enabling zstd compression and decompressing rootfs tarballs of
9 distributions resulted in a filesystem that used 5.8 GiB of space with a
compression ratio of 50%. This was exciting, because it looked like having a
8 GiB SD card image will be possible.
Using compression is also good for antoher reason. SD card access in
Pinephone is limited to 24MiB/s max. At this speed, 4 core Cortex-A53 CPU can
easily keep up, and this makes the loading of data from the SD card faster.
Some interesting
features of COW filesystems
So far the optimization steps were fairly trivial. Just enabling features of
existing filesystem and using them in a smart way.
Getting from 5.8GiB image with 9 Linux distributions to a 5 GiB image with
13 Linux distributions was harder. It was likely that those distributions have
a lot of files in common. I made a simple test, and found that there's a space
for further savings in the range of 12–13% if I could make the filesystem
share data for duplicate files among the distributions.
COW filesystems, like btrfs, can sometimes allow to share file data among
multiple files without resorting to hardlinking. In fact, there's now a generic
VFS-level
Linux API to make a filesystem share file data between files if it supports
this feature.
Side note: Hardlinking would not be a great solution for reducing file data
duplication in the image anyway, because it would have meant that if user
changed a file in Ubuntu Touch, the change might be visible in other distros
that may share that file.
The next hurdle was figuring out how to effectively use this API in my
specific scenario. There are tools that allow scanning the existing filesystem,
search for duplicities and use this API to remove them. Then it would be
possible to scrub the filesystem, discard empty space, and compress the
resulting block device image for easy re-distribution. I checked a bunch of
those tools, and they seemed exceedingly complicated, or buggy when used on
btrfs subvolumes.
Instead I decided to write a extraction
tool using libarchive that takes multiple tarballs on the input (one per
Linux distribution) and decompresses them while keeping track of content of
already extracted files and using the above mentioned FICLONE API to share data
whenver it finds file that was already extracted before. It does deduplication
on the fly, so there's no need to apply any further cleanup steps to the
filesystem afterwards. This approach to deduplication is also very fast and
efficient, because there's no extra IO necessary.
At this point I tried to add 4 more Linux distributions to the image.
I did this just to stress test the new extraction tool. In particular, I added
several variants of postmarket OS, which were bound to have a lot of the files
in common.
The resulting image had the same size as the previously made 9 distribution
variant.
Final space savings
The final opportunity for size optimization was a cheap one. Remove files
that are not used/needed. I didn't want to tweak the distributions too much, so
that they are as close to their official state as possible. I had to make one
change, though. I was not able to use distributions' own official kernels,
because neither had a driver for btrfs built in. I also was not very fond of
supporting outdated EOLed kernels many of the distributions use, or re-building
them. So I used my own kernel and made all the distributions share it.
Due to this I was able to remove the modules, firmware, and kernels that
distributions package themselves. With 13 distributions this led to another
0.8GiB of saved space.
The result is a 5 GiB 13-distro
image, that easily fits on a 8GiB SD card, and can be used to test-drive
most of the OSes that
are available for Pinephone today.
The image also serves as a demo of the GUI
bootloader I wrote for Pinephone.