megi's PinePhone Development Log RSS

Genocide is bad

2020–09–14: Putting 13 PinePhone distributions on a 8GiB uSD card

This is a summary of how I was able to make a 5 GiB multi-boot 13-distro image, that easily fits on a 8GiB SD card, and can be used to test-drive most of the OSes that are available for Pinephone today.

Starting with a 32GiB image and an old-school approach

At first I did the obvious, and started with traditional partitioning. I created 10 partitions on a 32GiB SD card, and was barely able to fit 9 Linux distributions into that image. Each paritition needed its own filesystem and some free space to allow for proper operation of the OS. It was also hard to mange. Each time I needed to change anything, I had to figure out which partition held which OS, mount it, fix things and unmount again.

It was not good.

Swithching filesystems and sharing free space

The breakthroug came with a realization that I can mount subvolume in btrfs as if it was a filesystem in its own right. This allowed me to have just a single partition with a btrfs filesystem on the SD card, and have each distribution contained in its own subvolume. This way, all distributions could share from the same pool of free space for the data. There was no longer any need to plan partition sizes for each distro. This was the first major win.

Subvolumes are also much easier to manage than partitions. Need a new subvolume? Just create one and copy some files to it. Don't need the subvolume anymore? Just delete it.

Subvolumes appear as normal directories in the filesystem when the root subvolume is mounted, so one mount operation is all that's needed to have access to files of all included Linux distributions.

I also used another feature of subvolumes to debug startup issues in some distributions. Snapshotting. It's possible to create a snapshot of a current state of any of the distributions indivudally, and restore it in the future. Ubuntu Touch gave me a headscratch, when it didn't boot during the first boot, but it did on the second one. So I made a snapshot of the initial state of the filesystem, and kept comming back to it and tweaking it until I found the reason. (One of the boot scripts checked for the presence of a file named userdata/.writable_image and if it was missing it tried to reboot the system and failed, crashing the lightdm process.)

In fact, the latest multi-boot image keeps the initial state of all included distributions in a snapshot, so it's possible to selectively restore state of any of the distributions to the original.

Compression

Looking at btrfs, I found that it supports transparent compression for the stored files. Enabling zstd compression and decompressing rootfs tarballs of 9 distributions resulted in a filesystem that used 5.8 GiB of space with a compression ratio of 50%. This was exciting, because it looked like having a 8 GiB SD card image will be possible.

Using compression is also good for antoher reason. SD card access in Pinephone is limited to 24MiB/s max. At this speed, 4 core Cortex-A53 CPU can easily keep up, and this makes the loading of data from the SD card faster.

Some interesting features of COW filesystems

So far the optimization steps were fairly trivial. Just enabling features of existing filesystem and using them in a smart way.

Getting from 5.8GiB image with 9 Linux distributions to a 5 GiB image with 13 Linux distributions was harder. It was likely that those distributions have a lot of files in common. I made a simple test, and found that there's a space for further savings in the range of 12–13% if I could make the filesystem share data for duplicate files among the distributions.

COW filesystems, like btrfs, can sometimes allow to share file data among multiple files without resorting to hardlinking. In fact, there's now a generic VFS-level Linux API to make a filesystem share file data between files if it supports this feature.

Side note: Hardlinking would not be a great solution for reducing file data duplication in the image anyway, because it would have meant that if user changed a file in Ubuntu Touch, the change might be visible in other distros that may share that file.

The next hurdle was figuring out how to effectively use this API in my specific scenario. There are tools that allow scanning the existing filesystem, search for duplicities and use this API to remove them. Then it would be possible to scrub the filesystem, discard empty space, and compress the resulting block device image for easy re-distribution. I checked a bunch of those tools, and they seemed exceedingly complicated, or buggy when used on btrfs subvolumes.

Instead I decided to write a extraction tool using libarchive that takes multiple tarballs on the input (one per Linux distribution) and decompresses them while keeping track of content of already extracted files and using the above mentioned FICLONE API to share data whenver it finds file that was already extracted before. It does deduplication on the fly, so there's no need to apply any further cleanup steps to the filesystem afterwards. This approach to deduplication is also very fast and efficient, because there's no extra IO necessary.

At this point I tried to add 4 more Linux distributions to the image. I did this just to stress test the new extraction tool. In particular, I added several variants of postmarket OS, which were bound to have a lot of the files in common.

The resulting image had the same size as the previously made 9 distribution variant.

Final space savings

The final opportunity for size optimization was a cheap one. Remove files that are not used/needed. I didn't want to tweak the distributions too much, so that they are as close to their official state as possible. I had to make one change, though. I was not able to use distributions' own official kernels, because neither had a driver for btrfs built in. I also was not very fond of supporting outdated EOLed kernels many of the distributions use, or re-building them. So I used my own kernel and made all the distributions share it.

Due to this I was able to remove the modules, firmware, and kernels that distributions package themselves. With 13 distributions this led to another 0.8GiB of saved space.

The result is a 5 GiB 13-distro image, that easily fits on a 8GiB SD card, and can be used to test-drive most of the OSes that are available for Pinephone today.

The image also serves as a demo of the GUI bootloader I wrote for Pinephone.