install: Enable installing to multi device parents#1911
install: Enable installing to multi device parents#1911jmarrero merged 8 commits intobootc-dev:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request successfully enables installing to multi-device parent filesystems, such as LVM spanning multiple disks. It correctly discovers all parent devices and, for bootupd/GRUB, installs the bootloader to all devices with an ESP partition. For bootloaders that only support single-device configurations like systemd-boot and zipl, the implementation correctly defaults to using the first available device. The changes are well-architected, adapting data structures and logic to handle multiple devices. A new, thorough integration test validates both single and dual ESP scenarios. Overall, this is a solid enhancement with good error handling and logging. I have one suggestion to further improve the robustness of ESP detection.
f2a175a to
f7b1892
Compare
|
waiting to merge until the patch release goes out |
77b65cb to
d03c6fa
Compare
d03c6fa to
9b1c313
Compare
6802697 to
081f3b2
Compare
081f3b2 to
9d0e284
Compare
9d0e284 to
4ccc192
Compare
| # See https://tmt.readthedocs.io/en/stable/stories/features.html#reboot-during-test | ||
| match $env.TMT_REBOOT_COUNT? { | ||
| null | "0" => test_single_esp, | ||
| "1" => { test_dual_esp; test_three_devices_partial_esp; tmt-reboot }, |
There was a problem hiding this comment.
How about add another test scenario like RAID1 that using the whole disks (see coreos/bootupd#1059), for example:
root@localhost-live:/home/fedora# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
vda 253:0 0 30G 0 disk
└─md126 9:126 0 30G 0 raid1
├─md126p1 259:0 0 477M 0 part
├─md126p2 259:1 0 954M 0 part /boot
└─md126p3 259:2 0 28.6G 0 part /
vdb 253:16 0 30G 0 disk
└─md126 9:126 0 30G 0 raid1
├─md126p1 259:0 0 477M 0 part
├─md126p2 259:1 0 954M 0 part /boot
└─md126p3 259:2 0 28.6G 0 part /
To create RAID1 using command:
sudo mdadm -CR /dev/md126 -e 1 -l1 -n 2 /dev/loop0 /dev/loop1 --assume-clean
541691d to
baa6c67
Compare
9da140b to
b7d4d61
Compare
| /// files before reinstallation. On multi-device setups only the first | ||
| /// ESP is mounted and cleaned; stale files on additional ESPs are left | ||
| /// in place (bootupd will overwrite them during installation). | ||
| // TODO: clean all ESPs on multi-device setups |
There was a problem hiding this comment.
Hmm yeah maybe should move this logic into bootupd
| *) copr_distro="centos-stream" ;; | ||
| esac | ||
| # Update bootc from rhcontainerbot copr; the new bootupd | ||
| # requires a newer bootc than what ships in some base images. |
There was a problem hiding this comment.
But wait we're building this as part of our CI here right?
There was a problem hiding this comment.
yea, this is definitely sub-optimal. This is in place to unblock the bootupd install which will fail due to the bootc version being <= 1.14.1. I might be able to move the bootupd copr install to later in the Dockerfile, after the new bootc build is installed...
b245dc6 to
52602b4
Compare
|
OK I pushed a few cleanups here, when you get back tomorrow can you look @ckyrouac ? |
174ba1c to
3fcf28c
Compare
lgtm, fixed a doc string bug and removed the composefs commit. |
06f8406 to
282160b
Compare
Three issues prevented a successful Anaconda-based bootc install onto a
RAID 1 array with per-disk ESP partitions:
1. Bind mount failure during ostreecontainer payload install:
Anaconda mounts the root FS at /mnt/sysimage and then tries to
bind-mount /mnt/sysimage/boot/efi2 and /boot/efi3 into /mnt/sysroot.
However, it does not create the mount point directories for the
secondary ESPs before the bind mount, causing:
mount: /mnt/sysroot/boot/efi2: special device
/mnt/sysimage/boot/efi2 does not exist.
Fix: add a %pre-install script that creates the directories after
the root FS is mounted but before the payload runs.
2. Empty secondary ESPs after installation:
Anaconda only installs the bootloader (shim, GRUB, grub.cfg) to the
primary ESP at /boot/efi. The secondary ESPs (efi2, efi3) are
formatted and mounted but receive no bootloader files. Additionally,
the 'noauto' fsoption causes Anaconda to unmount them before %post.
Fix: add a %post --nochroot script that explicitly mounts each
secondary ESP by device path and copies the full EFI directory tree
from the primary ESP.
3. GRUB cannot find the kernel on the RAID root filesystem:
The GRUB EFI binary shipped with CentOS Stream 10 does not include
the mdraid1x module. GRUB's grub.cfg uses 'search --fs-uuid' to
locate the root filesystem and load the kernel from the ostree
deployment path, but it cannot access the md array.
Fix: add a separate /boot partition (ext4, 1G) on the primary disk.
GRUB can read ext4 natively without mdraid support. The kernel and
initramfs live on /boot, and the initramfs handles RAID assembly
for the root filesystem during boot.
Note: /boot is only on vda (not mirrored), so only disk 1 can boot
independently. Full boot redundancy from any disk would require either
whole-disk RAID (where GRUB reads from the raw mirror) or a GRUB build
with mdraid1x support. This is a known limitation tracked in:
- bootc-dev/bootc#1911
- coreos/bootupd#1077
|
Needs a rebase 🏄 otherwise looks sane to me! |
Assisted-by: Claude Code (Opus 4) Signed-off-by: ckyrouac <ckyrouac@redhat.com> Signed-off-by: Colin Walters <walters@verbum.org>
The composefs BLS and UKI boot setup paths called find_partition_of_esp() directly on the device, which fails when the root filesystem is on an LVM logical volume (the ESP is on the parent disk, not the LV). The store module had the same issue via require_single_root() + find_partition_of_esp(). Switch all call sites to find_colocated_esps() which walks up to the physical disk(s) via find_all_roots() before searching for the ESP, consistent with what install_systemd_boot and mount_esp_part already do. Assisted-by: Claude Code (Opus 4) Signed-off-by: ckyrouac <ckyrouac@redhat.com> Signed-off-by: Colin Walters <walters@verbum.org>
The test was using `get_target_image` which returns the upstream `docker://quay.io/centos-bootc/centos-bootc:stream10`. On composefs+grub variants provisioned with an updated bootupd from copr, the upstream image has stock bootupd with incompatible EFI update metadata, causing the install to fail with "Failed to find EFI update metadata". Switch to using `containers-storage:localhost/bootc` (the locally-built image), matching the pattern used by test-32, test-37, and test-38. The locally-built image has the updated bootupd with compatible metadata. Assisted-by: Claude Code (Opus 4) Signed-off-by: ckyrouac <ckyrouac@redhat.com> Signed-off-by: Colin Walters <walters@verbum.org>
The initial change to use locally-built images had two additional issues
on composefs:
1. containers-storage: transport fails on composefs's read-only root
with "mkdir /.local: read-only file system". Fix by exporting the
image to an OCI layout directory on writable /var/tmp instead.
2. run_install() was masking /sysroot/ostree and removing bootupd update
metadata, which composefs needs for bootloader installation and boot
binaries. Fix by making run_install() skip these ostree-specific
workarounds on composefs systems.
Note: the composefs install-outside-container code path still has a
separate bug ("Shared boot binaries not found" in boot.rs:745) that
needs fixing in the Rust code.
Assisted-by: Claude Code (Opus 4)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
Signed-off-by: Colin Walters <walters@verbum.org>
The multi-device ESP test creates ESP partitions and expects bootupd to install a UEFI bootloader. On BIOS-booted systems, bootupd instead tries to install GRUB for i386-pc, which requires a BIOS Boot Partition and fails. The test plan already requests UEFI provisioning via the hardware hint, but Testing Farm does not always honor this on CentOS Stream x86_64. Add a runtime check for /sys/firmware/efi so the test skips gracefully on BIOS hosts rather than failing. Assisted-by: Claude Code (Opus 4.6) Signed-off-by: ckyrouac <ckyrouac@redhat.com> Signed-off-by: Colin Walters <walters@verbum.org>
282160b to
cf0ea08
Compare
Extract the repeated PATH environment variable string into a set_default_path() method on BwrapCmd. The bwrap environment may not have a complete PATH, causing tools like bootupctl or sfdisk to not be found. This consolidates the workaround into one place. Assisted-by: OpenCode (Claude Opus 4) Signed-off-by: Colin Walters <walters@verbum.org>
Several improvements to ESP partition discovery: Add find_partition_of_esp_optional() returning Result<Option<&Device>> to cleanly separate three outcomes: found, absent, and genuinely unexpected errors (like unsupported partition table types). The existing find_partition_of_esp() is now a thin wrapper that converts None to Err. Add find_first_colocated_esp() helper to replace a 10-line pattern that was repeated verbatim 5 times across boot.rs and store/mod.rs. Deduplicate roots in find_all_roots() using a seen-set: in complex topologies like multipath, multiple parent branches can converge on the same physical disk. find_colocated_esps() now uses the optional variant to properly propagate real errors while treating absence normally. Also extract the match-on-if-else in setup_composefs_bls_boot into a let binding for readability. Assisted-by: OpenCode (Claude Opus 4) Signed-off-by: Colin Walters <walters@verbum.org> Signed-off-by: Chris Kyrouac <ckyrouac@redhat.com>
The no-ESP test only checked for a non-zero exit code, which would also pass if podman itself failed for unrelated reasons. Check that the output contains "ESP" to confirm the right failure mode. Assisted-by: OpenCode (Claude Opus 4) Signed-off-by: Colin Walters <walters@verbum.org>
cf0ea08 to
64e815e
Compare
|
s390x builds have been broken for a while too, and we know about the ZIPL/systemd boot limitations no. Maybe we should add those to the docs in a follow up. |
When the root filesystem spans multiple backing devices (e.g., LVM across multiple disks), discover all parent devices and find ESP partitions on each. For bootupd/GRUB, install the bootloader to all devices with an ESP partition, enabling boot from any disk in a multi-disk setup. systemd-boot and zipl only support single-device configurations.
This adds a new integration test validating both single-ESP and dual-ESP multi-device scenarios.
Fixes: #481
Assisted-by: Claude Code (Opus 4.5)