Skip to content

Comments

extend-filesystems: Fix race condition by not using lsblk#132

Merged
chewi merged 1 commit intoflatcar-masterfrom
chewi/extend-fs-race
Jun 12, 2025
Merged

extend-filesystems: Fix race condition by not using lsblk#132
chewi merged 1 commit intoflatcar-masterfrom
chewi/extend-fs-race

Conversation

@chewi
Copy link
Contributor

@chewi chewi commented Jun 12, 2025

extend-filesystems: Fix race condition by not using lsblk

lsblk relies on udev, which is inherently racy, especially when the filesystem has just been mounted. It has been observed that the FSTYPE field is sometimes not populated, preventing the filesystem from being resized, and causing tests involving the dev container (which is large) to fail.

Use findmnt instead, which gets its information directly from the kernel. It also natively supports filtering by filesystem type and mount option. It doesn't fetch PARTTYPE, but we can safely get that from cgpt.

This could be quite an important fix. We have only seen test failures so far, and maybe that's because it only triggers when the system is under load, but I see no reason why it wouldn't happen in production. Perhaps this only started breaking recently following an update to the kernel or udev or something.

How to use

qemu-img create -f qcow2 -o backing_file=flatcar_production_qemu_uefi_image.img,backing_fmt=qcow2,lazy_refcounts=on,size=16106127360 my_flatcar.img
./flatcar_production_qemu_uefi.sh -I my_flatcar.img,snapshot=on 

Check that the extend-filesystems.service unit works successfully and that / is actually resized.

Testing done

CI has passed. I previously forced the test to fail so that it would retry and I could see the output. Once I had applied the fix, it would successfully resize every time, whereas before it would fail most of the time.

@chewi chewi requested a review from a team June 12, 2025 09:05
@chewi chewi self-assigned this Jun 12, 2025
lsblk relies on udev, which is inherently racy, especially when the
filesystem has just been mounted. It has been observed that the FSTYPE
field is sometimes not populated, preventing the filesystem from being
resized, and causing tests involving the dev container (which is large)
to fail.

Use findmnt instead, which gets its information directly from the
kernel. It also natively supports filtering by filesystem type and mount
option. It doesn't fetch PARTTYPE, but we can safely get that from cgpt.

Signed-off-by: James Le Cuirot <jlecuirot@microsoft.com>
@chewi chewi force-pushed the chewi/extend-fs-race branch from 557815c to 0397383 Compare June 12, 2025 12:51
@chewi chewi merged commit dd9cbe4 into flatcar-master Jun 12, 2025
@chewi chewi deleted the chewi/extend-fs-race branch June 12, 2025 12:59
chewi added a commit to flatcar/scripts that referenced this pull request Jun 12, 2025
An associated Kola fix is included.

Closes: flatcar/init#132
Closes: flatcar/Flatcar#296
Signed-off-by: James Le Cuirot <jlecuirot@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants