From: phoebe-lab local fix Subject: [PATCH] builds.sr.ht-images ubuntu/genimg: fix nbd settle race + host chown + networkd match Three independent fixes for the upstream `images/ubuntu/genimg` script as shipped in the `builds.sr.ht-images` Alpine package (last verified against 0.103.12-r0). All three bugs make image generation or first-boot networking fail on any host that doesn't happen to satisfy their hidden assumptions. 1) After `qemu-nbd --connect`, the kernel hasn't picked up the device size yet, so the immediate `dd if=mbr.bin of=/dev/nbd0` fails with "No space left on device". The sibling `debian/genimg` already handles this with a partprobe loop; ubuntu's variant is missing it. Verbatim copy of the debian fix. 2) `chown build:build /mnt/home/build/.gitconfig` runs on the *host*, not in the chroot. There's no host user named `build` on a typical Arch/Debian builder VM, so the script aborts at the very end after 5+ min of debootstrap work. Routing the chown through `run_root` (the existing chroot helper used everywhere else in the file) puts it where the user actually exists. 3) `/etc/systemd/network/25-ens3.network` matches only `Name=ens3`. The kernel actually picks the predictable name from PCI topology, and on QEMU machine types newer than upstream's tested combo (e.g. modern noble + virtio on i440fx slot 3) the interface comes up as `enp0s3` or similar — so the file matches nothing, no DHCP runs, the guest has no IP, and the builds-worker spins for 2 min `Waiting for guest to settle` before giving up. Widen the match to any ethernet name (`Name=en*`); the image only ever has one NIC, so there's no risk of binding the wrong one. Rename the file to drop the stale interface-name hint. 4) Hard-coded public DNS (`8.8.8.8` / `9.9.9.9` / `1.1.1.1`) in the guest's `/etc/resolv.conf` makes the VM resolve internal hostnames (e.g. `git.srht.bigb.es`) to their *public* address. On a self-hosted forge whose public IP routes back through the user's own ISP, the worker container can't NAT-hairpin to itself and every `git clone` from a build dies with "Couldn't connect to server". The fix is to use QEMU SLIRP's built-in DNS proxy (`10.0.2.3`) in the *guest*: SLIRP forwards each query through the host process's `/etc/resolv.conf`, which inside the worker container is Docker's embedded resolver, which knows the LAN DNS — so internal hostnames resolve to the LAN IP and the path stays entirely on-LAN. Subtle constraint: the script's `/mnt/etc/resolv.conf` is dual-purpose — `chroot`'d `apt-get` reads it during debootstrap, *and* the same file is what the booted guest uses. SLIRP's `10.0.2.3` only exists *inside* a QEMU SLIRP guest; on the image-builder host it's unroutable, so swapping the file early breaks `apt-get install linux-image-generic` with "Temporary failure resolving 'archive.ubuntu.com'". Leave the public DNS in place for the build, then overwrite `/mnt/etc/resolv.conf` with `nameserver 10.0.2.3` at the end of the script — after all apt operations, immediately before `sync`. This way the build still works and the booted guest gets the SLIRP DNS proxy. Apply when refreshing the apk recipe tree on the image-builder host: cd /var/lib/images patch -p1 < builds-images-ubuntu-genimg.patch --- a/ubuntu/genimg +++ b/ubuntu/genimg @@ -35,6 +35,7 @@ qemu-img create -f qcow2 $arch/root.img.qcow2 32G modprobe nbd max_part=16 qemu-nbd --connect=/dev/nbd0 $arch/root.img.qcow2 +for i in 1 2 3 4 5; do sleep 0.$i; partprobe /dev/nbd0 && break; done trap cleanup EXIT if [ "$arch" = "amd64" ] @@ -85,9 +86,9 @@ rm -f /mnt/etc/resolv.conf echo 'nameserver 8.8.8.8' >/mnt/etc/resolv.conf echo 'nameserver 8.8.4.4' >>/mnt/etc/resolv.conf -cat >/mnt/etc/systemd/network/25-ens3.network </mnt/etc/systemd/network/25-ethernet.network </mnt/etc/resolv.conf sync