Installation errors out when installing in a chroot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
kdump-tools (Ubuntu) |
Fix Released
|
Undecided
|
dann frazier | ||
Jammy |
Fix Released
|
Undecided
|
dann frazier | ||
Mantic |
Fix Released
|
Undecided
|
dann frazier | ||
Noble |
Fix Released
|
Undecided
|
dann frazier | ||
linux-nvidia (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Jammy |
Invalid
|
Undecided
|
Unassigned | ||
Mantic |
Invalid
|
Undecided
|
Unassigned | ||
Noble |
Invalid
|
Undecided
|
Unassigned |
Bug Description
[Impact]
When installing in a chroot environment, the kdump-tools kernel hook will fail. This breaks certain OS image creation tools, such as BCM (which I assume is NVIDIA Base Command Manager from context). While BCM's image generation tool requires some additional tweaks to consume this, the proposed fix is necessary, and should be sufficient for other tools that use chroots.
[Test Plan]
1) debootstrap a chroot environment for the target Ubuntu release
2) Install kdump-tools within that environment
3) Install a kernel package and make sure it succeeds.
4) do the same steps in a virtual machine, but with `ischroot` symlinked to /dev/true at install time. This should simulate a system that was just installed with an image prepared as above.
5) Then restore ischroot, and reboot.
6) Confirm the systemd service does generate an initramfs at boot, and
7) that a crash dump can be triggered afterwords.
[Where Problems Could Occur]
The solution we implemented in noble was simply to detect a chroot environment with ischroot, and disable initramfs generation. The systemd service will detect that an initrd is missing, and generate one on first boot. A source of potential problems is if that does not happen for some reason, resulting in a kdump-tools service failure on first boot.
Changed in linux-nvidia (Ubuntu): | |
status: | New → Invalid |
Changed in kdump-tools (Ubuntu): | |
status: | New → Triaged |
assignee: | nobody → dann frazier (dannf) |
Changed in linux-nvidia (Ubuntu Mantic): | |
status: | New → Invalid |
Changed in linux-nvidia (Ubuntu Jammy): | |
status: | New → Invalid |
Changed in kdump-tools (Ubuntu Mantic): | |
assignee: | nobody → dann frazier (dannf) |
Changed in kdump-tools (Ubuntu Jammy): | |
assignee: | nobody → dann frazier (dannf) |
Changed in kdump-tools (Ubuntu Mantic): | |
status: | New → In Progress |
Changed in kdump-tools (Ubuntu Jammy): | |
status: | New → In Progress |
description: | updated |
The complete recreate steps with errors are the following:
Steps to reproduce: dgx-h100- image
1. Create a directory on a BCM head node or any Linux sever, say /cm/images/
2. Download and unpack the DGX-OS base tar archive into the above directory
$ wget http:// bright- dev.nvidia. com/base- distributions/ x86_64/ dgx-os/ dgx-os- 6.1-trd4/ DGXOS-6. 1.0-DGX- H100.tar. gz -P /tmp
$ cd /cm/images/ dgx-h100- image
$ tar -xvzf /tmp/DGXOS- 6.1.0-DGX- H100.tar. gz
3. Install the latest kernel after chroot-ing to /cm/images/ dgx-h100- image
$ cm-chroot-sw-img /cm/images/ dgx-h100- image (Note: cm-chroot-sw-img is just a wrapper around chroot that additionally takes care of mounting the virtual filesystems such as /proc /sys etc. in the target directory)
$ apt-get update 5.15.0- 1040-nvidia 5.15.0- 1040-nvidia (5.15.0-1040.40) ... 5.15.0- 1040-nvidia (5.15.0-1040.40) ... postinst. d/dkms: postinst. d/initramfs- tools: img-5.15. 0-1040- nvidia initiatorname. iscsi': No such file or directory postinst. d/kdump- tools: kdump/initrd. img-5.15. 0-1040- nvidia kdump/initramfs -tools
$ apt-get install linux-image-
Setting up linux-image-
Processing triggers for linux-image-
/etc/kernel/
* dkms: running auto installation service for kernel 5.15.0-1040-nvidia
...done.
/etc/kernel/
update-initramfs: Generating /boot/initrd.
cryptsetup: WARNING: Couldn't determine root device
W: Couldn't identify type of root file system for fsck hook
cp: cannot stat '/etc/iscsi/
/etc/kernel/
kdump-tools: Generating /var/lib/
mkinitramfs: failed to determine device for /
mkinitramfs: workaround is MODULES=most, check:
grep -r MODULES /var/lib/
Error please report bug on initramfs-tools kdump/initrd. img-5.15. 0-1040- nvidia with 1. postinst. d/kdump- tools exited with return code 1 5.15.0- 1040-nvidia (--configure): 5.15.0- 1040-nvidia package post-installation script subprocess returned error exit status 1 image-5. 15.0-1040- nvidia
Include the output of 'mount' and 'cat /proc/mounts'
update-initramfs: failed for /var/lib/
run-parts: /etc/kernel/
dpkg: error processing package linux-image-
installed linux-image-
Errors were encountered while processing:
linux-
E: Sub-process /usr/bin/dpkg returned an error code (1)