Installation errors out when installing in a chroot

Bug #2043059 reported by Brad Figg
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
kdump-tools (Ubuntu)
Fix Released
Undecided
dann frazier
Jammy
Fix Released
Undecided
dann frazier
Mantic
Fix Released
Undecided
dann frazier
Noble
Fix Released
Undecided
dann frazier
linux-nvidia (Ubuntu)
Invalid
Undecided
Unassigned
Jammy
Invalid
Undecided
Unassigned
Mantic
Invalid
Undecided
Unassigned
Noble
Invalid
Undecided
Unassigned

Bug Description

[Impact]
When installing in a chroot environment, the kdump-tools kernel hook will fail. This breaks certain OS image creation tools, such as BCM (which I assume is NVIDIA Base Command Manager from context). While BCM's image generation tool requires some additional tweaks to consume this, the proposed fix is necessary, and should be sufficient for other tools that use chroots.

[Test Plan]
1) debootstrap a chroot environment for the target Ubuntu release
2) Install kdump-tools within that environment
3) Install a kernel package and make sure it succeeds.
4) do the same steps in a virtual machine, but with `ischroot` symlinked to /dev/true at install time. This should simulate a system that was just installed with an image prepared as above.
5) Then restore ischroot, and reboot.
6) Confirm the systemd service does generate an initramfs at boot, and
7) that a crash dump can be triggered afterwords.

[Where Problems Could Occur]
The solution we implemented in noble was simply to detect a chroot environment with ischroot, and disable initramfs generation. The systemd service will detect that an initrd is missing, and generate one on first boot. A source of potential problems is if that does not happen for some reason, resulting in a kdump-tools service failure on first boot.

Revision history for this message
Sam Tannous (stannous) wrote :

The complete recreate steps with errors are the following:

Steps to reproduce:
1. Create a directory on a BCM head node or any Linux sever, say /cm/images/dgx-h100-image

2. Download and unpack the DGX-OS base tar archive into the above directory

$ wget http://bright-dev.nvidia.com/base-distributions/x86_64/dgx-os/dgx-os-6.1-trd4/DGXOS-6.1.0-DGX-H100.tar.gz -P /tmp

$ cd /cm/images/dgx-h100-image

$ tar -xvzf /tmp/DGXOS-6.1.0-DGX-H100.tar.gz

3. Install the latest kernel after chroot-ing to /cm/images/dgx-h100-image

$ cm-chroot-sw-img /cm/images/dgx-h100-image (Note: cm-chroot-sw-img is just a wrapper around chroot that additionally takes care of mounting the virtual filesystems such as /proc /sys etc. in the target directory)

$ apt-get update
$ apt-get install linux-image-5.15.0-1040-nvidia
Setting up linux-image-5.15.0-1040-nvidia (5.15.0-1040.40) ...
Processing triggers for linux-image-5.15.0-1040-nvidia (5.15.0-1040.40) ...
/etc/kernel/postinst.d/dkms:
 * dkms: running auto installation service for kernel 5.15.0-1040-nvidia
   ...done.
/etc/kernel/postinst.d/initramfs-tools:
update-initramfs: Generating /boot/initrd.img-5.15.0-1040-nvidia
cryptsetup: WARNING: Couldn't determine root device
W: Couldn't identify type of root file system for fsck hook
cp: cannot stat '/etc/iscsi/initiatorname.iscsi': No such file or directory
/etc/kernel/postinst.d/kdump-tools:
kdump-tools: Generating /var/lib/kdump/initrd.img-5.15.0-1040-nvidia
mkinitramfs: failed to determine device for /
mkinitramfs: workaround is MODULES=most, check:
grep -r MODULES /var/lib/kdump/initramfs-tools

Error please report bug on initramfs-tools
Include the output of 'mount' and 'cat /proc/mounts'
update-initramfs: failed for /var/lib/kdump/initrd.img-5.15.0-1040-nvidia with 1.
run-parts: /etc/kernel/postinst.d/kdump-tools exited with return code 1
dpkg: error processing package linux-image-5.15.0-1040-nvidia (--configure):
 installed linux-image-5.15.0-1040-nvidia package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
 linux-image-5.15.0-1040-nvidia
E: Sub-process /usr/bin/dpkg returned an error code (1)

Revision history for this message
Jamie Nguyen (jamien) wrote (last edit ):

I believe the key to the reproducer is having the kdump-tools package installed in the chroot. The kdump-tools package provides a /etc/kernel/postinst.d/kdump-tools script, which has the following snippet:

(Apologies for formatting, the interface removes the leading spaces from this code)

 43 if test "${MODULES-most}" = most; then
 44 # Switch from "most" to "dep" to reduce the size of the initramfs.
 45 # "netboot" and "list" are expected to be already small enough.
 46 KDUMP_MODULES=dep
 47 fi
 48
 49 # We need a modified copy of initramfs-tools directory
 50 # with MODULES=dep in initramfs.conf
 51 if [ ! -d "$kdumpdir" ];then
 52 mkdir -p "$kdumpdir"
 53 fi
 54 # Force re-creation of $kdumpdir/initramfs-tools
 55 # in case the source has changed since last time
 56 # we ran
 57 if [ -d "$kdumpdir/initramfs-tools" ];then
 58 rm -Rf $kdumpdir/initramfs-tools
 59 fi
 60 cp -pr /etc/initramfs-tools "$kdumpdir"
 61
 62 initramfsdir="$kdumpdir/initramfs-tools"
 63
 64 if test -n "${KDUMP_MODULES-}" -a "${KDUMP_MODULES-}" != "${MODULES}"; then
 65 mkdir -p "$initramfsdir/conf.d"
 66 echo "MODULES=${KDUMP_MODULES}" > "$initramfsdir/conf.d/zzz-kdump"
 67 fi

This changes the initramfs-tools default from "MODULES=most" to "MODULES=dep" before generating the initrd. "MODULES=dep" fails in a chroot because there's no way to map '/' back to the block device that it resides on.

Revision history for this message
Ian May (ian-may) wrote :

I don't appear to have access to the image file used in the reproducer.
http://bright-dev.nvidia.com/base-distributions/x86_64/dgx-os/dgx-os-6.1-trd4/DGXOS-6.1.0-DGX-H100.tar.gz

So instead I'm using the following image for reproducing.
https://cloud-images.ubuntu.com/jammy/20231027/jammy-server-cloudimg-amd64.tar.gz

The error indicates to me that it can't find the root device. If I don't bind mount /dev into my image, I'm able to recreate the error with both linux-generic and linux-nvidia. With the host /dev mounted into the chroot both kernels are able to call mkinitramfs successfully.

Can you confirm that 'cm-chroot-sw-img' is mounting /dev?
mount | grep /cm/images/dgx-h100-image/dev

If we are lucky and it happens to not be mounted could you try the following:
sudo mount --bind /dev /cm/images/dgx-h100-image/dev
sudo chroot /cm/images/dgx-h100-image
/etc/kernel/postinst.d/kdump-tools 5.15.0-1040-nvidia

If /dev is correctly mounted and problem persists, I'll probably need a way to get that image tar to further investigate.

Thanks,
Ian

Revision history for this message
Sam Tannous (stannous) wrote :
Revision history for this message
Sam Tannous (stannous) wrote :

Hello Ian,

It looks like we are mounting /dev:

here is how /dev is mounted when you use the cm-chroot-sw-image command,

jd-b91-s15sp4-11-07:~ # mount | grep image
udev on /cm/images/default-image/dev type devtmpfs (rw,relatime,size=4096k,nr_inodes=1048576,mode=755,inode64)
devpts on /cm/images/default-image/dev/pts type devpts (rw,relatime,gid=5,mode=620,ptmxmode=000)
proc on /cm/images/default-image/proc type proc (rw,relatime)
sysfs on /cm/images/default-image/sys type sysfs (rw,relatime)
tmpfs on /cm/images/default-image/run type tmpfs (rw,relatime,inode64)

It looks like the tar file DGXOS-6.1.0-DGX-H100.tar.gz is too large to upload here.
If you have an alternate location, I can upload it there.

--Sam

Revision history for this message
Sam Tannous (stannous) wrote :

Hello Ian,

I've uploaded the cm-chroot-sw-img script here.

Please let me know if you need more info.

--Sam

Revision history for this message
dann frazier (dannf) wrote :

Thanks for providing that script Sam. Jamie has nailed the problem in Comment #2. A *seemingly* obvious solution is to add code to the /etc/kernel/postinst.d/kdump-tools hook that detects when it is running in a chroot, and if so, exits before trying to make an initramfs. When the real system boots, the kdump-tools should detect that no initramfs for the running kernel exists, and it can generate a proper one at that time.

There are 2 tools in Debian/Ubuntu that packages use to detect if you are running in a chroot - `ischroot` and `systemd-detect-virt --chroot`. So I asked the team to test this out:
  https://salsa.debian.org/dannf/kdump-tools/-/commit/2b70360d6aeaa0874e2cbd917f39e3fcaa3f56be

This seems to DTRT. But it appears not to work with your cm-chroot-sw-img script. That script fools both chroot detection tools. This is due to the use of `unshare --mount-proc`. I believe this is because that causes unshare to place everything in a new mount namespace, whereas these tools rely on pid 1's "/" being different than the chroot'd "/" to detect the chroot.

If you'd be able to remove --mount-proc, then we can look into adding such code.

An alternative option is to add a diversion for /etc/kernel/postinst.d/kdump-tools before installing packages in your chroot:

$ sudo dpkg-divert --divert /etc/kernel/postinst.d/kdump-tools.disabled \
                   --rename /etc/kernel/postinst.d/kdump-tools

run-parts won't exec hooks with a "." in the name, so this will prevent the hook from firing until the diversion is later removed. You can add this diversion whether or not kdump-tools is installed. If the diversion is in place, it will rename the file for you automatically when you do install kdump-tools.

Once you are done installing packages, you can then remove the diversion to re-enable the hook:

$ sudo dpkg-divert --rename --remove /etc/kernel/postinst.d/kdump-tools
Removing 'local diversion of /etc/kernel/postinst.d/kdump-tools to /etc/kernel/postinst.d/kdump-tools.disabled'

Revision history for this message
Sam Tannous (stannous) wrote :

Thanks, Dann,

This dpkg-divert won't work unless we first remove the package this "clashes" with (nvidia-crashdump had already diverted this file). Not sure if this is really much better than just removing kdump-tools (and dependencies) and then reinstalling them later.
We're still looking into removing the `unshare --mount-proc`. Perhaps this would be a better long term solution.

dann frazier (dannf)
Changed in linux-nvidia (Ubuntu):
status: New → Invalid
Changed in kdump-tools (Ubuntu):
status: New → Triaged
assignee: nobody → dann frazier (dannf)
Revision history for this message
dann frazier (dannf) wrote :

hey Sam,

I looked into ways to try to teach the chroot detection tools how to detect a chroot in a new pid namespace (which is actually the problem, not the mount namespace), but I didn't find a good solution there.

However, it occurs to me that you can simply override the answer:

ln -sf ../../bin/true ${SW_IMG}/usr/local/bin/ischroot
echo 'DPkg::Path "/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin";' > /etc/apt/apt.conf.d/99-usr-local-dpkg-path
unshare -p -u -f --mount-proc="${SW_IMG}/proc" chroot "${SW_IMG}" "${CHROOT_SCRIPT}"
EXIT_CODE=$?
rm ${SW_IMG}/etc/apt/apt.conf.d/99-usr-local-dpkg-path
rm ${SW_IMG}/usr/local/bin/ischroot

That would have the added benefit of solving the problem for other/future packages as well.

If we were to update kdump-tools to detect a chroot as described in comment #7 (test build in https://launchpad.net/~dannf/+archive/ubuntu/lp2043059 ), could you update your script to do the above?

Revision history for this message
Sam Tannous (stannous) wrote (last edit ):

Hey Dann,

This seems to work ok if I make the changes in our script.
(one minor change in the line below to add ${SW_IMG})

    echo 'DPkg::Path "/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin";' > ${SW_IMG}/etc/apt/apt.conf.d/99-usr-local-dpkg-path

Let me propose this script change to the owners I'll get back to you.

(wouldn't the check 'unshare -U true' be a better check than ischroot since this would work with any script in any linux kernel (post 2013)?)

Thanks,
Sam

Revision history for this message
dann frazier (dannf) wrote :

Cool, thanks for testing Sam. Apologies for the missing ${SW_IMG}.

And thanks for the 'unshare -U true' suggestion. I gather your point is that 'unshare -U' will fail in any chroot environment. I'm not an expert on namespaces but, from what I can tell, that does seem to be true in all modern kernels as you say. However, 'unshare -U' can fail for other reasons too - say, if the admin set /proc/sys/user/max_user_namespaces to 0, or when running a kernel with CONFIG_USER_NS=n. If we assume all failures mean we are in a chroot, then we risk not configuring kdump in such environments.

Revision history for this message
Sam Tannous (stannous) wrote :

You're correct. It can fail for other reasons. I don't think we could check the errno, could we? It would be 1, "Operation not permitted".

If not, I think we can live with the last suggestion.

Thanks,

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package kdump-tools - 1:1.8.2ubuntu1

---------------
kdump-tools (1:1.8.2ubuntu1) noble; urgency=medium

  * Merge from Debian unstable. Remaining changes:
    - debian/control/tests: sleep while waiting for systemd service.
    - Bump amd64 crashkernel from 384M-:128M to 512M-:192M.
    - Add a systemd-resolved service dependency in order kdump-tools is able
      to resolve DNS when in kdump boot.
    - Update default s390x crashkernel.
      - Install the updated zipl.conf with ucf, so users will be able to
        decide whether to pick any crashkernel changes.

kdump-tools (1:1.8.2) unstable; urgency=medium

  * Disable the initramfs generation in our kernel-postinst hook when
    we detect we are running in a chroot. LP: #2043059.

 -- dann frazier <email address hidden> Fri, 15 Dec 2023 15:06:42 -0700

Changed in kdump-tools (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
dann frazier (dannf) wrote : Re: [Bug 2043059] Re: Installation errors out when installing in a chroot

On Fri, Dec 15, 2023 at 3:55 PM Sam Tannous <email address hidden> wrote:
>
> You're correct. It can fail for other reasons. I don't think we could check the errno, could we? It would be 1, "Operation not permitted".

The exit code is not documented in the manpage, and it appears to exit
1 with other failure types:

root@dannf-kdump-chroot:~# unshare -U true
unshare: unshare failed: No space left on device
root@dannf-kdump-chroot:~# echo $?
1

I also suspect this is not the only situation that could return -EPERM.

> If not, I think we can live with the last suggestion.

ACK. As you can see above, I've uploaded that change to noble, and
will begin the SRU process back to jammy.

  -dann

dann frazier (dannf)
Changed in linux-nvidia (Ubuntu Mantic):
status: New → Invalid
Changed in linux-nvidia (Ubuntu Jammy):
status: New → Invalid
Changed in kdump-tools (Ubuntu Mantic):
assignee: nobody → dann frazier (dannf)
Changed in kdump-tools (Ubuntu Jammy):
assignee: nobody → dann frazier (dannf)
Changed in kdump-tools (Ubuntu Mantic):
status: New → In Progress
Changed in kdump-tools (Ubuntu Jammy):
status: New → In Progress
dann frazier (dannf)
description: updated
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello Brad, or anyone else affected,

Accepted kdump-tools into mantic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/kdump-tools/1:1.8.1ubuntu1.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-mantic to verification-done-mantic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-mantic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in kdump-tools (Ubuntu Mantic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-mantic
Changed in kdump-tools (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed-jammy
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello Brad, or anyone else affected,

Accepted kdump-tools into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/kdump-tools/1:1.6.10ubuntu2.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
dann frazier (dannf) wrote :
Download full text (8.2 KiB)

# verification steps 1-3: installation in a chroot

I set up 2 chroots - one w/ the normal sources.list, the other w/ proposed enabled, and did a side by side install test.

$ for chroot in mantic mantic-proposed jammy jammy-proposed; do sudo chroot $chroot apt list kdump-tools; done
Listing... Done
kdump-tools/mantic,now 1:1.8.1ubuntu1 amd64 [installed]
Listing... Done
kdump-tools/mantic-proposed,now 1:1.8.1ubuntu1.1 amd64 [installed]
N: There is 1 additional version. Please use the '-a' switch to see it
Listing... Done
kdump-tools/jammy-updates,now 1:1.6.10ubuntu2.1 amd64 [installed]
N: There is 1 additional version. Please use the '-a' switch to see it
Listing... Done
kdump-tools/jammy-proposed,now 1:1.6.10ubuntu2.2 amd64 [installed]
N: There are 2 additional versions. Please use the '-a' switch to see them.

## mantic

### first w/ only updates enabled

dannf@lakitu:~$ sudo chroot mantic-proposed apt install kdump-tools
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
kdump-tools is already the newest version (1:1.8.1ubuntu1).
0 upgraded, 0 newly installed, 0 to remove and 40 not upgraded.
1 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n]
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
 LANGUAGE = (unset),
 LC_ALL = (unset),
 LANG = "en_US.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
locale: Cannot set LC_CTYPE to default locale: No such file or directory
locale: Cannot set LC_MESSAGES to default locale: No such file or directory
locale: Cannot set LC_ALL to default locale: No such file or directory
E: Can not write log (Is /dev/pts mounted?) - posix_openpt (19: No such device)
Setting up linux-image-6.5.0-15-generic (6.5.0-15.15) ...
Processing triggers for linux-image-6.5.0-15-generic (6.5.0-15.15) ...
/etc/kernel/postinst.d/initramfs-tools:
update-initramfs: Generating /boot/initrd.img-6.5.0-15-generic
/etc/kernel/postinst.d/kdump-tools:
kdump-tools: Generating /var/lib/kdump/initrd.img-6.5.0-15-generic
mkinitramfs: failed to determine device for /
mkinitramfs: workaround is MODULES=most, check:
grep -r MODULES /var/lib/kdump/initramfs-tools

Error please report bug on initramfs-tools
Include the output of 'mount' and 'cat /proc/mounts'
update-initramfs: failed for /var/lib/kdump/initrd.img-6.5.0-15-generic with 1.
run-parts: /etc/kernel/postinst.d/kdump-tools exited with return code 1
dpkg: error processing package linux-image-6.5.0-15-generic (--configure):
 installed linux-image-6.5.0-15-generic package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
 linux-image-6.5.0-15-generic
E: Sub-process /usr/bin/dpkg returned an error code (1)

### and now w/ proposed

$ sudo chroot mantic-proposed apt install kdump-tools -t mantic-proposed
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages will be upgraded:
  kdump-tools
1 upgraded, 0 newly installed, 0 to remove and 50 not upg...

Read more...

Revision history for this message
dann frazier (dannf) wrote :
Revision history for this message
dann frazier (dannf) wrote :
tags: added: verification-done verification-done-jammy verification-done-mantic
removed: verification-needed verification-needed-jammy verification-needed-mantic
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package kdump-tools - 1:1.8.1ubuntu1.1

---------------
kdump-tools (1:1.8.1ubuntu1.1) mantic; urgency=medium

  * Disable the initramfs generation in our kernel-postinst hook when
    we detect we are running in a chroot. LP: #2043059.

 -- dann frazier <email address hidden> Thu, 01 Feb 2024 07:33:05 -0700

Changed in kdump-tools (Ubuntu Mantic):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for kdump-tools has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package kdump-tools - 1:1.6.10ubuntu2.2

---------------
kdump-tools (1:1.6.10ubuntu2.2) jammy; urgency=medium

  * Disable the initramfs generation in our kernel-postinst hook when
    we detect we are running in a chroot. LP: #2043059.

 -- dann frazier <email address hidden> Thu, 01 Feb 2024 07:34:55 -0700

Changed in kdump-tools (Ubuntu Jammy):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.