Infinite-loops in fsck when booting with damaged /

Bug #501801 reported by Chris Halse Rogers
94
This bug affects 16 people
Affects Status Importance Assigned to Milestone
mountall (Ubuntu)
Fix Released
High
Scott James Remnant (Canonical)
Lucid
Fix Released
High
Scott James Remnant (Canonical)

Bug Description

Binary package hint: upstart

This might actually need to be filed against mountall, but I'll start it here.

If the root filesystem is dirty in such a way that fsck returns "filesystem needs repair, please re-run interactively", upstart will infinite-loop trying to mount /, having fsck fail, and then trying again.

ProblemType: Bug
Architecture: amd64
Date: Thu Dec 31 12:27:59 2009
DistroRelease: Ubuntu 10.04
Package: upstart 0.6.3-11
ProcEnviron:
 LANGUAGE=en_AU.UTF-8
 PATH=(custom, user)
 LANG=en_AU.UTF-8
 SHELL=/bin/zsh
ProcVersionSignature: Ubuntu 2.6.32-9.13-generic
SourcePackage: upstart
Tags: lucid
Uname: Linux 2.6.32-9-generic x86_64

Revision history for this message
Chris Halse Rogers (raof) wrote :
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

mountall, but already fixed

affects: upstart (Ubuntu) → mountall (Ubuntu)
Changed in mountall (Ubuntu):
status: New → Fix Released
Revision history for this message
Chris Jones (cmsj) wrote :

I ran into this on a user's lucid laptop this morning, so re-opening

Changed in mountall (Ubuntu):
status: Fix Released → Confirmed
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote : Re: [Bug 501801] Re: Infinite-loops in fsck when booting with damaged /

On Fri, 2010-04-09 at 10:30 +0000, Chris Jones wrote:

> I ran into this on a user's lucid laptop this morning, so re-opening
>
Could you run mountall --debug and capture the output (it should end up
in /var/log/boot.log now)

 status incomplete

Scott
--
Scott James Remnant
<email address hidden>

Changed in mountall (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Matthew Barker (matthew-barker) wrote :

On 09/04/10 13:05, Scott James Remnant wrote:
> mountall --debug

Log attached.

--

Matthew Barker
Corporate Services, Canonical Ltd.

Tel +44(0)20 7630 2492 | Mob +44(0)7809 389 876
21-24 Millbank | Floor 27 Millbank Tower | London | SW1P 4QP

ubuntu.com | canonical.com

Revision history for this message
Matt Zimmerman (mdz) wrote :

I saw this happen on Matt Barker's ThinkPad X200, though it definitely wasn't running the latest Lucid bits at the time. I'll check which mountall version he was running. Which one is believed to fix this problem?

Matt Zimmerman (mdz)
Changed in mountall (Ubuntu):
status: Incomplete → Triaged
importance: Undecided → High
Changed in mountall (Ubuntu):
assignee: nobody → Scott James Remnant (scott)
Revision history for this message
Oliver Grawert (ogra) wrote :

here is a log from a beagleboard as requested ...

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Ok, this one is quite easy.

The fsck fails, and because it opened the block device for writing, udev tells mountall to try it again - spawning fsck

We shouldn't spawn fsck or mount while mnt->error is set, and the plymouth keys function should clear the error before running with fix

Changed in mountall (Ubuntu):
status: Triaged → Fix Committed
milestone: none → ubuntu-10.04
Revision history for this message
Paul Sladen (sladen) wrote :

Scott: does that fix the -server case where plymouth was never present/installed/enabled in the first place?

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

On Fri, 2010-04-16 at 22:14 +0000, Paul Sladen wrote:

> Scott: does that fix the -server case where plymouth was never
> present/installed/enabled in the first place?
>
mountall Depends: plymouth

So this is an impossible situation to get into.

Scott
--
Scott James Remnant
<email address hidden>

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

I've uploaded a new version of mountall (2.13~ppa1) to https://launchpad.net/~scott/+archive/ppa

Please test and see whether it solves this problem

Thanks

Revision history for this message
Thaylin (thaylinsgate) wrote :

Looking at the same issue, what are my options now that this is happening?

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package mountall - 2.13

---------------
mountall (2.13) lucid; urgency=low

  [ Scott James Remnant ]
  * Once a mountpoint has been skipped, don't try and mount it again
    (unless the udev device actually shows up). LP: #553290.
  * Skipping a filesystem means we should also skip anything that depends
    on that (ie. skip /usr/local when skipping /usr).
  * Don't skip filesystems due to timeout when Plymouth not available.

  * Don't run mount, swapon or fsck while there's an uncleared error on
    the filesystem. LP: #501801.

  * Don't display the filesystem check message when an fsck completes
    without needing to check the filesystem. LP: #564434.
 -- Steve Langasek <email address hidden> Mon, 19 Apr 2010 00:15:58 -0700

Changed in mountall (Ubuntu Lucid):
status: Fix Committed → Fix Released
Revision history for this message
Björn Schließmann (b-schliessmann) wrote :

Bug is not fixed for me in mountall 2.13. Issue I had yesterday:

- Resuming the system failed, so I had to reboot
- Linux' pm_trace is active so system date is set to some remote future date
- While rebooting (without "splash" in kernel command line), root is mounted; fsck finds unexpected inconsistency
- System hangs. Pressing ESC repeats the last point. Booting not possible.

Booting Linux with "single" in kernel command line leads to the exact same result. Pardon my expression, but this is most ridiculous. I had to make a bootable USB FD and fsck my root manually.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

On Sat, 2010-04-24 at 22:30 +0000, Björn Schließmann wrote:

> Bug is not fixed for me in mountall 2.13. Issue I had yesterday:
>
> - Resuming the system failed, so I had to reboot
> - Linux' pm_trace is active so system date is set to some remote future date
> - While rebooting (without "splash" in kernel command line), root is mounted; fsck finds unexpected inconsistency
> - System hangs. Pressing ESC repeats the last point. Booting not possible.
>
> Booting Linux with "single" in kernel command line leads to the exact
> same result. Pardon my expression, but this is most ridiculous. I had to
> make a bootable USB FD and fsck my root manually.
>
Then you are most likely experiencing a different problem than the
original reporter. Please file a new bug with "ubuntu-bug mountall",
and describe in detail what you see on the screen during the boot.

Scott
--
Scott James Remnant
<email address hidden>

Revision history for this message
Mirko Nasato (mirko-nasato) wrote :

Similar problem here, planned disk checks at boot get stuck into an infinite loop on my T61. The progress % displayed on the boot screen (is that what "plymouth" is?) keeps increasing but at a slower and slower pace.

Hitting ESC shows the fsck output, and it keeps re-checking the same partition (/dev/sda1)! Seems like it just doesn't realize it's just been checked. See attached picture.

Should I open a new issue? There are a bunch of similar issues that have been closed as duplicate of this one.

Revision history for this message
Romano Giannetti (romano-giannetti) wrote :

Me too. Not fixed, it seems.

Grr... the upgrade from Karmic to Lucid seems to go for the most painful ever.

Revision history for this message
Mirko Nasato (mirko-nasato) wrote :

Opened bug #580060 because this one is marked as fixed and I have a feeling it won't be reopened.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.