entering 'reboot' at the rescue console brings up gdm

Bug #66002 reported by Reinhard Tartler
10
Affects Status Importance Assigned to Milestone
upstart (Fedora)
Won't Fix
Medium
upstart (Ubuntu)
Fix Released
High
Unassigned

Bug Description

Situation: Being dropped to a shell because of a fschck.

There I say 'reboot' to reboot the machine. Result is that gdm is started instead of the machine being rebooted.

System: Freshly installed edgy from daily 13/10/2006.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Known, use "reboot -f"

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Caused by the fact that the rcS job is still running, reboot issues the "shutdown" event but that won't actually complete until rcS and then rc2 have finished ... you'll probably find that the moment it hits gdm, it starts shutting down again?

Changed in upstart:
assignee: nobody → keybuk
importance: Undecided → Low
status: Unconfirmed → Confirmed
Revision history for this message
Reinhard Tartler (siretart) wrote :

no, I had to select 'reboot' at the gdm prompt.

Perhaps I was to fast pressing reboot, however (30seks).

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Oh, right, the gettys would be running and thus save the system from shutdown -- ho-hum :)

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Actually, sorry, scratch that last -- the reboot event can be issued when processes are in start

Changed in upstart:
importance: Low → High
Revision history for this message
Tormod Volden (tormodvolden) wrote :

Seeing the same in Gutsy. It does not start to shutdown after starting gdm.

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Need to figure out what's causing this

Changed in upstart:
assignee: keybuk → nobody
Revision history for this message
Krzysztof Janowski (n00bystance-gmail) wrote :

Isn't it printed on the screen after dropping to maintenance console, that after doing maintenance you have to press Control+D to reboot?

Revision history for this message
Josef Wolf (jw-raven) wrote :

No, this is not printed. And in fact, ctrl-D won't reboot, either. There is really, really no way to properly reboot at this stage.

"reboot -f" is _not_ a proper workaround, since it will not umount/sync filesystems properly.

BTW: This bug still exists in 8.04rc. :-(

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Still there in Ubuntu 8.04.1. I think its especially bad because fsck tells me to reboot (I guess that's recommended after root file system repair) and when I type "reboot" it starts up everything (which must be not recommended for the same reasons).

Revision history for this message
In , Hans (hans-redhat-bugs) wrote :

Description of problem:
When the fsck of a network based blockdevice (nbd/iscsi) fails, the netfs
initscript will drop the user to a root shell to repair things and then issue
"shutdown -r now"

upstart seems to try to continue entering the runlevel it was busy with entering despite the reboot command being given.

While at the same time (in parallel??) starting a runlevel switch to the new (reboot) runlevel.

Most noticeably here is that upstart spews messages like:
-process tty1 died, respawning
-process prefdm died, respawning
(not the exact text but like that)

While its busy shutting down services for the reboot.

Revision history for this message
In , Bill (bill-redhat-bugs) wrote :

Hm, looking at a tty, for example:

start on stopped rc2
start on stopped rc3
start on stopped rc4
start on started prefdm

stop on runlevel 0
stop on runlevel 1
stop on runlevel 6

So, it may start quickly (on the stop of the runlevel) but then stop again (on the start of runlevel 0/1/6)

Revision history for this message
In , Bug (bug-redhat-bugs) wrote :

This bug appears to have been reported against 'rawhide' during the Fedora 10 development cycle.
Changing version to '10'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Revision history for this message
Josef Wolf (jw-raven) wrote :

In Ubuntu 8.10, the wording has been changed to reflect the actual beahvior. It now reads "press ctrl-D to resume boot". I don't think this is a proper solution to the problem. IMHO there _have_ to be a way to properly shutdown without continuing the boot. E.g. one might want to make a copy of the corrupted filesystem before trying to repair it. Continuing the boot is the worst thing one can do in such a situation.

Revision history for this message
Paul O'Malley (ompaul-deactivatedaccount) wrote :

8.04.2 remote use of reboot -f leaves the user in server mode having to make a journey to the on off button or find someone who can

not great when it is a test firewall but better than it being in production

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Fedora bug looks like the same thing, just expressed differently

Changed in upstart (Fedora):
status: Unknown → In Progress
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Ok, so here's my current working theory on this bug.

When you are in single-user mode, the rcS-default job is running: this is what runs "sulogin". Once "sulogin" finishes, this then runs "telinit" to switch to the default runlevel.

If you type "reboot" inside the "sulogin" shell, that will run "shutdown" which will send a runlevel event to change the runlevel (to 6).

Now, what's supposed to happen is this:

 * the rcS-sulogin job is stopped (stop on runlevel)
 * the rcS-sulogin script, sulogin, etc. are sent the TERM signal and die
 * the rc6 job is started (start on runlevel 6)
 * shutdown begins

When I try it, that's what happens.

Now, what I think *MIGHT* happen for some people is:

 * the rcS-sulogin job is stopped (stop on runlevel)
 * the sulogin script gets sent the TERM signal
 * the rcS-sulogin script continues, and gets to run "telinit 2"
 * the rcS-sulogin script gets send the TERM signal
 * _but_ telinit2 sends the "runlevel 2" event
 * the rc6 job is stopped (stop on runlevel [!6])
 * the rc2 job is started (start on runlevel 2)
 * normal boot begins

So this would be a race condition.

0.3 gives us an easy way to fix this, the job stop cause is in $UPSTART_EVENT for the post-stop script, so all we need to do is move the telinit into post-stop and only run if if $UPSTART_EVENT != runlevel

Interestingly this is impossible with 0.5 because it doesn't reveal why a process was stopped, and this makes we realise that we do need that functionality

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Though also thinking about it, we really want the rc-default job to be run - and therefore the stopping event to know why it stopped ;)

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Moving the telinit code into post-stop and checking $UPSTART_EVENT looks like it does the trick

Changed in upstart (Ubuntu):
status: Confirmed → In Progress
Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

Actually, we have to check $1 as well - $UPSTART_EVENT is not reset if the stop command is by exit, so it still contains "runlevel" because that's what started the job.

        if [ "${UPSTART_EVENT}" = "runlevel" -a "$1" != "S" ]
        then
            exit 100
        fi

Seems right

Revision history for this message
Scott James Remnant (Canonical) (canonical-scott) wrote :

100 being for debugging, obv

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package upstart - 0.3.10-2

---------------
upstart (0.3.10-2) karmic; urgency=low

  * debian/upstart.postinst: Use telinit u to re-exec, rather than
    kill just in case it's not Upstart that's running. LP: #92177.
  * debian/event.d/system-services/tty*: Run getty in 8-bit clean
    mode. LP: #273189.
  * debian/event.d/upstart-compat-sysv/rc-default:
    - Don't use grep -w, instead split on $IFS and iterate. LP: #385911.
    - Check for any valid runlevel, not just S. LP: #85014.
    - Make console owner, since it may spawn sulogin.
  * debian/event.d/upstart-compat-sysv/rcS:
    - Spawn sulogin if given -b or "emergency". LP: #193810.
  * debian/event.d/upstart-compat-sysv/rcS:
    - Make console owner. LP: #211402.
  * debian/event.d/upstart-compat-sysv/rcS-sulogin:
    - Place the telinit code in post-stop, checking $UPSTART_EVENT first so
      we don't change the runlevel if we were stopped due to a runlevel
      change. LP: #66002.

 -- Scott James Remnant <email address hidden> Thu, 18 Jun 2009 16:19:34 +0100

Changed in upstart (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
In , Bug (bug-redhat-bugs) wrote :

This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 10 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

The process we are following is described here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Revision history for this message
In , Bug (bug-redhat-bugs) wrote :

Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Changed in upstart (Fedora):
status: In Progress → Won't Fix
Changed in upstart (Fedora):
importance: Unknown → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.