Kernel Oops with Maverick in ahci_stop_engine

Bug #658560 reported by Gonzhauser
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Triaged
Medium
Unassigned

Bug Description

This is a Dell Precision T3500 with BIOS A03.
No further information at this point since system doesn't boot.

Revision history for this message
Gonzhauser (gonzhauser) wrote :
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

I'd built a test kernel for a different bug but it includes the patches from the upstream bugzilla report you noted here.

https://bugs.edge.launchpad.net/ubuntu/+source/linux/+bug/647043/comments/14 .

Give it a test and let us know your results. Thanks.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Avi Carmi (avi-carmi) wrote :

Dell T3500

worked just fine with 10.04

upgraded to 10.10 using update manager, and it won't boot. I get white text on black background, not even sure what I am looking at... dropped me into some kind of shell?

tried booting from the 10.10 install CD (which I burned to install on another machine) also won't boot off the install CD

get the Ubuntu logo, with dots underneath, then it again drops me into a shell.

back to booting from disk, but this time choosing the older kernel on the boot menu, and it booted up just fine. and seems to work fine.

not an expert, not sure what's wrong, and how to report this in more details.

Revision history for this message
Gonzhauser (gonzhauser) wrote : Re: [Bug 658560] Re: Kernel Oops with Maverick in ahci_stop_engine

Try the above kernel and then update your BIOS. I am going to do this
tomorrow. I can tell you then how it worked.
It seems to be a BIOS problem.

Revision history for this message
Avi Carmi (avi-carmi) wrote :

Thank you for your help.

I believe that I have the latest BIOS. No I don't, upgrading to A08 dated 9/17/10 right now.

I am not sure how to change kernels, do I use the update manager?

I have only two kernels in the boot choice (plus memtest and the old Windows partition), and only the older kernel is bootable.

I am afraid that if I add another kernel, the older (working one) will be deleted, and I'll end up with two "bad" kernels

Is there a way to keep more than two kernels?

-avi

Revision history for this message
Avi Carmi (avi-carmi) wrote :

FYI:

found this work around in another thread: https://bugs.launchpad.net/ubuntu/+bug/659149

Following this link : http://lkml.org/lkml/2010/6/16/323 I found a temporary solution : by inserting pci=nocrs in the kernel line in grub

Revision history for this message
Gonzhauser (gonzhauser) wrote :

Ok, I tested two things.

1. The kernel from comment #2 works.
2. Upgrading to the latest BIOS provided by DELL (A08, from A03) doesn't work with the old kernel.

So:
Kernel from #2 works with BIOS A03 and A08, stock Maverick kernel doesn't work with both BIOSs.

Should I
a) communicate this upstream,
b) file a bug at DELL?

Thanks for the realtime solution. :)

g

Revision history for this message
Gonzhauser (gonzhauser) wrote :

@Avi:

Download the deb packages from #2, reboot your old kernel, install new packages by "sudo aptitude install *.deb" and don't worry about removing old kernels, everything is preserved. Boot into your new shiny 2.6.25 kernel that actually boots. :)
Well, at least it worked for me on my T3500.

g

Revision history for this message
Gonzhauser (gonzhauser) wrote :

@Leann: Your kernel is installed as linux-image-2.6.35-22-generic.
Does this means it's going to be overwritten by the next round of
kernel updates when running update-manager?

g

Revision history for this message
Avi Carmi (avi-carmi) wrote :

as mentioned earlier, the A08 BIOS update did not resolve the problem.

pci=nocrs does work with the 10.10 Ubuntu kernel and A08 BIOS (did not try the older kernel with A08)

did not yet try the test patched kernel from #2, since I can boot with pci=nocrs and I need to get some real work done...

-avi

Revision history for this message
jstammi (johannes-stamminger) wrote :

I just ran into this problem. And I can confirm post #6, adding the pci=nocrs to grub.conf solves the problem. Another workaround is to boot an older kernel version, some 2.6.34.xxx.

https://bugzilla.kernel.org/show_bug.cgi?id=16228 tells of a patch for fixing this is available.

And there they tell also, that there is no BIOS for the Dell T3500 available resolving the problem.

So IMHO for ubuntu either the pci=nocrs should go into the grub.conf (for Dell T3500) or - preferred - the mentioned patch should be included ... ?

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

@Gonzhauser, thanks for trying the test kernel noted in comment #2 and confirming it fixes the issue. The next round of kernel updates would supersede the test kernel. However, it appears the next official Maverick kernel update will be an ABI bump, so you would have the option of selecting the older test kernel or the newer updated kernel.

Getting these patches accepted as a Maverick stable release update will be a bit difficult (ie it's not exactly appropriate to qualify for SRU). Even Linus himself has refused to accept these into the upstream 2.6.36 kernel as noted in https://patchwork.kernel.org/patch/189182/

"I definitely am not going to pull this series for 2.6.36.

I could possibly take this first one that only prepares for the real
change and doesn't actually change anything in itself, but switching
around the order of allocations after -rc5 would be crazy. Yes, it may
help some people, but we have absolutely no idea who it could hurt. So
the whole thing is definitely something for the merge window (and
preferably pretty early there too)"

I have to agree that this is a very risky change with a high risk of regression which is always something that one tries to avoid when introducing changes/fixes to a released kernel. I still need to discuss with the rest of the Ubuntu Kernel SRU team, but we may just have to wait for this to resolve itself in Natty.

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Incomplete → Triaged
Revision history for this message
Gonzhauser (gonzhauser) wrote :

On Wed, Oct 13, 2010 at 6:29 PM, Leann Ogasawara
<email address hidden> wrote:
> @Gonzhauser, thanks for trying the test kernel noted in comment #2 and
> confirming it fixes the issue.  The next round of kernel updates would
> supersede the test kernel.  However, it appears the next official
> Maverick kernel update will be an ABI bump, so you would have the option
> of selecting the older test kernel or the newer updated kernel.

I installed your kernel again after running upgrade-manager and it
still works but I am waiting for the day when I will forget... :(
I am willing to bisect this but you/Red Hat/Dell/Linux
Foundation/whoever should seriously consider adding a T3500 to your
test configurations. ;)

g

Revision history for this message
Gonzhauser (gonzhauser) wrote :

For the record: I found the "Lock version" in synaptic. Everything
should be fine.

Revision history for this message
Bjorn Helgaas (bjorn-helgaas) wrote :

This is a duplicate of bug #653238, not of bug #647043.

Bug #653238 was marked a duplicate of bug #647043, but that is incorrect.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.