[Needs AGPMode quirk] DRI lockup with AGP on ati (mobility radeon rv350)

Bug #141551 reported by encoded
34
Affects Status Importance Assigned to Milestone
xserver-xorg-video-ati (Ubuntu)
Fix Released
High
Bryce Harrington

Bug Description

Binary package hint: xserver-xorg-video-ati

X locks up when using the "ati" driver with my laptop. The screen is black, and it does not respond to mouse or keyboard input.

However, the machine is not totally frozen. Processes continue to run in the background (like audio playing), and while the keyboard doesn't respond, pressing the power button does incur a normal shutdown.

I have been able to get into X using the "vesa" driver, but 1024x768 on a 1280x800 widescreen display is not fun.

I originally believed this to be the same as bug #132716, but have since been informed that it is different.

In the process of trying to solve this bug, I have tried:
  * running X with no xorg.conf (same result)
  * running dpkg-reconfigure -p high xserver-xorg (same result)
  * setting various combinations of MonitorLayout (see bug #132716, same result)
  * tried running xrandr -d :0 --output LVDS --mode 0x48 (same result, with caveats, below)
  * tried running xset --display :0 dpms force on (same result, with caveats, see below)

The caveats mentioned are that, unlike the poster of bug #132716, I cannot C-A-Fx out of X, so I tried these with various bash or cron tricks. As a result, I am not entirely certain that they ran successfully.

Steps to reproduce:
  1. make sure your xorg.conf is using the "ati" driver
  2. start X.
  3. rinse, repeat :)

Hardware:
  * Sony VAIO S-170 w/ ATI Mobility Radeon 9700 (RV350 M10)

Revision history for this message
encoded (encoded) wrote :
Revision history for this message
encoded (encoded) wrote :
description: updated
Revision history for this message
Tormod Volden (tormodvolden) wrote :

Can you please also attach the output from:
 xrandr --verbose -q -d :0

Revision history for this message
encoded (encoded) wrote :

Gladly. This is the output when logged into X using the "vesa" display driver. I can't do the same when using the "ati" driver obviously.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Can you try your bash and cron tips and get one from the "ati" driver?

Revision history for this message
encoded (encoded) wrote :

Just tried all my bash/cron tricks again. Unfortunately none of them work. :(

I tried setting my .xinitrc to something like this:

exec xterm -display :0 -e 'xrandr --verbose -q -d :0 > xrandr.txt'

Which worked when I had the driver set to "vesa", but when I tried it with "ati", it must have locked up before it could run xrandr.

I also tried some stuff with things like:

(startx &); sleep 5; xrandr --verbose -d -d :0 > xrandr.txt

Last thing I tried was setting a crontab entry like so:

xrandr -d :0 -q --verbose > xrandr.txt

Unfortunately, none of these things work when I have the "ati" driver set. I know they're all a bit hokey, but they were all I could think of. If you know any other tricks that might let me get the information you're looking for, I'll be glad to try them.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

can you attach the logfile? Apparently it's not that simple since the machine normally boots to X, but after a crash, boot in recovery mode and copy the log somewhere.

Changed in xserver-xorg-video-ati:
importance: Undecided → High
status: New → Incomplete
Revision history for this message
encoded (encoded) wrote :

No problem, here's the log. I did a chmod a-x /etc/init.d/gdm a long time ago.

Revision history for this message
encoded (encoded) wrote :

Well, xserver-xorg-video-ati version 1:6.7.193-1ubuntu1 didn't fix my problem. I have since done some more tests and have learned that adding the following section to my xorg.conf allows me to load X with the "ati" driver.

Section "Module"
        Disable "glx"
        Disable "dri"
EndSection

I will attach the requested xrandr --verbose -q -d :0, with the ati driver presently.

Revision history for this message
encoded (encoded) wrote :
Revision history for this message
Tormod Volden (tormodvolden) wrote :

It's not clear for me whether the X server crashes/locks up or if it just does not enable the right mode/output. Would you be able to ssh in from another machine and see what the X server is up to (CPU usage, tracing etc)?
See also https://wiki.ubuntu.com/DebuggingXorg

Revision history for this message
encoded (encoded) wrote :

Yes, will do so ASAP (tonight hopefully).

Thanks for the link to the debugging tips!

Thanks for your efforts, I appreciate it.

Revision history for this message
encoded (encoded) wrote :

SSH'ing in works fine.

top says:

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13080 root 25 0 155m 7384 3508 R 98.8 1.4 10:53.84 Xorg

I tried to run xrandr via SSH, but after 2 minutes, it had still not produced any output, hence I assumed it wouldn't, and killed it. Should I wait longer?

I've just installed the debugging symbols, so I should have a trace soon.

Revision history for this message
encoded (encoded) wrote :

strace of a lockup. Captured with the following script (included in the hopes that it might help someone else):

#!/bin/bash

if [ $EUID -ne 0 ]; then
    echo "You must be root to run this script."
    exit 0
fi

PROCESS=Xorg

sudo sh /etc/init.d/gdm start

pgrep $PROCESS > /dev/null 2>&1
while [ $? \!= 0 ]; do
    sleep 0.001
    pgrep $PROCESS > /dev/null 2>&1
done

sudo strace -o strace.txt -p `pgrep $PROCESS`

# end script

The script was needed because if i didn't start the strace right away, it seemed I didn't get any output at all.

PLEASE NOTE: The strace.txt file has had the last 15,000 lines or so clipped, since they were all the same.

Revision history for this message
encoded (encoded) wrote :

Xorg log for the strace'd run.

Revision history for this message
Tormod Volden (tormodvolden) wrote : Re: dri lockup on ati (mobility radeon rv350)

Thanks for the info, excellent debugging. The X server is spinning in a loop and munging CPU. The repeating ioctl(7, 0x6444, 0) is probably a DRI request (from Xorg.0.log we can see file descriptor 7 is /dev/dri/card0 - you can verify it with ls -l /proc/`pgrep Xorg`/fd ).

Can you try to get a stack trace with gdb? You might want to install debug/symbol packages for xserver-xorg-core, libgl1-mesa-dri and libgl1-mesa-glx.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

You can also try if
 Option "BusType" "PCI"
helps. It would be a better workaround than disabling DRI.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

More tips:
 - try adjusting agp aperture in BIOS.
 - Turn on debug option on drm (see updated wiki page)
 - check dmesg (please attach it)

Revision history for this message
encoded (encoded) wrote :

One thing I've noticed: /dev/dri doesn't exist. Is that supposed to be created on the fly?

Will try to fulfill all of your requests ASAP. But, if you don't hear from me for a while, it just means I didn't get a chance to do these things before heading out of town for the weekend.

Thanks for your help

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Yes, /dev/dri/ is created by the drm kernel module, which will be loaded on demand, i.e. when glx and dri initializes, IIRC. So if you disable dri, you won't see it.

Revision history for this message
encoded (encoded) wrote :

* Option "BusType" "PCI" -- works
* dmesg (from locked up server) -- attached

Unfortunately, this is all I have time for right now. More ASAP.

Revision history for this message
mabovo (mabovo) wrote :

As Compiz is doing fine on Radeon 9600 RV350 with ati driver, " if " I just type command Compiz on terminal everything get messed with X asking for XGL and Compiz stop working.

Revision history for this message
mabovo (mabovo) wrote :

The fact that I can't access keyboard when X gets a black screen is probably related to a bug in XKB package.
All the time starts Gnome and a window show-up with a message that keyboard layout configuration on X is different from configuration in Gnome.
I tried to configure direct from menu panel option System > Preferences > Keyboard and got thet following error message:
 " Error trying to configure XKB - can be a bug on libxklavier library or a bug on Xserver (utilities xkbcomp and xmodmap). Xserver with an uncompatible implementation from libxkbfile.

The output for XKB configuration is:

marcos@autoxtreme:~$ xprop -root | grep XKB
_XKB_RULES_NAMES_BACKUP(STRING) = "xorg", "abnt2", "br", "", ""
_XKB_RULES_NAMES(STRING) = "xorg", "abnt2", "br", "", "grp:alts_toggle"
marcos@autoxtreme:~$ gconftool-2 -R /desktop/gnome/peripherals/keyboard/kbd
 layouts = [br]
 model = abnt2
 options = [grp grp:alts_toggle]
 overrideSettings = true
marcos@autoxtreme:~$

And here is the xorg.conf configuration for the keyboard:
Section "InputDevice"
 Identifier "Generic Keyboard"
 Driver "kbd"
 Option "CoreKeyboard"
 Option "XkbRules" "xorg"
 Option "XkbModel" "abnt2"
 Option "XkbLayout" "br"
EndSection

Revision history for this message
Tormod Volden (tormodvolden) wrote :

mabovo, that sounds like bug #67188.

Revision history for this message
mabovo (mabovo) wrote :

Thanks to point that. I will follow this regression on Gutsy.

Revision history for this message
encoded (encoded) wrote :

I don't have a AGP Aperature Size option in my laptop's BIOS. :/

In order to turn on DRM debugging, I first had to do "sudo modprobe drm", otherwise the "debug" file mentioned on the wiki page didn't exist.

I assume you want my kern.log for that chunk of time, so I'm attaching it. Looking at it, I've cut it at about 500 lines, it appears to be repeating after that stage. I am keeping the whole thing though, just in case, and can attach it (gzipped of course) if you need it.

Thanks again Tormod!

Revision history for this message
gammell (gammell) wrote : Re: DRI lockup with AGP on ati (mobility radeon rv350)

I believe I have this same bug on a Thinkpad R51 with a Radeon RV250 (Mobility FireGL9000). Screen shuts off when gdm/x starts (either from boot or manually calling "/etc/init.d/gdm start" from recovery mode) and I can't ctrl+alt+backspace or ctrl+alt+F# out of/around it. It started when I upgraded to the Gutsy beta last night from Feisty. I'm now running xserver-xorg-video-ati-dbg_6.7.194-1ubuntu1tv6_i386.deb as posted by Tormod Volden in 132716, but no change (as expected if it's a different bug). In attempting multiple fixes I've noticed that the Xorg log is *very* slow in filling (I've let it it set on a black screen for 10-15 minutes to still find a log file cut off mid line, never the same place but always around the "(II) RADEON(0): Port0:" lines). I'm going to let it 'run' for an hour or so to make sure I capture the whole log and will post it here. Please advise if this is yet another different bug. Thanks.

Revision history for this message
gammell (gammell) wrote :

Oops, missed a post, I just tried disabling DRI and GLX in the modules section as suggested by encoded and it 'fixes' things so I can get into X for me too, so I'm fairly sure I have the same bug. I will run the strace and post the results and matching xorg.0.log shortly. Is there anything else I can supply to help?

Changed in xserver-xorg-video-ati:
status: Incomplete → Confirmed
Revision history for this message
Tormod Volden (tormodvolden) wrote :

gammell, it would be nice if you can repeat what encoded has been doing and verify that you have the exact same issue (same things in strace and kern.log).

Revision history for this message
gammell (gammell) wrote :

I ran strace and let it sit for a couple hours. When it's locked up I can't ssh in and any open connections time out. However it looks like I had a different issue, down near the bottom of the log (where it had never got to before), it gave me a warning about using the AGPFastWrite option. I've now disabled it (I don't know why I had it on in the first place) and every thing works. All I did was comment out the option in the device section, but since you don't have it set encoded, maybe you should try setting it to false? It's probably not related though, my strace and xorg log looked fairly different. I'll post them if anyone still wants them, sorry to get things 'off-topic'.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

AGPfastwrite defaults to off, for good reasons. It will only be turned on if the user forces it on in the xorg.conf.

Revision history for this message
waxhell (waxhell) wrote :

I can verify this bug on Feisty with an ATI Radeon 8500DV. In addition, it persisted when I moved to Gutsy (Beta and now release).

However, I do have something odd that somewhat differs with the previously reported issues. Approximately one in every two to three boots, X will lock up upon boot with the same symptoms. The other times, X boots normally without issues.

Option "BusType" "PCI" fixed this issue.

I may have a related or unrelated issue where X.org takes up all of my CPU power in Gutsy -- Bug #150863. Disabling DRI fixes that separate issue. I didn't have this problem in Feisty.

Revision history for this message
Bryce Harrington (bryce) wrote :

I don't know if this will fix this specific issue, but I've backported a bunch of r3xx patches from upstream. Please test this .deb and report what you find:

http://people.ubuntu.com/~bryce/Testing/ati/

Revision history for this message
mart1n (martin-paleis-oosteinde) wrote :

Hi Bryce,
do your fixes run on Hardy, and support the ATI Radeon 8500DV ALl-in-Wonder?

Anyway, I'm running an ATI Radeon 8500DV as well on an older system (AMD2200+), 1 in 4 times the X-server boots in the standard config. When I add
 Option "BusType" "PCI"
it boots always, but graphics hogs the CPU, all runs very very slow.
If I disable either
        Disable "glx"
or
        Disable "dri"
or both, it always boots, is quick, but I lose all nice compiz features. I read a tweak on the web, to add
 Section "DRI"
  Mode 0666
 EndSection
and have both glx and dri enabled/loaded, it works, the X-server always boots, all nice Compiz features work, but all is very very slow, CPU is high for any window move, etc. More or less a similar result as when specifying the BusType.
I tried changing the DRI mode number, but have no clue what to, so no result.

Question: is there an overview of DRI mode values, which may improve this?

Revision history for this message
Bryce Harrington (bryce) wrote :

Would you mind testing one of the Intrepid alphas and see if it's still an issue with the newest -ati and xserver included in it? If so, we probably should just forward this issue upstream.

Changed in xserver-xorg-video-ati:
status: Confirmed → Incomplete
Revision history for this message
Joe (fullmitten) wrote :

I think I may have the same problem with Intrepid. I get random lockups within about 20 minutes using Ubuntu or Kubuntu 8.10. I can still SSH in to the box and don't see any errors in the logs. I just swapped out my Radeon 8500 for an ancient ATI Rage 128 and I've gotten a good 2.5 hours of uptime, which led me to suspect the driver and to this bug.
Gigabyte 7ZXE Mobo
AMD Athlon XP 2000+
1024 Mb RAM
ATI Radeon 8500
SMC1255TX PCI Ethernet Adapter
WDC WD2500JB, ATA Hard drive
SONY DVD RW DRU-720A, ATAPI CD/DVD-ROM
SONY CD-RW CRX320E, ATAPI CD/DVD-ROM

I was going to try fglrx and see if it makes a difference.

Revision history for this message
Brian Murray (brian-murray) wrote : Ubuntu needs you!

Thanks for taking the time to report this bug and helping to make Ubuntu better. In the development cycle for Intrepid there have been some vast improvements in the open source ati video driver and we could use your help testing them. Could you please download the latest Alpha CD image of Intrepid and test this particular bug just using the Live CD? You can find the latest image at http://www.ubuntu.com/testing . Your testing can help make Ubuntu and the open source ati driver even better! Thanks in advance.

Bryce Harrington (bryce)
Changed in xserver-xorg-video-ati:
status: Incomplete → New
status: New → Incomplete
Revision history for this message
Bryce Harrington (bryce) wrote : Re: DRI lockup with AGP on ati (mobility radeon rv350)

Based on the descriptions here - particularly the fact that it goes away with DRI disabled - this sounds a LOT like the AGPMode settings are incorrect. Please test this by adding this option to your /etc/X11/xorg.conf's device section:

Section "Device"
   ...
   Option "AGPMode" "2"
EndSection

Valid values of AGPMode are 1, 2, 4, and 8. 4 is a typical default but some hardware needs a different value. AGPMode only comes into play with DRI enabled - which is why the hangs go away with DRI disabled. Also, it can be triggered by various things - starting a 3D game, a suspend/resume cycle, starting up compiz, or even just booting up.

Anyway, if you find an AGPMode value that makes the issues go away, we may be able to put in a 'quirk' so the Xserver always uses that value for your particular combination of hardware. To help us do this, please test each of the 4 values and provide the following data:

  * AGPMode value(s) that work
  * Make/Model of laptop or motherboard
  * Output of lspci -vvnn
  * Is the system all factory hardware, or have any parts been replaced?
  * Is there an AGP Mode in the system BIOS?
    - If so, is it set to the factory default?

Revision history for this message
Joe (fullmitten) wrote :

Bryce's suggestion works for me. I put a fresh install of Ubuntu 8.04.1 on my computer (homebuilt with a Gigabyte 7ZXE mobo), added an AGPMode of 2 to xorg.conf, took the "fail-safe default" values in the BIOS, which set the AGP speed to 2x. No mysterious hangs in 9 hours of uptime last night.
I tried setting the BIOS to "optimized default" values, which sets the AGP speed to 4x, and changing AGPMode in xorg.conf to match. After about 2 hours the computer hung up. SSH in and Xorg was maxing out the CPU, just like the previous hangs and encoded describes in his comments above.
Just changing the BIOS AGP speed to 2x and leaving the rest of the "optimized defaults" in place, I've had two sessions today of at least 2 hours each without any problems.
This computer has no problem running with the AGP speed set to 4x in the bios when running Fiesty (see Bug #227882), if that makes any difference.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Joe, can you provide an Xorg.0.log from Feisty (e.g. from a live cd run). I have seen some other reports where an AGP speed worked before and now no longer, so I wonder if it's a driver problem and not just incompatible settings.

Revision history for this message
Joe (fullmitten) wrote :

Here's the xorg.conf and Xorg.0.log from my Fiesty hard disk install. As I mentioned in the other bug, I get months of uptime from this install.

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Thanks. So in Feisty with -ati 6.6.3 you were running at AGP 1x (the "01" in 0x1f000201 below):
(II) RADEON(0): [agp] Mode 0x1f000201 [AGP 0x1106/0x0305; Card 0x1002/0x514c]
I suppose you had the BIOS set to "optimized" 4x at that time?

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Sorry, scratch my last question, you just wrote that you used 4x in BIOS :) What does that line look like in Intrepid with default settings (no settings in xorg.conf)? I can't find your Intrepid logs. Please attach logs as is, no archives.

Revision history for this message
Joe (fullmitten) wrote :

I posted logs of my experience with Intrepid Alpha 5 over here:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/227806/comments/9

Revision history for this message
Joe (fullmitten) wrote :

By "no archives", I take it you want them unzipped. . .

Revision history for this message
Joe (fullmitten) wrote :
Revision history for this message
Tormod Volden (tormodvolden) wrote :

Thanks. Intrepid uses 4x like set in BIOS ("07" is 4+2+1):
(==) RADEON(0): Using AGP 4x
(II) RADEON(0): [agp] Mode 0x1f000207 [AGP 0x1106/0x0305; Card 0x1002/0x514c]

Revision history for this message
Bryce Harrington (bryce) wrote :

Tormod, do you think I should add a quirk for this one as well?

Revision history for this message
Tormod Volden (tormodvolden) wrote :

Bryce, yes that sounds reasonable, although the data is only from one single user. Anyway it might be difficult to get confirmed testing from several users having the exact same hardware combination.

I think the original report is getting forgotten here. encoded, do you have any luck with AGPMode settings?

Bryce Harrington (bryce)
Changed in xserver-xorg-video-ati:
assignee: nobody → bryceharrington
Revision history for this message
johnlu (juanluperez) wrote :

I have this same problem, I've just tested using Ubuntu 8.10 beta, It happens the same problem using 8.04, but there is no problem with 7.10. Right now I'm using xorg.conf recovered from my 7.10 installation and the problem still stays.

So, I think there were some changes from 7.10 to 8.04 which provokes that ATI Radeon Mobility 9700 (RV350 M10) no longer works.

Revision history for this message
Joe (fullmitten) wrote :
Bryce Harrington (bryce)
Changed in xserver-xorg-video-ati:
status: Incomplete → Triaged
Revision history for this message
johnlu (juanluperez) wrote :

I can't try right now. I'm gonna try right tomorrow.
I'll put up all that info. Thanks!

Revision history for this message
johnlu (juanluperez) wrote :

  * AGPMode value(s) that work: Only agpmode "1" (both ati and radeon driver)
  * Make/Model of laptop or motherboard: ASUS A6G, Chipset Intel 855GME supposed to work at 4x with this ATI Radeon Mobility 9700
  * Is the system all factory hardware, or have any parts been replaced? All factory hw.
  * Is there an AGP Mode in the system BIOS? No.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package xserver-xorg-video-ati - 1:6.9.0+git20081003.f9826a56-0ubuntu4

---------------
xserver-xorg-video-ati (1:6.9.0+git20081003.f9826a56-0ubuntu4) jaunty; urgency=low

  * 100_quirk_system.patch: Add three more quirks for AGPMode issues
    (LP: #296617, #141551)

 -- Bryce Harrington <email address hidden> Mon, 24 Nov 2008 20:32:11 -0800

Changed in xserver-xorg-video-ati:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.