sata_via hard resetting link / freeze: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6

Bug #422994 reported by JTZ
108
This bug affects 17 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Surbhi Palande
Lucid
Fix Released
Medium
Steve Conklin

Bug Description

[as this is my first ubuntu bug report, sorry for all beginner's mistakes :-]
Hopefully you see in the attached logs that I have two Samsung SATA drives with which I encounter the same problem:
Aug 31 21:30:02 server01 kernel: [699319.083097] ata1.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6
Aug 31 21:30:02 server01 kernel: [699319.083195] ata1.00: BMDMA stat 0x5
Aug 31 21:30:02 server01 kernel: [699319.083243] ata1: SError: { UnrecovData Proto TrStaTrns }
Aug 31 21:30:02 server01 kernel: [699319.083314] ata1.00: cmd 25/00:00:0d:b6:6c/00:01:1a:00:00/e0 tag 0 dma 131072 in
Aug 31 21:30:02 server01 kernel: [699319.083326] res 51/84:4f:be:b6:6c/84:00:1a:00:00/e0 Emask 0x12 (ATA bus error)
Aug 31 21:30:02 server01 kernel: [699319.083439] ata1.00: status: { DRDY ERR }
Aug 31 21:30:02 server01 kernel: [699319.083480] ata1.00: error: { ICRC ABRT }
Aug 31 21:30:02 server01 kernel: [699319.083544] ata1: hard resetting link
Aug 31 21:30:02 server01 kernel: [699319.401970] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Aug 31 21:30:02 server01 kernel: [699319.421707] ata1.00: configured for UDMA/133
Aug 31 21:30:02 server01 kernel: [699319.421777] ata1: EH complete
This happens when copying a lot of files and leads to a freeze with the need to reboot.
I have had a look into https://bugs.launchpad.net/ubuntu/intrepid/+source/linux/+bug/263160 but I suppose in my kernel 2.6.27-14.39-generic the described patch should be already included?
If I can produce any output or perform some tests I'm of course willing to help...

ProblemType: Bug
Architecture: i386
CurrentDmesg:
 [ 87.886361] eth1: no IPv6 routers present
 [ 87.915168] tun: Universal TUN/TAP device driver, 1.6
 [ 87.915202] tun: (C) 1999-2004 Max Krasnyansky <email address hidden>
 [ 87.926802] tun0: Disabled Privacy Extensions
 [ 104.149886] warning: `vdr-kbd' uses 32-bit capabilities (legacy support in use)
DistroRelease: Ubuntu 8.10
HalComputerInfo: Error: command ['lshal', '-u', '/org/freedesktop/Hal/devices/computer'] failed with exit code 1: error: dbus_bus_get: org.freedesktop.DBus.Error.FileNotFound: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory

LsUsb: Bus 001 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Package: linux-image-2.6.27-14-generic 2.6.27-14.39
ProcCmdLine: root=UUID=be4630a3-c4e8-4c47-a960-416a770ac10d ro quiet splash
ProcEnviron:
 LANGUAGE=de_DE:de:en_GB:en
 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
 LANG=de_DE.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.27-14.39-generic
SourcePackage: linux

Revision history for this message
JTZ (jtz) wrote :
Revision history for this message
JTZ (jtz) wrote :
Revision history for this message
JTZ (jtz) wrote :

Referring to bug#263160 #17 I get the impression that the VIA driver might be the problem.

JTZ (jtz)
summary: - atax.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6
+ sata_via hard resetting link / freeze: exception Emask 0x12 SAct 0x0
+ SErr 0x1000500 action 0x6
Revision history for this message
JTZ (jtz) wrote :

Hello Stefan, trying to get forward with this I assign this bug to you and hope you are one of the right persons... (sorry, if not)

Changed in linux (Ubuntu):
assignee: nobody → Stefan Bader (stefan-bader-canonical)
Revision history for this message
Stefan Bader (smb) wrote :

Usually you should not subscribe anybody to bugs. People would pick up bugs as they have time and assigning one might have the opposite effect of getting someone to look at it. That said and as I have looked at it now... ;-)
Usually it is a good help to take a recent kernel from https://wiki.ubuntu.com/KernelTeam/MainlineBuilds and install it in parallel. Intrepid (8.10) also has already been succeeded by Jaunty. Have you already tried running from a Jaunty live CD? Just to rule out that this actually is fixed already.

Revision history for this message
JTZ (jtz) wrote :

Stefan, thanks for your answer. I can now confirm that with kernel 2.6.31-020631-generic the problem remains. I tried to copy the problematic folder tree twice, once there was only a freeze and no entry the syslog, the next time I saw after the freeze the known kernel entry:
Sep 19 22:07:01 server01 kernel: [86345.385651] ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6
Sep 19 22:07:01 server01 kernel: [86345.385718] ata3.00: BMDMA stat 0x5
Sep 19 22:07:01 server01 kernel: [86345.385759] ata3: SError: { UnrecovData Proto TrStaTrns }
Sep 19 22:07:01 server01 kernel: [86345.385841] ata3.00: cmd 25/00:60:bd:ad:41/00:00:1a:00:00/e0 tag 0 dma 49152 in
Sep 19 22:07:01 server01 kernel: [86345.385852] res 51/84:1f:fe:ad:41/84:00:1a:00:00/e0 Emask 0x12 (ATA bus error)
Sep 19 22:07:01 server01 kernel: [86345.385939] ata3.00: status: { DRDY ERR }
Sep 19 22:07:01 server01 kernel: [86345.385987] ata3.00: error: { ICRC ABRT }
Sep 19 22:07:01 server01 kernel: [86345.386068] ata3: hard resetting link
Sep 19 22:07:01 server01 kernel: [86345.702931] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Sep 19 22:07:01 server01 kernel: [86345.736757] ata3.00: configured for UDMA/133
Sep 19 22:07:01 server01 kernel: [86345.736821] ata3: EH complete
If you have any other test-scenariofor me I will of course try ..., thanks for your help.

Revision history for this message
Stefano Palazzo (stefano-palazzo) wrote :

i was struggling with this problem for the last few days now on ubuntu and fedora. Once i removed my "cheap and cheerful" cd burner the problem was gone. However, i maintain it's a software bug, since the drive worked perfectly well with kernel <= 2.6.19, i.e. without libata handling pata devices. hope this helps in some way.

Revision history for this message
JTZ (jtz) wrote :

Thanks for your comment, Stephano. This would probably mean for me to move the system from my good old 10GB PATA to one of the two 500GB SATA disks to have no PATA left in the system, right?

Revision history for this message
jonaz__ (jonaz-86) wrote :

I can confirm this on 9.10 (kernel 2.6.31-14-server)

Nov 1 16:44:37 bulan kernel: [ 9203.231942] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x400000 action 0x6
Nov 1 16:44:37 bulan kernel: [ 9203.231945] ata6.00: BMDMA stat 0x5
Nov 1 16:44:37 bulan kernel: [ 9203.231948] ata6: SError: { Handshk }
Nov 1 16:44:37 bulan kernel: [ 9203.231956] ata6.00: cmd 35/00:00:3f:f0:7e/00:01:36:00:00/e0 tag 0 dma 131072 out
Nov 1 16:44:37 bulan kernel: [ 9203.231957] res 51/84:ff:40:f0:7e/84:00:36:00:00/e0 Emask 0x10 (ATA bus error)
Nov 1 16:44:37 bulan kernel: [ 9203.231960] ata6.00: status: { DRDY ERR }
Nov 1 16:44:37 bulan kernel: [ 9203.231962] ata6.00: error: { ICRC ABRT }
Nov 1 16:44:37 bulan kernel: [ 9203.231966] ata6: hard resetting link
Nov 1 16:44:37 bulan kernel: [ 9203.231969] ata6: nv: skipping hardreset on occupied port
Nov 1 16:44:37 bulan kernel: [ 9203.390033] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Nov 1 16:44:37 bulan kernel: [ 9203.450386] ata6.00: configured for UDMA/33
Nov 1 16:44:37 bulan kernel: [ 9203.450396] ata6: EH complete

I get ALOT of these while trying to rebuild my raid5 array. Which ALWAYS worked fine on ubuntu 9.04.

Please tell me what more information you need!

Revision history for this message
jonaz__ (jonaz-86) wrote :

i downloaded 2.6.31.5 from kernel.org and compiled and now everything is fine!

i guss stock ubuntu karmic kernel is broken :(

Revision history for this message
jonaz__ (jonaz-86) wrote :

i take back my comment. i still see the errors. but not a frequent.
(is there a way to edit/delete comments?)

Revision history for this message
Surbhi Palande (csurbhi) wrote :

@JTZ, thanks for your logs and error report. Can you please:

1) also attach the similar output for the kernel 2.6.31 (or karmic) ?
2) It will be great if you could paste a bigger output of dmesg i.e messages before the error message as well.
3) Also try this:
    a) make this file (if it does not exist) in /etc/modprobe.d/options
    b) make/add an entry in it:
        options libata udma/100
    c) *copy* your current initramfs somewhere safe (in the event you fail, this will be needed)
    d) fire the following command to update your existing initramfs:
        update-initramfs -u
    e) reboot and get the output of hdparm -l /dev/sda and attach it here.
    f) test if your data copy works now with this new speed.

Revision history for this message
Surbhi Palande (csurbhi) wrote :

@JTZ, one more thing, please try the step 3 only on 2.6.27.
However, if you do go on 2.6.31 (which we strongly encourage you to do) then we will let you know what
to do once you attach the log files here.

Revision history for this message
Floris (fdirkzwager) wrote :

Hello, I do not want to hijack the thread but I think I am seeing the same :
ata3.00: status: { DRDY ERR }
[ 2444.768392] ata3: hard resetting link
[ 2445.252019] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[ 2445.280061] ata3.00: configured for UDMA/100
[ 2445.280598] ata3: EH complete
[ 2866.980021] ata3.00: qc timeout (cmd 0xa0)
[ 2866.980033] ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[ 2866.980093] ata3.00: irq_stat 0x40000001
[ 2866.980152] ata3.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[ 2866.980153] cdb 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2866.980155] res 51/20:03:00:00:00/00:00:00:00:00/a0 Emask 0x5 (timeout)

I have attached the complete log. the above error keeps repeating its self and will never go away. I tried " sudo touch /forcefsck" but no avail. Again sorry if this is not relevant.

Revision history for this message
xamul (luigi-zanderighi) wrote :

Hi, I see the same error.

Posted comments on bug #397096, similar?
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/397096

Stefan Bader (smb)
Changed in linux (Ubuntu):
assignee: Stefan Bader (stefan-bader-canonical) → Surbhi Palande (csurbhi)
Revision history for this message
JTZ (jtz) wrote :

@Surbhi, (3) (update of initrd) failed as with the updated initramfs the system does no longer find the boot hd (get disk by uuid failed). I will go on with (1) and (2).

Revision history for this message
JTZ (jtz) wrote :

2.6.31 dmesg

Revision history for this message
JTZ (jtz) wrote :
Revision history for this message
JTZ (jtz) wrote :
Revision history for this message
JTZ (jtz) wrote :
Revision history for this message
JTZ (jtz) wrote :
Revision history for this message
JTZ (jtz) wrote :
Revision history for this message
JTZ (jtz) wrote :
Revision history for this message
JTZ (jtz) wrote :

@Surbhi: I reproduced the error again under 2.6.31, here is kernel.log which is the direct extension of the already attached dmesg. Is this sufficient?
Nov 21 21:15:50 server01 kernel: [ 82.251186] warning: `vdr-kbd' uses 32-bit capabilities (legacy support in use)
Nov 21 22:57:45 server01 kernel: [ 6197.950235] ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6
Nov 21 22:57:45 server01 kernel: [ 6197.950303] ata3.00: BMDMA stat 0x5
Nov 21 22:57:45 server01 kernel: [ 6197.950343] ata3: SError: { UnrecovData Proto TrStaTrns }
Nov 21 22:57:45 server01 kernel: [ 6197.950425] ata3.00: cmd 25/00:60:05:7a:41/00:00:1a:00:00/e0 tag 0 dma 49152 in
Nov 21 22:57:45 server01 kernel: [ 6197.950437] res 51/84:3f:26:7a:41/84:00:1a:00:00/e0 Emask 0x12 (ATA bus error)
Nov 21 22:57:45 server01 kernel: [ 6197.950524] ata3.00: status: { DRDY ERR }
Nov 21 22:57:45 server01 kernel: [ 6197.950559] ata3.00: error: { ICRC ABRT }
Nov 21 22:57:45 server01 kernel: [ 6197.950643] ata3: hard resetting link
Nov 21 22:57:46 server01 kernel: [ 6198.268380] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Nov 21 22:57:46 server01 kernel: [ 6198.284901] ata3.00: configured for UDMA/133
Nov 21 22:57:46 server01 kernel: [ 6198.284974] ata3: EH complete
Nov 21 22:59:28 server01 kernel: [ 6300.824367] Clocksource tsc unstable (delta = 259972416 ns)
Nov 21 23:01:49 server01 kernel: [ 6442.008484] ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6
Nov 21 23:01:49 server01 kernel: [ 6442.008562] ata3.00: BMDMA stat 0x5
Nov 21 23:01:49 server01 kernel: [ 6442.008612] ata3: SError: { UnrecovData Proto TrStaTrns }
Nov 21 23:01:49 server01 kernel: [ 6442.008704] ata3.00: cmd 25/00:68:85:33:e1/00:00:1a:00:00/e0 tag 0 dma 53248 in
Nov 21 23:01:49 server01 kernel: [ 6442.008715] res 51/84:57:96:33:e1/84:00:1a:00:00/e0 Emask 0x12 (ATA bus error)
Nov 21 23:01:49 server01 kernel: [ 6442.008839] ata3.00: status: { DRDY ERR }
Nov 21 23:01:49 server01 kernel: [ 6442.008884] ata3.00: error: { ICRC ABRT }
Nov 21 23:01:49 server01 kernel: [ 6442.008964] ata3: hard resetting link
Nov 21 23:01:50 server01 kernel: [ 6442.328497] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Nov 21 23:01:50 server01 kernel: [ 6442.345046] ata3.00: configured for UDMA/133
Nov 21 23:01:50 server01 kernel: [ 6442.345110] ata3: EH complete

Revision history for this message
Andrew Meyer (agmlego) wrote :

I can confirm this issue on a system running 2.6.31.16. The steps laid out in #12 do not have any effect at stopping the issue on 2.6.31.16.

Revision history for this message
Andrew Meyer (agmlego) wrote :

Machine in question is running Karmic Server. Machine is a Gateway E-4200, motherboard model E139761. Operating system is on sda, an IDE drive on the motherboard primary controller. Issue is most easily reproduced by moving large amounts of data from sdc (SATA drive on the VIA controller) to sdb (another SATA drive on the VIA controller). Attached are lspci output and output from hdparm -I on the three drives.

Revision history for this message
Andrew Meyer (agmlego) wrote :
Revision history for this message
Andrew Meyer (agmlego) wrote :
Revision history for this message
Andrew Meyer (agmlego) wrote :
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Antoine (ve-antoine) wrote :

Got the same here. See attachment.

Revision history for this message
JHF2442 (a-launchpad-joel-hatsch-net) wrote :

Same here :
- 10.04 LUCID !!!
- at the same time went from 80GB SATA I to 640 GB SATA II drive
- didn't notice this with Karmic
- messages come when simply starting up firefox, which I don't really consider as moving lots of data (as in #26)
- step #12 didn't help

Could someone state if the bug implies data integrity issues ? From my understanding, it's only a matter of performance, bus is not responding OK and is being reset, but afterwards access is completed OK. Confirm ?

# hdparm /dev/sda

/dev/sda:
 multcount = 16 (on)
 IO_support = 0 (default)
 readonly = 0 (off)
 readahead = 256 (on)
 geometry = 77825/255/63, sectors = 1250263728, start = 0

Revision history for this message
JTZ (jtz) wrote :

Originally, this bug was about a system freeze caused by the sata problems. Maybe that some people see similar error messages in their log files but most important to me is of cause that my system crashes completely with the need for a cold start when these sata problems happen.

Revision history for this message
NeoTrantor (neotrantor) wrote :

Same here.
I'm also running 10.04 lucid. My HDD is a Western Digital WD5000AACS 500 GB, the Mainboard is an Abit AN78HD. I'm not sure if this occured before updating to lucid.

Attached some logs (hdparm -I /dev/sda ; dmesg ; lspci -vv ; /proc/interrupts).

Revision history for this message
Peter Haight (peterh-sapros) wrote :

I took the default Lucid kernel, applied this patch http://www.spinics.net/lists/linux-ide/msg37898.html, built it and this fixed this problem for me.

Revision history for this message
Stefan Bader (smb) wrote : Re: [Bug 422994] Re: sata_via hard resetting link / freeze: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6

On 07/30/2010 06:23 AM, Peter Haight wrote:
> http://www.spinics.net/lists/linux-ide/msg37898.html

Just a note that this patch is part of 2.6.32.16 and will be part of one of the
next Lucid updates.

Revision history for this message
Cruncher (ubuntu-wkresse) wrote :

Confirming problem with WD20EARS+VIA6421.
Confirming patch is working for me.

@JHF2442 From my understanding the bus keeps resetting because of incoming data overflow, but there is no data corrupting. Writing data is not affected (therefore writing is possible at full speed), since the problem is the WD disk sending too much data when the controller requests a break (think of it as the fast SATA disk flooding the slow PCI bus. think DoS attack ;-) )

@JTZ after thinking about your comment, I did remember I had two freezes under heavy disk access load, resulting in kernel panics (caps lock+scroll lock flashing). However, in general the problem was only visible in really slow read transfer rates.

Thanks in advance for inclusion in Lucid updates.

Revision history for this message
Steve Conklin (sconklin) wrote :

SRU Justification

Impact: The upstream process for stable tree updates is quite similar in scope to the Ubuntu SRU process, e.g., each patch has to demonstrably fix a bug, and each patch is vetted by upstream by originating either directly from Linus' tree or in a minimally backported form of that patch.

The 2.6.32.21 upstream stable patch set is now available and contains fixes for this problem. It should be included in the Ubuntu kernel as well.

Related commits in the Lucid repo are:

93b1530 sata_via: magic vt6421 fix for transmission problems w/ WD drives

Changed in linux (Ubuntu Lucid):
status: New → Fix Committed
importance: Undecided → Medium
assignee: nobody → Steve Conklin (sconklin)
milestone: none → lucid-updates
Changed in linux (Ubuntu):
status: Confirmed → Invalid
Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Accepted linux into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

tags: added: verification-needed
Revision history for this message
software-schlosser (software-schlosser) wrote :

Hi,

unfortunately the kernel update didn't fix the Problem for me. However, after running my HTPC for a few hours it seems that they occur less often but that's just a subjective estimation.

Revision history for this message
software-schlosser (software-schlosser) wrote :

Well, I was wrong. It doesn't occur less often :(

Revision history for this message
Stefan Bader (smb) wrote :

The patch will only make a difference for a specific VIA chip (vt6421 / device id 0x3249). So if the mainboard uses a different chipset, this does not help and should be reported as a separate bug then.

Revision history for this message
software-schlosser (software-schlosser) wrote :

Ok, so I guess this should be fixed for my "Abit AN78HD" too (HDD is Western Digital WD5000AACS 500 GB).

lspci says: "IDE interface: nVidia Corporation MCP78S [GeForce 8200] SATA Controller (non-AHCI mode) (rev a2)"
The corresponding module should be "pata_amd" I think.
Taking a quick look at the modules source it seems the above patch won't work. And don't have any knowledge on disk controller hardware so I can't fix it by myself.

Revision history for this message
Stefan Bader (smb) wrote :

Can anybody affected by the bug (and having the right hw combination) please give feedback whether the current proposed kernel does resolve the issue? Thanks.

Revision history for this message
Antoine (ve-antoine) wrote :

Ok, I've done it (on a productivity server, don't blamed me ;-) )

I've this chipset :

01:07.0 RAID bus controller: VIA Technologies, Inc. VT6421 IDE RAID Controller (rev 50)
 Subsystem: VIA Technologies, Inc. VT6421 IDE RAID Controller
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 32
 Interrupt: pin A routed to IRQ 19
 Region 0: I/O ports at a400 [size=16]
 Region 1: I/O ports at a800 [size=16]
 Region 2: I/O ports at ac00 [size=16]
 Region 3: I/O ports at b000 [size=16]
 Region 4: I/O ports at b400 [size=32]
 Region 5: I/O ports at b800 [size=256]
 [virtual] Expansion ROM at 300a0000 [disabled] [size=64K]
 Capabilities: <access denied>
 Kernel driver in use: sata_via
 Kernel modules: sata_via

This hardrive on the SATA connector :

        Model Family: Western Digital Caviar Second Generation Serial ATA family
        Device Model: WDC WD5000AAKS-00V1A0
        Serial Number: WD-WCAWF2903335
        Firmware Version: 05.01D05
        User Capacity: 500 107 862 016 bytes
        Device is: In smartctl database [for details use: -P show]
        ATA Version is: 8
        ATA Standard is: Exact ATA specification draft version not indicated
        Local Time is: Tue Sep 14 12:39:19 2010 CEST
        SMART support is: Available - device has SMART capability.
        SMART support is: Enabled

Now this kernel : 2.6.32-25-generic #43-Ubuntu SMP Wed Sep 1 09:46:39 UTC 2010 i686 GNU/Linux

And it *seems* to not have the usually annoying messages like « ata3.00: cmd 25/00:60:bd:ad:41/00:00:1a:00:00/e0 tag 0 dma 49152 i ... blablabla »

It seems also ti have improve the drive speed :
BEFORE :
/dev/sdc:
 Timing cached reads: 718 MB in 2.00 seconds = 359.02 MB/sec
 Timing buffered disk reads: 20 MB in 3.61 seconds = 5.55 MB/sec
AFTER :
/dev/sdc:
 Timing cached reads: 754 MB in 2.00 seconds = 376.38 MB/sec
 Timing buffered disk reads: 168 MB in 3.01 seconds = 55.81 MB/sec

Wow ! 10x faster !

Thank a lot for the fix guys !

Revision history for this message
Stefan Bader (smb) wrote :

Antoine, thanks for testing and: no, no. no. I never suggested doing that on production machines. ;-) But again thanks for verifying. Not sure the improved speed can be "blamed" to that patch or to some other but good to hear that as well. :)

tags: added: verification-done
removed: verification-needed
Revision history for this message
jts (jeremysilva) wrote :

Same issue with the VT6421 on Samsung drives.

lspci: 02:0a.0 RAID bus controller: VIA Technologies, Inc. VT6421 IDE RAID Controller (rev 50)
Drives are new 1.0 TB SAMSUNG HD103SJ 32M 7K drives.

Revision history for this message
Khalid Qasrawi (khalid) wrote :
Download full text (4.1 KiB)

I'm responding to Stephan's request on 2010-09-14 for feedback on the proposed kernel. First assumption here is that it is 2.6.35.

I have been using the stock releases of Lucid Lynx (10.0.4 and 10.04.1) and getting the SATA errors detailed below. After installing the 2.6.35 kernel, the errors have disappeared from syslog. I can't comment on any performance changes.

Kernel version:-
2.6.35-7-generic-pae

Controller details:-

00:0a.0 RAID bus controller: VIA Technologies, Inc. VT6421 IDE RAID Controller (rev 50)
 Subsystem: VIA Technologies, Inc. VT6421 IDE RAID Controller
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 32
 Interrupt: pin A routed to IRQ 18
 Region 0: I/O ports at bc00 [size=16]
 Region 1: I/O ports at b800 [size=16]
 Region 2: I/O ports at b400 [size=16]
 Region 3: I/O ports at b000 [size=16]
 Region 4: I/O ports at ac00 [size=32]
 Region 5: I/O ports at a800 [size=256]
 Expansion ROM at f7fe0000 [disabled] [size=64K]
 Capabilities: <access denied>
 Kernel driver in use: sata_via
 Kernel modules: sata_via
00: 06 11 49 32 07 00 90 02 50 00 04 01 00 20 00 00
10: 01 bc 00 00 01 b8 00 00 01 b4 00 00 01 b0 00 00
20: 01 ac 00 00 01 a8 00 00 00 00 00 00 06 11 49 32
30: 00 00 fe f7 e0 00 00 00 00 00 00 00 05 01 00 00

Disk details:-

ATA device, with non-removable media
 Model Number: SAMSUNG HD103SJ
 Serial Number: S246J90Z106350
 Firmware Revision: 1AJ100E4
 Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6
Standards:
 Used: unknown (minor revision code 0x0028)
 Supported: 8 7 6 5
 Likely used: 8
Configuration:
 Logical max current
 cylinders 16383 16383
 heads 16 16
 sectors/track 63 63
 --
 CHS current addressable sectors: 16514064
 LBA user addressable sectors: 268435455
 LBA48 user addressable sectors: 1953525168
 Logical Sector size: 512 bytes
 Physical Sector size: 512 bytes
 device size with M = 1024*1024: 953869 MBytes
 device size with M = 1000*1000: 1000204 MBytes (1000 GB)
 cache/buffer size = unknown
 Form Factor: 3.5 inch
 Nominal Media Rotation Rate: 7200

Error message before kernel upgrade:-
Sep 12 09:52:45 it2 kernel: [644545.934682] ata3.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6
Sep 12 09:52:45 it2 kernel: [644545.935748] ata3.00: BMDMA stat 0x5
Sep 12 09:52:45 it2 kernel: [644545.936779] ata3: SError: { UnrecovData Proto TrStaTrns }
Sep 12 09:52:45 it2 kernel: [644545.937835] ata3.00: failed command: READ DMA EXT
Sep 12 09:52:45 it2 kernel: [644545.939072] ata3.00: cmd 25/00:00:3f:01:80/00:01:1d:00:00/e0 tag 0 dma 131072 in
Sep 12 09:52:45 it2 kernel: [644545.939074] res 51/84:3f:00:00:00/84:02:00:00:00/e0 Emask 0x12 (ATA bus error)
Sep 12 09:52:45 it2 kernel: [644545.941281] ata3.00: status: { DRDY ERR }
Sep 12 09:52:45 it2 kernel: [644545.942423] ata3.00: error: { ICRC ABRT }
Sep 12 09:52:45 it2 kernel: [644545.943505] ata3: hard resetting link
Sep 12 09:52:45 it2 kern...

Read more...

Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (43.7 KiB)

This bug was fixed in the package linux - 2.6.32-25.44

---------------
linux (2.6.32-25.44) lucid-proposed; urgency=low

  [ Brad Figg ]

  * SAUCE: (no-up) Modularize vesafb -- fix initialization
    - LP: #611471

  [ Stefan Bader ]

  * Revert "SAUCE: sync before umount to reduce time taken by ext4 umount"
    - LP: #543617, #585092

  [ Steve Conklin ]

  * Revert "SAUCE: tulip: Let dmfe handle davicom on non-sparc"
    - LP: #607824

  [ Tim Gardner ]

  * [Config] Added ums-cypress to udeb
    - LP: #576066

  [ Upstream Kernel Changes ]

  * Revert "PCI quirk: Disable MSI on VIA K8T890 systems"
    - LP: #607824
  * Revert "PCI quirks: disable msi on AMD rs4xx internal gfx bridges"
    - LP: #607824
  * Revert "(pre-stable) Input: psmouse - reset all types of mice before
    reconnecting"
    - LP: #607824
  * Revert "jbd: jbd-debug and jbd2-debug should be writable"
    - LP: #607824
  * Revert "ext4: Make fsync sync new parent directories in no-journal
    mode"
    - LP: #615548
  * Revert "ext4: Fix compat EXT4_IOC_ADD_GROUP"
    - LP: #615548
  * Revert "ext4: Conditionally define compat ioctl numbers"
    - LP: #615548
  * Revert "ext4: restart ext4_ext_remove_space() after transaction
    restart"
    - LP: #615548
  * Revert "ext4: Clear the EXT4_EOFBLOCKS_FL flag only when warranted"
    - LP: #615548
  * Revert "ext4: Avoid crashing on NULL ptr dereference on a filesystem
    error"
    - LP: #615548
  * Revert "ext4: Use bitops to read/modify i_flags in struct
    ext4_inode_info"
    - LP: #615548
  * Revert "ext4: Show journal_checksum option"
    - LP: #615548
  * Revert "ext4: check for a good block group before loading buddy pages"
    - LP: #615548
  * Revert "ext4: Prevent creation of files larger than RLIMIT_FSIZE using
    fallocate"
    - LP: #615548
  * Revert "ext4: Remove extraneous newlines in ext4_msg() calls"
    - LP: #615548
  * Revert "ext4: init statistics after journal recovery"
    - LP: #615548
  * Revert "ext4: clean up inode bitmaps manipulation in ext4_free_inode"
    - LP: #615548
  * Revert "ext4: Do not zero out uninitialized extents beyond i_size"
    - LP: #615548
  * Revert "ext4: don't scan/accumulate more pages than mballoc will
    allocate"
    - LP: #615548
  * Revert "ext4: stop issuing discards if not supported by device"
    - LP: #615548
  * Revert "ext4: check s_log_groups_per_flex in online resize code"
    - LP: #615548
  * Revert "ext4: fix quota accounting in case of fallocate"
    - LP: #615548
  * Revert "ext4: allow defrag (EXT4_IOC_MOVE_EXT) in 32bit compat mode"
    - LP: #615548
  * Revert "ext4: rename ext4_mb_release_desc() to ext4_mb_unload_buddy()"
    - LP: #615548
  * Revert "ext4: Remove unnecessary call to ext4_get_group_desc() in
    mballoc"
    - LP: #615548
  * Revert "ext4: fix memory leaks in error path handling of
    ext4_ext_zeroout()"
    - LP: #615548
  * Revert "ext4: check missed return value in ext4_sync_file()"
    - LP: #615548
  * Revert "ext4: Issue the discard operation *before* releasing the blocks
    to be reused"
    - LP: #615548
  * Revert "ext4: Fix buffer head leaks after calls to
    ext4_get_inode_loc()"
    - LP: #615548
  * Revert "ex...

Changed in linux (Ubuntu Lucid):
status: Fix Committed → Fix Released
Revision history for this message
Colin (duke267) wrote :

I have the same error with VIA VT6421 chipset on 2.6.32-25-generic #45-Ubuntu SMP

Revision history for this message
Joseph P (joseph13) wrote :

I'm using a SATA PCI card VT6421 chipset with two Western Digital drives. Supposedly issues with communication between the card and drives has been solved as mentioned above. I have installed kernel 2.6.37 on Ubuntu Maverick 10.10 and have also used Ubuntu Knatty Alpha 1.0 and am still having issues. Any help?
(See my post http://ubuntuforums.org/showthread.php?t=1665984)

Revision history for this message
Stefan Bader (smb) wrote :

Joseph, it would be good if you just could open a new bug report with the ubuntu-bug command to have your issues separated from this more generic one which have been resolved by the upstream changes. So it is very likely something that needs its own resolution.

Revision history for this message
Rudi Servo (rudiservo) wrote :

Hi, I've been checking around for this bug, well what i have been able to conclude, is that obviously there is a problem with whom assembles the controllers, I have a pci card that has the VT6421A chip, no problems at all with WD and Samsung HDD's, on the other hand same chip from another budget manufacturer, gives me not only this type of bug on the Samsung but also the bug on WD drives, and both cards are detected with the same sata_via controller vt6421 rev 50, exactly the same.

I don't know how to correct the problem but I can offer access to my home server with the card installed and 2 Wd 500 gb caviar Hdd on the card if any developer wants to use it for testing, or I can send an extra card same model of the one that bugs.

Revision history for this message
arnuschky (abrutschy) wrote :

Same here, on a Dell PowerEdge 2800. I am using:

10:01.0 RAID bus controller: VIA Technologies, Inc. VT6421 IDE RAID Controller (rev 50)
 Subsystem: VIA Technologies, Inc. VT6421 IDE RAID Controller
 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
 Latency: 32
 Interrupt: pin A routed to IRQ 16
 Region 0: I/O ports at 8cf0 [size=16]
 Region 1: I/O ports at 8cd0 [size=16]
 Region 2: I/O ports at 8cb0 [size=16]
 Region 3: I/O ports at 8c90 [size=16]
 Region 4: I/O ports at 8c60 [size=32]
 Region 5: I/O ports at 8800 [size=256]
 Expansion ROM at df200000 [disabled] [size=64K]
 Capabilities: [e0] Power Management version 2
  Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
  Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
 Kernel driver in use: sata_via
 Kernel modules: sata_via

and I am seeing the same errors. I don't have any disks attached, but a (new) dvd drive. System crashes each time on any access to this drive. This is reproducible. Kernel is 2.6.38-10-server #46-Ubuntu SMP x86_64

Did anyone open a new bug for this as advised by Stefan?

Revision history for this message
arnuschky (abrutschy) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.