Lucid kernel is missing a large number of important ext4 bug fixes

Bug #588069 reported by Theodore Ts'o
32
This bug affects 4 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Brad Figg
Lucid
Fix Released
High
Brad Figg

Bug Description

SRU Justification

Impact:
The following patches are heading to the stable maintainers tree and will eventually get pulled into the
Lucid tree. The earlier we can get these in the fewer issues our users will encounter. We will be getting
these through stable-tree updates, we would just be pulling them in a little earlier.

The following patches should really be applied to the Lucid 2.6.32-22.33 kernel:

ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/ext4-patches/ext4-patches-for-2.6.32.11--14.tar.gz

There is a git tree you may find useful here:

git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git lucid-2.6.32-ext4

It fixes a very large number of ext4 bugs, including (but not limited to):

http://bugzilla.kernel.org/show_bug.cgi?id=14286
http://bugzilla.kernel.org/show_bug.cgi?id=14936
http://bugzilla.kernel.org/show_bug.cgi?id=15420
http://bugzilla.kernel.org/show_bug.cgi?id=15579
http://bugzilla.kernel.org/show_bug.cgi?id=13549
http://bugzilla.kernel.org/show_bug.cgi?id=15742
http://bugzilla.kernel.org/show_bug.cgi?id=15768
http://bugzilla.kernel.org/show_bug.cgi?id=15792
http://bugzilla.kernel.org/show_bug.cgi?id=15827

This patch has been submitted upstream, but I'm not sure if Greg K-H will be doing another 2.6.32.y kernel release.

Testing:
   TBD

Joel Ebel (jbebel)
tags: added: glucid
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Hi Ted,
   Thanks for bringing these to our attention. I'll make sure this gets the attention it needs.

~JFo

Changed in linux (Ubuntu):
status: New → Triaged
importance: Undecided → High
tags: added: kernel-fs kernel-needs-review
Revision history for this message
Andy Whitcroft (apw) wrote :

@Ted -- thanks for the heads up, always appreciated.

tags: added: kernel-candidate kernel-reviewed
removed: kernel-needs-review
Revision history for this message
Theodore Ts'o (tytso) wrote :

Is your next backport kernel for Lucid going to be for 2.6.34? (as I seem to seem to see from the ubuntu-lucid/lts-backport-maverick branch)

If so, I have a different set of patches that has been submitted for 2.6.34.y for ext4 which I can make available to you....

Brad Figg (brad-figg)
Changed in linux (Ubuntu):
assignee: nobody → Brad Figg (brad-figg)
Revision history for this message
Brad Figg (brad-figg) wrote :

@Ted

It is our understanding that 2.6.32 is a long-term stable support tree for Greg KH. Have you sent these patches to the stable maintainers list? I an working on them for our review, independent of that.

Revision history for this message
Theodore Ts'o (tytso) wrote :

Yes, they have been sent to Greg K-H. There are actually few more patches I will be sending both to Greg and to you guys in the next few days, BTW.

Revision history for this message
Brad Figg (brad-figg) wrote :

@Ted

Thanks, look forward to more when you have them ready.

Brad Figg (brad-figg)
description: updated
Andy Whitcroft (apw)
tags: removed: kernel-candidate
Revision history for this message
Brad Figg (brad-figg) wrote :

For anyone interested, I have test kernels with these patches applied in my ppa. Also, I've run the "check" test from xfstests-dev and it does run better on a kernel with the patches applied than one without them.

https://launchpad.net/~brad-figg/+archive/ppa

Revision history for this message
Brad Figg (brad-figg) wrote :

The patches have been reviews and approved for SRU

Changed in linux (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Pete Graner (pgraner) wrote :
Download full text (4.3 KiB)

Per discussion here is the panic I hit yesterday and again today, I will also add hw info. This occurred when having multiple rsync's pounding the fileystem (ext4 on a USB 1TB disk) after about an hour. Will also test brad's kernel to see if it gets any better.

Jun 30 23:46:12 frylock kernel: [114450.175420] BUG: unable to handle kernel NULL pointer dereference at 00
00000c
Jun 30 23:46:12 frylock kernel: [114450.175425] IP: [<c01d9f13>] do_writepages+0x13/0x40
Jun 30 23:46:12 frylock kernel: [114450.175431] *pdpt = 000000000088f001 *pde = 0000000000000000
Jun 30 23:46:12 frylock kernel: [114450.175434] Oops: 0000 [#1] SMP
Jun 30 23:46:12 frylock kernel: [114450.175437] last sysfs file: /sys/devices/pci0000:00/0000:00:1a.7/usb1/
1-5/1-5:1.0/host0/target0:0:0/0:0:0:0/block/sde/sde1/stat
Jun 30 23:46:12 frylock kernel: [114450.175439] Modules linked in: raid0 binfmt_misc ppdev nfsd exportfs nf
s lockd nfs_acl auth_rpcgss sunrpc xt_TCPMSS xt_limit xt_tcpudp nf_nat_irc nf_nat_ftp snd_hda_codec_intelhd
mi ipt_LOG ipt_MASQUERADE snd_hda_codec_realtek xt_DSCP ipt_REJECT nf_conntrack_irc nf_conntrack_ftp snd_hd
a_intel xt_state iptable_nat snd_hda_codec nf_nat snd_hwdep nf_conntrack_ipv4 nf_conntrack snd_pcm_oss snd_
mixer_oss nf_defrag_ipv4 snd_pcm fbcon tileblit snd_seq_dummy iptable_mangle font snd_seq_oss iptable_filte
r bitblit ip_tables snd_seq_midi softcursor snd_rawmidi x_tables snd_seq_midi_event snd_seq vga16fb vgastat
e snd_timer snd_seq_device snd i915 drm_kms_helper soundcore snd_page_alloc drm i2c_algo_bit video intel_ag
p agpgart output lp parport hid_belkin usbhid hid usb_storage r8169 mii ahci e1000e
Jun 30 23:46:12 frylock kernel: [114450.175485]
Jun 30 23:46:12 frylock kernel: [114450.175487] Pid: 4448, comm: flush-9:0 Not tainted (2.6.32-22-generic-p
ae #36-Ubuntu) Product Name To Be Filled By O.E.M.
Jun 30 23:46:12 frylock kernel: [114450.175489] EIP: 0060:[<c01d9f13>] EFLAGS: 00010206 CPU: 0
Jun 30 23:46:12 frylock kernel: [114450.175492] EIP is at do_writepages+0x13/0x40
Jun 30 23:46:12 frylock kernel: [114450.175493] EAX: d3d31d98 EBX: d3d31d08 ECX: 00000000 EDX: e034bf10
Jun 30 23:46:12 frylock kernel: [114450.175495] ESI: 00000007 EDI: f69f90e0 EBP: e034be90 ESP: e034be90
Jun 30 23:46:12 frylock kernel: [114450.175497] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Jun 30 23:46:12 frylock kernel: [114450.175499] Process flush-9:0 (pid: 4448, ti=e034a000 task=f6a4b340 tas
k.ti=e034a000)
Jun 30 23:46:12 frylock kernel: [114450.175500] Stack:
Jun 30 23:46:12 frylock kernel: [114450.175502] e034beb0 c022f895 d3d31d98 00000000 e034bf10 d3d31d08 0000
0000 f69f90e0
Jun 30 23:46:12 frylock kernel: [114450.175506] <0> e034bef8 c0230399 e034bee0 01b372c7 f69f9108 00000000 0
0000000 e034bf10
Jun 30 23:46:12 frylock kernel: [114450.175511] <0> f69f90f8 f69f9100 00000000 f69f90e0 dfedc7b0 d78ba8f0 e
ed5cc00 e034bf68
Jun 30 23:46:12 frylock kernel: [114450.175517] Call Trace:
Jun 30 23:46:12 frylock kernel: [114450.175521] [<c022f895>] ? writeback_single_inode+0xd5/0x380
Jun 30 23:46:12 frylock kernel: [114450.175524] [<c0230399>] ? writeback_inodes_wb+0x389/0x530
Jun 30 23:46:12 frylock kernel: [114450.175527] [<c0230637>] ? wb_...

Read more...

Revision history for this message
Brad Figg (brad-figg) wrote :

@pgraner,

Please open a new bug on your issue. This bug is being used as a tracking bug for an SRU.

Thanks,
Brad

Revision history for this message
Pete Graner (pgraner) wrote :

@Brad, was told to put it here by apw & smb :-) go yeall at them

The test kernel from comment #7 seems to have fixed the panic above. It has been running over 4 hours now with multiple high speed transfers. Previously it would panic after about 30 mins.

Revision history for this message
Steve Conklin (sconklin) wrote :

SRU Justification:

    Impact:
    This is a set of ext4 patches which are highly recommended from upstream, and which fix some observed problems.

    Fix:
    Apply the patch set

    Testcase:
    Testing using xfstests indicates that some tests were failing before application of the patches, and they now pass. It has also been confirmed to resolve a server hang issue, as documented above.

Changed in linux (Ubuntu):
status: In Progress → Fix Committed
milestone: none → lucid-updates
Revision history for this message
Colin Watson (cjwatson) wrote : Please test proposed package

Accepted linux into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in linux (Ubuntu Lucid):
status: New → Fix Committed
tags: added: verification-needed
Andy Whitcroft (apw)
Changed in linux (Ubuntu Lucid):
assignee: nobody → Andy Whitcroft (apw)
assignee: Andy Whitcroft (apw) → Brad Figg (brad-figg)
importance: Undecided → Medium
importance: Medium → High
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu Lucid):
milestone: none → lucid-updates
Changed in linux (Ubuntu):
milestone: lucid-updates → none
Revision history for this message
Brad Figg (brad-figg) wrote :

I have run the xfstests "check" and "stress" tests against the shipping kernel and then again against the test kernel with these patches applied. The "check" tests did not run to completion and there was an error or two. The tests ran to completion on the test kernel and had no errors. Stress ran to completion with no errors as well.

Martin Pitt (pitti)
tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (4.9 KiB)

This bug was fixed in the package linux - 2.6.32-24.38

---------------
linux (2.6.32-24.38) lucid-proposed; urgency=low

  [ Keng-Yu Lin ]

  * SAUCE: dell-laptop: fire SMI when toggling hardware killswitch
    (revised)
    - LP: #590607

  [ Upstream Kernel Changes ]

  * sfc: Wait at most 10ms for the MC to finish reading out MAC statistics
    - LP: #590783
  * sfc: Always close net device at the end of a disabling reset
    - LP: #590783
  * sfc: Change falcon_probe_board() to fail for unsupported boards
    - LP: #590783
  * ext4: Fix potential quota deadlock
    - LP: #588069
  * jbd: jbd-debug and jbd2-debug should be writable
    - LP: #588069
  * ext4: replace BUG() with return -EIO in ext4_ext_get_blocks
    - LP: #588069
  * ext4, jbd2: Add barriers for file systems with exernal journals
    - LP: #588069
  * ext4: Eliminate potential double free on error path
    - LP: #588069
  * ext4: return correct wbc.nr_to_write in ext4_da_writepages
    - LP: #588069
  * ext4: Ensure zeroout blocks have no dirty metadata
    - LP: #588069
  * ext4: Patch up how we claim metadata blocks for quota purposes
    - LP: #588069
  * ext4: Fix accounting of reserved metadata blocks
    - LP: #588069
  * ext4: Calculate metadata requirements more accurately
    - LP: #588069
  * ext4: Handle -EDQUOT error on write
    - LP: #588069
  * ext4: Fix quota accounting error with fallocate
    - LP: #588069
  * ext4: Drop EXT4_GET_BLOCKS_UPDATE_RESERVE_SPACE flag
    - LP: #588069
  * ext4: Use bitops to read/modify EXT4_I(inode)->i_state
    - LP: #588069
  * ext4: Fix BUG_ON at fs/buffer.c:652 in no journal mode
    - LP: #588069
  * ext4: Add flag to files with blocks intentionally past EOF
    - LP: #588069
  * ext4: Fix fencepost error in chosing choosing group vs file
    preallocation.
    - LP: #588069
  * ext4: fix error handling in migrate
    - LP: #588069
  * ext4: explicitly remove inode from orphan list after failed direct io
    - LP: #588069
  * ext4: Handle non empty on-disk orphan link
    - LP: #588069
  * ext4: make "offset" consistent in ext4_check_dir_entry()
    - LP: #588069
  * ext4: Fix insertion point of extent in mext_insert_across_blocks()
    - LP: #588069
  * ext4: Fix the NULL reference in double_down_write_data_sem()
    - LP: #588069
  * ext4: Code cleanup for EXT4_IOC_MOVE_EXT ioctl
    - LP: #588069
  * ext4: Fix estimate of # of blocks needed to write indirect-mapped files
    - LP: #588069
  * ext4: Fixed inode allocator to correctly track a flex_bg's used_dirs
    - LP: #588069
  * ext4: Fix possible lost inode write in no journal mode
    - LP: #588069
  * ext4: Fix buffer head leaks after calls to ext4_get_inode_loc()
    - LP: #588069
  * ext4: Issue the discard operation *before* releasing the blocks to be
    reused
    - LP: #588069
  * ext4: check missed return value in ext4_sync_file()
    - LP: #588069
  * ext4: fix memory leaks in error path handling of ext4_ext_zeroout()
    - LP: #588069
  * ext4: Remove unnecessary call to ext4_get_group_desc() in mballoc
    - LP: #588069
  * ext4: rename ext4_mb_release_desc() to ext4_mb_unload_buddy()
    - LP: #588069
  * ext4: allow defrag (EXT4_IOC_MOVE_EXT) in 32b...

Read more...

Changed in linux (Ubuntu Lucid):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.