nfs client broken since 2.6.28-2-generic upgrade

Bug #306016 reported by Ambricka
64
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Linux
Invalid
Medium
linux (Ubuntu)
Fix Released
High
Andy Whitcroft
nfs-utils (Ubuntu)
Invalid
High
Unassigned

Bug Description

Binary package hint: nfs-common

Description: Ubuntu jaunty (development branch)
Release: 9.04

After the latest kernel upgrade I can't mount nfs shares anymore. (server not upgraded, just client)
Mount command just seems to hang, but is stoppable with ctrl-c

/var/log/syslog spews out this:

Dec 7 18:15:56 nattbrygga kernel: [28315.080038] rpcbind: server localhost not responding, timed out
Dec 7 18:15:56 nattbrygga kernel: [28315.080076] RPC: failed to contact local rpcbind server (errno 5).
Dec 7 18:16:26 nattbrygga kernel: [28345.080049] rpcbind: server localhost not responding, timed out
Dec 7 18:16:26 nattbrygga kernel: [28345.080092] RPC: failed to contact local rpcbind server (errno 5).
Dec 7 18:16:56 nattbrygga kernel: [28375.080039] rpcbind: server localhost not responding, timed out
Dec 7 18:16:56 nattbrygga kernel: [28375.080078] RPC: failed to contact local rpcbind server (errno 5).

Changed in nfs-utils:
status: New → Confirmed
Changed in nfs-utils:
importance: Undecided → High
Revision history for this message
Peter Cordes (peter-cordes) wrote :

This does look the as same bug 306343 I reported. I didn't see this one because I was only searching in linux, not nfs-common.

Changed in linux:
status: New → Confirmed
Changed in linux:
status: Confirmed → Invalid
Revision history for this message
Simon L'nu (simon-lnu) wrote :

Hi,

I've been seeing this in Ubuntu, but not Debian (I triple-boot between Debian and Ubuntu almost evenly, and the unnamed obsolete OS rarely). In Debian, they mount instantly with the same exact mount options in fstab, whereas in Jaunty they don't, and I see the same errors as the original reporter sees.

Revision history for this message
Kees Cook (kees) wrote :

Why is this certain to be an nfs-utils bug? There is no newer upstream version, and if "nolock" works, that seems like the kernel broke something to me.

Changed in linux:
status: Invalid → Confirmed
importance: Undecided → High
Revision history for this message
Simon L'nu (simon-lnu) wrote : Re: [Bug 306016] Re: nfs client broken since 2.6.28-2-generic upgrade

2008/12/19 Kees Cook <email address hidden>:
> Why is this certain to be an nfs-utils bug? There is no newer upstream
> version, and if "nolock" works, that seems like the kernel broke
> something to me.
>

that's a very good point ;). i think you're right too, tbh.

Changed in linux:
status: Unknown → Confirmed
Revision history for this message
tdflanders (thomasdelbeke) wrote :

Hi there,

It works fine for me now. Can you try:

root# apt-get install libntfs-3g-dev libntfs-3g31 libntfs-dev libntfs10 ; apt-get build-dep libntfs-3g-dev libntfs-3g31 libntfs-dev libntfs10 ; dpkg-reconfigure libntfs-3g-dev libntfs-3g31 libntfs-dev libntfs10 ; apt-get install nfs-common ; apt-get build-dep nfs-common ; dpkg-reconfigure nfs-common

After that you can create a mount point for your ntfs share: System > Administration > NTFS Configuration Tool

Enable internal and external devices.

After that, try and go to /media and right-click your nfs-share. Now create a mount point for this too. After that it just worked for me (before I even installed nfs-common!).

Anyway,

that is just what I experienced.

Cheers,

Thomas

Revision history for this message
tdflanders (thomasdelbeke) wrote :

see:

 Bug #308537:
This report is public

Revision history for this message
Peter Cordes (peter-cordes) wrote :

On Sat, Dec 20, 2008 at 04:16:55AM -0000, tdflanders wrote:
>
> Hi there,
>
> It works fine for me now. Can you try:
>
> root# apt-get install libntfs-3g-dev libntfs-3g31 libntfs-dev libntfs10

 Are you confusing NFS with NTFS? They're completely different
things... If you aren't, then the NFS problems probably aren't
specific to exporting NTFS filesystems. I mostly use XFS (which, like
ext3, has support in the Linux kernel, not through FUSE like NTFS 3g).

 Anyway, this NTFS stuff seems to be coming out of nowhere, maybe
since your bug is marked as a duplicate of 306016, and I haven't read
most of the info on bug 308537. If you're now talking about a
different bug that's specific to NTFS, unmark that as a duplicate or
something.

> --
> nfs client broken since 2.6.28-2-generic upgrade
> https://bugs.launchpad.net/bugs/306016
> You received this bug notification because you are a direct subscriber
> of the bug.

--
#define X(x,y) x##y
Peter Cordes ; e-mail: X(peter@cor , des.ca)

"The gods confound the man who first found out how to distinguish the hours!
 Confound him, too, who in this place set up a sundial, to cut and hack
 my day so wretchedly into small pieces!" -- Plautus, 200 BC

Revision history for this message
Noel J. Bergman (noeljb) wrote :

Peter,

Bug 308537 is solely about an install problem where both Thomas and I encountered errors when simply installing nfs-common on Jaunty. I agree with you that I don't see any reason why NTFS would have any effect on NFS, and my volumes are either ext3 or xfs.

Thomas, what mount command(s) are you using? I have tried:

  mount -t nfs host:/path /mount-point
  mount -t nfs -o nolock host:/path /mount-point
  mount -t nfs4 host:/path /mount-point

Only the middle one works.

Revision history for this message
3vi1 (launchpad-net-eternaldusk) wrote :

I can concur: The volume I now see the problem is not NTFS either.

Revision history for this message
David Erosa (erosa) wrote :

I confirm both the bug and the "-o nolock" workaround on a "fresh" Jaunty installation (as far as jaunty can be fresh installed...)

Andy Whitcroft (apw)
Changed in linux:
assignee: nobody → apw
status: Confirmed → In Progress
Kees Cook (kees)
Changed in nfs-utils:
status: Confirmed → Invalid
Changed in linux:
status: Confirmed → Invalid
Revision history for this message
Andy Whitcroft (apw) wrote :

The upstream maintainer indicate that this is a missmatch between our kernel and userspace. Specificially we are using portmapper in userspace but have the kernel side of rpcbind enabled. Spun a patch to disable CONFIG_SUNRPC_REGISTER_V4 and submitted it. Building some test kernels now.

Revision history for this message
Noel J. Bergman (noeljb) wrote :

Thanks, Andy and Kees. Question: having read the discussion at the upstream bug, I'll ask the obvious --: why is Jaunty using the "legacy Linux portmapper daemon" instead of rpcbind?

Revision history for this message
Andy Whitcroft (apw) wrote :

The new mode is still marked experimental (depending on CONFIG_EXPERIMENTAL) so its not deemed ready for mainstream use. We should have it disabled for the Jaunty cycle at least. It is likely something to be considered for the K cycle.

Revision history for this message
Andy Whitcroft (apw) wrote :

I have built some test kernels with this config change applied. These are based of the current jaunty kernel. If you could test these kernels and report back here that would be great. These kernels are available at the URL below:

    http://people.ubuntu.com/~apw/lp306016/

Revision history for this message
Kees Cook (kees) wrote :

I booted the test kernel and they fixed the problem for me. Thanks very much!

Revision history for this message
Noel J. Bergman (noeljb) wrote :

Andy, I also booted the test kernel, and can report that all of:

  mount -t nfs -o nolock noel-ubuntu.local:/ /tmp/nfsshare/
  mount -t nfs noel-ubuntu.local:/ /tmp/nfsshare/
  mount -t nfs4 -orsize=32768,wsize=32768 noel-ubuntu.local:/ /tmp/nfsshare/

work just fine. So that's all good news, and thank for you it. :-)

HOWEVER ... unrelated to this bug, but a concern about the kernel you built. Did you do anything else different from a stock Ubuntu Jaunty kernel? Because if I boot to the previous kernel, Thinkfinger works, but it does not work with this kernel. Very reproducible, depending upon which kernel I boot. If this is something you want to follow up, feel free to open a new bug (or I can) and I'll pitch in with my details. If this is not unexpected for this test build, I'll chalk it up to a quirk of the build, and wait for a released kernel package set.

Revision history for this message
Simon L'nu (simon-lnu) wrote :

worked wonderfully, thank you very much :D.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 2.6.28-4.5

---------------
linux (2.6.28-4.5) jaunty; urgency=low

  [ Andy Whitcroft ]

  * clean up module dependancy information on package removal/purge
    - LP: #300773

  [ Tim Gardner ]

  * Update iscsitarget to 0.4.17
  * Build in ext{234}
  * Build in Crypto modules AES, CBC, ECB
  * Build in ACPI AC,BATTERY,BUTTON,FAN,PCI_SLOT,PROCESSOR,SBS,THERMAL,WMI
  * Build in AGP intel,via,sis,ali,amd,amd64,efficeon,nvidia,sworks
  * Build in ata,dev_dm,dev_loop,dev_md,dev_sd,dev_sr
  * Build in BT l2cap,rfcomm,sco
  * Reduce CONFIG_LEGACY_PTY_COUNT to 0
  * Build in CDROM_PKTCDVD and CHR_DEV_SG
  * Build in CPU_FREQ
    GOV_CONSERVATIVE,GOV_ONDEMAND,GOV_POWERSAVE,GOV_USERSPACE,STAT,TABLE
  * Build in DM CRYPT,MIRROR,MULTIPATH,SNAPSHOT
  * Build in DRM
  * Build in HID
  * Build in HOTPLUG PCI,PCIE
  * Build in I2C
  * Build in IEEE1394 OHCI1394
  * Build in INPUT EVDEV
  * Build in IPV6
  * Build in MMC
  * Build in PACKET
  * Enable both IEEE1394 (Firewire) stacks as modules
    - LP: #276463
  * Disable SUNRPC_REGISTER_V4
    - LP: #306016
  * Enable dm-raid4-5
    - LP: #309378
  * Build in PPP
  * Build in RFKILL
  * Build in USB SERIAL

  [ Upstream Kernel Changes ]

  * Rebased to v2.6.28

 -- Tim Gardner <email address hidden> Thu, 18 Dec 2008 21:18:44 -0700

Changed in linux:
status: In Progress → Fix Released
Revision history for this message
Noel J. Bergman (noeljb) wrote :

Andy, that bug I reported is present in the released kernel. I have initiated Bug 311732 for it.

Revision history for this message
Michael Shadle (mshadle) wrote :

This has been an issue for me for a long time using intrepid too.

I have not upgraded to jaunty yet.

Kernel versions on only *one* machine seem to not start nlockmgr.

broken machine:

[root@lvs01 etc]# rpcinfo -p
   program vers proto port
    100000 2 tcp 111 portmapper
    100024 1 udp 64054 status
    100024 1 tcp 49410 status
    100000 2 udp 111 portmapper

good machine:

[root@web02 default]# rpcinfo -p
   program vers proto port
    100000 2 tcp 111 portmapper
    100000 2 udp 111 portmapper
    100024 1 udp 26775 status
    100024 1 tcp 20961 status
    100021 1 tcp 17290 nlockmgr
    100021 3 tcp 17290 nlockmgr
    100021 4 tcp 17290 nlockmgr

same identical configs in /etc that matter, same everything. only one machine out of 6 exhibited this behavior.

looks like adding "nolock" fixed this. but i was hoping the 2.6.28 update in intrepid would fix it and it still hasn't. the last stable kernel i had was 2.6.24-16-server. since then every kernel has had this issue just on this one box (all identically configured, identical dpkg lists even - i dpkg -l and md5 them and i keep them identical...)

PLEASE have this fixed asap and if possible push it back to intrepid until jaunty becomes the new production.

Revision history for this message
Michael Shadle (mshadle) wrote :

This has been an issue for me for a long time using intrepid too.

I have not upgraded to jaunty yet.

Kernel versions on only *one* machine seem to not start nlockmgr.

broken machine:

[root@lvs01 etc]# rpcinfo -p
   program vers proto port
    100000 2 tcp 111 portmapper
    100024 1 udp 64054 status
    100024 1 tcp 49410 status
    100000 2 udp 111 portmapper

good machine:

[root@web02 default]# rpcinfo -p
   program vers proto port
    100000 2 tcp 111 portmapper
    100000 2 udp 111 portmapper
    100024 1 udp 26775 status
    100024 1 tcp 20961 status
    100021 1 tcp 17290 nlockmgr
    100021 3 tcp 17290 nlockmgr
    100021 4 tcp 17290 nlockmgr

same identical configs in /etc that matter, same everything. only one machine out of 6 exhibited this behavior.

looks like adding "nolock" fixed this. but i was hoping the 2.6.28 update in intrepid would fix it and it still hasn't. the last stable kernel i had was 2.6.24-16-server. since then every kernel has had this issue just on this one box (all identically configured, identical dpkg lists even - i dpkg -l and md5 them and i keep them identical...)

PLEASE have this fixed asap and if possible push it back to intrepid until jaunty becomes the new production.

Revision history for this message
Michael Shadle (mshadle) wrote :

This has been an issue for me for a long time using intrepid too.

I have not upgraded to jaunty yet.

Kernel versions on only *one* machine seem to not start nlockmgr.

broken machine:

[root@lvs01 etc]# rpcinfo -p
   program vers proto port
    100000 2 tcp 111 portmapper
    100024 1 udp 64054 status
    100024 1 tcp 49410 status
    100000 2 udp 111 portmapper

good machine:

[root@web02 default]# rpcinfo -p
   program vers proto port
    100000 2 tcp 111 portmapper
    100000 2 udp 111 portmapper
    100024 1 udp 26775 status
    100024 1 tcp 20961 status
    100021 1 tcp 17290 nlockmgr
    100021 3 tcp 17290 nlockmgr
    100021 4 tcp 17290 nlockmgr

same identical configs in /etc that matter, same everything. only one machine out of 6 exhibited this behavior.

looks like adding "nolock" fixed this. but i was hoping the 2.6.28 update in intrepid would fix it and it still hasn't. the last stable kernel i had was 2.6.24-16-server. since then every kernel has had this issue just on this one box (all identically configured, identical dpkg lists even - i dpkg -l and md5 them and i keep them identical...)

PLEASE have this fixed asap and if possible push it back to intrepid until jaunty becomes the new production.

Revision history for this message
Peter Cordes (peter-cordes) wrote : Re: [Bug 306016] Re: nfs client broken since 2.6.28-2-generic upgrade

On Thu, Mar 26, 2009 at 07:38:36PM -0000, mike503 wrote:
> This has been an issue for me for a long time using intrepid too.

 You're the only one hitting this on Intrepid. Unless you're using a
Jaunty kernel on your Intrepid system. It's been fixed in Jaunty for
months. The only way you could be having this exact bug is if you are
using an out-of-date Jaunty kernel.

If your NFS is borked, and you're using
linux-image-2.6.27-14-server ver 2.6.27-14.30 (the latest Intrepid) or
linux-image-2.6.28-11-server ver 2.6.28-11.37 (the latest Jaunty)
then your problem lies elsewhere.

 NFS has worked fine for me with Intrepid, and with Jaunty after this
was fixed in linux (2.6.28-4.5).

Revision history for this message
Michael Shadle (mshadle) wrote :

I'm using the latest Intrepid packages of everything. NFS seems to work great on everything but one box and one box only, still. Using the "nolock" mount flag workaround appeared to fix it at first, but wound up freezing up again after a while.

I had to go back to using 2.6.24-16-server on that one machine and it's okay. Any of the newer kernels don't work.

I have not tried Jaunty kernels yet. I did try linux-image-2.6.28-2-server_2.6.28-2.3_amd64.deb which might be one of the earlier jaunty ones a few months ago with no luck either.

Same fstab, same /etc/default config files, same dpkg list (identical md5 hashes), same mount strings, same network settings, etc.

Odd that I can't get a newer kernel on this one single machine to work. I'm hoping when Jaunty becomes stable I can upgrade everything and be okay.

Revision history for this message
buntunub (mckisick) wrote :

Oh. I just noticed this --

nepomukservices[4039]: segfault at 8 ip 00007fcbbf332cfe sp 00007fffc7935b50 error 4 in libQtCore.so.4.5.0[7fcbbf2c9000+240000]

And that does happen every boot, but I guess thats yet another bug to follow up on with Jaunty..

Anyway, I have googled up on this nfs issue quite a lot and from what I have noticed, upstream seems to not really take too much stock in this issue as it seems to be some type of harmless call back issue that is harmless as far as the protocol goes. Above my head though so I am really clueless as to this, but considering this issue never cropped up with Fedora10 when I had that loaded on this same client, nor with Hardy, same machine, one would have to assume that whatever was done with the Jaunty kernel is causing this problem. I also disabled the firewall, and modified the fstab with "clientaddr=192.168.11.3", thinking THAT might be causing a problem, but no joy!

Revision history for this message
buntunub (mckisick) wrote :

Sorry, posted on wrong bug.

Revision history for this message
James Sparenberg (james-linuxrebel) wrote :

First - Thanks for fixing this. Second I've found the work around for the bug. If you install nfs-kernel-server you can mount/unmount nfs shares without a problem. It seems whatever is missing from kernel 2.6.28-15 is in the nfs-kernel-server package.

Hope this helps someone trying to solve an immediate problem. --thanks again for the work.

Changed in linux:
importance: Unknown → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.