NetBSD Problem Report #15463

Received: (qmail 624 invoked from network); 2 Feb 2002 17:39:43 -0000
Message-Id: <200202021739.g12HdfA07242@cleopatra.empire.pick.ucam.org>
Date: Sat, 2 Feb 2002 17:39:41 GMT
From: matthew@debian.org
Reply-To: matthew@debian.org
To: gnats-bugs@gnats.netbsd.org
Subject: NFS client bug - incorrect caching of file-handles 
X-Send-Pr-Version: 3.95

>Number:         15463
>Category:       kern
>Synopsis:       nfs client incorrectly caches file handles, and erroniously returns EINVAL
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Feb 02 17:40:01 +0000 2002
>Closed-Date:    
>Last-Modified:  Wed May 18 13:05:01 +0000 2005
>Originator:     Matthew Vernon
>Release:        NetBSD 1.5.1
>Organization:
The Debian project. Also, the University of Cambridge
>Environment:
nfs server is a Linux user-land server (Debian GNU/Linux testing)
The FS was stressed, as software was being built on the nfs-mounted partition
on which the fs-errors occured
System: NetBSD cleopatra 1.5.2 NetBSD 1.5.2 (GENERIC) #3: Sat Aug 18 23:37:05 CEST 2001 he@hamster.urc.uninett.no:/usr/src/sys/arch/i386/compile/GENERIC i386


>Description:
This problem came to light as I was trying to build .debs on an nfs-mounted
partition; apparantly-random errors (symlinks not being recognised as such,
spurious EISDIRs and so on) were being reported by netbsd, which did not 
match the state of the fs (when checked on the server, or using a different 
client). Accordingly, I ran tcpdump, mounted the partition, started building
some software, and waited for things to break. Much of the analysis of the 
resulting tcpdump data was done by Ian Jackson:

Here is my analysis, as requested.  I looked at the output of
 /usr/sbin/tcpdump -uvvvvs500 -r ethcapture not port 500
and found that the value of the relevant NFS filehandle,
 7A7F03700650E7626DA575000000000000000000000000000000000000000000
which I searched for.

Summary: the client seems to have cached the fact that the fh used to
refer to a symlink and keeps trying to readlink it despite several
lookup and getattr calls and EINVALs from readlink telling it that
it's a file.


The detailed sequence of events is as follows:

At 14:35:45.097718 the client looks up the name `Makefile' in some
directory or other and gets given the fh in question; it's told it's a
symlink.  The client removes it straight away by making an NFS remove
call on the name `Makefile' in that directory.

At 14:36:28.799930 the client creates a file `md2_dgst.o' in some
other directory.  Evidently the server reuses the original inode
number at this point, because the client gets the fh in question back
from the create call.

At 14:36:28.859574 the client calls getattr on the fh; it's told the
it's a file.

At 14:36:28.879670 the client looks up `md2_dgst.o' again and again
gets given the same fh, with a note saying that it's a file.  But, the
client tries to call readlink on it, which fails with EINVAL.

At 14:37:22.586938 the client does the same again - looks it up, is
given the fh and told it's a file, makes a failing call to readlink.

At 14:39:49.724363 the client calls getattr on the fh.  It again gets
told it's a file.  It then calls readlink straight away, which fails.

At 14:39:51.425057 the client calls readlink again, which fails.
I think this corresponds to the error message on the client:
 cat: /debian/import/openssl-0.9.6c/crypto/md2/md2_dgst.o: Invalid

You say that now, nearly an hour later, it's still broken.  The
conclusion I come to is that NetBSD has improperly cached the fact
that the fh used to refer to a symlink, despite much time passing and
much evidence to the contrary.

Relevant tcpdump bits below.

Ian.

14:35:45.096982 cleopatra.empire.pick.ucam.org.2482867553 > ming.empire.pick.ucam.org.nfs: 132 lookup fh Unknown/797F03700550E7626DA500000000000000000000000000000000000000000000 "Makefile" (ttl 64, id 50606, len 160)
14:35:45.097718 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482867553: reply ok 128 lookup fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 LNK 120777 ids 4015/500 sz 12 nlink 1 rdev 847 fsid 1 nodeid 70037f7a a/m/ctime 1012487679.000000 1012487679.000000 1012487679.000000  (ttl 64, id 43090, len 156)
14:35:45.098680 cleopatra.empire.pick.ucam.org.2482867554 > ming.empire.pick.ucam.org.nfs: 132 lookup fh Unknown/797F03700550E7626DA500000000000000000000000000000000000000000000 "Makefile" (ttl 64, id 50607, len 160)
14:35:45.099028 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482867554: reply ok 128 lookup fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 LNK 120777 ids 4015/500 sz 12 nlink 1 rdev 847 fsid 1 nodeid 7
14:35:45.099729 cleopatra.empire.pick.ucam.org.2482867555 > ming.empire.pick.ucam.org.nfs: 132 remove fh Unknown/797F03700550E7626DA500000000000000000000000000000000000000000000 "Makefile" (ttl 64, id 50608, len 160)
14:35:45.100131 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482867555: reply ok 28 remove (ttl 64, id 43092, len 56)

14:36:28.796326 cleopatra.empire.pick.ucam.org.2482873186 > ming.empire.pick.ucam.org.nfs: 136 lookup fh Unknown/797F03700550E7626DA500000000000000000000000000000000000000000000 "md2_dgst.o" (ttl 64, id 56494, len 164)
14:36:28.796860 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482873186: reply ok 28 lookup ERROR: No such file or directory (ttl 64, id 49113, len 56)
14:36:28.799265 cleopatra.empire.pick.ucam.org.2482873187 > ming.empire.pick.ucam.org.nfs: 168 create fh Unknown/797F03700550E7626DA500000000000000000000000000000000000000000000 "md2_dgst.o" (ttl 64, id 56495, len 196)
14:36:28.799930 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482873187: reply ok 128 create fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 REG 100644 ids 0/0 sz 0 nlink 1 rdev 201 fsid 1 nodeid 70037f7a a/m/ctime 1012487788.000000 1012487788.000000 1012487788.000000  (ttl 64, id 49114, len 156)

14:36:28.859574 cleopatra.empire.pick.ucam.org.2482873191 > ming.empire.pick.ucam.org.nfs: 120 getattr fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 (ttl 64, id 56499, len 148)
14:36:28.860327 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482873191: reply ok 96 getattr REG 100644 ids 0/0 sz 0  (ttl 64, id 49118, len 124)

14:36:28.879209 cleopatra.empire.pick.ucam.org.2482873196 > ming.empire.pick.ucam.org.nfs: 136 lookup fh Unknown/797F03700550E7626DA500000000000000000000000000000000000000000000 "md2_dgst.o" (ttl 64, id 56510, len 164)
14:36:28.879670 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482873196: reply ok 128 lookup fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 REG 100644 ids 0/0 sz 0 nlink 1 rdev 201 fsid 1 nodeid 70037f7a a/m/ctime 1012487788.000000 1012487788.000000 1012487788.000000  (ttl 64, id 49123, len 156)
14:36:28.881779 cleopatra.empire.pick.ucam.org.2482873197 > ming.empire.pick.ucam.org.nfs: 120 readlink fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 (ttl 64, id 56511, len 148)
14:36:28.882129 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482873197: reply ok 28 readlink ERROR: Invalid argument (ttl 64, id 49125, len 56)

14:37:22.586569 cleopatra.empire.pick.ucam.org.2482873261 > ming.empire.pick.ucam.org.nfs: 136 lookup fh Unknown/797F03700550E7626DA500000000000000000000000000000000000000000000 "md2_dgst.o" (ttl 64, id 56636, len 164)
14:37:22.586938 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482873261: reply ok 128 lookup fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 REG 100644 ids 0/0 sz 0 nlink 1 rdev 201 fsid 1 nodeid 70037f7a a/m/ctime 1012487788.000000 1012487788.000000 1012487788.000000  (ttl 64, id 49411, len 156)
14:37:22.587631 cleopatra.empire.pick.ucam.org.2482873262 > ming.empire.pick.ucam.org.nfs: 120 readlink fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 (ttl 64, id 56637, len 148)
14:37:22.587912 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482873262: reply ok 28 readlink ERROR: Invalid argument (ttl 64, id 49412, len 56)

14:39:49.724363 cleopatra.empire.pick.ucam.org.2482873296 > ming.empire.pick.ucam.org.nfs: 120 getattr fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 (ttl 64, id 56770, len 148)
14:39:49.724635 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482873296: reply ok 96 getattr REG 100644 ids 0/0 sz 0  (ttl 64, id 50150, len 124)
14:39:49.725293 cleopatra.empire.pick.ucam.org.2482873297 > ming.empire.pick.ucam.org.nfs: 120 readlink fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 (ttl 64, id 56771, len 148)
14:39:49.725590 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482873297: reply ok 28 readlink ERROR: Invalid argument (ttl 64, id 50151, len 56)
14:39:51.425057 cleopatra.empire.pick.ucam.org.2482873298 > ming.empire.pick.ucam.org.nfs: 120 readlink fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 (ttl 64, id 56775, len 148)
14:39:51.425452 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482873298: reply ok 28 readlink ERROR: Invalid argument (ttl 64, id 50160, len 56)

>How-To-Repeat:
nfs-mount a partition from a linux server, and do something that involves a lot 
of creating and deleting files - such as building a large source package 
>Fix:
As noted above, the behaviour with regard to file-handles the client has cached
from nfs servers should be changed to behave more intelligently if the fh 
appears to no longer be what the client thought it was
>Release-Note:
>Audit-Trail:

From: Jaromir Dolecek <jdolecek@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc:  
Subject: Re: kern/15463: nfs client incorrectly caches file handles, and erroniously
 returns EINVAL
Date: Sat, 2 Feb 2002 22:39:19 +0100 (CET)

 Hi,
 just as a proof of concept, can you confirm that
 you can repeat the problem if you force the NetBSD client
 to use NFSv2 (i.e. mount with -2 option)?

 Also, you seem to suggest that a link is created, then
 removed and then a file is created in the erratic session.
 However, the tcpdump includes removal of 'Makefile'
 and lookup/create is for 'md2_dgst.o'.

 It's my guts feeling that if the symlink is removed
 and replaced with some other object, the file handle
 should change, or at least the server should tell the client
 the file handle is no longer valid. gettattr call only
 is there to get file size/mdate/cdate, the client won't
 change it's information about file type if the type
 suddenly changes, that is not expected to happen.

 I'm not a NFS expert tho, so maybe I'm way off.

 Jaromir
 -- 
 Jaromir Dolecek <jdolecek@NetBSD.org> http://www.NetBSD.org/Ports/i386/ps2.html
 -=  Those who would give up liberty for a little temporary safety deserve  =-
 -=  neither liberty nor safety, and will lose both.  -- Benjamin Franklin  =-

From: David Laight <david@l8s.co.uk>
To: "tech-net@netbsd.org" <tech-net@netbsd.org>
Cc: gnats-bugs@gnats.netbsd.org
Subject: Re: kern/15463: NFS client bug - incorrect caching of file-handles
Date: Mon, 04 Feb 2002 18:19:42 +0000

 matthew@debian.org wrote:

 > nfs server is a Linux user-land server (Debian GNU/Linux testing)

 Is this NFS over UDP or NFS over TCP?
 If TCP, Does the server do ANY validation of file handle?  (ie check
 that it was given out over the same TCP connection - is this actually
 allowed?)

 There seem to be 2 problems here:

 1) The NetBSD NFS client is failing to update information returned when
 it gets a 'duplicate' file handle.  This shouldn't really ever happen
 though!

 2) The NFS server is reusing file handles.  I'm not exactly sure what
 the lifetime of a file handle is, but I think the NFS (over UDP)
 protocol requires it to be 'for ever'.  This is the more serious bug!

 For instance, consider following senario:

 1) system A looks up file 'F1' and is given file handle 'H-AF1', the
 file
    is 'opened' by an application.
 2) system B deletes file 'F1'
 3) Any access to file 'F1' from system A would now fail ESTALE
 4) File 'F2' is created, reusing the inode number that F1 had.
 5) system A tries to write to file F1 - should fail ESTALE

 Now, if he NFS server is assuming that the inode number a is 'good
 enough' file handle it will write to (corrupt) file F2.  If it is only
 willing to access files it knows the mapping for, then maybe system 'B'
 doing a lookup on file F2 will generate the handle 'H-AF1' (now mapping
 to file F2) so that a write to F1 will succeed (corrupting file F2). 
 Maybe it needs system A to access file F2.  In any case this replication
 of file handles is a serious bug in the server code.

 I don't know what the NFS server is trying to use as a file handle?  If
 it can't access the inode generation number (which has probably been
 'clobbered' from the stat response for 'security' reasons), then it must
 allocate a random number and keep state somewhere.

 	David

 =====

 I've cut the bumf, just leaving the trace of the NFS responses for the
 two 'files' - the filehandle is being reused by the server!

 Response for lookup of Makefile:
 > 14:35:45.097718 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482867553: reply ok 128 lookup fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 LNK 120777 ids 4015/500 sz 12 nlink 1 rdev 847 fsid 1 nodeid 70037f7a a/m/ctime 1012487679.000000 1012487679.000000 1012487679.000000  (ttl 64, id 43090, len 156)

 'Makefile' deleted (using NFS, but that doesn't acually matter!)

 Response for lookup of md2_dgst.o:
 > 14:36:28.799930 ming.empire.pick.ucam.org.nfs > cleopatra.empire.pick.ucam.org.2482873187: reply ok 128 create fh Unknown/7A7F03700650E7626DA575000000000000000000000000000000000000000000 REG 100644 ids 0/0 sz 0 nlink 1 rdev 201 fsid 1 nodeid 70037f7a a/m/ctime 1012487788.000000 1012487788.000000 1012487788.000000  (ttl 64, id 49114, len 156)

From: Olaf Seibert <rhialto@polderland.nl>
To: David Laight <david@l8s.co.uk>
Cc: "tech-net@netbsd.org" <tech-net@netbsd.org>, gnats-bugs@gnats.netbsd.org
Subject: Re: kern/15463: NFS client bug - incorrect caching of file-handles
Date: Mon, 4 Feb 2002 21:33:56 +0100

 A long time ago I also discussed a problem where the . and .. entries of
 a directory would seem to be the same inode, and other obvious wrong
 inodes were present. (I cannot find it in my archives though) This was
 with a commercial NFS server running on WNT called DiskShare from
 Intergraph. Looking at recent discussions, it might well have been a
 filehandle re-using problem, since the server was known to be buggy
 anyway: if you did "mv foo Foo", your file foo would disappear...
 But I am currently in no position to verify this theory.

 -Olaf.
 -- 
 ___ Olaf 'Rhialto' Seibert - rhialto@polder --They that can give up essential 
 \X/ land.nl --liberty to purchase a little temporary safety
 --------------deserve neither liberty or safety. - Benjamin Franklin, 1759
 ---People will accept your ideas much more readily if you tell them that
 ---Benjamin Franklin said it first. - Unknown

From: der Mouse <mouse@Rodents.Montreal.QC.CA>
To: tech-net@netbsd.org, gnats-bugs@gnats.netbsd.org
Cc:  
Subject: Re: kern/15463: NFS client bug - incorrect caching of file-handles
Date: Mon, 4 Feb 2002 16:50:29 -0500 (EST)

 > A long time ago I also discussed a problem where the . and .. entries
 > of a directory would seem to be the same inode, and other obvious
 > wrong inodes were present.

 This reminds me of a run-in I had with some commercial NFS-server box.
 I think it was a Network Appliance and the problem was related to their
 snapshot directories, but can't recall details (and I've long since
 left the environment where it occurred).  I think it was handing out
 non-unique file IDs within a single mount (though with different
 filehandles).  Confused userland programs that expected <st_dev,st_ino>
 pairs to be unique.

 /~\ The ASCII				der Mouse
 \ / Ribbon Campaign
  X  Against HTML	       mouse@rodents.montreal.qc.ca
 / \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B
From: pancake@phreaker.net
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/15463
Date: Wed, 18 May 2005 15:04:01 +0200 (CEST)

 >Submitter-Id:	net
 >Originator:	
 >Organization:

 >Confidential:	no
 >Synopsis:	Problem persists on -current
 >Severity:	serious
 >Priority:	high
 >Category:	kern
 >Class:		sw-bug
 >Release:	NetBSD 3.99.3


 >Environment:


 System: NetBSD xenun 3.99.3 NetBSD 3.99.3 (XenU) #13: Wed May 18 02:13:20 CEST 2005 juan@Nocturno:/home/juan/build/obj/sys/arch/i386/compile/XenU i386
 Architecture: i386
 Machine: i386
 >Description:
 	This problem is not yet fixed.

 >How-To-Repeat:

 >Fix:


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.