NetBSD Problem Report #47739

From wiz@yt.nih.at  Sat Apr 13 10:43:44 2013
Return-Path: <wiz@yt.nih.at>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	by www.NetBSD.org (Postfix) with ESMTP id 78D4863F415
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 13 Apr 2013 10:43:44 +0000 (UTC)
Message-Id: <20130413104335.B8EBE2AC736@yt.nih.at>
Date: Sat, 13 Apr 2013 12:43:35 +0200 (CEST)
From: Thomas Klausner <wiz@NetBSD.org>
Reply-To: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@NetBSD.org
Subject: tmpfs panic: kernel diagnostic assertion "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
X-Send-Pr-Version: 3.95

>Number:         47739
>Category:       kern
>Synopsis:       tmpfs panic: kernel diagnostic assertion "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    rmind
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Apr 13 10:45:00 +0000 2013
>Closed-Date:    Fri Nov 08 15:55:26 +0000 2013
>Last-Modified:  Fri Nov 08 15:55:26 +0000 2013
>Originator:     Thomas Klausner
>Release:        NetBSD 6.99.19
>Organization:
Curiosity is the very basis of education and if you tell me that 
curiosity killed the cat, I say only that the cat died nobly.
- Arnold Edinborough
>Environment:


System: NetBSD yt.nih.at 6.99.19 NetBSD 6.99.19 (KVOTHE) #0: Sun Apr 7 19:52:05 CEST 2013 wiz@yt.nih.at:/archive/foreign/src/sys/arch/amd64/compile/obj/KVOTHE amd64
Architecture: x86_64
Machine: amd64
>Description:
My machine just paniced from X, so no backtrace.
savecore reported:
Checking for core dump...
savecore: kvm_read: invalid translation (invalid level 2 PDE)
savecore: reboot after panic: kernel diagnostic assertion "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL || tmpfs_dircookie((node)->tn_spec.tn_dir.tn_readdir_lastp) == (node)->tn_spec.tn_dir.tn_readdir_lastn" failed: file "/archive/foreign/src/sys/fs/tmpfs/tmpfs_subr.c", line 610
savecore: system went down at Sat Apr 13 12:15:32 2013

savecore: writing compressed core to /var/crash/netbsd.12.core.gz

>How-To-Repeat:
Run a bulk build inside tmpfs, get unlucky.
Previously happened on March 3, so it's not exactly common.
>Fix:
Not known.

>Release-Note:

>Audit-Trail:
From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/47739
Date: Wed, 8 May 2013 11:30:25 +0100

 Just got a working core file:               

 (gdb) bt                                     
 #0  0xffffffff804e40de in cpu_reboot (howto=260, bootstr=<optimized out>)
     at ../../../../arch/amd64/amd64/machdep.c:705
 #1  0xffffffff80696bad in vpanic (            
     fmt=0xffffffff8096f688 "kernel %sassertion \"%s\" failed: file \"%s\", line 
 +%d ", ap=0xfffffe811b1769d0) at ../../../../kern/subr_prf.c:284
 #2  0xffffffff80834368 in kern_assert (fmt=<unavailable>)
     at ../../../../../../lib/libkern/kern_assert.c:50
 #3  0xffffffff806de80c in VP_TO_TMPFS_DIR (vp=<optimized out>)
     at ../../../../fs/tmpfs/tmpfs.h:357        
 #4  tmpfs_readdir (v=<optimized out>) at ../../../../fs/tmpfs/tmpfs_vnops.c:938 
 #5  0xffffffff807daa33 in VOP_READDIR (vp=0xfffffe81591b9be8,
     uio=<optimized out>, cred=<optimized out>, eofflag=<optimized out>,
     cookies=<optimized out>, ncookies=<optimized out>)  
     at ../../../../kern/vnode_if.c:952               
 #6  0xffffffff807c507b in vn_readdir (fp=0xfffffe811bc1c940,
     bf=0x7f7ff770b000 <Address 0x7f7ff770b000 out of bounds>, segflg=0,
     count=<optimized out>, done=0xfffffe811b176bec, l=0xfffffe81105f3040,
     cookies=0x0, ncookies=0x0) at ../../../../kern/vfs_vnops.c:470
     count=<optimized out>, done=0xfffffe811b176bec, l=0xfffffe81105f3040,
     cookies=0x0, ncookies=0x0) at ../../../../kern/vfs_vnops.c:470
 #7  0xffffffff807c07c1 in sys___getdents30 (l=0xfffffe81105f3040,
     uap=0xfffffe811b176c80, retval=0xfffffe811b176c30)
     at ../../../../kern/vfs_syscalls.c:4611             
 #8  0xffffffff806affe4 in sy_call (rval=0xfffffe811b176c30,
     uap=0xfffffe811b176c80, l=0xfffffe81105f3040, sy=0xffffffff80c99460)

 dmesg is full of

 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON SYSCALL 16445 -151703264 EXIT 0 7
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0   
 ...
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 sys___getdents30() at netbsd:sys___getdents30+0x76
 WARNING: SPL NOT LOWERED ON SYSCALL 24678 -1 EXIT f7b2b400 6
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0 
 ...

 (gdb) frame 3
 #3  0xffffffff806de80c in VP_TO_TMPFS_DIR (vp=<optimized out>)
     at ../../../../fs/tmpfs/tmpfs.h:357
 357             TMPFS_VALIDATE_DIR(node);
 (gdb) list
 352     VP_TO_TMPFS_DIR(vnode_t *vp)
 353     {
 354             tmpfs_node_t *node = vp->v_data;
 355
 356             KASSERT(node != NULL);
 357             TMPFS_VALIDATE_DIR(node);
 358             return node;
 359     }
 360
 361     #endif /* defined(_KERNEL) */
 (gdb) frame 5
 #5  0xffffffff807daa33 in VOP_READDIR (vp=0xfffffe81591b9be8, 
     uio=<optimized out>, cred=<optimized out>, eofflag=<optimized out>, 
     cookies=<optimized out>, ncookies=<optimized out>)
     at ../../../../kern/vnode_if.c:952
 952             error = (VCALL(vp, VOFFSET(vop_readdir), &a));
 (gdb) list
 947             a.a_eofflag = eofflag;
 948             a.a_cookies = cookies;
 949             a.a_ncookies = ncookies;
 950             mpsafe = (vp->v_vflag & VV_MPSAFE);
 951             if (!mpsafe) { KERNEL_LOCK(1, curlwp); }
 952             error = (VCALL(vp, VOFFSET(vop_readdir), &a));
 953             if (!mpsafe) { KERNEL_UNLOCK_ONE(curlwp); }
 954             return error;
 955     }
 956
 (gdb) print *vp
 $1 = {v_uobj = {vmobjlock = 0xfffffe8197f5fac0, pgops = 0xffffffff80957c80, 
     memq = {tqh_first = 0x0, tqh_last = 0xfffffe81591b9bf8}, uo_npages = 0, 
     uo_refs = 1, rb_tree = {rbt_root = 0x0, rbt_ops = 0xffffffff80957a60, 
       rbt_minmax = {0x0, 0x0}}, uo_ubc = {lh_first = 0x0}}, v_cv = {
     cv_opaque = {0x0, 0xfffffe81591b9c38, 0xffffffff809d8ec4}}, 
   v_size = 16720, v_writesize = 16720, v_iflag = 0, v_vflag = 16, v_uflag = 0, 
   v_numoutput = 0, v_writecount = 0, v_holdcnt = 0, v_synclist_slot = 0, 
   v_mount = 0xfffffe8110c43008, v_op = 0xfffffe821db1a748, v_freelist = {
     tqe_next = 0xfffffe8198503148, tqe_prev = 0xfffffe81985037b8}, 
   v_freelisthd = 0x0, v_mntvnodes = {tqe_next = 0xfffffe81591b9620, 
     tqe_prev = 0xfffffe81591b9dd0}, v_cleanblkhd = {lh_first = 0x0}, 
   v_dirtyblkhd = {lh_first = 0x0}, v_synclist = {tqe_next = 0x0, 
     tqe_prev = 0x0}, v_dnclist = {lh_first = 0xfffffe81386c0c00}, v_nclist = {
     lh_first = 0xfffffe813d1b3e40}, v_un = {vu_mountedhere = 0x0, 
     vu_socket = 0x0, vu_specnode = 0x0, vu_fifoinfo = 0x0, vu_ractx = 0x0}, 
   v_type = VDIR, v_tag = VT_TMPFS, v_lock = {rw_owner = 64}, 
   v_data = 0xfffffe81face0660, v_klist = {slh_first = 0x0}}

 so v_data is not 0 here...

 frame 4, vp is already optimized out...

From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
 "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Mon, 22 Jul 2013 19:42:17 +0200

 These are highly reproducible.

 Whenever I do a bulk build from scratch, I usually reboot at least
 once due to it.

 Can someone please take a look?
  Thomas

From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Wed, 24 Jul 2013 14:22:54 +0000

 I don't see any obvious mismanagement of the tn_readdir fields, so
 this is probably some other kind of corruption.

 Wild guess: tmpfs_rmdir is missing cache_purge(vp), and something is
 trying to list a directory that just got rmdir'd.

 I'm puzzled by the `WARNING: SPL NOT LOWERED' messages in prlw1's
 dmesg.  The syscalls it reports (chroot and recv) don't seem to me to
 be related to tmpfs.  wiz, do you see those messages too, or do you
 have DIAGNOSTIC disabled, or are they a red herring for the tmpfs
 issue?

From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: Taylor R Campbell <riastradh@NetBSD.org>
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
 "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Wed, 24 Jul 2013 19:02:45 +0200

 Thanks for looking at this!

 On Wed, Jul 24, 2013 at 02:25:00PM +0000, Taylor R Campbell wrote:
 >  I don't see any obvious mismanagement of the tn_readdir fields, so
 >  this is probably some other kind of corruption.

 Ok.

 >  Wild guess: tmpfs_rmdir is missing cache_purge(vp), and something is
 >  trying to list a directory that just got rmdir'd.

 Could be. bulkbuilds do lots of directory creations and removals. Not
 sure where the listing process would come from, but since it's
 parallel, perhaps it's a bug in particular packages that are not
 really parallelbuild-safe?

 >  I'm puzzled by the `WARNING: SPL NOT LOWERED' messages in prlw1's
 >  dmesg.  The syscalls it reports (chroot and recv) don't seem to me to
 >  be related to tmpfs.  wiz, do you see those messages too, or do you
 >  have DIAGNOSTIC disabled, or are they a red herring for the tmpfs
 >  issue?

 My dmesg from today still starts with the end of the last panic:

 OT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYvSnC_ArLeLa d1d0i4r (4)  EaXtI T 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 120 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 netbsd:vn_readdir+0x21b
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 128 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 sys___getdents30() at WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 126 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 netbsd:sys___getdents30+0x60
 WARNING: SPL NOT LOWERED ON SYSCALL 132 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 105 4 EXIT 7fc4 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIsTy s7cfael0l (6)
  at WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 117 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
 netbsd:syscall+0xb5
 --- syscall (number 390) ---
 7f7ff6d09d3a:
 cpu10: End traceback...W
 ARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
 WARNING: SPL NOT LOWERED ON SYSCALL 191 4 EXIT 7fe0 6

 dumping to dev 168,8 (offset=8, size=8380291):


 I do have DIAGNOSTIC enabled in this kernel.
  Thomas

From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: Thomas Klausner <wiz@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
 "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Thu, 8 Aug 2013 21:27:34 +0100

 Thomas Klausner <wiz@NetBSD.org> wrote:
 > The following reply was made to PR kern/47739; it has been noted by GNATS.
 > 
 > From: Thomas Klausner <wiz@NetBSD.org>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
 >  "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
 > Date: Mon, 22 Jul 2013 19:42:17 +0200
 > 
 >  These are highly reproducible.
 >  
 >  Whenever I do a bulk build from scratch, I usually reboot at least
 >  once due to it.
 >  
 >  Can someone please take a look?
 >   Thomas

 Most likely this is due to tmpfs_dircookie() truncation here:

 http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88

 It is wrong and there are other PRs because of it.  Thomas, you can test
 this by replacing the function body with the following:

 	return (off_t)(uintptr_t)de;

 This breaks linux32 compat, but we really just need to decide how we want
 to fix it (it would be good to avoid penalising the native code, but better
 to penalise than fail).

 -- 
 Mindaugas

From: Thomas Klausner <wiz@NetBSD.org>
To: Mindaugas Rasiukevicius <rmind@netbsd.org>
Cc: NetBSD bugtracking <gnats-bugs@NetBSD.org>
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
 "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Tue, 20 Aug 2013 07:32:56 +0200

 On Thu, Aug 08, 2013 at 09:27:34PM +0100, Mindaugas Rasiukevicius wrote:
 > Thomas Klausner <wiz@NetBSD.org> wrote:
 > > The following reply was made to PR kern/47739; it has been noted by GNATS.
 > > 
 > > From: Thomas Klausner <wiz@NetBSD.org>
 > > To: gnats-bugs@NetBSD.org
 > > Cc: 
 > > Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
 > >  "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
 > > Date: Mon, 22 Jul 2013 19:42:17 +0200
 > > 
 > >  These are highly reproducible.
 > >  
 > >  Whenever I do a bulk build from scratch, I usually reboot at least
 > >  once due to it.
 > >  
 > >  Can someone please take a look?
 > >   Thomas
 > 
 > Most likely this is due to tmpfs_dircookie() truncation here:
 > 
 > http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
 > 
 > It is wrong and there are other PRs because of it.  Thomas, you can test
 > this by replacing the function body with the following:
 > 
 > 	return (off_t)(uintptr_t)de;
 > 
 > This breaks linux32 compat, but we really just need to decide how we want
 > to fix it (it would be good to avoid penalising the native code, but better
 > to penalise than fail).

 It survived a bit longer, but rebooted last night.

 Checking for core dump...
 savecore: reboot after panic: WTAA RLRNNOINWGEGR :ES DPS LP OLNN O NTTOW RALTAROP NW LIEEONRGIEETRD E  DO OS0NNP
  LTT RNAPOP T E EXLXIOIWT E 6R6A E 0D0N

  IONNG :T RSAPPL  ENXOITT  L6O WW0E
 RENDI NOGN:  TSRPALP  NEOXTI WTLA OR6WNEIR0NEGD:  OSNP LT RNAOPT  ELXOIWTE R6E D0 
 ON TRAPA RENXIINTG :6  S0P
 L NOT LOWERED ON TRAP EXIT 6A R0N
 IWNAGR
 savecore: system went down at Tue Aug 20 00:39:50 2013

 savecore: writing compressed core to /var/crash/netbsd.42.core.gz

 (gdb) target kvm netbsd.core
 #0  0xffffffff805e3299 in cpu_reboot ()
 (gdb) bt
 #0  0xffffffff805e3299 in cpu_reboot ()
 #1  0xffffffff807e92be in vpanic ()
 #2  0xffffffff8098cc7a in kern_assert ()
 #3  0xffffffff80834a6b in tmpfs_readdir ()
 #4  0xffffffff80926743 in VOP_READDIR ()
 #5  0xffffffff809092fb in vn_readdir ()
 #6  0xffffffff809048e0 in sys___getdents30 ()
 #7  0xffffffff80807225 in syscall ()
 #8  0xffffffff801006a1 in Xsyscall ()
 #9  0x000000000000000a in ?? ()
 #10 0x00007f7ff7b3f000 in ?? ()
 #11 0x0000000000001000 in ?? ()
 #12 0x00007f7ff710c84a in ?? ()
 #13 0x0000000000000ff0 in ?? ()
 #14 0x00007f7ff7bb82d5 in ?? ()
 #15 0x0000000000000000 in ?? ()



  Thomas

From: Thomas Klausner <wiz@NetBSD.org>
To: NetBSD bugtracking <gnats-bugs@NetBSD.org>
Cc: 
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
 "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Mon, 2 Sep 2013 09:00:18 +0200

 Completely unscientifically: I feel that with the patch applied I see
 less panics.

 I did have one last night again, dmesg after boot starts with:

  EXIT 6 0
 WARNIWNAGR:N INSGP LS PNWLOA TRN NOLITNO GLWO WESERPRELEDD  OONNNO  TTT RRLAAOPPW E EREXEIT D6  O0X
 I nTTe RtA6P d :Ev
 XW_IArTeRa6N dI0iN
 rG:+ 0SxP2L1 bN
 OT LOWERED ONW ATRRNAIPN GE:X ISTP L6W  AN0O
 N LIONWGE:R ESDP LON  NTWROAATRP  NLEIOXWIGTE: R6  E0PDL  NOON LO WTERRAEPW  AEOXNN IITTRN AG6P:   0ES
 XPILT  N6O T0 
 LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 W0
 ARNING: WSARPNLI NNGOW:T A SNLPIOLN WNE:O RTSE PDLL O OWNNET  RTLERDAW PEORNE EDT XOANPI TRAP  TE XXII6TT  60 6
 0 
 0
 WARNING: SPL WNAORTN ILNOGWW:AE RRSENDIL  ONNNGO :TT  RSALPPOL WE EXNRIETTD   LO6NO WTE
 RAP RXIET D6  W0OANR NTIRNAG:P  SEPXLW ANRTO IT6 G L:0O 
 WSPELR ENDT  OLNO WTERARPE D ONE XTIWRTAAPR 6NE I0XNI
 GT: 6  S0P
 L NOT LOWWEARRENDI NG:W NSA PRTNLI RNNAGOPT   SLEPOXWIE RTNE ODT6   O0ONW 
 ERRAEPD  EOXNI TT R6A P0 
 EXIT 6 0
 WARNWIANRGN:IA NRSGN:PI NLSG:P  LS PNNLOO TNT O LTLO OWWOEWERERREDE DOD NO OTNR  ATTPRR AAEXIPPT  E E6I X0TI
  T6  06
  0
 WARNING: SPLWNAWORATN NILINOGWWG:EA: RR SNEPIDPLN  GONN:OO  TTTSRP LALLOPO W WENEERORXETDI   T 6 O0NLO 
 OSNWY SECTARRLELAD P 1OE1XN8WI  TAT R26NR  IE0NX
 PIGT  :Ef fXSfIPfTLd  9Na6O0 T  60
 OWERED ON TRAP EXIT 6W A0R
 NING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWEARRENDI NOGN :TR AS EXPILT  6WN A0OR
 TN ILNOG: WSEPRLE DN OTO LONW EARRERNA PNO GN:E  TXSRIPALPT   NEO6X  0TL 
 O6W E0R
 ED ON TRAP EXIT 6 0
 WARNING: SWPALR NNIWONAGT:  NLISOPWGLE :RN EOSDP  LLO ONWN OETRR EALDPO  WOENR XETIDRT AP EXOINT   6T6 R00A

 P EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT W6A RN0I
 NG: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNWNAGR:W IASRGP:NL ISNNPOLT :N  OLTOS WPLEORWEEDR  ENOON  TOT NR LATOPWREAEXPRI TEXIT 6  0D
   O0N
  TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WWAARRNWNIAINRNNGIG:N G S:SP PLSL P LNN OONTOT TL LOOOWWEEERREEEDDD  OOONNN T TRRTAAPRP EXIT 6  E0X
 PI TE X6I T0 
 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXWITA R6N I0N
 G: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON SYSCALL 22711 -1 EXIT f7b28400 6
 WARNING: SPL NOT LOWERED ON SYSCALL 0 5 EXIT WffAfRfNaIfN8G0:  6S
 PL NOT LOWERED ON TRAP EXITW A6RN IN0GW:A
 RSNPILN GN:O TS PLLO WNEORTE DL OOWNE RTERDA PO NE XSIYTS C6A LWLA R1N54I64N -G1: E XSITP Lf bN2O8TW4 A0LR0ON WI6N
 EGR: ESDP LO NNO T LTORWAERPE D EOXNI TT RA6P  EXIT0 
 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWWAERRNEID NOGN:  TSRALP  NEOXTT  6L O0W
 ERED ON TRAP EXIT 6 W
 ARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWWEREAD RONN ITRNP GE:X ITS P6L  0
 NOT LOWERED ON TRAP EXITW AR6N I0sNW
 yGAs:R_N _I_SgeP:tL  dSNeOPTtLsL 3ON0WO(E) R LEaODtW  EON RTERDA PO NE XSIYTS C6A L0L
  17186W -A1R5N1I71N2G5W6:8  RSENPXILI GTN: O 0 ST7P
 L  NLOTO WLEORWEEDRE DO NON  TTRRAAPPWE AEIRTXN II6TN  06:  SP0L
  NOT LOWERED ON TRAP EXIT W6A R0N
 ING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWEREWDA RONNI TNAGP :E XSIWPATLR N I6NN OG0T
 :  LSOPWLE RNOTE DL OWOWANER RNTEIRNDAG P:O  NSX PITTR  A6NP O 0TE
  XLIOTW E6R E0D
  ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED AORNNWITANRRGAN:PI  NESGXPI:LT   N6SOP0TL
   LONWERETD  LOO TWRWEARAPER DNE XIONTNG  :6T R0SA
 PPL  NOET XLOIWTE RE6D  O0N
  TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6W WAA0RR
 NNIINNGG: :S PSLP LN ONTO TL OLWOEWnREeERtDE sDd :NOsN y TsSR_Y_AS_PCg eAEWtLXdALeTR n 9Nt67 23I050N
  G+-0x16:50 1
 S7P1A2LR5 N6NI8N GET:X I TLS OP0WL E 7RN
 OTE LDO WEORNE DT ROANP  TERXWAIAPT RE NX6II TN0 G6: 
  0S
 PL NOT LOWERED ON TRAP EXIT 6 0
 WARNIWNAGR: NSIPL NNOTG L:O WSERPELD  ONN OTRTAP  LEOXIWTE 6R E0D
  WOANR NTIRNAGP:W  ASERPXNLII TNN GOW:6ATR N S0LP
 NLWGE :RN EODPT L OLNNO OWTTER RALEPOD W EEORXWINATE D T 6N AN0P I
 SNEYGS:ICTA  LS6L  10L
 5  NWEXOIRTN  I4NL0GW OA:7W R
 ENRPILEN GDN: O TSOPLONLW ENRTOERDT A OPNL  OWEWRAEARIPETI D EN6 X: SOPINLT   T0NR
 6OA TP0  L
 EOXWIETR E6D  0O
 N SYASCRLWNAI RN8N9G4N9:G  :4S  PESLXPI LTN WONA7TOf dNLf IO LNW6GO
 E:WR ESEPDRL E DON OONTN   LTOTRWRAEPAR EED X EIOXTI T 6 T06R
  A0P 
 EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWEREWDA RONINN G: STPRLA PNO TE LXOIWETRED  ON T R0P 
 EXIT 6 0
 WARNING: SPL NOT LOWERED ON SYSCALL W2A8R8N4WI2N A:- R1SNPIL1 N7NG1OT: 5 L6SO8W EELRX D ON TRIATPN  O0E TX7 I
 LTO W6E R0E
 D ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED OW ATRRNAIPN GE:X ISTP L6  N0O
 T LOWERED OWNA RTNRIANPG :E XSIPTL  6N O0T
  LOWERED ON SYSCALL 80 12 EXIT ffffd9a0 6
 WARNING: SPL WOATR NLIOWE:R ESDP LO NN OTTR ALPO WEEXRIETD  6O N0 
 TRAP EXITW 6 0
 ARNING: SPWLA RNNOINTG :L WOAPWLRE NRNIEONTGD :  LOSOPWNLE  ETDNR OAOTN  T LEOAPW IETRXE ITD  66O  N0 
 0T
 RAP EXIT 6 0
 WARNING: SPL NOWTA RLNOIWNEGR:E DS WPOANLR  NNTOIRN GLP:O  EESRPEILDT   ON6  OT0R
 A P LEOXWIETR E6D  0O
 N TRAP EXIT 6 0W
 AARRNNIINNGG::  SSPPLL  NNOOTT  LLOOWWEERREEDD  OONN  TTRRAAPP  EEXXIITT  66 W 00

 ARNING: SPL NOT LOWEREDWWAARORNNNIIN NGT:GR :S PSPLP  LNE ONXTOI TTL OL6OW WE0RE
 ERDE DO NO NT RTARPA PE XEIXTI T6  60 
 0
 WARNING: SPL NWOATR NLIONWG:E RSEPDL  ONONT  LOWTERRAEPD  WOEAXR INTTIRNA GP6: E0XSI
 PTL  6 N0O
 T LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT W6A RNI0NA
 GR:N ISNPGL:  NSOPTL  LNOOWTE RLEODW EORNE DT ROANP  TERXAIPT  E6X I0T
  6 0
 WARNING: SPL NOT LOWERED WOANANR INTNIRGNAGP:S  PESLX PINLTO  TN6 LTO  W0LE
 RWEEDR EODN  OTNR ATPR AEPX IETX I6T  06
  0
 WARNING: SPL NOT LOWERWEADR NOINN GT:R SAPPL  WEARXONITT NL GO:W E6S PE0LD
  ONNO TT RLAOPWERED  EOXNI TT R6A P0 
 EWXAIRT N6 I0N
 G: SPWLAR NINNOGT:  LSWPOALW ENNORIEN GD: OOWSNEP LRT ERNDAO PTO  NLEOXTWIRTA RPE6D   E0OXN
 I TRATP  6E X0I
 T 6 0
 WARNING: SPWL AORTN ILONWGEWR:AER DNSIPONLG :  TNRAPTPL   LENOXOWITTE  LRO6EW DR 
 EODN  OTNR ATR AEPX IETX IT 66  00

 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNINGW:A RSIPNGL:W ASRNNPOILN TGN :OLT O SLWPOLEW REEOREDD  LOONNW  ETTRRAAPD P EO XNEI XTT IRAP EXI6T  T06 
  60 
 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 7 0
 WAWRANRINNIGN:G :S PWSLAP RNN OINTNOG T:L  OLSWOPEWLRE ERNEODT   OLONON WT ETARRPEA DPE  XOIEN TT XRIATP  60E 
 X0IT
  6 0
 WARNING: SPL NOT LOWERED ONW ATRRNAIPN WE:AX RISTP I6N  0GN
 :O TS PLLO WNEORTEWDAL ROONNWI ENTRR:AE PDS  POXLNI  TNTOR T6A  PL 0OE
 WXERIETD  6O N0 
 TRAP EXIT W6A R0N
 ING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WAWRANRINNIGN:G :WS APSRLNL I NNNOGOTT   LPLOLW ONEWORETER DE LDOO NW OTNRR ATDR  OAENP TRAXPI ETE XX6I I0T
  66  00

 WARNING: SPL NOT WLAORWNEIRNEGW:D ASOPRNLN  INRNAGPT:   ESXPOILTW  EN6OR TE0 
 LDO WOENR ETDR AOPN  EWTXRIRTN I6 N EG0X:
 I STP L6  N0O
 T LOWERED ON TRAP EXIT W6A R0N
 ING: SPL NOT LOWERED ONWTARRANPI WNAXGRI:NT I SN6PG L:0  
 NSOPLT  NLOTO WLEWERREEDD  OONN  TTRRAPA PE IT E6X 0I
 T 6 0
 WARNING: SPL NOT LOWERED ON WTRAARPN IENXGI:T  6S P0L
  NOT WLAORWNEIRNEWGDA :RO NNSI PNLGR: A NPSO PTLE  XLNIOTTW  E6LRO EWD0 EO
 N ETDR AOPN  ETXRIATP  6E X0I
 WTA R6N I0N
 G: SPL NOT LWOAWRENIRNEGD:  OSNP LT RNAWOPAT R ENLIXONWIGET:R  E6SD P 0LO 
 NN OTTR ALPO WEEXRIETD O6N  0T
 RAP EXIT 6 0
 WARNING: SPL NOT LOWWEARRENDI NOGN:  TSRPALP  NEOXTI TL O6W E0R
 ED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING:W ASNPILN GN:O TS PLLO WNEORTE DL OOWWNAE RRTNEIADNP G O:E X SIPRTL A 6PN OE0XT
 IT  6L O0W
 ERED OWNA RTNIRNAGP:  ESXPLI TN OT6  LO0WE
 RED ON TRAP EXIT s6y s0c
 all() at WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NWOATR NLIONWGE:R ESDP LONNO TT RLAOWPE REEXDI OTN  6T RA0PW
 AERXNIITN6G :0 
 SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TWRAAPR NEIXINT G6:  0S
 PL NOT LOWERED ON TRAP EXWIATR N6I N0G
 : SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING:W ASRPNLI NNGO:T  SLPOLW ENROT ELDO WOENR ETDR AOPN  ETXRIATP  6X0I
 T 6 0
 WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
 WARNING: SPL NOT LOWEREDW WAORAN ITNNRGIA:P  :SEP XLS IPNTLO  TN6O LT0 OL
 WOEWREERDE DO NO NS YTSRCAAPL LE X6I1T  162  0E
 XIT ffWfAfRdN9IaN0G :6W 
 ASRPNLI NNGO:T S PLL ONWOETR ELDO WOENRE DT ROANP  TERXAIPT  E6X IT0 6
  0
 WARNING: SPL NOT LOWERED ON TRAP EWWXAAIRRTNNIIN6GN :G0 :
  SPLL  NNOOTT  LLOOWWEERREEDD  OONN  STYRSACPA LELX IT0  60  0E
 XIT 0 6
 WARNING: SPL NOT LOWERED ON TRAP EWXWAIATRR NN6II NN0GG
 :  SSPPLL  NNOOT TL OWLAEORRWNEEIDRN EGOD:   OTSRNP PL REANPIO TTE  XL6IOnTe0Et 
 bR6sEd D:0 s
 OyNs cTaRlAlP+ 0ExXbI5W
 A R6N I-N
 -G s: SsPcLa lNlO TW( nAuLRONbWIeErNR WG3EA9:D0  N)OSINP  N-LT- :R-
  SPALNP  ONEOTXfT I LfTfLO OWd6WE0 aR06E4EDa: D
 O NON c TpTuR1AW1AAP:R  NE IXEnNXdI GTt r :6a  cS e0P0aL
 c kN.O.T. 
 LOWERED ON TRAP EXIT 6 0

 So perhaps there are two issues and the suggested patch addresses one
 of them?
  Thomas

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
	netbsd-bugs@netbsd.org, Thomas Klausner <wiz@NetBSD.org>
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
 "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Mon, 7 Oct 2013 07:42:23 +0000

 On Thu, Aug 08, 2013 at 08:30:01PM +0000, Mindaugas Rasiukevicius wrote:
  >  >  Can someone please take a look?

 almost certainly the same as 47480 (and 41068)

  >  >   Thomas
  >  
  >  Most likely this is due to tmpfs_dircookie() truncation here:
  >  
  >  http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
  >  
  >  It is wrong and there are other PRs because of it.  Thomas, you can test
  >  this by replacing the function body with the following:
  >  
  >  	return (off_t)(uintptr_t)de;
  >  
  >  This breaks linux32 compat, but we really just need to decide how we want
  >  to fix it (it would be good to avoid penalising the native code, but better
  >  to penalise than fail).

 Four and a half years ago (in PR 41068) I asked why tmpfs does this
 nonsense instead of just assigning sequence numbers to each node.
 Nobody has ever managed to come up with a coherent justification, just
 FUD.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Thomas Klausner <wiz@NetBSD.org>
To: NetBSD bugtracking <gnats-bugs@NetBSD.org>
Cc: 
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
 "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Tue, 8 Oct 2013 00:34:07 +0200

 --IrhDeMKUP4DT/M7F
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: inline

 On Mon, Oct 07, 2013 at 07:42:23AM +0000, David Holland wrote:
 > On Thu, Aug 08, 2013 at 08:30:01PM +0000, Mindaugas Rasiukevicius wrote:
 >  >  >  Can someone please take a look?
 > 
 > almost certainly the same as 47480 (and 41068)
 > 
 >  >  >   Thomas
 >  >  
 >  >  Most likely this is due to tmpfs_dircookie() truncation here:
 >  >  
 >  >  http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
 >  >  
 >  >  It is wrong and there are other PRs because of it.  Thomas, you can test
 >  >  this by replacing the function body with the following:
 >  >  
 >  >  	return (off_t)(uintptr_t)de;
 >  >  
 >  >  This breaks linux32 compat, but we really just need to decide how we want
 >  >  to fix it (it would be good to avoid penalising the native code, but better
 >  >  to penalise than fail).
 > 
 > Four and a half years ago (in PR 41068) I asked why tmpfs does this
 > nonsense instead of just assigning sequence numbers to each node.
 > Nobody has ever managed to come up with a coherent justification, just
 > FUD.

 Thanks for taking a look.

 I've been using rmind's patch (attached) for some weeks now, and it
 has definitely reduced by bulk build panics.

 I still get them sometimes, so there must be a second problem.
  Thomas

 --IrhDeMKUP4DT/M7F
 Content-Type: text/plain; charset=us-ascii
 Content-Disposition: attachment; filename="rmind.diff"

 Index: tmpfs.h
 ===================================================================
 RCS file: /cvsroot/src/sys/fs/tmpfs/tmpfs.h,v
 retrieving revision 1.45
 diff -u -r1.45 tmpfs.h
 --- tmpfs.h	27 Sep 2011 01:10:43 -0000	1.45
 +++ tmpfs.h	7 Oct 2013 22:32:14 -0000
 @@ -87,14 +87,7 @@
  static inline off_t
  tmpfs_dircookie(tmpfs_dirent_t *de)
  {
 -	off_t cookie;
 -
 -	cookie = ((off_t)(uintptr_t)de >> 1) & 0x7FFFFFFF;
 -	KASSERT(cookie != TMPFS_DIRCOOKIE_DOT);
 -	KASSERT(cookie != TMPFS_DIRCOOKIE_DOTDOT);
 -	KASSERT(cookie != TMPFS_DIRCOOKIE_EOF);
 -
 -	return cookie;
 +	return (off_t)(uintptr_t)de;
  }
  #endif


 --IrhDeMKUP4DT/M7F--

From: David Laight <david@l8s.co.uk>
To: David Holland <dholland-bugs@netbsd.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Wed, 9 Oct 2013 21:27:38 +0100

 On Mon, Oct 07, 2013 at 07:42:23AM +0000, David Holland wrote:
 > On Thu, Aug 08, 2013 at 08:30:01PM +0000, Mindaugas Rasiukevicius wrote:
 >  >  >  Can someone please take a look?
 > 
 > almost certainly the same as 47480 (and 41068)
 > 
 >  >  >   Thomas
 >  >  
 >  >  Most likely this is due to tmpfs_dircookie() truncation here:
 >  >  
 >  >  http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
 >  >  
 >  >  It is wrong and there are other PRs because of it.  Thomas, you can test
 >  >  this by replacing the function body with the following:
 >  >  
 >  >  	return (off_t)(uintptr_t)de;
 >  >  
 >  >  This breaks linux32 compat, but we really just need to decide how we want
 >  >  to fix it (it would be good to avoid penalising the native code, but better
 >  >  to penalise than fail).
 > 
 > Four and a half years ago (in PR 41068) I asked why tmpfs does this
 > nonsense instead of just assigning sequence numbers to each node.
 > Nobody has ever managed to come up with a coherent justification, just
 > FUD.

 If it assigned sequence numbers it would have to check for already used
 values once it had created 2^32 entries (assuming it needs to generate
 32bit offsets).

 Something based on the algorithm used to look up process ids might be
 better.

 	David

 -- 
 David Laight: david@l8s.co.uk

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
 "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Fri, 11 Oct 2013 08:03:11 +0000

 On Wed, Oct 09, 2013 at 09:27:38PM +0100, David Laight wrote:
  > > Four and a half years ago (in PR 41068) I asked why tmpfs does this
  > > nonsense instead of just assigning sequence numbers to each node.
  > > Nobody has ever managed to come up with a coherent justification, just
  > > FUD.
  > 
  > If it assigned sequence numbers it would have to check for already used
  > values once it had created 2^32 entries (assuming it needs to generate
  > 32bit offsets).

 ... so once you've cycled 2^32 entries through a directory, which is
 in general "never", you compact it. Big deal...

 -- 
 David A. Holland
 dholland@netbsd.org

From: Dennis Ferguson <dennis.c.ferguson@gmail.com>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org,
 Thomas Klausner <wiz@NetBSD.org>
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Fri, 11 Oct 2013 17:26:48 -0400

 On 9 Oct, 2013, at 16:20 , David Laight <david@l8s.co.uk> wrote:
 > The following reply was made to PR kern/47739; it has been noted by =
 GNATS.
 >=20
 > From: David Laight <david@l8s.co.uk>
 > To: David Holland <dholland-bugs@netbsd.org>
 > Cc: gnats-bugs@NetBSD.org
 > Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion =
 "(node)->tn_spec.tn_dir.tn_readdir_lastp =3D=3D NULL..."
 > Date: Wed, 9 Oct 2013 21:27:38 +0100
 >=20
 > On Mon, Oct 07, 2013 at 07:42:23AM +0000, David Holland wrote:
 >> On Thu, Aug 08, 2013 at 08:30:01PM +0000, Mindaugas Rasiukevicius =
 wrote:
 >>>> Can someone please take a look?
 >>=20
 >> almost certainly the same as 47480 (and 41068)
 >>=20
 >>>>  Thomas
 >>>=20
 >>> Most likely this is due to tmpfs_dircookie() truncation here:
 >>>=20
 >>> http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
 >>>=20
 >>> It is wrong and there are other PRs because of it.  Thomas, you can =
 test
 >>> this by replacing the function body with the following:
 >>>=20
 >>> 	return (off_t)(uintptr_t)de;
 >>>=20
 >>> This breaks linux32 compat, but we really just need to decide how we =
 want
 >>> to fix it (it would be good to avoid penalising the native code, but =
 better
 >>> to penalise than fail).
 >>=20
 >> Four and a half years ago (in PR 41068) I asked why tmpfs does this
 >> nonsense instead of just assigning sequence numbers to each node.
 >> Nobody has ever managed to come up with a coherent justification, =
 just
 >> FUD.
 >=20
 > If it assigned sequence numbers it would have to check for already =
 used
 > values once it had created 2^32 entries (assuming it needs to generate
 > 32bit offsets).
 >=20
 > Something based on the algorithm used to look up process ids might be
 > better.

 Assuming my mail client doesn't ruin it I've attached a patch which adds
 a sequence number to each tmpfs directory entry.  The sequence numbers
 are kept sorted in the directory entry list TAILQ order because it =
 doesn't
 cost anything much to do that in the current code and having a file =
 offset with
 actual ordering semantics fixes some things that the current code can do =
 wrong
 when directories are being read, like the EINVAL error that getdents(2) =
 says
 only NFS file systems are supposed to return but which appears in the =
 code
 here too.

 The sequence algorithm is minimally simple.  It tracks the last entry =
 added
 and attempts to add the next entry after it with a sequence number 1 =
 higher,
 incrementing past anything already using the sequence number until it =
 finds a
 free spot and wrapping to zero when it gets to the end of the sequence =
 space.
 If the last entry added is removed it backs up and reuses the numbers =
 right
 away.  The first time through the sequence space it never has to skip =
 anything,
 but you are right that subsequent passes through it cost more.  On the =
 other
 hand, the total additional cost for a subsequent full pass through the =
 sequence
 space is about the same as a single, unsuccessful name search in the =
 directory
 with the O(n) search algorithm it uses now.  If it does a name search =
 like that
 before adding a new directory entry (I'm not positive it does since I
 don't understand the name caching, but caches often can't help with =
 names
 that don't exist) the amortised necessary cost increase would be in the
 fractional parts per billion.  The absolute amount of work this involves
 is tiny if the number of directory entries is small compared to the
 sequence space.  My guess is that the sequence numbers may not matter in
 this case, if there are performance issues the O(n) name search is the
 long pole in the tent.

 It does read sequence numbers out of the bracketing directory entries =
 during
 name adds when it could instead easily cache them in the directory =
 inode.  I
 didn't do that because I didn't want to increase the inode size, and I =
 suspect
 the directory entries are being read by a name search anyway so the data =
 it
 needs is likely to be in the processor cache already.  If that isn't =
 true this
 could be easily fixed.

 This was done after I got a crash like this PR while trying to see if I
 could speed up builds this way.  I ran it for a while with =
 TMPFS_DIRCOOKIE_EOF
 set much smaller, it is actually quite difficult to get the 2^31
 sequence space to roll over under normal use (it takes a long time for
 a C program explicitly designed to make it roll over to accomplish =
 that).
 I didn't like the memory address cookie thing because it was too easy to
 think up things it could do that are wrong, even if they probably =
 wouldn't
 happen.  The code below may not be bug free, but it does the same few
 things over and over without the possibility of exceptions that are out
 of your control so it might be possible to get the bugs out of it.

 Dennis Ferguson


 Index: sys/fs/tmpfs/tmpfs.h
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 RCS file: /cvsroot/src/sys/fs/tmpfs/tmpfs.h,v
 retrieving revision 1.45
 diff -u -r1.45 tmpfs.h
 --- sys/fs/tmpfs/tmpfs.h	27 Sep 2011 01:10:43 -0000	1.45
 +++ sys/fs/tmpfs/tmpfs.h	11 Oct 2013 18:37:27 -0000
 @@ -44,6 +44,11 @@
  #include <sys/vnode.h>
 =20
  /*
 + * Type of a directory entry sequence number.  See TMPFS_DIRCOOKIE_EOF.
 + */
 +typedef	uint32_t	tmpfs_seq_t;
 +
 +/*
   * Internal representation of a tmpfs directory entry.
   *
   * All fields are protected by vnode lock.
 @@ -54,9 +59,12 @@
  	/* Pointer to the inode this entry refers to. */
  	struct tmpfs_node *		td_node;
 =20
 +	/* Sequence in directory */
 +	tmpfs_seq_t			td_seq;
 +
  	/* Name and its length. */
 -	char *				td_name;
  	uint16_t			td_namelen;
 +	char *				td_name;
  } tmpfs_dirent_t;
 =20
  TAILQ_HEAD(tmpfs_dir, tmpfs_dirent);
 @@ -67,36 +75,29 @@
  /* Validate maximum td_namelen length. */
  CTASSERT(TMPFS_MAXNAMLEN < UINT16_MAX);
 =20
 -#define	TMPFS_DIRCOOKIE_DOT	0
 -#define	TMPFS_DIRCOOKIE_DOTDOT	1
 -#define	TMPFS_DIRCOOKIE_EOF	2
 -
  /*
 - * Each entry in a directory has a cookie that identifies it.  Cookies
 - * supersede offsets within directories, as tmpfs has no offsets as =
 such.
 + * Each entry in a directory has a sequence number that identifies it.
 + * The directory entries are kept sorted into ascending sequence order.
 + * This acts as a stand-in for offset; it provides ordering for =
 sequential
 + * directory reads.
   *
 - * The '.', '..' and the end of directory markers have fixed cookies,
 - * which cannot collide with the cookies generated by other entries.
 + * The '.' and '..' entries have fixed sequence numbers (and no actual
 + * directory entries).  Directory entries have a sequence number =
 greater
 + * than those but less than TMPFS_DIRCOOKIE_EOF.  The latter is set
 + * to 2^31-1 to avoid Linux compat problems, see PR32034, and the type
 + * of tmpfs_seq_t is set to uint32_t to match.  For no Linux =
 compatibility
 + * and huge directories make tmpfs_seq_t an off_t and _EOF a much =
 larger
 + * number.
   *
 - * The cookies for the other entries are generated based on the memory
 - * address of their representative meta-data structure.
 - *
 - * XXX: Truncating directory cookies to 31 bits now - workaround for
 - * problem with Linux compat, see PR/32034.
 + * We call the sequence numbers "cookies" since the old code used the =
 name
 + * and they are used to fill in the cookie fields for NFS directory =
 reads.
   */
 -static inline off_t
 -tmpfs_dircookie(tmpfs_dirent_t *de)
 -{
 -	off_t cookie;
 -
 -	cookie =3D ((off_t)(uintptr_t)de >> 1) & 0x7FFFFFFF;
 -	KASSERT(cookie !=3D TMPFS_DIRCOOKIE_DOT);
 -	KASSERT(cookie !=3D TMPFS_DIRCOOKIE_DOTDOT);
 -	KASSERT(cookie !=3D TMPFS_DIRCOOKIE_EOF);
 +#define	TMPFS_DIRCOOKIE_DOT	0
 +#define	TMPFS_DIRCOOKIE_DOTDOT	1
 +#define	TMPFS_DIRCOOKIE_MIN	2		/* min td_seq */
 +#define	TMPFS_DIRCOOKIE_EOF	0x7fffffff	/* max td_seq */
 =20
 -	return cookie;
 -}
 -#endif
 +#endif	/* defined(_KERNEL) */
 =20
  /*
   * Internal representation of a tmpfs file system node -- inode.
 @@ -169,12 +170,14 @@
  			/* List of directory entries. */
  			struct tmpfs_dir	tn_dir;
 =20
 +			/* Pointer to insertion point for new entries */
 +			struct tmpfs_dirent *	tn_insert;
 +
  			/*
 -			 * Number and pointer of the last directory =
 entry
 +			 * Pointer to the last directory entry
  			 * returned by the readdir(3) operation.
  			 */
 -			off_t			tn_readdir_lastn;
 -			struct tmpfs_dirent *	tn_readdir_lastp;
 +			struct tmpfs_dirent *	tn_readdir_last;
  		} tn_dir;
 =20
  		/* Type case: VLNK. */
 @@ -278,7 +281,7 @@
 =20
  int		tmpfs_dir_getdotdent(tmpfs_node_t *, struct uio *);
  int		tmpfs_dir_getdotdotdent(tmpfs_node_t *, struct uio *);
 -tmpfs_dirent_t *tmpfs_dir_lookupbycookie(tmpfs_node_t *, off_t);
 +tmpfs_dirent_t *tmpfs_dir_getnext(tmpfs_node_t *, off_t);
  int		tmpfs_dir_getdents(tmpfs_node_t *, struct uio *, off_t =
 *);
 =20
  int		tmpfs_reg_resize(vnode_t *, off_t);
 @@ -324,9 +327,6 @@
  #define TMPFS_VALIDATE_DIR(node) \
      KASSERT((node)->tn_type =3D=3D VDIR); \
      KASSERT((node)->tn_size % sizeof(tmpfs_dirent_t) =3D=3D 0); \
 -    KASSERT((node)->tn_spec.tn_dir.tn_readdir_lastp =3D=3D NULL || \
 -        tmpfs_dircookie((node)->tn_spec.tn_dir.tn_readdir_lastp) =3D=3D =
 \
 -        (node)->tn_spec.tn_dir.tn_readdir_lastn);
 =20
  /*
   * Memory management stuff.
 Index: sys/fs/tmpfs/tmpfs_subr.c
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 RCS file: /cvsroot/src/sys/fs/tmpfs/tmpfs_subr.c,v
 retrieving revision 1.80
 diff -u -r1.80 tmpfs_subr.c
 --- sys/fs/tmpfs/tmpfs_subr.c	4 Oct 2013 15:14:11 -0000	1.80
 +++ sys/fs/tmpfs/tmpfs_subr.c	11 Oct 2013 18:37:27 -0000
 @@ -155,8 +155,8 @@
  		/* Directory. */
  		TAILQ_INIT(&nnode->tn_spec.tn_dir.tn_dir);
  		nnode->tn_spec.tn_dir.tn_parent =3D NULL;
 -		nnode->tn_spec.tn_dir.tn_readdir_lastn =3D 0;
 -		nnode->tn_spec.tn_dir.tn_readdir_lastp =3D NULL;
 +		nnode->tn_spec.tn_dir.tn_insert =3D NULL;
 +		nnode->tn_spec.tn_dir.tn_readdir_last =3D NULL;
 =20
  		/* Extra link count for the virtual '.' entry. */
  		nnode->tn_links++;
 @@ -439,6 +439,53 @@
  }
 =20
  /*
 + * tmpfs_dir_newinsert: find a spot in the directory where
 + * there is free sequence number space so that nodes can be
 + * inserted.
 + *
 + * =3D> Scan the directory entries forward from the current
 + *    insert point looking for an unused sequence number.
 + * =3D> Return the predecessor node to the unused spot.  This
 + *    might be NULL if TMPFS_DIRCOOKIE_MIN is not in use.
 + * =3D> Stop searching as soon as we find even a single empty
 + *    spot.  If our sequence space is size S and the number
 + *    of directory entries is N then the cost of doing (S - N)
 + *    tmpfs_dir_attach() operations will be scanning N directory
 + *    entries (or about the same as two name lookups given the
 + *    O(N) algorithm used here).
 + */
 +static tmpfs_dirent_t *
 +tmpfs_dir_newinsert(tmpfs_node_t *node)
 +{
 +	tmpfs_dirent_t *de, *de_next, *de_first;
 +	tmpfs_seq_t seq, seq_next;
 +
 +	/* If we're called there's no space here, so seq 1 past first */
 +	de_first =3D node->tn_spec.tn_dir.tn_insert;
 +	seq_next =3D de_first->td_seq + 1;
 +	de_next =3D TAILQ_NEXT(de_first, td_entries);
 +
 +	do {
 +		KASSERT(de_next !=3D de_first);	/* all seqno's used?!? =
 */
 +		de =3D de_next;
 +		if (de =3D=3D NULL) {
 +			seq =3D TMPFS_DIRCOOKIE_MIN + 1;
 +			de_next =3D =
 TAILQ_FIRST(&node->tn_spec.tn_dir.tn_dir);
 +		} else {
 +			seq =3D seq_next;
 +			de_next =3D TAILQ_NEXT(de, td_entries);
 +		}
 +		if (de_next =3D=3D NULL) {
 +			seq_next =3D TMPFS_DIRCOOKIE_EOF - 1;
 +		} else {
 +			seq_next =3D de_next->td_seq;
 +		}
 +	} while (seq >=3D seq_next);
 +
 +	return de;
 +}
 +
 +/*
   * tmpfs_dir_attach: associate directory entry with a specified inode,
   * and attach the entry into the directory, specified by vnode.
   *
 @@ -451,6 +498,8 @@
  tmpfs_dir_attach(vnode_t *dvp, tmpfs_dirent_t *de, tmpfs_node_t *node)
  {
  	tmpfs_node_t *dnode =3D VP_TO_TMPFS_DIR(dvp);
 +	tmpfs_dirent_t *de_prev, *de_next;
 +	tmpfs_seq_t new_seq;
  	int events =3D NOTE_WRITE;
 =20
  	KASSERT(VOP_ISLOCKED(dvp));
 @@ -465,8 +514,35 @@
  		node->tn_dirent_hint =3D de;
  	}
 =20
 -	/* Insert the entry to the directory (parent of inode). */
 -	TAILQ_INSERT_TAIL(&dnode->tn_spec.tn_dir.tn_dir, de, =
 td_entries);
 +	/* Find the insertion point.  Make sure we have a sequence =
 space. */
 +	de_prev =3D dnode->tn_spec.tn_dir.tn_insert;
 +	if (de_prev =3D=3D NULL) {
 +		new_seq =3D TMPFS_DIRCOOKIE_MIN;
 +	} else {
 +		new_seq =3D de_prev->td_seq + 1;
 +		de_next =3D TAILQ_NEXT(de_prev, td_entries);
 +		if (new_seq >=3D TMPFS_DIRCOOKIE_EOF ||
 +		    (de_next !=3D NULL && new_seq >=3D de_next->td_seq)) =
 {
 +			de_prev =3D tmpfs_dir_newinsert(dnode);
 +			if (de_prev =3D=3D NULL) {
 +				new_seq =3D TMPFS_DIRCOOKIE_MIN;
 +			} else {
 +				new_seq =3D de_prev->td_seq + 1;
 +			}
 +		}
 +	}
 +
 +	/* Insert the entry into directory after de_prev (or at head) */
 +	de->td_seq =3D new_seq;
 +	if (de_prev) {
 +		TAILQ_INSERT_AFTER(&dnode->tn_spec.tn_dir.tn_dir,
 +				   de_prev, de, td_entries);
 +	} else {
 +		TAILQ_INSERT_HEAD(&dnode->tn_spec.tn_dir.tn_dir,
 +				  de, td_entries);
 +	}
 +	dnode->tn_spec.tn_dir.tn_insert =3D de;
 +
  	dnode->tn_size +=3D sizeof(tmpfs_dirent_t);
  	dnode->tn_status |=3D TMPFS_NODE_STATUSALL;
  	uvm_vnp_setsize(dvp, dnode->tn_size);
 @@ -528,9 +604,12 @@
  	}
 =20
  	/* Remove the entry from the directory. */
 -	if (dnode->tn_spec.tn_dir.tn_readdir_lastp =3D=3D de) {
 -		dnode->tn_spec.tn_dir.tn_readdir_lastn =3D 0;
 -		dnode->tn_spec.tn_dir.tn_readdir_lastp =3D NULL;
 +	if (dnode->tn_spec.tn_dir.tn_readdir_last =3D=3D de) {
 +		dnode->tn_spec.tn_dir.tn_readdir_last =3D NULL;
 +	}
 +	if (dnode->tn_spec.tn_dir.tn_insert =3D=3D de) {
 +		dnode->tn_spec.tn_dir.tn_insert =3D
 +		    TAILQ_PREV(de, tmpfs_dir, td_entries);
  	}
  	TAILQ_REMOVE(&dnode->tn_spec.tn_dir.tn_dir, de, td_entries);
 =20
 @@ -620,7 +699,7 @@
  	else {
  		error =3D uiomove(dentp, dentp->d_reclen, uio);
  		if (error =3D=3D 0)
 -			uio->uio_offset =3D TMPFS_DIRCOOKIE_DOTDOT;
 +			uio->uio_offset =3D TMPFS_DIRCOOKIE_DOT + 1;
  	}
  	node->tn_status |=3D TMPFS_NODE_ACCESSED;
  	kmem_free(dentp, sizeof(struct dirent));
 @@ -654,13 +733,7 @@
  	else {
  		error =3D uiomove(dentp, dentp->d_reclen, uio);
  		if (error =3D=3D 0) {
 -			tmpfs_dirent_t *de;
 -
 -			de =3D =
 TAILQ_FIRST(&node->tn_spec.tn_dir.tn_dir);
 -			if (de =3D=3D NULL)
 -				uio->uio_offset =3D TMPFS_DIRCOOKIE_EOF;
 -			else
 -				uio->uio_offset =3D tmpfs_dircookie(de);
 +			uio->uio_offset =3D TMPFS_DIRCOOKIE_DOTDOT + 1;
  		}
  	}
  	node->tn_status |=3D TMPFS_NODE_ACCESSED;
 @@ -669,23 +742,31 @@
  }
 =20
  /*
 - * tmpfs_dir_lookupbycookie: lookup a directory entry by associated =
 cookie.
 + * tmpfs_dir_getnext: find an entry with a sequence >=3D cookie
   */
  tmpfs_dirent_t *
 -tmpfs_dir_lookupbycookie(tmpfs_node_t *node, off_t cookie)
 +tmpfs_dir_getnext(tmpfs_node_t *node, off_t cookie)
  {
  	tmpfs_dirent_t *de;
 +	tmpfs_seq_t next_seq;
 =20
  	KASSERT(VOP_ISLOCKED(node->tn_vnode));
 =20
 -	if (cookie =3D=3D node->tn_spec.tn_dir.tn_readdir_lastn &&
 -	    node->tn_spec.tn_dir.tn_readdir_lastp !=3D NULL) {
 -		return node->tn_spec.tn_dir.tn_readdir_lastp;
 +	if (cookie >=3D TMPFS_DIRCOOKIE_EOF || cookie < =
 TMPFS_DIRCOOKIE_DOT) {
 +		return NULL;
  	}
 -	TAILQ_FOREACH(de, &node->tn_spec.tn_dir.tn_dir, td_entries) {
 -		if (tmpfs_dircookie(de) =3D=3D cookie) {
 +	next_seq =3D (tmpfs_seq_t) cookie;
 +
 +	de =3D node->tn_spec.tn_dir.tn_readdir_last;
 +	if (de =3D=3D NULL || de->td_seq > next_seq) {
 +		de =3D TAILQ_FIRST(&node->tn_spec.tn_dir.tn_dir);
 +	}
 +
 +	while (de) {
 +		if (de->td_seq >=3D next_seq) {
  			break;
  		}
 +		de =3D TAILQ_NEXT(de, td_entries);
  	}
  	return de;
  }
 @@ -699,7 +780,7 @@
  int
  tmpfs_dir_getdents(tmpfs_node_t *node, struct uio *uio, off_t *cntp)
  {
 -	tmpfs_dirent_t *de;
 +	tmpfs_dirent_t *de, *last_de;
  	struct dirent *dentp;
  	off_t startcookie;
  	int error;
 @@ -715,17 +796,14 @@
  	startcookie =3D uio->uio_offset;
  	KASSERT(startcookie !=3D TMPFS_DIRCOOKIE_DOT);
  	KASSERT(startcookie !=3D TMPFS_DIRCOOKIE_DOTDOT);
 -	if (startcookie =3D=3D TMPFS_DIRCOOKIE_EOF) {
 -		return 0;
 -	} else {
 -		de =3D tmpfs_dir_lookupbycookie(node, startcookie);
 -	}
 +	de =3D tmpfs_dir_getnext(node, startcookie);
  	if (de =3D=3D NULL) {
 -		return EINVAL;
 +		return 0;
  	}
 +	last_de =3D NULL;		/* track last entry written */
 =20
  	/*
 -	 * Read as much entries as possible; i.e., until we reach the =
 end
 +	 * Read as many entries as possible; i.e., until we reach the =
 end
  	 * of the directory or we exhaust uio space.
  	 */
  	dentp =3D kmem_alloc(sizeof(struct dirent), KM_SLEEP);
 @@ -783,20 +861,26 @@
  		 * advance pointers.
  		 */
  		error =3D uiomove(dentp, dentp->d_reclen, uio);
 +		if (error !=3D 0) {
 +			break;
 +		}
 =20
 +		/*
 +		 * At this point we have successfully written de.  Keep
 +		 * track the last successfully written entry in last_de.
 +		 */
  		(*cntp)++;
 -		de =3D TAILQ_NEXT(de, td_entries);
 -	} while (error =3D=3D 0 && uio->uio_resid > 0 && de !=3D NULL);
 +		last_de =3D de;
 +	} while (uio->uio_resid > 0 &&
 +		 (de =3D TAILQ_NEXT(de, td_entries)) !=3D NULL);
 =20
  	/* Update the offset and cache. */
  	if (de =3D=3D NULL) {
  		uio->uio_offset =3D TMPFS_DIRCOOKIE_EOF;
 -		node->tn_spec.tn_dir.tn_readdir_lastn =3D 0;
 -		node->tn_spec.tn_dir.tn_readdir_lastp =3D NULL;
 -	} else {
 -		node->tn_spec.tn_dir.tn_readdir_lastn =3D =
 uio->uio_offset =3D
 -		    tmpfs_dircookie(de);
 -		node->tn_spec.tn_dir.tn_readdir_lastp =3D de;
 +		node->tn_spec.tn_dir.tn_readdir_last =3D NULL;
 +	} else if (last_de) {
 +		uio->uio_offset =3D last_de->td_seq + 1;
 +		node->tn_spec.tn_dir.tn_readdir_last =3D last_de;
  	}
  	node->tn_status |=3D TMPFS_NODE_ACCESSED;
  	kmem_free(dentp, sizeof(struct dirent));
 Index: sys/fs/tmpfs/tmpfs_vnops.c
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
 RCS file: /cvsroot/src/sys/fs/tmpfs/tmpfs_vnops.c,v
 retrieving revision 1.103
 diff -u -r1.103 tmpfs_vnops.c
 --- sys/fs/tmpfs/tmpfs_vnops.c	4 Oct 2013 15:14:11 -0000	1.103
 +++ sys/fs/tmpfs/tmpfs_vnops.c	11 Oct 2013 18:37:27 -0000
 @@ -995,21 +995,21 @@
  	*ncookies =3D cnt;
 =20
  	for (i =3D 0; i < cnt; i++) {
 -		KASSERT(off !=3D TMPFS_DIRCOOKIE_EOF);
 +		KASSERT(off < uio->uio_offset);
  		if (off !=3D TMPFS_DIRCOOKIE_DOT) {
  			if (off =3D=3D TMPFS_DIRCOOKIE_DOTDOT) {
  				de =3D =
 TAILQ_FIRST(&node->tn_spec.tn_dir.tn_dir);
  			} else if (de !=3D NULL) {
  				de =3D TAILQ_NEXT(de, td_entries);
  			} else {
 -				de =3D tmpfs_dir_lookupbycookie(node, =
 off);
 +				de =3D tmpfs_dir_getnext(node, off);
  				KASSERT(de !=3D NULL);
  				de =3D TAILQ_NEXT(de, td_entries);
  			}
  			if (de =3D=3D NULL) {
 -				off =3D TMPFS_DIRCOOKIE_EOF;
 +				off =3D uio->uio_offset;
  			} else {
 -				off =3D tmpfs_dircookie(de);
 +				off =3D de->td_seq;
  			}
  		} else {
  			off =3D TMPFS_DIRCOOKIE_DOTDOT;

From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: Dennis Ferguson <dennis.c.ferguson@gmail.com>, Thomas Klausner
 <wiz@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
 "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Sun, 27 Oct 2013 16:03:40 +0000

 Dennis Ferguson <dennis.c.ferguson@gmail.com> wrote:
 > >>>> <...>
 > >>> 
 > >>> Most likely this is due to tmpfs_dircookie() truncation here:
 > >>> 
 > >>> http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
 > >>> 
 > >>> It is wrong and there are other PRs because of it.  Thomas, you can
 > >>> test this by replacing the function body with the following:
 > >>> 
 > >>> 	return (off_t)(uintptr_t)de;
 > >>> 
 > >>> This breaks linux32 compat, but we really just need to decide how we
 > >>> want to fix it (it would be good to avoid penalising the native code,
 > >>> but better to penalise than fail).
 > >> 
 > >> <...>
 > 
 > Assuming my mail client doesn't ruin it I've attached a patch which adds
 > a sequence number to each tmpfs directory entry.  The sequence numbers
 > are kept sorted in the directory entry list TAILQ order because it doesn't
 > cost anything much to do that in the current code and having a file
 > offset with actual ordering semantics fixes some things that the current
 > code can do wrong when directories are being read, like the EINVAL error
 > that getdents(2) says only NFS file systems are supposed to return but
 > which appears in the code here too.

 I think there is a better way.  Here is the patch:

 http://www.netbsd.org/~rmind/tmpfs_readdir_fixes.diff

  tmpfs.h        |   86 +++++--------
  tmpfs_rename.c |   14 +-
  tmpfs_subr.c   |  373 ++++++++++++++++++++++++++++++++-------------------------
  tmpfs_vfsops.c |   39 +++--
  tmpfs_vnops.c  |   75 ++++-------
  5 files changed, 309 insertions(+), 278 deletions(-)

 It also fixes tmpfs_unmount() and the net diff seems better. :)

 Thomas, can you try this patch on your build-hammer-machine?

 Thanks.

 -- 
 Mindaugas

From: Thomas Klausner <wiz@NetBSD.org>
To: Mindaugas Rasiukevicius <rmind@netbsd.org>
Cc: Dennis Ferguson <dennis.c.ferguson@gmail.com>,
	NetBSD bugtracking <gnats-bugs@NetBSD.org>
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
 "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Mon, 28 Oct 2013 09:58:31 +0100

 Hi Mindaugas!

 Thanks for the patch!

 On Sun, Oct 27, 2013 at 04:03:40PM +0000, Mindaugas Rasiukevicius wrote:
 > Thomas, can you try this patch on your build-hammer-machine?

 I've built a new kernel with it and started a bulk build. I'll let you
 know if it reboots :)
  Thomas

From: "Mindaugas Rasiukevicius" <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/47739 CVS commit: src/sys/fs/tmpfs
Date: Fri, 8 Nov 2013 15:44:23 +0000

 Module Name:	src
 Committed By:	rmind
 Date:		Fri Nov  8 15:44:23 UTC 2013

 Modified Files:
 	src/sys/fs/tmpfs: tmpfs.h tmpfs_rename.c tmpfs_subr.c tmpfs_vfsops.c
 	    tmpfs_vnops.c

 Log Message:
 tmpfs: replace the broken tmpfs_dircookie() logic which uses the node
 address truncated to 31 bits (required for 32-bit readdir compatibility,
 e.g. linux32).  Instead, assign 2^31 range using the following logic:
 - The first half of the 2^31 is assigned incrementally (the fast path).
 - When exceeded, use the second half of 2^31, but manage with vmem(9).

 It will require 2 billion files per-directory to trigger vmem(9) usage.
 Also, while here, add some fixes for tmpfs_unmount().

 Should fix PR/47739, PR/47480, PR/46088 and PR/41068.
 Thanks to wiz@ for stress testing.


 To generate a diff of this commit:
 cvs rdiff -u -r1.45 -r1.46 src/sys/fs/tmpfs/tmpfs.h
 cvs rdiff -u -r1.4 -r1.5 src/sys/fs/tmpfs/tmpfs_rename.c
 cvs rdiff -u -r1.82 -r1.83 src/sys/fs/tmpfs/tmpfs_subr.c
 cvs rdiff -u -r1.52 -r1.53 src/sys/fs/tmpfs/tmpfs_vfsops.c
 cvs rdiff -u -r1.105 -r1.106 src/sys/fs/tmpfs/tmpfs_vnops.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

Responsible-Changed-From-To: kern-bug-people->rmind
Responsible-Changed-By: rmind@NetBSD.org
Responsible-Changed-When: Fri, 08 Nov 2013 15:55:26 +0000
Responsible-Changed-Why:


State-Changed-From-To: open->closed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Fri, 08 Nov 2013 15:55:26 +0000
State-Changed-Why:
Should be fixed in -current.  Please let us know if you will ever see a
similar problem again.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.