NetBSD Problem Report #47739
From wiz@yt.nih.at Sat Apr 13 10:43:44 2013
Return-Path: <wiz@yt.nih.at>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
by www.NetBSD.org (Postfix) with ESMTP id 78D4863F415
for <gnats-bugs@gnats.NetBSD.org>; Sat, 13 Apr 2013 10:43:44 +0000 (UTC)
Message-Id: <20130413104335.B8EBE2AC736@yt.nih.at>
Date: Sat, 13 Apr 2013 12:43:35 +0200 (CEST)
From: Thomas Klausner <wiz@NetBSD.org>
Reply-To: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@NetBSD.org
Subject: tmpfs panic: kernel diagnostic assertion "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
X-Send-Pr-Version: 3.95
>Number: 47739
>Category: kern
>Synopsis: tmpfs panic: kernel diagnostic assertion "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: rmind
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Apr 13 10:45:00 +0000 2013
>Closed-Date: Fri Nov 08 15:55:26 +0000 2013
>Last-Modified: Fri Nov 08 15:55:26 +0000 2013
>Originator: Thomas Klausner
>Release: NetBSD 6.99.19
>Organization:
Curiosity is the very basis of education and if you tell me that
curiosity killed the cat, I say only that the cat died nobly.
- Arnold Edinborough
>Environment:
System: NetBSD yt.nih.at 6.99.19 NetBSD 6.99.19 (KVOTHE) #0: Sun Apr 7 19:52:05 CEST 2013 wiz@yt.nih.at:/archive/foreign/src/sys/arch/amd64/compile/obj/KVOTHE amd64
Architecture: x86_64
Machine: amd64
>Description:
My machine just paniced from X, so no backtrace.
savecore reported:
Checking for core dump...
savecore: kvm_read: invalid translation (invalid level 2 PDE)
savecore: reboot after panic: kernel diagnostic assertion "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL || tmpfs_dircookie((node)->tn_spec.tn_dir.tn_readdir_lastp) == (node)->tn_spec.tn_dir.tn_readdir_lastn" failed: file "/archive/foreign/src/sys/fs/tmpfs/tmpfs_subr.c", line 610
savecore: system went down at Sat Apr 13 12:15:32 2013
savecore: writing compressed core to /var/crash/netbsd.12.core.gz
>How-To-Repeat:
Run a bulk build inside tmpfs, get unlucky.
Previously happened on March 3, so it's not exactly common.
>Fix:
Not known.
>Release-Note:
>Audit-Trail:
From: Patrick Welche <prlw1@cam.ac.uk>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/47739
Date: Wed, 8 May 2013 11:30:25 +0100
Just got a working core file:
(gdb) bt
#0 0xffffffff804e40de in cpu_reboot (howto=260, bootstr=<optimized out>)
at ../../../../arch/amd64/amd64/machdep.c:705
#1 0xffffffff80696bad in vpanic (
fmt=0xffffffff8096f688 "kernel %sassertion \"%s\" failed: file \"%s\", line
+%d ", ap=0xfffffe811b1769d0) at ../../../../kern/subr_prf.c:284
#2 0xffffffff80834368 in kern_assert (fmt=<unavailable>)
at ../../../../../../lib/libkern/kern_assert.c:50
#3 0xffffffff806de80c in VP_TO_TMPFS_DIR (vp=<optimized out>)
at ../../../../fs/tmpfs/tmpfs.h:357
#4 tmpfs_readdir (v=<optimized out>) at ../../../../fs/tmpfs/tmpfs_vnops.c:938
#5 0xffffffff807daa33 in VOP_READDIR (vp=0xfffffe81591b9be8,
uio=<optimized out>, cred=<optimized out>, eofflag=<optimized out>,
cookies=<optimized out>, ncookies=<optimized out>)
at ../../../../kern/vnode_if.c:952
#6 0xffffffff807c507b in vn_readdir (fp=0xfffffe811bc1c940,
bf=0x7f7ff770b000 <Address 0x7f7ff770b000 out of bounds>, segflg=0,
count=<optimized out>, done=0xfffffe811b176bec, l=0xfffffe81105f3040,
cookies=0x0, ncookies=0x0) at ../../../../kern/vfs_vnops.c:470
count=<optimized out>, done=0xfffffe811b176bec, l=0xfffffe81105f3040,
cookies=0x0, ncookies=0x0) at ../../../../kern/vfs_vnops.c:470
#7 0xffffffff807c07c1 in sys___getdents30 (l=0xfffffe81105f3040,
uap=0xfffffe811b176c80, retval=0xfffffe811b176c30)
at ../../../../kern/vfs_syscalls.c:4611
#8 0xffffffff806affe4 in sy_call (rval=0xfffffe811b176c30,
uap=0xfffffe811b176c80, l=0xfffffe81105f3040, sy=0xffffffff80c99460)
dmesg is full of
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON SYSCALL 16445 -151703264 EXIT 0 7
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
...
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
sys___getdents30() at netbsd:sys___getdents30+0x76
WARNING: SPL NOT LOWERED ON SYSCALL 24678 -1 EXIT f7b2b400 6
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
...
(gdb) frame 3
#3 0xffffffff806de80c in VP_TO_TMPFS_DIR (vp=<optimized out>)
at ../../../../fs/tmpfs/tmpfs.h:357
357 TMPFS_VALIDATE_DIR(node);
(gdb) list
352 VP_TO_TMPFS_DIR(vnode_t *vp)
353 {
354 tmpfs_node_t *node = vp->v_data;
355
356 KASSERT(node != NULL);
357 TMPFS_VALIDATE_DIR(node);
358 return node;
359 }
360
361 #endif /* defined(_KERNEL) */
(gdb) frame 5
#5 0xffffffff807daa33 in VOP_READDIR (vp=0xfffffe81591b9be8,
uio=<optimized out>, cred=<optimized out>, eofflag=<optimized out>,
cookies=<optimized out>, ncookies=<optimized out>)
at ../../../../kern/vnode_if.c:952
952 error = (VCALL(vp, VOFFSET(vop_readdir), &a));
(gdb) list
947 a.a_eofflag = eofflag;
948 a.a_cookies = cookies;
949 a.a_ncookies = ncookies;
950 mpsafe = (vp->v_vflag & VV_MPSAFE);
951 if (!mpsafe) { KERNEL_LOCK(1, curlwp); }
952 error = (VCALL(vp, VOFFSET(vop_readdir), &a));
953 if (!mpsafe) { KERNEL_UNLOCK_ONE(curlwp); }
954 return error;
955 }
956
(gdb) print *vp
$1 = {v_uobj = {vmobjlock = 0xfffffe8197f5fac0, pgops = 0xffffffff80957c80,
memq = {tqh_first = 0x0, tqh_last = 0xfffffe81591b9bf8}, uo_npages = 0,
uo_refs = 1, rb_tree = {rbt_root = 0x0, rbt_ops = 0xffffffff80957a60,
rbt_minmax = {0x0, 0x0}}, uo_ubc = {lh_first = 0x0}}, v_cv = {
cv_opaque = {0x0, 0xfffffe81591b9c38, 0xffffffff809d8ec4}},
v_size = 16720, v_writesize = 16720, v_iflag = 0, v_vflag = 16, v_uflag = 0,
v_numoutput = 0, v_writecount = 0, v_holdcnt = 0, v_synclist_slot = 0,
v_mount = 0xfffffe8110c43008, v_op = 0xfffffe821db1a748, v_freelist = {
tqe_next = 0xfffffe8198503148, tqe_prev = 0xfffffe81985037b8},
v_freelisthd = 0x0, v_mntvnodes = {tqe_next = 0xfffffe81591b9620,
tqe_prev = 0xfffffe81591b9dd0}, v_cleanblkhd = {lh_first = 0x0},
v_dirtyblkhd = {lh_first = 0x0}, v_synclist = {tqe_next = 0x0,
tqe_prev = 0x0}, v_dnclist = {lh_first = 0xfffffe81386c0c00}, v_nclist = {
lh_first = 0xfffffe813d1b3e40}, v_un = {vu_mountedhere = 0x0,
vu_socket = 0x0, vu_specnode = 0x0, vu_fifoinfo = 0x0, vu_ractx = 0x0},
v_type = VDIR, v_tag = VT_TMPFS, v_lock = {rw_owner = 64},
v_data = 0xfffffe81face0660, v_klist = {slh_first = 0x0}}
so v_data is not 0 here...
frame 4, vp is already optimized out...
From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
"(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Mon, 22 Jul 2013 19:42:17 +0200
These are highly reproducible.
Whenever I do a bulk build from scratch, I usually reboot at least
once due to it.
Can someone please take a look?
Thomas
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Wed, 24 Jul 2013 14:22:54 +0000
I don't see any obvious mismanagement of the tn_readdir fields, so
this is probably some other kind of corruption.
Wild guess: tmpfs_rmdir is missing cache_purge(vp), and something is
trying to list a directory that just got rmdir'd.
I'm puzzled by the `WARNING: SPL NOT LOWERED' messages in prlw1's
dmesg. The syscalls it reports (chroot and recv) don't seem to me to
be related to tmpfs. wiz, do you see those messages too, or do you
have DIAGNOSTIC disabled, or are they a red herring for the tmpfs
issue?
From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: Taylor R Campbell <riastradh@NetBSD.org>
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
"(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Wed, 24 Jul 2013 19:02:45 +0200
Thanks for looking at this!
On Wed, Jul 24, 2013 at 02:25:00PM +0000, Taylor R Campbell wrote:
> I don't see any obvious mismanagement of the tn_readdir fields, so
> this is probably some other kind of corruption.
Ok.
> Wild guess: tmpfs_rmdir is missing cache_purge(vp), and something is
> trying to list a directory that just got rmdir'd.
Could be. bulkbuilds do lots of directory creations and removals. Not
sure where the listing process would come from, but since it's
parallel, perhaps it's a bug in particular packages that are not
really parallelbuild-safe?
> I'm puzzled by the `WARNING: SPL NOT LOWERED' messages in prlw1's
> dmesg. The syscalls it reports (chroot and recv) don't seem to me to
> be related to tmpfs. wiz, do you see those messages too, or do you
> have DIAGNOSTIC disabled, or are they a red herring for the tmpfs
> issue?
My dmesg from today still starts with the end of the last panic:
OT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYvSnC_ArLeLa d1d0i4r (4) EaXtI T 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 120 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
netbsd:vn_readdir+0x21b
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 128 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
sys___getdents30() at WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 126 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
netbsd:sys___getdents30+0x60
WARNING: SPL NOT LOWERED ON SYSCALL 132 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 105 4 EXIT 7fc4 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIsTy s7cfael0l (6)
at WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 117 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
WARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 104 4 EXIT 7fe0 6
netbsd:syscall+0xb5
--- syscall (number 390) ---
7f7ff6d09d3a:
cpu10: End traceback...W
ARNING: SPL NOT LOWERED ON SYSCALL 1 5 EXIT 40 7
WARNING: SPL NOT LOWERED ON SYSCALL 191 4 EXIT 7fe0 6
dumping to dev 168,8 (offset=8, size=8380291):
I do have DIAGNOSTIC enabled in this kernel.
Thomas
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: Thomas Klausner <wiz@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
"(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Thu, 8 Aug 2013 21:27:34 +0100
Thomas Klausner <wiz@NetBSD.org> wrote:
> The following reply was made to PR kern/47739; it has been noted by GNATS.
>
> From: Thomas Klausner <wiz@NetBSD.org>
> To: gnats-bugs@NetBSD.org
> Cc:
> Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
> "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
> Date: Mon, 22 Jul 2013 19:42:17 +0200
>
> These are highly reproducible.
>
> Whenever I do a bulk build from scratch, I usually reboot at least
> once due to it.
>
> Can someone please take a look?
> Thomas
Most likely this is due to tmpfs_dircookie() truncation here:
http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
It is wrong and there are other PRs because of it. Thomas, you can test
this by replacing the function body with the following:
return (off_t)(uintptr_t)de;
This breaks linux32 compat, but we really just need to decide how we want
to fix it (it would be good to avoid penalising the native code, but better
to penalise than fail).
--
Mindaugas
From: Thomas Klausner <wiz@NetBSD.org>
To: Mindaugas Rasiukevicius <rmind@netbsd.org>
Cc: NetBSD bugtracking <gnats-bugs@NetBSD.org>
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
"(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Tue, 20 Aug 2013 07:32:56 +0200
On Thu, Aug 08, 2013 at 09:27:34PM +0100, Mindaugas Rasiukevicius wrote:
> Thomas Klausner <wiz@NetBSD.org> wrote:
> > The following reply was made to PR kern/47739; it has been noted by GNATS.
> >
> > From: Thomas Klausner <wiz@NetBSD.org>
> > To: gnats-bugs@NetBSD.org
> > Cc:
> > Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
> > "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
> > Date: Mon, 22 Jul 2013 19:42:17 +0200
> >
> > These are highly reproducible.
> >
> > Whenever I do a bulk build from scratch, I usually reboot at least
> > once due to it.
> >
> > Can someone please take a look?
> > Thomas
>
> Most likely this is due to tmpfs_dircookie() truncation here:
>
> http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
>
> It is wrong and there are other PRs because of it. Thomas, you can test
> this by replacing the function body with the following:
>
> return (off_t)(uintptr_t)de;
>
> This breaks linux32 compat, but we really just need to decide how we want
> to fix it (it would be good to avoid penalising the native code, but better
> to penalise than fail).
It survived a bit longer, but rebooted last night.
Checking for core dump...
savecore: reboot after panic: WTAA RLRNNOINWGEGR :ES DPS LP OLNN O NTTOW RALTAROP NW LIEEONRGIEETRD E DO OS0NNP
LTT RNAPOP T E EXLXIOIWT E 6R6A E 0D0N
IONNG :T RSAPPL ENXOITT L6O WW0E
RENDI NOGN: TSRPALP NEOXTI WTLA OR6WNEIR0NEGD: OSNP LT RNAOPT ELXOIWTE R6E D0
ON TRAPA RENXIINTG :6 S0P
L NOT LOWERED ON TRAP EXIT 6A R0N
IWNAGR
savecore: system went down at Tue Aug 20 00:39:50 2013
savecore: writing compressed core to /var/crash/netbsd.42.core.gz
(gdb) target kvm netbsd.core
#0 0xffffffff805e3299 in cpu_reboot ()
(gdb) bt
#0 0xffffffff805e3299 in cpu_reboot ()
#1 0xffffffff807e92be in vpanic ()
#2 0xffffffff8098cc7a in kern_assert ()
#3 0xffffffff80834a6b in tmpfs_readdir ()
#4 0xffffffff80926743 in VOP_READDIR ()
#5 0xffffffff809092fb in vn_readdir ()
#6 0xffffffff809048e0 in sys___getdents30 ()
#7 0xffffffff80807225 in syscall ()
#8 0xffffffff801006a1 in Xsyscall ()
#9 0x000000000000000a in ?? ()
#10 0x00007f7ff7b3f000 in ?? ()
#11 0x0000000000001000 in ?? ()
#12 0x00007f7ff710c84a in ?? ()
#13 0x0000000000000ff0 in ?? ()
#14 0x00007f7ff7bb82d5 in ?? ()
#15 0x0000000000000000 in ?? ()
Thomas
From: Thomas Klausner <wiz@NetBSD.org>
To: NetBSD bugtracking <gnats-bugs@NetBSD.org>
Cc:
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
"(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Mon, 2 Sep 2013 09:00:18 +0200
Completely unscientifically: I feel that with the patch applied I see
less panics.
I did have one last night again, dmesg after boot starts with:
EXIT 6 0
WARNIWNAGR:N INSGP LS PNWLOA TRN NOLITNO GLWO WESERPRELEDD OONNNO TTT RRLAAOPPW E EREXEIT D6 O0X
I nTTe RtA6P d :Ev
XW_IArTeRa6N dI0iN
rG:+ 0SxP2L1 bN
OT LOWERED ONW ATRRNAIPN GE:X ISTP L6W AN0O
N LIONWGE:R ESDP LON NTWROAATRP NLEIOXWIGTE: R6 E0PDL NOON LO WTERRAEPW AEOXNN IITTRN AG6P: 0ES
XPILT N6O T0
LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 W0
ARNING: WSARPNLI NNGOW:T A SNLPIOLN WNE:O RTSE PDLL O OWNNET RTLERDAW PEORNE EDT XOANPI TRAP TE XXII6TT 60 6
0
0
WARNING: SPL WNAORTN ILNOGWW:AE RRSENDIL ONNNGO :TT RSALPPOL WE EXNRIETTD LO6NO WTE
RAP RXIET D6 W0OANR NTIRNAG:P SEPXLW ANRTO IT6 G L:0O
WSPELR ENDT OLNO WTERARPE D ONE XTIWRTAAPR 6NE I0XNI
GT: 6 S0P
L NOT LOWWEARRENDI NG:W NSA PRTNLI RNNAGOPT SLEPOXWIE RTNE ODT6 O0ONW
ERRAEPD EOXNI TT R6A P0
EXIT 6 0
WARNWIANRGN:IA NRSGN:PI NLSG:P LS PNNLOO TNT O LTLO OWWOEWERERREDE DOD NO OTNR ATTPRR AAEXIPPT E E6I X0TI
T6 06
0
WARNING: SPLWNAWORATN NILINOGWWG:EA: RR SNEPIDPLN GONN:OO TTTSRP LALLOPO W WENEERORXETDI T 6 O0NLO
OSNWY SECTARRLELAD P 1OE1XN8WI TAT R26NR IE0NX
PIGT :Ef fXSfIPfTLd 9Na6O0 T 60
OWERED ON TRAP EXIT 6W A0R
NING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWEARRENDI NOGN :TR AS EXPILT 6WN A0OR
TN ILNOG: WSEPRLE DN OTO LONW EARRERNA PNO GN:E TXSRIPALPT NEO6X 0TL
O6W E0R
ED ON TRAP EXIT 6 0
WARNING: SWPALR NNIWONAGT: NLISOPWGLE :RN EOSDP LLO ONWN OETRR EALDPO WOENR XETIDRT AP EXOINT 6T6 R00A
P EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT W6A RN0I
NG: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNWNAGR:W IASRGP:NL ISNNPOLT :N OLTOS WPLEORWEEDR ENOON TOT NR LATOPWREAEXPRI TEXIT 6 0D
O0N
TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WWAARRNWNIAINRNNGIG:N G S:SP PLSL P LNN OONTOT TL LOOOWWEEERREEEDDD OOONNN T TRRTAAPRP EXIT 6 E0X
PI TE X6I T0
6 0
WARNING: SPL NOT LOWERED ON TRAP EXWITA R6N I0N
G: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON SYSCALL 22711 -1 EXIT f7b28400 6
WARNING: SPL NOT LOWERED ON SYSCALL 0 5 EXIT WffAfRfNaIfN8G0: 6S
PL NOT LOWERED ON TRAP EXITW A6RN IN0GW:A
RSNPILN GN:O TS PLLO WNEORTE DL OOWNE RTERDA PO NE XSIYTS C6A LWLA R1N54I64N -G1: E XSITP Lf bN2O8TW4 A0LR0ON WI6N
EGR: ESDP LO NNO T LTORWAERPE D EOXNI TT RA6P EXIT0
6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWWAERRNEID NOGN: TSRALP NEOXTT 6L O0W
ERED ON TRAP EXIT 6 W
ARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWWEREAD RONN ITRNP GE:X ITS P6L 0
NOT LOWERED ON TRAP EXITW AR6N I0sNW
yGAs:R_N _I_SgeP:tL dSNeOPTtLsL 3ON0WO(E) R LEaODtW EON RTERDA PO NE XSIYTS C6A L0L
17186W -A1R5N1I71N2G5W6:8 RSENPXILI GTN: O 0 ST7P
L NLOTO WLEORWEEDRE DO NON TTRRAAPPWE AEIRTXN II6TN 06: SP0L
NOT LOWERED ON TRAP EXIT W6A R0N
ING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWEREWDA RONNI TNAGP :E XSIWPATLR N I6NN OG0T
: LSOPWLE RNOTE DL OWOWANER RNTEIRNDAG P:O NSX PITTR A6NP O 0TE
XLIOTW E6R E0D
ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED AORNNWITANRRGAN:PI NESGXPI:LT N6SOP0TL
LONWERETD LOO TWRWEARAPER DNE XIONTNG :6T R0SA
PPL NOET XLOIWTE RE6D O0N
TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6W WAA0RR
NNIINNGG: :S PSLP LN ONTO TL OLWOEWnREeERtDE sDd :NOsN y TsSR_Y_AS_PCg eAEWtLXdALeTR n 9Nt67 23I050N
G+-0x16:50 1
S7P1A2LR5 N6NI8N GET:X I TLS OP0WL E 7RN
OTE LDO WEORNE DT ROANP TERXWAIAPT RE NX6II TN0 G6:
0S
PL NOT LOWERED ON TRAP EXIT 6 0
WARNIWNAGR: NSIPL NNOTG L:O WSERPELD ONN OTRTAP LEOXIWTE 6R E0D
WOANR NTIRNAGP:W ASERPXNLII TNN GOW:6ATR N S0LP
NLWGE :RN EODPT L OLNNO OWTTER RALEPOD W EEORXWINATE D T 6N AN0P I
SNEYGS:ICTA LS6L 10L
5 NWEXOIRTN I4NL0GW OA:7W R
ENRPILEN GDN: O TSOPLONLW ENRTOERDT A OPNL OWEWRAEARIPETI D EN6 X: SOPINLT T0NR
6OA TP0 L
EOXWIETR E6D 0O
N SYASCRLWNAI RN8N9G4N9:G :4S PESLXPI LTN WONA7TOf dNLf IO LNW6GO
E:WR ESEPDRL E DON OONTN LTOTRWRAEPAR EED X EIOXTI T 6 T06R
A0P
EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWEREWDA RONINN G: STPRLA PNO TE LXOIWETRED ON T R0P
EXIT 6 0
WARNING: SPL NOT LOWERED ON SYSCALL W2A8R8N4WI2N A:- R1SNPIL1 N7NG1OT: 5 L6SO8W EELRX D ON TRIATPN O0E TX7 I
LTO W6E R0E
D ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED OW ATRRNAIPN GE:X ISTP L6 N0O
T LOWERED OWNA RTNRIANPG :E XSIPTL 6N O0T
LOWERED ON SYSCALL 80 12 EXIT ffffd9a0 6
WARNING: SPL WOATR NLIOWE:R ESDP LO NN OTTR ALPO WEEXRIETD 6O N0
TRAP EXITW 6 0
ARNING: SPWLA RNNOINTG :L WOAPWLRE NRNIEONTGD : LOSOPWNLE ETDNR OAOTN T LEOAPW IETRXE ITD 66O N0
0T
RAP EXIT 6 0
WARNING: SPL NOWTA RLNOIWNEGR:E DS WPOANLR NNTOIRN GLP:O EESRPEILDT ON6 OT0R
A P LEOXWIETR E6D 0O
N TRAP EXIT 6 0W
AARRNNIINNGG:: SSPPLL NNOOTT LLOOWWEERREEDD OONN TTRRAAPP EEXXIITT 66 W 00
ARNING: SPL NOT LOWEREDWWAARORNNNIIN NGT:GR :S PSPLP LNE ONXTOI TTL OL6OW WE0RE
ERDE DO NO NT RTARPA PE XEIXTI T6 60
0
WARNING: SPL NWOATR NLIONWG:E RSEPDL ONONT LOWTERRAEPD WOEAXR INTTIRNA GP6: E0XSI
PTL 6 N0O
T LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT W6A RNI0NA
GR:N ISNPGL: NSOPTL LNOOWTE RLEODW EORNE DT ROANP TERXAIPT E6X I0T
6 0
WARNING: SPL NOT LOWERED WOANANR INTNIRGNAGP:S PESLX PINLTO TN6 LTO W0LE
RWEEDR EODN OTNR ATPR AEPX IETX I6T 06
0
WARNING: SPL NOT LOWERWEADR NOINN GT:R SAPPL WEARXONITT NL GO:W E6S PE0LD
ONNO TT RLAOPWERED EOXNI TT R6A P0
EWXAIRT N6 I0N
G: SPWLAR NINNOGT: LSWPOALW ENNORIEN GD: OOWSNEP LRT ERNDAO PTO NLEOXTWIRTA RPE6D E0OXN
I TRATP 6E X0I
T 6 0
WARNING: SPWL AORTN ILONWGEWR:AER DNSIPONLG : TNRAPTPL LENOXOWITTE LRO6EW DR
EODN OTNR ATR AEPX IETX IT 66 00
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNINGW:A RSIPNGL:W ASRNNPOILN TGN :OLT O SLWPOLEW REEOREDD LOONNW ETTRRAAPD P EO XNEI XTT IRAP EXI6T T06
60
0
WARNING: SPL NOT LOWERED ON TRAP EXIT 7 0
WAWRANRINNIGN:G :S PWSLAP RNN OINTNOG T:L OLSWOPEWLRE ERNEODT OLONON WT ETARRPEA DPE XOIEN TT XRIATP 60E
X0IT
6 0
WARNING: SPL NOT LOWERED ONW ATRRNAIPN WE:AX RISTP I6N 0GN
:O TS PLLO WNEORTEWDAL ROONNWI ENTRR:AE PDS POXLNI TNTOR T6A PL 0OE
WXERIETD 6O N0
TRAP EXIT W6A R0N
ING: SPL NOT LOWERED ON TRAP EXIT 6 0
WAWRANRINNIGN:G :WS APSRLNL I NNNOGOTT LPLOLW ONEWORETER DE LDOO NW OTNRR ATDR OAENP TRAXPI ETE XX6I I0T
66 00
WARNING: SPL NOT WLAORWNEIRNEGW:D ASOPRNLN INRNAGPT: ESXPOILTW EN6OR TE0
LDO WOENR ETDR AOPN EWTXRIRTN I6 N EG0X:
I STP L6 N0O
T LOWERED ON TRAP EXIT W6A R0N
ING: SPL NOT LOWERED ONWTARRANPI WNAXGRI:NT I SN6PG L:0
NSOPLT NLOTO WLEWERREEDD OONN TTRRAPA PE IT E6X 0I
T 6 0
WARNING: SPL NOT LOWERED ON WTRAARPN IENXGI:T 6S P0L
NOT WLAORWNEIRNEWGDA :RO NNSI PNLGR: A NPSO PTLE XLNIOTTW E6LRO EWD0 EO
N ETDR AOPN ETXRIATP 6E X0I
WTA R6N I0N
G: SPL NOT LWOAWRENIRNEGD: OSNP LT RNAWOPAT R ENLIXONWIGET:R E6SD P 0LO
NN OTTR ALPO WEEXRIETD O6N 0T
RAP EXIT 6 0
WARNING: SPL NOT LOWWEARRENDI NOGN: TSRPALP NEOXTI TL O6W E0R
ED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING:W ASNPILN GN:O TS PLLO WNEORTE DL OOWWNAE RRTNEIADNP G O:E X SIPRTL A 6PN OE0XT
IT 6L O0W
ERED OWNA RTNIRNAGP: ESXPLI TN OT6 LO0WE
RED ON TRAP EXIT s6y s0c
all() at WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NWOATR NLIONWGE:R ESDP LONNO TT RLAOWPE REEXDI OTN 6T RA0PW
AERXNIITN6G :0
SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TWRAAPR NEIXINT G6: 0S
PL NOT LOWERED ON TRAP EXWIATR N6I N0G
: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING:W ASRPNLI NNGO:T SLPOLW ENROT ELDO WOENR ETDR AOPN ETXRIATP 6X0I
T 6 0
WARNING: SPL NOT LOWERED ON TRAP EXIT 6 0
WARNING: SPL NOT LOWEREDW WAORAN ITNNRGIA:P :SEP XLS IPNTLO TN6O LT0 OL
WOEWREERDE DO NO NS YTSRCAAPL LE X6I1T 162 0E
XIT ffWfAfRdN9IaN0G :6W
ASRPNLI NNGO:T S PLL ONWOETR ELDO WOENRE DT ROANP TERXAIPT E6X IT0 6
0
WARNING: SPL NOT LOWERED ON TRAP EWWXAAIRRTNNIIN6GN :G0 :
SPLL NNOOTT LLOOWWEERREEDD OONN STYRSACPA LELX IT0 60 0E
XIT 0 6
WARNING: SPL NOT LOWERED ON TRAP EWXWAIATRR NN6II NN0GG
: SSPPLL NNOOT TL OWLAEORRWNEEIDRN EGOD: OTSRNP PL REANPIO TTE XL6IOnTe0Et
bR6sEd D:0 s
OyNs cTaRlAlP+ 0ExXbI5W
A R6N I-N
-G s: SsPcLa lNlO TW( nAuLRONbWIeErNR WG3EA9:D0 N)OSINP N-LT- :R-
SPALNP ONEOTXfT I LfTfLO OWd6WE0 aR06E4EDa: D
O NON c TpTuR1AW1AAP:R NE IXEnNXdI GTt r :6a cS e0P0aL
c kN.O.T.
LOWERED ON TRAP EXIT 6 0
So perhaps there are two issues and the suggested patch addresses one
of them?
Thomas
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, Thomas Klausner <wiz@NetBSD.org>
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
"(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Mon, 7 Oct 2013 07:42:23 +0000
On Thu, Aug 08, 2013 at 08:30:01PM +0000, Mindaugas Rasiukevicius wrote:
> > Can someone please take a look?
almost certainly the same as 47480 (and 41068)
> > Thomas
>
> Most likely this is due to tmpfs_dircookie() truncation here:
>
> http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
>
> It is wrong and there are other PRs because of it. Thomas, you can test
> this by replacing the function body with the following:
>
> return (off_t)(uintptr_t)de;
>
> This breaks linux32 compat, but we really just need to decide how we want
> to fix it (it would be good to avoid penalising the native code, but better
> to penalise than fail).
Four and a half years ago (in PR 41068) I asked why tmpfs does this
nonsense instead of just assigning sequence numbers to each node.
Nobody has ever managed to come up with a coherent justification, just
FUD.
--
David A. Holland
dholland@netbsd.org
From: Thomas Klausner <wiz@NetBSD.org>
To: NetBSD bugtracking <gnats-bugs@NetBSD.org>
Cc:
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
"(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Tue, 8 Oct 2013 00:34:07 +0200
--IrhDeMKUP4DT/M7F
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
On Mon, Oct 07, 2013 at 07:42:23AM +0000, David Holland wrote:
> On Thu, Aug 08, 2013 at 08:30:01PM +0000, Mindaugas Rasiukevicius wrote:
> > > Can someone please take a look?
>
> almost certainly the same as 47480 (and 41068)
>
> > > Thomas
> >
> > Most likely this is due to tmpfs_dircookie() truncation here:
> >
> > http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
> >
> > It is wrong and there are other PRs because of it. Thomas, you can test
> > this by replacing the function body with the following:
> >
> > return (off_t)(uintptr_t)de;
> >
> > This breaks linux32 compat, but we really just need to decide how we want
> > to fix it (it would be good to avoid penalising the native code, but better
> > to penalise than fail).
>
> Four and a half years ago (in PR 41068) I asked why tmpfs does this
> nonsense instead of just assigning sequence numbers to each node.
> Nobody has ever managed to come up with a coherent justification, just
> FUD.
Thanks for taking a look.
I've been using rmind's patch (attached) for some weeks now, and it
has definitely reduced by bulk build panics.
I still get them sometimes, so there must be a second problem.
Thomas
--IrhDeMKUP4DT/M7F
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="rmind.diff"
Index: tmpfs.h
===================================================================
RCS file: /cvsroot/src/sys/fs/tmpfs/tmpfs.h,v
retrieving revision 1.45
diff -u -r1.45 tmpfs.h
--- tmpfs.h 27 Sep 2011 01:10:43 -0000 1.45
+++ tmpfs.h 7 Oct 2013 22:32:14 -0000
@@ -87,14 +87,7 @@
static inline off_t
tmpfs_dircookie(tmpfs_dirent_t *de)
{
- off_t cookie;
-
- cookie = ((off_t)(uintptr_t)de >> 1) & 0x7FFFFFFF;
- KASSERT(cookie != TMPFS_DIRCOOKIE_DOT);
- KASSERT(cookie != TMPFS_DIRCOOKIE_DOTDOT);
- KASSERT(cookie != TMPFS_DIRCOOKIE_EOF);
-
- return cookie;
+ return (off_t)(uintptr_t)de;
}
#endif
--IrhDeMKUP4DT/M7F--
From: David Laight <david@l8s.co.uk>
To: David Holland <dholland-bugs@netbsd.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Wed, 9 Oct 2013 21:27:38 +0100
On Mon, Oct 07, 2013 at 07:42:23AM +0000, David Holland wrote:
> On Thu, Aug 08, 2013 at 08:30:01PM +0000, Mindaugas Rasiukevicius wrote:
> > > Can someone please take a look?
>
> almost certainly the same as 47480 (and 41068)
>
> > > Thomas
> >
> > Most likely this is due to tmpfs_dircookie() truncation here:
> >
> > http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
> >
> > It is wrong and there are other PRs because of it. Thomas, you can test
> > this by replacing the function body with the following:
> >
> > return (off_t)(uintptr_t)de;
> >
> > This breaks linux32 compat, but we really just need to decide how we want
> > to fix it (it would be good to avoid penalising the native code, but better
> > to penalise than fail).
>
> Four and a half years ago (in PR 41068) I asked why tmpfs does this
> nonsense instead of just assigning sequence numbers to each node.
> Nobody has ever managed to come up with a coherent justification, just
> FUD.
If it assigned sequence numbers it would have to check for already used
values once it had created 2^32 entries (assuming it needs to generate
32bit offsets).
Something based on the algorithm used to look up process ids might be
better.
David
--
David Laight: david@l8s.co.uk
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
"(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Fri, 11 Oct 2013 08:03:11 +0000
On Wed, Oct 09, 2013 at 09:27:38PM +0100, David Laight wrote:
> > Four and a half years ago (in PR 41068) I asked why tmpfs does this
> > nonsense instead of just assigning sequence numbers to each node.
> > Nobody has ever managed to come up with a coherent justification, just
> > FUD.
>
> If it assigned sequence numbers it would have to check for already used
> values once it had created 2^32 entries (assuming it needs to generate
> 32bit offsets).
... so once you've cycled 2^32 entries through a directory, which is
in general "never", you compact it. Big deal...
--
David A. Holland
dholland@netbsd.org
From: Dennis Ferguson <dennis.c.ferguson@gmail.com>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org,
gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org,
Thomas Klausner <wiz@NetBSD.org>
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion "(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Fri, 11 Oct 2013 17:26:48 -0400
On 9 Oct, 2013, at 16:20 , David Laight <david@l8s.co.uk> wrote:
> The following reply was made to PR kern/47739; it has been noted by =
GNATS.
>=20
> From: David Laight <david@l8s.co.uk>
> To: David Holland <dholland-bugs@netbsd.org>
> Cc: gnats-bugs@NetBSD.org
> Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion =
"(node)->tn_spec.tn_dir.tn_readdir_lastp =3D=3D NULL..."
> Date: Wed, 9 Oct 2013 21:27:38 +0100
>=20
> On Mon, Oct 07, 2013 at 07:42:23AM +0000, David Holland wrote:
>> On Thu, Aug 08, 2013 at 08:30:01PM +0000, Mindaugas Rasiukevicius =
wrote:
>>>> Can someone please take a look?
>>=20
>> almost certainly the same as 47480 (and 41068)
>>=20
>>>> Thomas
>>>=20
>>> Most likely this is due to tmpfs_dircookie() truncation here:
>>>=20
>>> http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
>>>=20
>>> It is wrong and there are other PRs because of it. Thomas, you can =
test
>>> this by replacing the function body with the following:
>>>=20
>>> return (off_t)(uintptr_t)de;
>>>=20
>>> This breaks linux32 compat, but we really just need to decide how we =
want
>>> to fix it (it would be good to avoid penalising the native code, but =
better
>>> to penalise than fail).
>>=20
>> Four and a half years ago (in PR 41068) I asked why tmpfs does this
>> nonsense instead of just assigning sequence numbers to each node.
>> Nobody has ever managed to come up with a coherent justification, =
just
>> FUD.
>=20
> If it assigned sequence numbers it would have to check for already =
used
> values once it had created 2^32 entries (assuming it needs to generate
> 32bit offsets).
>=20
> Something based on the algorithm used to look up process ids might be
> better.
Assuming my mail client doesn't ruin it I've attached a patch which adds
a sequence number to each tmpfs directory entry. The sequence numbers
are kept sorted in the directory entry list TAILQ order because it =
doesn't
cost anything much to do that in the current code and having a file =
offset with
actual ordering semantics fixes some things that the current code can do =
wrong
when directories are being read, like the EINVAL error that getdents(2) =
says
only NFS file systems are supposed to return but which appears in the =
code
here too.
The sequence algorithm is minimally simple. It tracks the last entry =
added
and attempts to add the next entry after it with a sequence number 1 =
higher,
incrementing past anything already using the sequence number until it =
finds a
free spot and wrapping to zero when it gets to the end of the sequence =
space.
If the last entry added is removed it backs up and reuses the numbers =
right
away. The first time through the sequence space it never has to skip =
anything,
but you are right that subsequent passes through it cost more. On the =
other
hand, the total additional cost for a subsequent full pass through the =
sequence
space is about the same as a single, unsuccessful name search in the =
directory
with the O(n) search algorithm it uses now. If it does a name search =
like that
before adding a new directory entry (I'm not positive it does since I
don't understand the name caching, but caches often can't help with =
names
that don't exist) the amortised necessary cost increase would be in the
fractional parts per billion. The absolute amount of work this involves
is tiny if the number of directory entries is small compared to the
sequence space. My guess is that the sequence numbers may not matter in
this case, if there are performance issues the O(n) name search is the
long pole in the tent.
It does read sequence numbers out of the bracketing directory entries =
during
name adds when it could instead easily cache them in the directory =
inode. I
didn't do that because I didn't want to increase the inode size, and I =
suspect
the directory entries are being read by a name search anyway so the data =
it
needs is likely to be in the processor cache already. If that isn't =
true this
could be easily fixed.
This was done after I got a crash like this PR while trying to see if I
could speed up builds this way. I ran it for a while with =
TMPFS_DIRCOOKIE_EOF
set much smaller, it is actually quite difficult to get the 2^31
sequence space to roll over under normal use (it takes a long time for
a C program explicitly designed to make it roll over to accomplish =
that).
I didn't like the memory address cookie thing because it was too easy to
think up things it could do that are wrong, even if they probably =
wouldn't
happen. The code below may not be bug free, but it does the same few
things over and over without the possibility of exceptions that are out
of your control so it might be possible to get the bugs out of it.
Dennis Ferguson
Index: sys/fs/tmpfs/tmpfs.h
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/fs/tmpfs/tmpfs.h,v
retrieving revision 1.45
diff -u -r1.45 tmpfs.h
--- sys/fs/tmpfs/tmpfs.h 27 Sep 2011 01:10:43 -0000 1.45
+++ sys/fs/tmpfs/tmpfs.h 11 Oct 2013 18:37:27 -0000
@@ -44,6 +44,11 @@
#include <sys/vnode.h>
=20
/*
+ * Type of a directory entry sequence number. See TMPFS_DIRCOOKIE_EOF.
+ */
+typedef uint32_t tmpfs_seq_t;
+
+/*
* Internal representation of a tmpfs directory entry.
*
* All fields are protected by vnode lock.
@@ -54,9 +59,12 @@
/* Pointer to the inode this entry refers to. */
struct tmpfs_node * td_node;
=20
+ /* Sequence in directory */
+ tmpfs_seq_t td_seq;
+
/* Name and its length. */
- char * td_name;
uint16_t td_namelen;
+ char * td_name;
} tmpfs_dirent_t;
=20
TAILQ_HEAD(tmpfs_dir, tmpfs_dirent);
@@ -67,36 +75,29 @@
/* Validate maximum td_namelen length. */
CTASSERT(TMPFS_MAXNAMLEN < UINT16_MAX);
=20
-#define TMPFS_DIRCOOKIE_DOT 0
-#define TMPFS_DIRCOOKIE_DOTDOT 1
-#define TMPFS_DIRCOOKIE_EOF 2
-
/*
- * Each entry in a directory has a cookie that identifies it. Cookies
- * supersede offsets within directories, as tmpfs has no offsets as =
such.
+ * Each entry in a directory has a sequence number that identifies it.
+ * The directory entries are kept sorted into ascending sequence order.
+ * This acts as a stand-in for offset; it provides ordering for =
sequential
+ * directory reads.
*
- * The '.', '..' and the end of directory markers have fixed cookies,
- * which cannot collide with the cookies generated by other entries.
+ * The '.' and '..' entries have fixed sequence numbers (and no actual
+ * directory entries). Directory entries have a sequence number =
greater
+ * than those but less than TMPFS_DIRCOOKIE_EOF. The latter is set
+ * to 2^31-1 to avoid Linux compat problems, see PR32034, and the type
+ * of tmpfs_seq_t is set to uint32_t to match. For no Linux =
compatibility
+ * and huge directories make tmpfs_seq_t an off_t and _EOF a much =
larger
+ * number.
*
- * The cookies for the other entries are generated based on the memory
- * address of their representative meta-data structure.
- *
- * XXX: Truncating directory cookies to 31 bits now - workaround for
- * problem with Linux compat, see PR/32034.
+ * We call the sequence numbers "cookies" since the old code used the =
name
+ * and they are used to fill in the cookie fields for NFS directory =
reads.
*/
-static inline off_t
-tmpfs_dircookie(tmpfs_dirent_t *de)
-{
- off_t cookie;
-
- cookie =3D ((off_t)(uintptr_t)de >> 1) & 0x7FFFFFFF;
- KASSERT(cookie !=3D TMPFS_DIRCOOKIE_DOT);
- KASSERT(cookie !=3D TMPFS_DIRCOOKIE_DOTDOT);
- KASSERT(cookie !=3D TMPFS_DIRCOOKIE_EOF);
+#define TMPFS_DIRCOOKIE_DOT 0
+#define TMPFS_DIRCOOKIE_DOTDOT 1
+#define TMPFS_DIRCOOKIE_MIN 2 /* min td_seq */
+#define TMPFS_DIRCOOKIE_EOF 0x7fffffff /* max td_seq */
=20
- return cookie;
-}
-#endif
+#endif /* defined(_KERNEL) */
=20
/*
* Internal representation of a tmpfs file system node -- inode.
@@ -169,12 +170,14 @@
/* List of directory entries. */
struct tmpfs_dir tn_dir;
=20
+ /* Pointer to insertion point for new entries */
+ struct tmpfs_dirent * tn_insert;
+
/*
- * Number and pointer of the last directory =
entry
+ * Pointer to the last directory entry
* returned by the readdir(3) operation.
*/
- off_t tn_readdir_lastn;
- struct tmpfs_dirent * tn_readdir_lastp;
+ struct tmpfs_dirent * tn_readdir_last;
} tn_dir;
=20
/* Type case: VLNK. */
@@ -278,7 +281,7 @@
=20
int tmpfs_dir_getdotdent(tmpfs_node_t *, struct uio *);
int tmpfs_dir_getdotdotdent(tmpfs_node_t *, struct uio *);
-tmpfs_dirent_t *tmpfs_dir_lookupbycookie(tmpfs_node_t *, off_t);
+tmpfs_dirent_t *tmpfs_dir_getnext(tmpfs_node_t *, off_t);
int tmpfs_dir_getdents(tmpfs_node_t *, struct uio *, off_t =
*);
=20
int tmpfs_reg_resize(vnode_t *, off_t);
@@ -324,9 +327,6 @@
#define TMPFS_VALIDATE_DIR(node) \
KASSERT((node)->tn_type =3D=3D VDIR); \
KASSERT((node)->tn_size % sizeof(tmpfs_dirent_t) =3D=3D 0); \
- KASSERT((node)->tn_spec.tn_dir.tn_readdir_lastp =3D=3D NULL || \
- tmpfs_dircookie((node)->tn_spec.tn_dir.tn_readdir_lastp) =3D=3D =
\
- (node)->tn_spec.tn_dir.tn_readdir_lastn);
=20
/*
* Memory management stuff.
Index: sys/fs/tmpfs/tmpfs_subr.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/fs/tmpfs/tmpfs_subr.c,v
retrieving revision 1.80
diff -u -r1.80 tmpfs_subr.c
--- sys/fs/tmpfs/tmpfs_subr.c 4 Oct 2013 15:14:11 -0000 1.80
+++ sys/fs/tmpfs/tmpfs_subr.c 11 Oct 2013 18:37:27 -0000
@@ -155,8 +155,8 @@
/* Directory. */
TAILQ_INIT(&nnode->tn_spec.tn_dir.tn_dir);
nnode->tn_spec.tn_dir.tn_parent =3D NULL;
- nnode->tn_spec.tn_dir.tn_readdir_lastn =3D 0;
- nnode->tn_spec.tn_dir.tn_readdir_lastp =3D NULL;
+ nnode->tn_spec.tn_dir.tn_insert =3D NULL;
+ nnode->tn_spec.tn_dir.tn_readdir_last =3D NULL;
=20
/* Extra link count for the virtual '.' entry. */
nnode->tn_links++;
@@ -439,6 +439,53 @@
}
=20
/*
+ * tmpfs_dir_newinsert: find a spot in the directory where
+ * there is free sequence number space so that nodes can be
+ * inserted.
+ *
+ * =3D> Scan the directory entries forward from the current
+ * insert point looking for an unused sequence number.
+ * =3D> Return the predecessor node to the unused spot. This
+ * might be NULL if TMPFS_DIRCOOKIE_MIN is not in use.
+ * =3D> Stop searching as soon as we find even a single empty
+ * spot. If our sequence space is size S and the number
+ * of directory entries is N then the cost of doing (S - N)
+ * tmpfs_dir_attach() operations will be scanning N directory
+ * entries (or about the same as two name lookups given the
+ * O(N) algorithm used here).
+ */
+static tmpfs_dirent_t *
+tmpfs_dir_newinsert(tmpfs_node_t *node)
+{
+ tmpfs_dirent_t *de, *de_next, *de_first;
+ tmpfs_seq_t seq, seq_next;
+
+ /* If we're called there's no space here, so seq 1 past first */
+ de_first =3D node->tn_spec.tn_dir.tn_insert;
+ seq_next =3D de_first->td_seq + 1;
+ de_next =3D TAILQ_NEXT(de_first, td_entries);
+
+ do {
+ KASSERT(de_next !=3D de_first); /* all seqno's used?!? =
*/
+ de =3D de_next;
+ if (de =3D=3D NULL) {
+ seq =3D TMPFS_DIRCOOKIE_MIN + 1;
+ de_next =3D =
TAILQ_FIRST(&node->tn_spec.tn_dir.tn_dir);
+ } else {
+ seq =3D seq_next;
+ de_next =3D TAILQ_NEXT(de, td_entries);
+ }
+ if (de_next =3D=3D NULL) {
+ seq_next =3D TMPFS_DIRCOOKIE_EOF - 1;
+ } else {
+ seq_next =3D de_next->td_seq;
+ }
+ } while (seq >=3D seq_next);
+
+ return de;
+}
+
+/*
* tmpfs_dir_attach: associate directory entry with a specified inode,
* and attach the entry into the directory, specified by vnode.
*
@@ -451,6 +498,8 @@
tmpfs_dir_attach(vnode_t *dvp, tmpfs_dirent_t *de, tmpfs_node_t *node)
{
tmpfs_node_t *dnode =3D VP_TO_TMPFS_DIR(dvp);
+ tmpfs_dirent_t *de_prev, *de_next;
+ tmpfs_seq_t new_seq;
int events =3D NOTE_WRITE;
=20
KASSERT(VOP_ISLOCKED(dvp));
@@ -465,8 +514,35 @@
node->tn_dirent_hint =3D de;
}
=20
- /* Insert the entry to the directory (parent of inode). */
- TAILQ_INSERT_TAIL(&dnode->tn_spec.tn_dir.tn_dir, de, =
td_entries);
+ /* Find the insertion point. Make sure we have a sequence =
space. */
+ de_prev =3D dnode->tn_spec.tn_dir.tn_insert;
+ if (de_prev =3D=3D NULL) {
+ new_seq =3D TMPFS_DIRCOOKIE_MIN;
+ } else {
+ new_seq =3D de_prev->td_seq + 1;
+ de_next =3D TAILQ_NEXT(de_prev, td_entries);
+ if (new_seq >=3D TMPFS_DIRCOOKIE_EOF ||
+ (de_next !=3D NULL && new_seq >=3D de_next->td_seq)) =
{
+ de_prev =3D tmpfs_dir_newinsert(dnode);
+ if (de_prev =3D=3D NULL) {
+ new_seq =3D TMPFS_DIRCOOKIE_MIN;
+ } else {
+ new_seq =3D de_prev->td_seq + 1;
+ }
+ }
+ }
+
+ /* Insert the entry into directory after de_prev (or at head) */
+ de->td_seq =3D new_seq;
+ if (de_prev) {
+ TAILQ_INSERT_AFTER(&dnode->tn_spec.tn_dir.tn_dir,
+ de_prev, de, td_entries);
+ } else {
+ TAILQ_INSERT_HEAD(&dnode->tn_spec.tn_dir.tn_dir,
+ de, td_entries);
+ }
+ dnode->tn_spec.tn_dir.tn_insert =3D de;
+
dnode->tn_size +=3D sizeof(tmpfs_dirent_t);
dnode->tn_status |=3D TMPFS_NODE_STATUSALL;
uvm_vnp_setsize(dvp, dnode->tn_size);
@@ -528,9 +604,12 @@
}
=20
/* Remove the entry from the directory. */
- if (dnode->tn_spec.tn_dir.tn_readdir_lastp =3D=3D de) {
- dnode->tn_spec.tn_dir.tn_readdir_lastn =3D 0;
- dnode->tn_spec.tn_dir.tn_readdir_lastp =3D NULL;
+ if (dnode->tn_spec.tn_dir.tn_readdir_last =3D=3D de) {
+ dnode->tn_spec.tn_dir.tn_readdir_last =3D NULL;
+ }
+ if (dnode->tn_spec.tn_dir.tn_insert =3D=3D de) {
+ dnode->tn_spec.tn_dir.tn_insert =3D
+ TAILQ_PREV(de, tmpfs_dir, td_entries);
}
TAILQ_REMOVE(&dnode->tn_spec.tn_dir.tn_dir, de, td_entries);
=20
@@ -620,7 +699,7 @@
else {
error =3D uiomove(dentp, dentp->d_reclen, uio);
if (error =3D=3D 0)
- uio->uio_offset =3D TMPFS_DIRCOOKIE_DOTDOT;
+ uio->uio_offset =3D TMPFS_DIRCOOKIE_DOT + 1;
}
node->tn_status |=3D TMPFS_NODE_ACCESSED;
kmem_free(dentp, sizeof(struct dirent));
@@ -654,13 +733,7 @@
else {
error =3D uiomove(dentp, dentp->d_reclen, uio);
if (error =3D=3D 0) {
- tmpfs_dirent_t *de;
-
- de =3D =
TAILQ_FIRST(&node->tn_spec.tn_dir.tn_dir);
- if (de =3D=3D NULL)
- uio->uio_offset =3D TMPFS_DIRCOOKIE_EOF;
- else
- uio->uio_offset =3D tmpfs_dircookie(de);
+ uio->uio_offset =3D TMPFS_DIRCOOKIE_DOTDOT + 1;
}
}
node->tn_status |=3D TMPFS_NODE_ACCESSED;
@@ -669,23 +742,31 @@
}
=20
/*
- * tmpfs_dir_lookupbycookie: lookup a directory entry by associated =
cookie.
+ * tmpfs_dir_getnext: find an entry with a sequence >=3D cookie
*/
tmpfs_dirent_t *
-tmpfs_dir_lookupbycookie(tmpfs_node_t *node, off_t cookie)
+tmpfs_dir_getnext(tmpfs_node_t *node, off_t cookie)
{
tmpfs_dirent_t *de;
+ tmpfs_seq_t next_seq;
=20
KASSERT(VOP_ISLOCKED(node->tn_vnode));
=20
- if (cookie =3D=3D node->tn_spec.tn_dir.tn_readdir_lastn &&
- node->tn_spec.tn_dir.tn_readdir_lastp !=3D NULL) {
- return node->tn_spec.tn_dir.tn_readdir_lastp;
+ if (cookie >=3D TMPFS_DIRCOOKIE_EOF || cookie < =
TMPFS_DIRCOOKIE_DOT) {
+ return NULL;
}
- TAILQ_FOREACH(de, &node->tn_spec.tn_dir.tn_dir, td_entries) {
- if (tmpfs_dircookie(de) =3D=3D cookie) {
+ next_seq =3D (tmpfs_seq_t) cookie;
+
+ de =3D node->tn_spec.tn_dir.tn_readdir_last;
+ if (de =3D=3D NULL || de->td_seq > next_seq) {
+ de =3D TAILQ_FIRST(&node->tn_spec.tn_dir.tn_dir);
+ }
+
+ while (de) {
+ if (de->td_seq >=3D next_seq) {
break;
}
+ de =3D TAILQ_NEXT(de, td_entries);
}
return de;
}
@@ -699,7 +780,7 @@
int
tmpfs_dir_getdents(tmpfs_node_t *node, struct uio *uio, off_t *cntp)
{
- tmpfs_dirent_t *de;
+ tmpfs_dirent_t *de, *last_de;
struct dirent *dentp;
off_t startcookie;
int error;
@@ -715,17 +796,14 @@
startcookie =3D uio->uio_offset;
KASSERT(startcookie !=3D TMPFS_DIRCOOKIE_DOT);
KASSERT(startcookie !=3D TMPFS_DIRCOOKIE_DOTDOT);
- if (startcookie =3D=3D TMPFS_DIRCOOKIE_EOF) {
- return 0;
- } else {
- de =3D tmpfs_dir_lookupbycookie(node, startcookie);
- }
+ de =3D tmpfs_dir_getnext(node, startcookie);
if (de =3D=3D NULL) {
- return EINVAL;
+ return 0;
}
+ last_de =3D NULL; /* track last entry written */
=20
/*
- * Read as much entries as possible; i.e., until we reach the =
end
+ * Read as many entries as possible; i.e., until we reach the =
end
* of the directory or we exhaust uio space.
*/
dentp =3D kmem_alloc(sizeof(struct dirent), KM_SLEEP);
@@ -783,20 +861,26 @@
* advance pointers.
*/
error =3D uiomove(dentp, dentp->d_reclen, uio);
+ if (error !=3D 0) {
+ break;
+ }
=20
+ /*
+ * At this point we have successfully written de. Keep
+ * track the last successfully written entry in last_de.
+ */
(*cntp)++;
- de =3D TAILQ_NEXT(de, td_entries);
- } while (error =3D=3D 0 && uio->uio_resid > 0 && de !=3D NULL);
+ last_de =3D de;
+ } while (uio->uio_resid > 0 &&
+ (de =3D TAILQ_NEXT(de, td_entries)) !=3D NULL);
=20
/* Update the offset and cache. */
if (de =3D=3D NULL) {
uio->uio_offset =3D TMPFS_DIRCOOKIE_EOF;
- node->tn_spec.tn_dir.tn_readdir_lastn =3D 0;
- node->tn_spec.tn_dir.tn_readdir_lastp =3D NULL;
- } else {
- node->tn_spec.tn_dir.tn_readdir_lastn =3D =
uio->uio_offset =3D
- tmpfs_dircookie(de);
- node->tn_spec.tn_dir.tn_readdir_lastp =3D de;
+ node->tn_spec.tn_dir.tn_readdir_last =3D NULL;
+ } else if (last_de) {
+ uio->uio_offset =3D last_de->td_seq + 1;
+ node->tn_spec.tn_dir.tn_readdir_last =3D last_de;
}
node->tn_status |=3D TMPFS_NODE_ACCESSED;
kmem_free(dentp, sizeof(struct dirent));
Index: sys/fs/tmpfs/tmpfs_vnops.c
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
RCS file: /cvsroot/src/sys/fs/tmpfs/tmpfs_vnops.c,v
retrieving revision 1.103
diff -u -r1.103 tmpfs_vnops.c
--- sys/fs/tmpfs/tmpfs_vnops.c 4 Oct 2013 15:14:11 -0000 1.103
+++ sys/fs/tmpfs/tmpfs_vnops.c 11 Oct 2013 18:37:27 -0000
@@ -995,21 +995,21 @@
*ncookies =3D cnt;
=20
for (i =3D 0; i < cnt; i++) {
- KASSERT(off !=3D TMPFS_DIRCOOKIE_EOF);
+ KASSERT(off < uio->uio_offset);
if (off !=3D TMPFS_DIRCOOKIE_DOT) {
if (off =3D=3D TMPFS_DIRCOOKIE_DOTDOT) {
de =3D =
TAILQ_FIRST(&node->tn_spec.tn_dir.tn_dir);
} else if (de !=3D NULL) {
de =3D TAILQ_NEXT(de, td_entries);
} else {
- de =3D tmpfs_dir_lookupbycookie(node, =
off);
+ de =3D tmpfs_dir_getnext(node, off);
KASSERT(de !=3D NULL);
de =3D TAILQ_NEXT(de, td_entries);
}
if (de =3D=3D NULL) {
- off =3D TMPFS_DIRCOOKIE_EOF;
+ off =3D uio->uio_offset;
} else {
- off =3D tmpfs_dircookie(de);
+ off =3D de->td_seq;
}
} else {
off =3D TMPFS_DIRCOOKIE_DOTDOT;
From: Mindaugas Rasiukevicius <rmind@netbsd.org>
To: Dennis Ferguson <dennis.c.ferguson@gmail.com>, Thomas Klausner
<wiz@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
"(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Sun, 27 Oct 2013 16:03:40 +0000
Dennis Ferguson <dennis.c.ferguson@gmail.com> wrote:
> >>>> <...>
> >>>
> >>> Most likely this is due to tmpfs_dircookie() truncation here:
> >>>
> >>> http://nxr.netbsd.org/source/xref/src/sys/fs/tmpfs/tmpfs.h#88
> >>>
> >>> It is wrong and there are other PRs because of it. Thomas, you can
> >>> test this by replacing the function body with the following:
> >>>
> >>> return (off_t)(uintptr_t)de;
> >>>
> >>> This breaks linux32 compat, but we really just need to decide how we
> >>> want to fix it (it would be good to avoid penalising the native code,
> >>> but better to penalise than fail).
> >>
> >> <...>
>
> Assuming my mail client doesn't ruin it I've attached a patch which adds
> a sequence number to each tmpfs directory entry. The sequence numbers
> are kept sorted in the directory entry list TAILQ order because it doesn't
> cost anything much to do that in the current code and having a file
> offset with actual ordering semantics fixes some things that the current
> code can do wrong when directories are being read, like the EINVAL error
> that getdents(2) says only NFS file systems are supposed to return but
> which appears in the code here too.
I think there is a better way. Here is the patch:
http://www.netbsd.org/~rmind/tmpfs_readdir_fixes.diff
tmpfs.h | 86 +++++--------
tmpfs_rename.c | 14 +-
tmpfs_subr.c | 373 ++++++++++++++++++++++++++++++++-------------------------
tmpfs_vfsops.c | 39 +++--
tmpfs_vnops.c | 75 ++++-------
5 files changed, 309 insertions(+), 278 deletions(-)
It also fixes tmpfs_unmount() and the net diff seems better. :)
Thomas, can you try this patch on your build-hammer-machine?
Thanks.
--
Mindaugas
From: Thomas Klausner <wiz@NetBSD.org>
To: Mindaugas Rasiukevicius <rmind@netbsd.org>
Cc: Dennis Ferguson <dennis.c.ferguson@gmail.com>,
NetBSD bugtracking <gnats-bugs@NetBSD.org>
Subject: Re: kern/47739: tmpfs panic: kernel diagnostic assertion
"(node)->tn_spec.tn_dir.tn_readdir_lastp == NULL..."
Date: Mon, 28 Oct 2013 09:58:31 +0100
Hi Mindaugas!
Thanks for the patch!
On Sun, Oct 27, 2013 at 04:03:40PM +0000, Mindaugas Rasiukevicius wrote:
> Thomas, can you try this patch on your build-hammer-machine?
I've built a new kernel with it and started a bulk build. I'll let you
know if it reboots :)
Thomas
From: "Mindaugas Rasiukevicius" <rmind@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/47739 CVS commit: src/sys/fs/tmpfs
Date: Fri, 8 Nov 2013 15:44:23 +0000
Module Name: src
Committed By: rmind
Date: Fri Nov 8 15:44:23 UTC 2013
Modified Files:
src/sys/fs/tmpfs: tmpfs.h tmpfs_rename.c tmpfs_subr.c tmpfs_vfsops.c
tmpfs_vnops.c
Log Message:
tmpfs: replace the broken tmpfs_dircookie() logic which uses the node
address truncated to 31 bits (required for 32-bit readdir compatibility,
e.g. linux32). Instead, assign 2^31 range using the following logic:
- The first half of the 2^31 is assigned incrementally (the fast path).
- When exceeded, use the second half of 2^31, but manage with vmem(9).
It will require 2 billion files per-directory to trigger vmem(9) usage.
Also, while here, add some fixes for tmpfs_unmount().
Should fix PR/47739, PR/47480, PR/46088 and PR/41068.
Thanks to wiz@ for stress testing.
To generate a diff of this commit:
cvs rdiff -u -r1.45 -r1.46 src/sys/fs/tmpfs/tmpfs.h
cvs rdiff -u -r1.4 -r1.5 src/sys/fs/tmpfs/tmpfs_rename.c
cvs rdiff -u -r1.82 -r1.83 src/sys/fs/tmpfs/tmpfs_subr.c
cvs rdiff -u -r1.52 -r1.53 src/sys/fs/tmpfs/tmpfs_vfsops.c
cvs rdiff -u -r1.105 -r1.106 src/sys/fs/tmpfs/tmpfs_vnops.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
Responsible-Changed-From-To: kern-bug-people->rmind
Responsible-Changed-By: rmind@NetBSD.org
Responsible-Changed-When: Fri, 08 Nov 2013 15:55:26 +0000
Responsible-Changed-Why:
State-Changed-From-To: open->closed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Fri, 08 Nov 2013 15:55:26 +0000
State-Changed-Why:
Should be fixed in -current. Please let us know if you will ever see a
similar problem again.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.