NetBSD Problem Report #42093
From njoly@lanfeust.sis.pasteur.fr Sat Sep 19 09:18:24 2009
Return-Path: <njoly@lanfeust.sis.pasteur.fr>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id 085BD63BC1D
for <gnats-bugs@gnats.NetBSD.org>; Sat, 19 Sep 2009 09:18:24 +0000 (UTC)
Message-Id: <20090919091820.F04EFDC9B9@lanfeust.sis.pasteur.fr>
Date: Sat, 19 Sep 2009 11:18:20 +0200 (CEST)
From: njoly@pasteur.fr
Reply-To: njoly@pasteur.fr
To: gnats-bugs@gnats.NetBSD.org
Subject: rump rename race testcase crashes
X-Send-Pr-Version: 3.95
>Number: 42093
>Category: kern
>Synopsis: rump rename race testcase crashes
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: lib-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Sep 19 09:20:00 +0000 2009
>Closed-Date: Mon Jul 18 15:56:16 +0000 2011
>Last-Modified: Mon Jul 18 15:56:16 +0000 2011
>Originator: Nicolas Joly
>Release: NetBSD 5.99.18
>Organization:
Insitut Pasteur
>Environment:
System: NetBSD lanfeust.sis.pasteur.fr 5.99.18 NetBSD 5.99.18 (LANFEUST) #0: Sat Sep 19 01:00:25 CEST 2009 njoly@lanfeust.sis.pasteur.fr:/local/src/NetBSD/obj.amd64/sys/arch/amd64/compile/LANFEUST amd64
Architecture: x86_64
Machine: amd64
>Description:
Noticed that ffs/t_renamerace ATF testcase sometimes crash. Trying to debug it,
i made the following rump small code which do crash too.
The crashes do only appear when logging is enabled (MNT_LOG), and are not so uncommom.
On 10 runs; i get 6 success, 3 crashes with `Resource deadlock avoided' and 1
with `segmentation fault'.
njoly@lanfeust [rump/ffs]> make ffs_rename
cc -g -O0 -Wall -Werror -lrumpfs_ffs -lrumpvfs -o ffs_rename ffs_rename.c
I currently do see 2 different symptoms when running it:
1)
njoly@lanfeust [rump/ffs]> ./ffs_rename
panic: rumpuser fatal failure 11 (Resource deadlock avoided)
zsh: abort (core dumped) ./ffs_rename
njoly@lanfeust [rump/ffs]> gdb ffs_rename ffs_rename.core
GNU gdb 6.5
[...]
Core was generated by `ffs_rename'.
Program terminated with signal 6, Aborted.
#0 0x00007f7ffd7e0c0a in _lwp_kill () from /usr/lib/libc.so.12
(gdb) bt
#0 0x00007f7ffd7e0c0a in _lwp_kill () from /usr/lib/libc.so.12
#1 0x00007f7ffd7e057e in abort () at /local/src/NetBSD/src/lib/libc/stdlib/abort.c:74
#2 0x00007f7ffd103662 in rumpuser_rw_enter (rw=0x7f7ffb2011a0, iswrite=<value optimized out>)
at /local/src/NetBSD/src/lib/librumpuser/../../sys/rump/librump/rumpuser/rumpuser_pth.c:353
#3 0x00007f7ffda2ab6a in vlockmgr (vl=0x7f7ffb20f2a8, flags=0)
at /local/src/NetBSD/src/lib/librumpvfs/../../sys/rump/../kern/vfs_subr.c:2832
#4 0x00007f7ffd538d29 in VOP_LOCK (vp=0x7f7ffb20f1a0, flags=6)
at /local/src/NetBSD/src/lib/librump/../../sys/rump/../kern/vnode_if.c:1380
#5 0x00007f7ffda24c38 in vn_lock (vp=0x7f7ffb20f1a0, flags=2)
at /local/src/NetBSD/src/lib/librumpvfs/../../sys/rump/../kern/vfs_vnops.c:765
#6 0x00007f7ffda2ec4c in vget (vp=0x7f7ffb20f1a0, flags=65538)
at /local/src/NetBSD/src/lib/librumpvfs/../../sys/rump/../kern/vfs_subr.c:1310
#7 0x00007f7ffdc26bee in ufs_ihashget (dev=65024, inum=5, flags=2)
at /local/src/NetBSD/src/sys/rump/fs/lib/libffs/../../../../ufs/ufs/ufs_ihash.c:153
#8 0x00007f7ffdc0eeaf in ffs_vget (mp=0x7f7ffc6b3000, ino=5, vpp=0x7f7ffb7ff8b0)
at /local/src/NetBSD/src/sys/rump/fs/lib/libffs/../../../../ufs/ffs/ffs_vfsops.c:1717
#9 0x00007f7ffdc0c07a in ufs_lookup (v=<value optimized out>)
at /local/src/NetBSD/src/sys/rump/fs/lib/libffs/../../../../ufs/ufs/ufs_lookup.c:541
#10 0x00007f7ffd539c71 in VOP_LOOKUP (dvp=0x7f7ffc6ad400, vpp=0x6, cnp=<value optimized out>)
at /local/src/NetBSD/src/lib/librump/../../sys/rump/../kern/vnode_if.c:130
#11 0x00007f7ffdc17599 in wapbl_ufs_rename (v=<value optimized out>)
at /local/src/NetBSD/src/sys/rump/fs/lib/libffs/../../../../ufs/ufs/ufs_wapbl.c:358
#12 0x00007f7ffdc14403 in ufs_rename (v=<value optimized out>)
at /local/src/NetBSD/src/sys/rump/fs/lib/libffs/../../../../ufs/ufs/ufs_vnops.c:976
#13 0x00007f7ffd539074 in VOP_RENAME (fdvp=0x7f7ffc6ad400, fvp=0x6, fcnp=<value optimized out>,
tdvp=0x7f7ffd7e0c0a, tvp=0x1, tcnp=0x8080808080808080)
at /local/src/NetBSD/src/lib/librump/../../sys/rump/../kern/vnode_if.c:998
#14 0x00007f7ffda25ef6 in do_sys_rename (from=<value optimized out>, to=0x4011ba "/mnt/rename2.test",
seg=UIO_USERSPACE, retain=0)
at /local/src/NetBSD/src/lib/librumpvfs/../../sys/rump/../kern/vfs_syscalls.c:3395
#15 0x00007f7ffd552a94 in rump_sys_rename (from=<value optimized out>, to=<value optimized out>)
at /local/src/NetBSD/src/lib/librump/../../sys/rump/librump/rumpkern/rump_syscalls.c:878
#16 0x0000000000400e86 in func2 (arg=0x0) at ffs_rename.c:37
#17 0x00007f7ffd30a812 in pthread__create_tramp (cookie=<value optimized out>)
at /local/src/NetBSD/src/lib/libpthread/pthread.c:476
#18 0x00007f7ffd76e690 in ___lwp_park50 () from /usr/lib/libc.so.12
#19 0x00007f7ffb800000 in ?? ()
#20 0x0000000111110001 in ?? ()
#21 0x0000000033330003 in ?? ()
#22 0x0000000000000000 in ?? ()
2)
njoly@lanfeust [rump/ffs]> ./ffs_rename
zsh: segmentation fault (core dumped) ./ffs_rename
njoly@lanfeust [rump/ffs]> gdb ffs_rename ffs_rename.core
GNU gdb 6.5
[...]
Core was generated by `ffs_rename'.
Program terminated with signal 11, Segmentation fault.
#0 0x00007f7ffdc0b198 in ufs_makedirentry (ip=0x7f7ffb416100,
cnp=0x7f7ffb7ffb70, newdirp=0x7f7ffb51a040)
at /local/src/NetBSD/src/sys/rump/fs/lib/libffs/../../../../ufs/ufs/ufs_lookup.c:735
735 if (FSFMT(ITOV(ip)))
(gdb) bt
#0 0x00007f7ffdc0b198 in ufs_makedirentry (ip=0x7f7ffb416100,
cnp=0x7f7ffb7ffb70, newdirp=0x7f7ffb51a040)
at /local/src/NetBSD/src/sys/rump/fs/lib/libffs/../../../../ufs/ufs/ufs_lookup.c:735
#1 0x00007f7ffdc17f3b in wapbl_ufs_rename (v=<value optimized out>)
at /local/src/NetBSD/src/sys/rump/fs/lib/libffs/../../../../ufs/ufs/ufs_wapbl.c:443
#2 0x00007f7ffdc14403 in ufs_rename (v=<value optimized out>)
at /local/src/NetBSD/src/sys/rump/fs/lib/libffs/../../../../ufs/ufs/ufs_vnops.c:976
#3 0x00007f7ffd539074 in VOP_RENAME (fdvp=0x7f7ffc6ad400, fvp=0x7f7ffb501811,
fcnp=<value optimized out>, tdvp=0x0, tvp=0x4, tcnp=0xe2)
at /local/src/NetBSD/src/lib/librump/../../sys/rump/../kern/vnode_if.c:998
#4 0x00007f7ffda25ef6 in do_sys_rename (from=<value optimized out>,
to=0x4011ba "/mnt/rename2.test", seg=UIO_USERSPACE, retain=0)
at /local/src/NetBSD/src/lib/librumpvfs/../../sys/rump/../kern/vfs_syscalls.c:3395
#5 0x00007f7ffd552a94 in rump_sys_rename (from=<value optimized out>,
to=<value optimized out>)
at /local/src/NetBSD/src/lib/librump/../../sys/rump/librump/rumpkern/rump_syscalls.c:878
#6 0x0000000000400e86 in func2 (arg=0x0) at ffs_rename.c:37
#7 0x00007f7ffd30a812 in pthread__create_tramp (cookie=<value optimized out>)
at /local/src/NetBSD/src/lib/libpthread/pthread.c:476
#8 0x00007f7ffd76e690 in ___lwp_park50 () from /usr/lib/libc.so.12
#9 0x00007f7ffb800000 in ?? ()
#10 0x0000000111110001 in ?? ()
#11 0x0000000033330003 in ?? ()
#12 0x0000000000000000 in ()
(gdb) p ip
$1 = (struct inode *) 0x7f7ffb416100
(gdb) p ip->i_vnode
$2 = (struct vnode *) 0x7f7ffb40f070
(gdb) p ip->i_vnode->v_mount
$3 = (struct mount *) 0x0
(gdb) p ip->i_vnode->v_mount->mnt_iflag
Cannot access memory at address 0x64
njoly@lanfeust [rump/ffs]> cat ffs_rename.c
#include <rump/rump_syscalls.h>
#include <rump/rump.h>
#include <ufs/ufs/ufsmount.h>
#include <err.h>
#include <fcntl.h>
#include <pthread.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#define IMAGE "image.ffs"
#define DEVICE "/dev/device.ffs"
#define MNTDIR "/mnt"
#define FILE1 MNTDIR "/rename1.test"
#define FILE2 MNTDIR "/rename2.test"
static int quit = 0;
static void *func1(void *arg) {
int fd;
while (quit != 1) {
fd = rump_sys_open(FILE1, O_WRONLY|O_CREAT|O_TRUNC, 0666);
rump_sys_unlink(FILE1);
rump_sys_close(fd); }
return NULL; }
static void *func2(void *arg) {
while (quit != 1) {
rump_sys_rename(FILE1, FILE2); }
return NULL; }
int main() {
int res;
pthread_t th1, th2;
struct ufs_args args;
(void)unlink(IMAGE);
res = system("newfs -F -s 10000 " IMAGE " >/dev/null");
if (res == -1)
err(1, "system failed");
res = rump_init();
if (res != 0)
err(1, "rump_init failed");
res = rump_etfs_register(DEVICE, IMAGE, RUMP_ETFS_BLK);
if (res != 0)
err(1, "rump_etfs_register failed");
res = rump_sys_mkdir(MNTDIR, 0777);
if (res != 0)
err(1, "rump_sys_mkdir failed");
memset(&args, 0x0, sizeof(args));
args.fspec = DEVICE;
res = rump_sys_mount(MOUNT_FFS, MNTDIR, MNT_LOG, &args, sizeof(args));
if (res == -1)
err(1, "rump_sys_mount failed");
res = pthread_create(&th1, NULL, func1, NULL);
if (res != 0)
errx(1, "pthread_create failed");
res = pthread_create(&th2, NULL, func2, NULL);
if (res != 0)
errx(1, "pthread_create failed");
sleep(10);
quit = 1;
res = pthread_join(th2, NULL);
if (res != 0)
errx(1, "pthread_join failed");
res = pthread_join(th1, NULL);
if (res != 0)
errx(1, "pthread_join failed");
res = rump_sys_unmount(MNTDIR, 0);
if (res == -1)
err(1, "rump_sys_unmount failed");
res = rump_etfs_remove(DEVICE);
if (res == -1)
err(1, "rump_etfs_remove failed");
res = unlink(IMAGE);
if (res == -1)
err(1, "unlink failed");
return 0; }
>How-To-Repeat:
Compile the provided testcase, run it multiple times and see it fail.
>Fix:
>Release-Note:
>Audit-Trail:
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@NetBSD.org
Cc: lib-bug-people@NetBSD.org, gnats-admin@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: lib/42093: rump rename race testcase crashes
Date: Sat, 19 Sep 2009 23:17:36 +0200
On Sat, Sep 19, 2009 at 09:20:01AM +0000, njoly@pasteur.fr wrote:
> System: NetBSD lanfeust.sis.pasteur.fr 5.99.18 NetBSD 5.99.18 (LANFEUST) #0: Sat Sep 19 01:00:25 CEST 2009 njoly@lanfeust.sis.pasteur.fr:/local/src/NetBSD/obj.amd64/sys/arch/amd64/compile/LANFEUST amd64
> Architecture: x86_64
> Machine: amd64
> >Description:
> Noticed that ffs/t_renamerace ATF testcase sometimes crash. Trying to debug it,
> i made the following rump small code which do crash too.
>
> The crashes do only appear when logging is enabled (MNT_LOG), and are not so uncommom.
> On 10 runs; i get 6 success, 3 crashes with `Resource deadlock avoided' and 1
> with `segmentation fault'.
Could it be related to kern/41417, which have been investigated to be
a deadlock in WAPBL's copy of rename ?
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Antti Kantee <pooka@cs.hut.fi>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/42093
Date: Sun, 20 Sep 2009 11:58:50 +0300
It might or might not be related to kern/41417. I think there are
actually two bugs here per the two different failure modes reported:
a) an error branch locking error, where a vnode is not unlocked
properly
b) a reference couting bug which causes use-after-free
(the test case vnodes are recycled immediately due to them
having a 0 link count)
There might one or more cases of "a", and 41417 might be one of them.
[i changed the PR from lib to kern]
From: "David A. Holland" <dholland@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/42093 CVS commit: src/sys/ufs/ufs
Date: Sun, 17 Jul 2011 22:07:59 +0000
Module Name: src
Committed By: dholland
Date: Sun Jul 17 22:07:59 UTC 2011
Modified Files:
src/sys/ufs/ufs: ufs_extern.h ufs_lookup.c ufs_wapbl.c
Log Message:
Provide correct locking for ufs_wapbl_rename. Note that this does not
fix the non-wapbl rename; that will be coming soon. This patch also
leaves a lot of the older locking-related code around in #if 0 blocks,
and there's a lot of leftover redundant logic. All that will be going
away later.
Relates to at least these PRs:
PR kern/24887
PR kern/41417
PR kern/42093
PR kern/43626
and possibly others.
To generate a diff of this commit:
cvs rdiff -u -r1.65 -r1.66 src/sys/ufs/ufs/ufs_extern.h
cvs rdiff -u -r1.110 -r1.111 src/sys/ufs/ufs/ufs_lookup.c
cvs rdiff -u -r1.16 -r1.17 src/sys/ufs/ufs/ufs_wapbl.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->feedback
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Mon, 18 Jul 2011 06:48:05 +0000
State-Changed-Why:
fixed, please test
From: Nicolas Joly <njoly@pasteur.fr>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/42093 (rump rename race testcase crashes)
Date: Mon, 18 Jul 2011 17:19:02 +0200
On Mon, Jul 18, 2011 at 06:48:09AM +0000, dholland@NetBSD.org wrote:
> Synopsis: rump rename race testcase crashes
>
> State-Changed-From-To: open->feedback
> State-Changed-By: dholland@NetBSD.org
> State-Changed-When: Mon, 18 Jul 2011 06:48:05 +0000
> State-Changed-Why:
> fixed, please test
I tested it again, and cannot make it fail anymore (with or without
wapbl). Works fine for more than 100s, where it died previously within
a few seconds.
Great job ! Thanks.
--
Nicolas Joly
Projects and Developments in Bioinformatics
Institut Pasteur, Paris.
State-Changed-From-To: feedback->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Mon, 18 Jul 2011 15:56:16 +0000
State-Changed-Why:
confirmed, thanks :-)
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.