NetBSD Problem Report #53884
From gson@gson.org Wed Jan 16 11:39:42 2019
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id D56607A1C1
for <gnats-bugs@gnats.NetBSD.org>; Wed, 16 Jan 2019 11:39:42 +0000 (UTC)
Message-Id: <20190116113936.A98439892D7@guava.gson.org>
Date: Wed, 16 Jan 2019 13:39:36 +0200 (EET)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: fs/vfs/t_rmdirrace:lfs_race test case fails randomly on real hardware
X-Send-Pr-Version: 3.95
>Number: 53884
>Category: kern
>Synopsis: fs/vfs/t_rmdirrace:lfs_race test case fails randomly on real hardware
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: dholland
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Jan 16 11:40:00 +0000 2019
>Closed-Date: Fri Jan 25 09:19:25 +0000 2019
>Last-Modified: Fri Jan 25 09:19:25 +0000 2019
>Originator: Andreas Gustafsson
>Release: NetBSD-current, source date >= 2015.10.13.20.49.39
>Organization:
>Environment:
System: NetBSD
Architecture: x86_64
Machine: amd64
>Description:
Running the ATF tests on real amd64 hardware (an 8-core HP DL360 G7), the
lfs_race test case of the fs/vfs/t_rmdirrace test program randomly fails
in about half the runs, with
panic: kernel diagnostic assertion "fs->lfs_cleaner_thread == curlwp" failed
Log output from the latest failure:
http://www.gson.org/netbsd/bugs/build/amd64-baremetal/2019/2019.01.15.14.23.56/test.html#fs_vfs_t_rmdirrace_lfs_race
The first recorded failure was on 2015.10.13.20.49.39:
http://www.gson.org/netbsd/bugs/build/amd64-baremetal/2015/2015.10.13.20.49.39/test.html#fs_vfs_t_rmdirrace_lfs_race
The test case is not failing on the qemu-based TNF testbed, which
emulates a uniprocessor system.
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->dholland
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Fri, 18 Jan 2019 10:12:42 +0000
Responsible-Changed-Why:
lfs
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53884: fs/vfs/t_rmdirrace:lfs_race test case fails randomly
on real hardware
Date: Fri, 18 Jan 2019 10:23:36 +0000
On Wed, Jan 16, 2019 at 11:40:00AM +0000, Andreas Gustafsson wrote:
> panic: kernel diagnostic assertion "fs->lfs_cleaner_thread == curlwp" failed
The only way this should be possible is to have two cleaners running
at once, which... I wouldn't put past the rump test framework, I
guess, but it doesn't seem that likely and also shouldn't vary under
the number of cpus on the system. Or so one would think.
Is it possible to check for extra lfs_cleanerd processes the next time
this happens?
--
David A. Holland
dholland@netbsd.org
From: Andreas Gustafsson <gson@gson.org>
To: David Holland <dholland-bugs@netbsd.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/53884: fs/vfs/t_rmdirrace:lfs_race test case fails randomly
on real hardware
Date: Fri, 18 Jan 2019 12:58:51 +0200
David,
You wrote:
> > panic: kernel diagnostic assertion "fs->lfs_cleaner_thread == curlwp" failed
>
> The only way this should be possible is to have two cleaners running
> at once, which... I wouldn't put past the rump test framework, I
> guess, but it doesn't seem that likely and also shouldn't vary under
> the number of cpus on the system. Or so one would think.
>
> Is it possible to check for extra lfs_cleanerd processes the next time
> this happens?
There are no lfs_cleanerd processes running after atf-run has exited,
if that's what you mean. The "anita test" command runs "ps -glaxw"
after the tests, prefixing each line with "ps-post-test", and there
were no unexpected processes shown in the console log from the last
run where the lfs_race test case failed:
ps-post-test UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
ps-post-test 0 0 0 0 0 0 0 126868 - OKl ? 0:33.26 [system]
ps-post-test 0 1 0 76 85 0 12104 1316 wait Ss ? 0:00.09 init
ps-post-test 0 167 1 0 85 0 24188 2136 kqueue Ss ? 0:00.11 /usr/sbin/syslogd -s
ps-post-test 0 306 1 0 85 0 12124 1136 kqueue Is ? 0:00.00 /usr/sbin/powerd
ps-post-test 0 586 1 0 85 0 51068 2492 kqueue Is ? 0:00.02 /usr/libexec/postfix/master -w
ps-post-test 12 588 586 0 85 0 51584 3780 kqueue I ? 0:00.02 pickup -l -t unix -u
ps-post-test 12 595 586 0 85 0 49828 3808 kqueue I ? 0:00.01 qmgr -l -t unix -u
ps-post-test 0 596 1 0 85 0 14232 1200 kqueue Is ? 0:00.00 /usr/sbin/inetd -l
ps-post-test 0 607 1 0 85 0 12132 1376 nanoslp Is ? 0:00.01 /usr/sbin/cron
ps-post-test 0 77 536 2728 85 0 7876 1180 pipe_rd S+ tty00 0:00.00 sed s/^/ps-post-test /
ps-post-test 0 536 633 2728 85 0 12380 1732 wait S tty00 0:00.03 /bin/sh
ps-post-test 0 633 1 60 85 0 66104 4164 wait Is tty00 0:00.02 login
ps-post-test 0 23032 536 2728 43 0 12188 1244 - O+ tty00 0:00.00 ps -glaxw
ps-post-test 0 465 1 60 85 0 12128 1296 ttyraw Is+ ttyE1 0:00.00 /usr/libexec/getty Pc ttyE1
ps-post-test 0 634 1 60 85 0 12260 1296 ttyraw Is+ ttyE2 0:00.00 /usr/libexec/getty Pc ttyE2
ps-post-test 0 527 1 60 85 0 12128 1292 ttyraw Is+ ttyE3 0:00.00 /usr/libexec/getty Pc ttyE3
If you would like to modify the kernel or test to provide more
debugging information and don't want to commit the changes, you can
send me a patch.
This may be unrelated, but I see that on the qemu-based TNF testbed
(where the test passes), lfs_cleanerd sometimes logs the error message
"couldn't open device rrumpfs for reading":
babylon5.netbsd.org$ cd /bracket/amd64/results/
babylon5.netbsd.org$ zgrep 'lfs_race:' 2019/*/test.log.gz
2019/2019.01.01.10.09.26/test.log.gz: lfs_race: [25.983353s] Passed.
2019/2019.01.01.14.01.46/test.log.gz: lfs_race: [25.433668s] Passed.
2019/2019.01.02.03.04.26/test.log.gz: lfs_race: [25.344210s] Passed.
2019/2019.01.02.09.04.09/test.log.gz: lfs_race: [25.431315s] Passed.
2019/2019.01.02.22.58.44/test.log.gz: lfs_race: Jan 3 22:27:14 lfs_cleanerd[27084]: couldn't open device rrumpfs for reading
2019/2019.01.03.15.33.06/test.log.gz: lfs_race: [25.992201s] Passed.
2019/2019.01.04.10.25.39/test.log.gz: lfs_race: [24.285902s] Passed.
2019/2019.01.05.05.40.00/test.log.gz: lfs_race: [25.598234s] Passed.
2019/2019.01.05.10.51.06/test.log.gz: lfs_race: Jan 6 05:42:53 lfs_cleanerd[27176]: couldn't open device rrumpfs for reading
2019/2019.01.06.11.20.53/test.log.gz: lfs_race: [26.268811s] Passed.
2019/2019.01.06.18.56.52/test.log.gz: lfs_race: Jan 9 14:56:42 lfs_cleanerd[26388]: couldn't open device rrumpfs for reading
2019/2019.01.07.05.01.10/test.log.gz: lfs_race: [25.582419s] Passed.
2019/2019.01.08.06.29.35/test.log.gz: lfs_race: [23.349863s] Passed.
2019/2019.01.09.04.02.26/test.log.gz: lfs_race: [23.915846s] Passed.
2019/2019.01.10.19.00.17/test.log.gz: lfs_race: [24.829028s] Passed.
2019/2019.01.11.02.44.49/test.log.gz: lfs_race: [25.661166s] Passed.
2019/2019.01.11.08.30.19/test.log.gz: lfs_race: [23.367517s] Passed.
2019/2019.01.11.15.43.51/test.log.gz: lfs_race: [25.536107s] Passed.
2019/2019.01.11.23.10.41/test.log.gz: lfs_race: [25.789131s] Passed.
2019/2019.01.12.17.25.09/test.log.gz: lfs_race: [25.472970s] Passed.
2019/2019.01.13.06.59.15/test.log.gz: lfs_race: [26.537625s] Passed.
2019/2019.01.13.10.01.07/test.log.gz: lfs_race: [25.320557s] Passed.
2019/2019.01.13.10.43.22/test.log.gz: lfs_race: Jan 15 05:31:26 lfs_cleanerd[5727]: couldn't open device rrumpfs for reading
2019/2019.01.13.16.48.51/test.log.gz: lfs_race: Jan 16 04:26:48 lfs_cleanerd[20726]: couldn't open device rrumpfs for reading
2019/2019.01.14.03.30.25/test.log.gz: lfs_race: [24.650734s] Passed.
2019/2019.01.14.21.29.56/test.log.gz: lfs_race: [23.333059s] Passed.
2019/2019.01.15.14.23.56/test.log.gz: lfs_race: [25.056744s] Passed.
2019/2019.01.16.08.32.24/test.log.gz: lfs_race: [26.300848s] Passed.
2019/2019.01.16.13.54.17/test.log.gz: lfs_race: [25.688629s] Passed.
--
Andreas Gustafsson, gson@gson.org
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53884: fs/vfs/t_rmdirrace:lfs_race test case fails randomly
on real hardware
Date: Fri, 18 Jan 2019 12:12:33 +0100
On Fri, Jan 18, 2019 at 11:00:02AM +0000, Andreas Gustafsson wrote:
> 2019/2019.01.06.18.56.52/test.log.gz: lfs_race: Jan 9 14:56:42 lfs_cleanerd[26388]: couldn't open device rrumpfs for reading
I see those as well on the console on all my test runs.
But I am confused: are we talking about a cleaner process inside rump,
and would the host ps show that?
Martin
From: Andreas Gustafsson <gson@gson.org>
To: Martin Husemann <martin@duskware.de>, dholland@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/53884: fs/vfs/t_rmdirrace:lfs_race test case fails randomly
on real hardware
Date: Fri, 18 Jan 2019 18:41:33 +0200
Martin wrote:
> But I am confused: are we talking about a cleaner process inside rump,
> and would the host ps show that?
Good question. I had a look at the code, and apparently the test is running
entirely within the rump kernel process and creating a cleaner thread rather
than a cleaner process, so it would not show up in the host ps.
See lfs_fstest_mount() and cleaner() in src/tests/fs/common/fstest_lfs.c.
I also managed to reproduce the failure on 8.0 by manually running just the
t_rmdirrace test program rather than the full ATF test suite, as in
cd /usr/tests/fs/vfs
atf-run t_rmdirrace | tee /tmp/log.tps | atf-report
It failed on the 3rd or 4th run.
I also tried
cd /usr/tests/fs/vfs
chown -r gson .
./t_rmdirrace lfs_race
and it failed the first time, leaving a t_rmdirrace.core that I can
examine using gdb, with symbols even. What should I look for?
--
Andreas Gustafsson, gson@gson.org
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53884: fs/vfs/t_rmdirrace:lfs_race test case fails randomly
on real hardware
Date: Fri, 18 Jan 2019 17:48:49 +0000
On Fri, Jan 18, 2019 at 04:45:01PM +0000, Andreas Gustafsson wrote:
> and it failed the first time, leaving a t_rmdirrace.core that I can
> examine using gdb, with symbols even. What should I look for?
So, at one place deep inside it wants to know if it's running in the
cleaner; so in lfs_bmapv, which is one of the cleaner syscalls that
nothing else should get to, it does the assign-and-assert that fails.
So the first thing to check is: is fs->lfs_cleaner_thread some other
lwp (if so, what, and what's that one doing, and how did the current
one that asserted get to lfs_bmapv?) or is it trash?
If it's trash it would be handy to know if it starts out trash (the
structure it's in is kmem_zalloc'd, but maybe that's borked in rump)
or if it gets overwritten later, or any other info that can be
discovered...
--
David A. Holland
dholland@netbsd.org
From: Andreas Gustafsson <gson@gson.org>
To: David Holland <dholland-bugs@netbsd.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/53884: fs/vfs/t_rmdirrace:lfs_race test case fails randomly
on real hardware
Date: Fri, 18 Jan 2019 20:01:57 +0200
David Holland wrote:
> So the first thing to check is: is fs->lfs_cleaner_thread some other
> lwp (if so, what, and what's that one doing, and how did the current
> one that asserted get to lfs_bmapv?) or is it trash?
(gdb) print fs->lfs_cleaner_thread
$7 = (struct lwp *) 0x78282b5b0000
(gdb) print *fs->lfs_cleaner_thread
$8 = {l_runq = {tqe_next = 0x0, tqe_prev = 0x0}, l_sched = {info = 0x0, timeslice = 0},
l_cpu = 0x7828394fc9c0 <rump_bootcpu>, l_mutex = 0x7828394fc988 <rumpns_unruntime_lock>,
l_ctxswtch = 0, l_addr = 0x0, l_md = {md_regs = 0x0, md_gc_pmap = 0x0, md_gc_ptp = 0x0,
md_flags = 0, md_astpending = 0}, l_flag = 1048576, l_stat = 2, l_rtime = {sec = 0, frac = 0},
l_stime = {sec = 0, frac = 0}, l_swtime = 0, l_rticks = 0, l_rticksum = 0, l_slpticks = 0,
l_slpticksum = 0, l_biglocks = 0, l_class = 0, l_kpriority = 0, l_kpribase = 0, l_priority = 0,
l_inheritedprio = 0, l_protectprio = 0, l_auxprio = 0, l_protectdepth = 0, l_pi_lenders = {
slh_first = 0x0}, l_ncsw = 5, l_nivcsw = 0, l_cpticks = 0, l_pctcpu = 0, l_estcpu = 0,
l_psid = 0, l_target_cpu = 0x7828394fc9c0 <rump_bootcpu>, l_lwpctl = 0x0, l_lcpage = 0x0,
l_affinity = 0x0, l_ts = 0x0, l_syncobj = 0x0, l_sleepchain = {tqe_next = 0x0, tqe_prev = 0x0},
l_wchan = 0x0, l_wmesg = 0x0, l_sleepq = 0x0, l_sleeperr = 0, l_slptime = 0, l_timeout_ch = {
_c_store = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, l_emap_gen = 0, l_waitcv = {
cv_opaque = {0x0, 0x0, 0x0}}, l_list = {le_next = 0x78283c505000, le_prev = 0x78282b5b09b0},
l_ctxlink = 0x0, l_proc = 0x78283d59bb40, l_sibling = {le_next = 0x78283d598000,
le_prev = 0x78282b5b09d0}, l_waiter = 0, l_waitingfor = 0, l_prflag = 0, l_refcnt = 0,
l_lid = 83289, l_name = 0x0, l_selflag = 0, l_selwait = {slh_first = 0x0}, l_selret = 0,
l_selrec = 0, l_selcluster = 0x0, l_selbits = 0x0, l_selni = 0, l_sigrestore = 0,
l_sigwaitset = {__bits = {0, 0, 0, 0}}, l_sigcv = {cv_opaque = {0x0, 0x0, 0x0}},
l_sigwaited = 0x0, l_sigpendset = 0x0, l_sigwaiter = {le_next = 0x0, le_prev = 0x0}, l_sigstk = {
ss_sp = 0x0, ss_size = 0, ss_flags = 0}, l_sigmask = {__bits = {0, 0, 0, 0}}, l_sigpend = {
sp_info = {tqh_first = 0x0, tqh_last = 0x0}, sp_set = {__bits = {0, 0, 0, 0}}},
l_sigoldmask = {__bits = {0, 0, 0, 0}}, l_specdataref = {specdataref_container = 0x0,
specdataref_lock = {u = {mtxa_owner = 132113907130496}}}, l_ktrcsw = {tv_sec = 0,
tv_nsec = 0}, l_private = 0x0, l_switchto = 0x0, l_cred = 0x78283e948f00,
l_fd = 0x78283d59d9c0, l_emuldata = 0x0, l_cv_signalled = 0, l_shlocks = 0, l_exlocks = 0,
l_unused = 0, l_blcnt = 0, l_nopreempt = 0, l_dopreempt = 0, l_pflag = 0, l_dupfd = 0,
l_sysent = 0x0, l_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0,
tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0,
ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0,
ru_nsignals = 0, ru_nvcsw = 0, ru_nivcsw = 0}, l_pfailtime = 0, l_pfailaddr = 0,
l_pfaillock = 0, l_ld_locks = {tqh_first = 0x0, tqh_last = 0x78282b5b03f8}, l_tcgen = 0,
l_syscall_time = 0, l_syscall_counter = 0x0, l_dtrace = 0x0}
I have so far been unable to figure out how to print the value that
curlwp refers to, because curlwp is a macro that expands into a
function call that disappears into a maze of #ifdefs.
--
Andreas Gustafsson, gson@gson.org
From: Andreas Gustafsson <gson@gson.org>
To: David Holland <dholland-bugs@netbsd.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/53884: fs/vfs/t_rmdirrace:lfs_race test case fails randomly
on real hardware
Date: Fri, 18 Jan 2019 20:19:26 +0200
> and how did the current one that asserted get to lfs_bmapv?
(gdb) where
#0 0x00007828372fe4aa in _lwp_kill () from /usr/lib/libc.so.12
#1 0x00007828372fdd52 in abort () at /usr/src/lib/libc/stdlib/abort.c:74
#2 0x0000782838e079dd in rumpuser_exit (rv=rv@entry=-1) at /usr/src/lib/librumpuser/rumpuser.c:236
#3 0x00007828392a1245 in cpu_reboot (howto=<optimized out>, bootstr=<optimized out>)
at /usr/src/lib/librump/../../sys/rump/librump/rumpkern/emul.c:385
#4 0x0000782839264b00 in vpanic (fmt=0x78283ce445e8 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ", ap=0x78282caef5c8)
at /usr/src/lib/librump/../../sys/rump/../kern/subr_prf.c:342
#5 0x000078283924ec4d in kern_assert (fmt=<optimized out>) at /usr/src/lib/librump/../../sys/rump/../lib/libkern/kern_assert.c:51
#6 0x000078283ce33d23 in lfs_bmapv (l=<optimized out>, fsidp=<optimized out>, blkiov=0x78283e538148, blkcnt=6)
at /usr/src/sys/rump/fs/lib/liblfs/../../../../ufs/lfs/lfs_syscalls.c:662
#7 0x000078283ce21106 in lfs_fcntl (v=0x78282caef750) at /usr/src/sys/rump/fs/lib/liblfs/../../../../ufs/lfs/lfs_vnops.c:2000
#8 0x00007828392582aa in VOP_FCNTL (vp=0x78283d5a6d28, command=<optimized out>, data=<optimized out>, fflag=<optimized out>,
cred=<optimized out>) at /usr/src/lib/librump/../../sys/rump/../kern/vnode_if.c:648
#9 0x0000782839248ead in fcntl_forfs (fd=<optimized out>, arg=<optimized out>, cmd=-1341076464, fp=0x78283d59ed80)
at /usr/src/lib/librump/../../sys/rump/../kern/sys_descrip.c:221
#10 sys_fcntl (l=<optimized out>, uap=<optimized out>, retval=<optimized out>)
at /usr/src/lib/librump/../../sys/rump/../kern/sys_descrip.c:376
#11 0x00007828392a1f2f in sy_call (rval=0x78282caef8f0, uap=0x78282caef900, l=0x78282b5bb800, sy=<optimized out>)
at /usr/src/lib/librump/../../sys/rump/../sys/syscallvar.h:65
#12 sy_invoke (code=92, rval=0x78282caef8f0, uap=0x78282caef900, l=0x78282b5bb800, sy=<optimized out>)
at /usr/src/lib/librump/../../sys/rump/../sys/syscallvar.h:94
#13 rump_syscall (num=num@entry=92, data=data@entry=0x78282caef900, dlen=dlen@entry=24, retval=retval@entry=0x78282caef8f0)
at /usr/src/lib/librump/../../sys/rump/librump/rumpkern/rump.c:762
#14 0x0000782839296e81 in rump___sysimpl_fcntl (fd=<optimized out>, cmd=<optimized out>, arg=<optimized out>)
at /usr/src/lib/librump/../../sys/rump/librump/rumpkern/rump_syscalls.c:1292
#15 0x000000007720d6af in toss_old_blocks () at /usr/src/tests/fs/vfs/t_rmdirrace.c:106
#16 0x000000007720e73b in lfs_cleaner_main () at /usr/src/tests/fs/vfs/t_rmdirrace.c:106
#17 0x000000007720a001 in ?? () at /usr/src/tests/fs/vfs/t_rmdirrace.c:106
#18 0x0000782838a0b8b5 in pthread__create_tramp (cookie=0x78283b9d4000) at /usr/src/lib/libpthread/pthread.c:593
#19 0x0000782837281b60 in ?? () from /usr/lib/libc.so.12
Backtrace stopped: Cannot access memory at address 0x78282caf0000
--
Andreas Gustafsson, gson@gson.org
From: Andreas Gustafsson <gson@gson.org>
To: dholland@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/53884: fs/vfs/t_rmdirrace:lfs_race test case fails randomly on real hardware
Date: Sun, 20 Jan 2019 14:21:19 +0200
I can make the test fail reliably by increasing the sleep(10) in
src/tests/fs/vfs/t_rmdirrace.c to sleep(100).
My current theory is that the test is falling victim to the
following rump behavior documented in rump_lwproc(3):
An implicit thread context is created every time the rump kernel
is entered and disbanded upon exit.
and this comment in rump_schedule():
/*
* If there is no dedicated lwp, allocate a temp one and
* set it to be free'd upon unschedule(). Use lwp0 context
* for reserving the necessary resources. Don't optimize
* for this case -- anyone who cares about performance will
* start a real thread.
*/
That is, every call to lfs_bmapv() from the cleaner thread gets a new
rump kernel lwp object (despite being made from the same host pthread),
so second call will always fail the assertion that the lwp is the same
as in the first call.
I'm now testing a patch that adds a call to rump_pub_lwproc_newlwp()
from the cleaner thread, like those that already exist in the worker
threads of the tmpfs test in tests/fs/tmpfs/t_renamerace.c.
--
Andreas Gustafsson, gson@gson.org
From: "Andreas Gustafsson" <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/53884 CVS commit: src/tests/fs/common
Date: Sun, 20 Jan 2019 14:50:58 +0000
Module Name: src
Committed By: gson
Date: Sun Jan 20 14:50:58 UTC 2019
Modified Files:
src/tests/fs/common: fstest_lfs.c
Log Message:
Call rump_pub_lwproc_newlwp() in the lfs cleaner thread to make its
rump system calls all use the same lwp. Hopefully this will fix
PR kern/53884. Also call rump_pub_lwproc_releaselwp() at the end.
To generate a diff of this commit:
cvs rdiff -u -r1.5 -r1.6 src/tests/fs/common/fstest_lfs.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53884: fs/vfs/t_rmdirrace:lfs_race test case fails randomly
on real hardware
Date: Sun, 20 Jan 2019 19:15:30 +0000
On Sun, Jan 20, 2019 at 12:25:01PM +0000, Andreas Gustafsson wrote:
> My current theory is that the test is falling victim to the
> following rump behavior documented in rump_lwproc(3):
>
> An implicit thread context is created every time the rump kernel
> is entered and disbanded upon exit.
That's very likely.
--
David A. Holland
dholland@netbsd.org
State-Changed-From-To: open->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Fri, 25 Jan 2019 09:19:25 +0000
State-Changed-Why:
No failures in the last 18 runs. Thanks to dholland for the analysis.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.