NetBSD Problem Report #53202
From gson@gson.org Sun Apr 22 17:10:20 2018
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 3E1917A1CF
for <gnats-bugs@gnats.NetBSD.org>; Sun, 22 Apr 2018 17:10:20 +0000 (UTC)
Message-Id: <20180422171013.5A09C989378@guava.gson.org>
Date: Sun, 22 Apr 2018 20:10:13 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: Kernel hangs running t_ptrace_wait:resume1 test
X-Send-Pr-Version: 3.95
>Number: 53202
>Category: kern
>Synopsis: Kernel hangs running t_ptrace_wait:resume1 test
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Apr 22 17:15:00 +0000 2018
>Closed-Date: Mon May 14 19:19:39 +0000 2018
>Last-Modified: Mon May 14 19:19:39 +0000 2018
>Originator: Andreas Gustafsson
>Release: NetBSD-current, source date >= 2017.12.02.22.51.22
>Organization:
>Environment:
System: NetBSD
Architecture: i386
Machine: i386
>Description:
Since the commit of kern_lwp.c 1.191, executing the resume1 test case
of the t_ptrace_wait test can cause the operating system to hang. The
problem was first reported in
http://mail-index.netbsd.org/current-users/2017/12/04/msg032841.html
but at that time it was not yet clear whether just the test framework
was hanging, or the kernel; it has now become clear that it is in fact
the kernel.
The problem is also mentioned in passing in comments to PR 51995, but
since the subject of that PR is "ptrace(2) PT_RESUME is not reliable"
rather than the more serious issue of the kernel hanging, I'm filing
this separate PR about the latter issue.
The problem occurs on multiple architectures under both qemu (i386,
amd64, sparc) and gxemul (hpcmips). I have also reproduced it on
physical amd64 hardware after disabling all but one CPU using cpuctl.
I currently have a hung i386 system attached to a remote kernel
debugger under qemu using the procedure documented at
https://wiki.netbsd.org/kernel_debugging_with_qemu/
The backtrace varies as the kernel appears to be in a loop
involving multiple functions, but here's a typical one:
(gdb) bt
#0 sleepq_remove (sq=0xc20d8b98, l=0xc20e3aa0)
at /usr/src/sys/kern/kern_sleepq.c:137
#1 0xc0bd5e0d in sleepq_unsleep (l=0xc20e3aa0, cleanup=true)
at /usr/src/sys/kern/kern_sleepq.c:347
#2 0xc0b97b90 in cv_unsleep (l=0xc20e3aa0, cleanup=true)
at /usr/src/sys/kern/kern_condvar.c:227
#3 0xc0bb37bb in lwp_unsleep (l=0xc20e3aa0, cleanup=true)
at /usr/src/sys/kern/kern_lwp.c:1526
#4 0xc0bd5ad6 in sleepq_block (timo=0, catch_p=true)
at /usr/src/sys/kern/kern_sleepq.c:259
#5 0xc0b97c9d in cv_wait_sig (cv=0xc20d8b98, mtx=0xc2674800)
at /usr/src/sys/kern/kern_condvar.c:272
#6 0xc0bb18a6 in lwp_wait (l=0xc20e3aa0, lid=0, departed=0x0, exiting=true)
at /usr/src/sys/kern/kern_lwp.c:648
#7 0xc0ba810b in exit_lwps (l=0xc20e3aa0) at /usr/src/sys/kern/kern_exit.c:636
#8 0xc0ba7478 in exit1 (l=0xc20e3aa0, exitcode=0, signo=1)
at /usr/src/sys/kern/kern_exit.c:223
#9 0xc0bd4d70 in sigexit (l=0xc20e3aa0, signo=1)
at /usr/src/sys/kern/kern_sig.c:2106
#10 0xc0bd4554 in postsig (signo=1) at /usr/src/sys/kern/kern_sig.c:1904
#11 0xc0bb3880 in lwp_userret (l=0xc20e3aa0)
at /usr/src/sys/kern/kern_lwp.c:1562
#12 0xc0169046 in mi_userret (l=0xc20e3aa0) at /usr/src/sys/sys/userret.h:94
#13 0xc01690c1 in userret (l=0xc20e3aa0) at ./machine/userret.h:80
#14 0xc01692af in syscall (frame=0xc9398fa8)
at /usr/src/sys/arch/x86/x86/syscall.c:168
#15 0xc01006a9 in Xsyscall ()
(gdb)
If I place a breakpoint on line 637 of kern_exit.c, it gets hit
repeatedly, but a breakpoint on line 641 is never hit:
(gdb) l
632 * behind us or there may even be new LWPs created. Therefore, a
633 * full retry is required on error.
634 */
635 while (p->p_nlwps > 1) {
636 if (lwp_wait(l, 0, NULL, true)) {
637 goto retry;
638 }
639 }
640
641 KERNEL_LOCK(nlocks, l);
(gdb)
The lwp_wait() call is returning -3 = ERESTART, which originates in
sleepq_sigtoerror():
#0 sleepq_sigtoerror (l=0xc20e3aa0, sig=9)
at /usr/src/sys/kern/kern_sleepq.c:400
#1 0xc0bd5bd5 in sleepq_block (timo=0, catch_p=true)
at /usr/src/sys/kern/kern_sleepq.c:293
#2 0xc0b97c9d in cv_wait_sig (cv=0xc20d8b98, mtx=0xc2674800)
at /usr/src/sys/kern/kern_condvar.c:272
#3 0xc0bb18a6 in lwp_wait (l=0xc20e3aa0, lid=0, departed=0x0, exiting=true)
at /usr/src/sys/kern/kern_lwp.c:648
#4 0xc0ba810b in exit_lwps (l=0xc20e3aa0) at /usr/src/sys/kern/kern_exit.c:636
#5 0xc0ba7478 in exit1 (l=0xc20e3aa0, exitcode=0, signo=1)
at /usr/src/sys/kern/kern_exit.c:223
If there are other gdb commands I can run to help debug this, let me
know.
See also PR 52892, "Tests hang on MIPS", for another problem that
appeared with the same commit.
>How-To-Repeat:
Because the test case that triggers the bug has been disabled, and
another change has been committed that also keeps the test case from
triggering the bug, you need to apply the following two patches to
reproduce the issue in -current as of source date 2018.04.19.21.21.44:
--- src/tests/lib/libc/sys/t_ptrace_wait.c.orig 2018-04-15 03:19:23.000000000 +0300
+++ src/tests/lib/libc/sys/t_ptrace_wait.c 2018-04-17 10:26:17.000000000 +0300
@@ -6444,7 +6444,6 @@
atf_tc_expect_fail("PR kern/51995");
// Hangs with qemu
- ATF_REQUIRE(0 && "In order to get reliable failure, abort");
SYSCALL_REQUIRE(msg_open(&fds) == 0);
Index: src/tests/lib/libc/sys/msg.h
diff -c src/tests/lib/libc/sys/msg.h:1.2 src/tests/lib/libc/sys/msg.h:1.1
*** src/tests/lib/libc/sys/msg.h:1.2 Tue Mar 13 16:45:36 2018
--- src/tests/lib/libc/sys/msg.h Mon Apr 3 00:44:00 2017
***************
*** 70,76 ****
CLOSEFD(fds->cfd[1]);
CLOSEFD(fds->pfd[0]);
! // printf("Send %s\n", info);
rv = write(fds->pfd[1], msg, len);
if (rv != (ssize_t)len)
return 1;
--- 70,76 ----
CLOSEFD(fds->cfd[1]);
CLOSEFD(fds->pfd[0]);
! printf("Send %s\n", info);
rv = write(fds->pfd[1], msg, len);
if (rv != (ssize_t)len)
return 1;
***************
*** 88,94 ****
CLOSEFD(fds->pfd[1]);
CLOSEFD(fds->cfd[0]);
! // printf("Send %s\n", info);
rv = write(fds->cfd[1], msg, len);
if (rv != (ssize_t)len)
return 1;
--- 88,94 ----
CLOSEFD(fds->pfd[1]);
CLOSEFD(fds->cfd[0]);
! printf("Send %s\n", info);
rv = write(fds->cfd[1], msg, len);
if (rv != (ssize_t)len)
return 1;
***************
*** 106,112 ****
CLOSEFD(fds->pfd[1]);
CLOSEFD(fds->cfd[0]);
! // printf("Wait %s\n", info);
rv = read(fds->pfd[0], msg, len);
if (rv != (ssize_t)len)
return 1;
--- 106,112 ----
CLOSEFD(fds->pfd[1]);
CLOSEFD(fds->cfd[0]);
! printf("Wait %s\n", info);
rv = read(fds->pfd[0], msg, len);
if (rv != (ssize_t)len)
return 1;
***************
*** 124,130 ****
CLOSEFD(fds->cfd[1]);
CLOSEFD(fds->pfd[0]);
! // printf("Wait %s\n", info);
rv = read(fds->cfd[0], msg, len);
if (rv != (ssize_t)len)
return 1;
--- 124,130 ----
CLOSEFD(fds->cfd[1]);
CLOSEFD(fds->pfd[0]);
! printf("Wait %s\n", info);
rv = read(fds->cfd[0], msg, len);
if (rv != (ssize_t)len)
return 1;
Then build an i386 release, boot it in qemu, and run the commands
# cd /usr/tests/lib/libc/sys/
# atf-run ./t_ptrace_wait4 >log 2>&1 &
Within a few minutes, the system becomes unresponsive. You may still
get a new shell prompt if you hit enter, but attempting to run "ls"
will hang.
>Fix:
Since the problem appeared with kern_lwp.c 1.191, the obvious fix
would be to revert that commit (and its pullup).
>Release-Note:
>Audit-Trail:
From: Andreas Gustafsson <gson@gson.org>
To: christos@NetBSD.org, gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53202: Kernel hangs running t_ptrace_wait:resume1 test
Date: Mon, 23 Apr 2018 18:16:25 +0300
My analysis of this bug is that the kernel ends up in a loop where the
current lwp is waiting for another lwp to exit, but the loop never
yields the CPU by calling mi_switch(), so the other lwp never gets a
chance to run and exit, unless we happen to be running on a
multiprocessor.
Before kern_lwp.c 1.191, this worked because sleepq_block() was called
with catch_p=false, so the the "early" flag in sleepq_block() was
false and sleepq_block() called mi_switch() in the "else" clause of
"if (early) ...". Now catch_p=true, "early" is true, and mi_switch()
is never called.
Below is a gdb transcript from single stepping around the entire loop,
beginning and ending at "goto retry".
Breakpoint 1, exit_lwps (l=0xc1f9b020) at /usr/src/sys/kern/kern_exit.c:637
637 goto retry;
(gdb) n
610 KASSERT(mutex_owned(p->p_lock));
(gdb)
616 LIST_FOREACH(l2, &p->p_lwps, l_sibling) {
(gdb)
617 if (l2 == l)
(gdb)
618 continue;
(gdb)
616 LIST_FOREACH(l2, &p->p_lwps, l_sibling) {
(gdb)
617 if (l2 == l)
(gdb)
619 lwp_lock(l2);
(gdb) n
620 l2->l_flag |= LW_WEXIT;
(gdb) n
621 if ((l2->l_stat == LSSLEEP && (l2->l_flag & LW_SINTR)) ||
(gdb) n
622 l2->l_stat == LSSUSPENDED || l2->l_stat == LSSTOP) {
(gdb) print /x l2->l_stat
$1 = 0x2
(gdb) print /x l2->l_flag
$2 = 0x1100000
(gdb) n
621 if ((l2->l_stat == LSSLEEP && (l2->l_flag & LW_SINTR)) ||
(gdb)
622 l2->l_stat == LSSUSPENDED || l2->l_stat == LSSTOP) {
(gdb)
627 lwp_unlock(l2);
(gdb)
616 LIST_FOREACH(l2, &p->p_lwps, l_sibling) {
(gdb)
635 while (p->p_nlwps > 1) {
(gdb)
636 if (lwp_wait(l, 0, NULL, true)) {
(gdb) s
lwp_wait (l=0xc1f9b020, lid=0, departed=0x0, exiting=true)
at /usr/src/sys/kern/kern_lwp.c:531
531 const lwpid_t curlid = l->l_lid;
(gdb) n
532 proc_t *p = l->l_proc;
(gdb)
536 KASSERT(mutex_owned(p->p_lock));
(gdb)
538 p->p_nlwpwait++;
(gdb) n
539 l->l_waitingfor = lid;
(gdb) print p->p_nlwpwait
$3 = 1
(gdb) n
550 if ((p->p_sflag & PS_WCORE) != 0) {
(gdb) n
560 while ((l2 = p->p_zomblwp) != NULL) {
(gdb)
571 nfound = 0;
(gdb)
572 error = 0;
(gdb)
573 LIST_FOREACH(l2, &p->p_lwps, l_sibling) {
(gdb)
583 if (l2->l_lid == lid && l2->l_waitingfor == curlid) {
(gdb)
587 if (l2 == l)
(gdb)
588 continue;
(gdb)
573 LIST_FOREACH(l2, &p->p_lwps, l_sibling) {
(gdb)
583 if (l2->l_lid == lid && l2->l_waitingfor == curlid) {
(gdb)
587 if (l2 == l)
(gdb)
589 if ((l2->l_prflag & LPR_DETACHED) != 0) {
(gdb)
593 if (lid != 0) {
(gdb)
602 } else if (l2->l_waiter != 0) {
(gdb)
612 nfound++;
(gdb)
615 if (l2->l_stat != LSZOMB)
(gdb)
616 continue;
(gdb)
573 LIST_FOREACH(l2, &p->p_lwps, l_sibling) {
(gdb)
635 if (error != 0)
(gdb)
637 if (nfound == 0) {
(gdb)
646 if (exiting) {
(gdb)
647 KASSERT(p->p_nlwps > 1);
(gdb)
648 error = cv_wait_sig(&p->p_lwpcv, p->p_lock);
(gdb) s
cv_wait_sig (cv=0xc20d6d80, mtx=0xc2604180)
at /usr/src/sys/kern/kern_condvar.c:266
266 lwp_t *l = curlwp;
(gdb) n
269 KASSERT(mutex_owned(mtx));
(gdb)
271 cv_enter(cv, mtx, l);
(gdb)
272 error = sleepq_block(0, true);
(gdb) s
sleepq_block (timo=0, catch_p=true) at /usr/src/sys/kern/kern_sleepq.c:235
235 int error = 0, sig;
(gdb) n
237 lwp_t *l = curlwp;
(gdb)
238 bool early = false;
(gdb)
239 int biglocks = l->l_biglocks;
(gdb)
241 ktrcsw(1, 0);
(gdb)
247 if (catch_p) {
(gdb)
248 l->l_flag |= LW_SINTR;
(gdb)
249 if ((l->l_flag & (LW_CANCELLED|LW_WEXIT|LW_WCORE)) != 0) {
(gdb)
253 } else if ((l->l_flag & LW_PENDSIG) != 0 && sigispending(l, 0))
(gdb)
254 early = true;
(gdb)
257 if (early) {
(gdb)
259 lwp_unsleep(l, true);
(gdb) s
lwp_unsleep (l=0xc1f9b020, cleanup=true) at /usr/src/sys/kern/kern_lwp.c:1525
1525 KASSERT(mutex_owned(l->l_mutex));
(gdb) n
1526 (*l->l_syncobj->sobj_unsleep)(l, cleanup);
(gdb) s
1527 }
(gdb) s
sleepq_block (timo=0, catch_p=true) at /usr/src/sys/kern/kern_sleepq.c:277
277 if (catch_p && error == 0) {
(gdb) print l->l_syncobj->sobj_unsleep
$4 = (void (*)(struct lwp *, _Bool)) 0xc0bdb63f <sched_unsleep>
(gdb) s
278 p = l->l_proc;
(gdb) s
279 if ((l->l_flag & (LW_CANCELLED | LW_WEXIT | LW_WCORE)) != 0)
(gdb) s
281 else if ((l->l_flag & LW_PENDSIG) != 0) {
(gdb) s
289 mutex_enter(p->p_lock);
(gdb) n
290 if (((sig = sigispending(l, 0)) != 0 &&
(gdb) n
291 (sigprop[sig] & SA_STOP) == 0) ||
(gdb) n
290 if (((sig = sigispending(l, 0)) != 0 &&
(gdb) n
293 error = sleepq_sigtoerror(l, sig);
(gdb) s
sleepq_sigtoerror (l=0xc1f9b020, sig=9) at /usr/src/sys/kern/kern_sleepq.c:387
387 struct proc *p = l->l_proc;
(gdb) n
390 KASSERT(mutex_owned(p->p_lock));
(gdb)
395 if ((SIGACTION(p, sig).sa_flags & SA_RESTART) == 0)
(gdb)
398 error = ERESTART;
(gdb)
400 return error;
(gdb)
401 }
(gdb)
sleepq_block (timo=0, catch_p=true) at /usr/src/sys/kern/kern_sleepq.c:294
294 mutex_exit(p->p_lock);
(gdb)
298 ktrcsw(0, 0);
(gdb) n
299 if (__predict_false(biglocks != 0)) {
(gdb) n
302 return error;
(gdb)
303 }
(gdb)
cv_wait_sig (cv=0xc20d6d80, mtx=0xc2604180)
at /usr/src/sys/kern/kern_condvar.c:273
273 return cv_exit(cv, mtx, l, error);
(gdb) n
274 }
(gdb) n
lwp_wait (l=0xc1f9b020, lid=0, departed=0x0, exiting=true)
at /usr/src/sys/kern/kern_lwp.c:649
649 if (error == 0)
(gdb) n
651 break;
(gdb)
685 if (lid != 0) {
(gdb)
694 p->p_nlwpwait--;
(gdb)
695 l->l_waitingfor = 0;
(gdb)
696 cv_broadcast(&p->p_lwpcv);
(gdb) n
698 return error;
(gdb)
699 }
(gdb)
Breakpoint 1, exit_lwps (l=0xc1f9b020) at /usr/src/sys/kern/kern_exit.c:637
637 goto retry;
(gdb)
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/53202 CVS commit: src/sys/kern
Date: Mon, 23 Apr 2018 11:51:00 -0400
Module Name: src
Committed By: christos
Date: Mon Apr 23 15:51:00 UTC 2018
Modified Files:
src/sys/kern: kern_lwp.c
Log Message:
PR/kern/53202: Kernel hangs running t_ptrace_wait:resume1 test, revert
previous.
To generate a diff of this commit:
cvs rdiff -u -r1.191 -r1.192 src/sys/kern/kern_lwp.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53202: Kernel hangs running t_ptrace_wait:resume1 test
Date: Mon, 23 Apr 2018 19:28:41 +0000
Does inserting a call to yield() after lwp_unsleep in sleepq_block
change anything, if we restore the use of cv_wait_sig in lwp_wait?
From: Andreas Gustafsson <gson@gson.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/53202: Kernel hangs running t_ptrace_wait:resume1 test
Date: Wed, 25 Apr 2018 10:32:39 +0300
Taylor R Campbell wrote:
> Does inserting a call to yield() after lwp_unsleep in sleepq_block
> change anything,
I have now run a test with the following patch, plus the two patches
given in the original PR submission:
Index: src/sys/kern/kern_sleepq.c
===================================================================
RCS file: /bracket/repo/src/sys/kern/kern_sleepq.c,v
retrieving revision 1.51
diff -u -r1.51 kern_sleepq.c
--- src/sys/kern/kern_sleepq.c 3 Jul 2016 14:24:58 -0000 1.51
+++ src/sys/kern/kern_sleepq.c 24 Apr 2018 17:39:03 -0000
@@ -257,6 +257,7 @@
if (early) {
/* lwp_unsleep() will release the lock */
lwp_unsleep(l, true);
+ yield();
} else {
if (timo) {
callout_schedule(&l->l_timeout_ch, timo);
> if we restore the use of cv_wait_sig in lwp_wait?
Instead of applying a fourth patch to -current to restore the use of
cv_wait_sig, I applied the above three patches to sources from CVS
date 2018.04.15.00.19.23, when cv_wait_sig was still being used.
I then built and tested an i386 debug build on my own testbed.
The resume1 tests failed with timeouts, but the test suite now ran to
completion without hanging, and the other tests that failed are ones
I have also seen failing in other tests runs around the same source
date:
kernel/t_timeleft:timeleft__lwp_park
lib/libc/sys/t_ptrace_wait:resume1
lib/libc/sys/t_ptrace_wait3:resume1
lib/libc/sys/t_ptrace_wait4:resume1
lib/libc/sys/t_ptrace_wait6:resume1
lib/libc/sys/t_ptrace_waitid:resume1
lib/libc/sys/t_ptrace_waitpid:resume1
lib/librumphijack/t_tcpip:nfs_autoload
usr.bin/cc/t_asan_poison:poison
usr.bin/cc/t_asan_poison:poison_pic
usr.bin/cc/t_asan_poison:poison_profile
usr.bin/c++/t_asan_poison:poison
usr.bin/c++/t_asan_poison:poison_pic
usr.bin/c++/t_asan_poison:poison_profile
--
Andreas Gustafsson, gson@gson.org
From: Andreas Gustafsson <gson@gson.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, christos@NetBSD.org
Subject: Re: kern/53202: Kernel hangs running t_ptrace_wait:resume1 test
Date: Thu, 26 Apr 2018 19:11:33 +0300
Taylor,
Earlier, you asked:
> Does inserting a call to yield() after lwp_unsleep in sleepq_block
> change anything, if we restore the use of cv_wait_sig in lwp_wait?
Since this seems to be working, do you think it should be committed?
--
Andreas Gustafsson, gson@gson.org
From: Taylor R Campbell <campbell@mumble.net>
To: Andreas Gustafsson <gson@gson.org>
Cc: gnats-bugs@NetBSD.org, christos@NetBSD.org, rmind@NetBSD.org
Subject: Re: kern/53202: Kernel hangs running t_ptrace_wait:resume1 test
Date: Thu, 26 Apr 2018 16:54:01 +0000
> Date: Thu, 26 Apr 2018 19:11:33 +0300
> From: Andreas Gustafsson <gson@gson.org>
>
> Earlier, you asked:
> > Does inserting a call to yield() after lwp_unsleep in sleepq_block
> > change anything, if we restore the use of cv_wait_sig in lwp_wait?
>
> Since this seems to be working, do you think it should be committed?
Maybe. My understanding of the constraints inside the sleepq and
scheduler logic is limited.
It may be more prudent to find why the change to cv_wait_sig worked
around whatever the root cause of the problem with Go was, but I don't
have any brilliant ideas about that.
In particular, it smells like there is a missing wakeup in the lwp
exit logic, perhaps owing to an obscure case of a signal delivery that
happens to cause cv_wait_sig to return early. But exactly where the
wakeup needs to happen is unclear. The condition that the relevant
cv_wait is waiting for in the `exiting' case of lwp_wait is not
obvious -- there's a whole string of things whose change might trigger
it.
From: Andreas Gustafsson <gson@gson.org>
To: christos@NetBSD.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/53202: Kernel hangs running t_ptrace_wait:resume1 test
Date: Fri, 27 Apr 2018 15:55:19 +0300
Taylor R Campbell wrote:
> It may be more prudent to find why the change to cv_wait_sig worked
> around whatever the root cause of the problem with Go was, but I don't
> have any brilliant ideas about that.
>
> In particular, it smells like there is a missing wakeup in the lwp
> exit logic, perhaps owing to an obscure case of a signal delivery that
> happens to cause cv_wait_sig to return early. But exactly where the
> wakeup needs to happen is unclear. The condition that the relevant
> cv_wait is waiting for in the `exiting' case of lwp_wait is not
> obvious -- there's a whole string of things whose change might trigger
> it.
Christos - is there a PR for the problem with Go that was the
motivation for kern_lwp.c 1.191?
--
Andreas Gustafsson, gson@gson.org
State-Changed-From-To: open->needs-pullups
State-Changed-By: gson@NetBSD.org
State-Changed-When: Wed, 02 May 2018 13:49:01 +0000
State-Changed-Why:
kern_lwp.c 1.192 works and should be pulled up
State-Changed-From-To: needs-pullups->pending-pullups
State-Changed-By: gson@NetBSD.org
State-Changed-When: Fri, 04 May 2018 07:15:13 +0000
State-Changed-Why:
Pullup request submitted.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/53202 CVS commit: [netbsd-8] src/sys/kern
Date: Mon, 14 May 2018 19:11:22 +0000
Module Name: src
Committed By: martin
Date: Mon May 14 19:11:21 UTC 2018
Modified Files:
src/sys/kern [netbsd-8]: kern_lwp.c
Log Message:
Pull up following revision(s) (requested by gson in ticket #805):
sys/kern/kern_lwp.c: revision 1.192
PR/kern/53202: Kernel hangs running t_ptrace_wait:resume1 test, revert
previous.
To generate a diff of this commit:
cvs rdiff -u -r1.189.2.1 -r1.189.2.2 src/sys/kern/kern_lwp.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: pending-pullups->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Mon, 14 May 2018 19:19:39 +0000
State-Changed-Why:
Pullup done.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.