NetBSD Problem Report #57705
From www@netbsd.org Sun Nov 19 00:29:42 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id CA9681A9238
for <gnats-bugs@gnats.NetBSD.org>; Sun, 19 Nov 2023 00:29:41 +0000 (UTC)
Message-Id: <20231119002940.6B7B61A9239@mollari.NetBSD.org>
Date: Sun, 19 Nov 2023 00:29:40 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: rump doesn't have a way to interrupt rump_sys_* by a signal
X-Send-Pr-Version: www-1.0
>Number: 57705
>Category: kern
>Synopsis: rump doesn't have a way to interrupt rump_sys_* by a signal
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Nov 19 00:30:00 +0000 2023
>Last-Modified: Sun Nov 26 12:35:03 +0000 2023
>Originator: Taylor R Campbell
>Release: current
>Organization:
The RumpBSD Signallation
>Environment:
>Description:
It is important for some automatic tests (like the one I drafted for PR kern/57703, https://gnats.NetBSD.org/57703) to test interrupting a blocking system call with a signal, so that cv_wait_sig (or legacy tsleep) will return with EINTR.
However, as far as I can tell, there's no way to do that with rump at the moment. Simply delivering a signal to the process or pthread doesn't do anything -- the underlying pthread_cond_wait inside the rump cv_wait_sig won't wake on signal, and there's no logic in it to trigger a wakeup.
>How-To-Repeat:
Write a test of interrupting a blocking syscall with a signal.
>Fix:
Yes, please!
Probably just a matter of creating a path in locks.c docvwait to check for pending signals and set l->l_sched.info = cv under a lock, so that another thread can deliver a `signal' by setting pending signals and cv_broadcasting l->l_sched.info.
>Audit-Trail:
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: netbsd-bugs@NetBSD.org
Subject: Re: kern/57705: rump doesn't have a way to interrupt rump_sys_* by a signal
Date: Sun, 19 Nov 2023 04:02:37 +0000
This is a multi-part message in MIME format.
--=_UO/DuVWULAgwAF6z8mwyycn6j85jhood
The attached patch implements a rump signal model,
RUMP_SIGMODEL_RECORD_WAKEUP, for use with rump_boot_setsigmodel, that
makes rump signal delivery record the signal in the process and wake
up any cv_wait_sig or cv_timedwait_sig in progress. And it adds a
function rump_lwp_kill(l, signo) to simulate signal delivery.
Combined with RUMP_SIGMODEL_RECORD_WAKEUP, rump_lwp_kill will cause
pending cv_wait_sig or cv_timedwait_sig to wake, and cause subsequent
cv_wait_sig or cv_timedwait_sig to fail with EINTR without sleeping.
This is a new signal model so it won't break any existing code
(although in principle the extra logic in cv_wait_sig and
cv_timedwait_sig could break code because it's not conditional on
RUMP_SIGMODEL_RECORD_WAKEUP).
It's a little bodgy:
1. It takes proc_lock in any cv_wait_sig or cv_timedwait_sig call --
perhaps we can relax this to use a narrower-scoped lock, like
l->l_mutex (if that works), so it doesn't become a point of
contention in all rump code.
2. It delivers the signal to all lwps in a process, not just the
requested one -- should be a simple matter of programming to make
it work for just the requested one, but this serves my testing
needs for now.
3. Nothing clears the signal, so subsequent cv_wait_sig or
cv_timedwait_sig will continue to fail with EINTR without sleeping.
Either the caller of rump_sys_* will need to do something to clear
the pending signal (i.e., pretend to `handle' it), or we'll need to
invent some way for rump_sys_* itself to clear it.
But this serves my needs for the moment, which is to test the fix for
the PR kern/57703 eventfd panic without crashing my development
laptop.
--=_UO/DuVWULAgwAF6z8mwyycn6j85jhood
Content-Type: text/plain; charset="ISO-8859-1"; name="pr57705-rumpsignalwakeup"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="pr57705-rumpsignalwakeup.patch"
From ec313e6d40f530da7a2fd5a823c1c71e99e36fc0 Mon Sep 17 00:00:00 2001
From: Taylor R Campbell <riastradh@NetBSD.org>
Date: Sun, 19 Nov 2023 01:14:21 +0000
Subject: [PATCH] rump: New mechanism to interrupt rump_sys_* as if by a
signal.
PR kern/57705
---
sys/rump/include/rump/rump.h | 5 ++-
sys/rump/librump/rumpkern/locks.c | 52 ++++++++++++++++++++----
sys/rump/librump/rumpkern/signals.c | 28 +++++++++++++
tests/rump/rumpkern/t_signals.c | 62 +++++++++++++++++++++++++++++
4 files changed, 139 insertions(+), 8 deletions(-)
diff --git a/sys/rump/include/rump/rump.h b/sys/rump/include/rump/rump.h
index a4679d225fe7..48d1b794c95e 100644
--- a/sys/rump/include/rump/rump.h
+++ b/sys/rump/include/rump/rump.h
@@ -64,7 +64,8 @@ enum rump_sigmodel {
RUMP_SIGMODEL_IGNORE,
RUMP_SIGMODEL__HOST_NOTANYMORE,
RUMP_SIGMODEL_RAISE,
- RUMP_SIGMODEL_RECORD
+ RUMP_SIGMODEL_RECORD,
+ RUMP_SIGMODEL_RECORD_WAKEUP,
};
=20
/* flags to rump_lwproc_rfork */
@@ -124,6 +125,8 @@ int rump_init_server(const char *);
int rump_daemonize_done(int);
#define RUMP_DAEMONIZE_SUCCESS 0
=20
+void rump_lwp_kill(struct lwp *, int);
+
#ifndef _KERNEL
#include <rump/rumpkern_if_pub.h>
#include <rump/rumpvfs_if_pub.h>
diff --git a/sys/rump/librump/rumpkern/locks.c b/sys/rump/librump/rumpkern/=
locks.c
index 1b60b8e70706..bb3931708102 100644
--- a/sys/rump/librump/rumpkern/locks.c
+++ b/sys/rump/librump/rumpkern/locks.c
@@ -378,7 +378,7 @@ cv_destroy(kcondvar_t *cv)
}
=20
static int
-docvwait(kcondvar_t *cv, kmutex_t *mtx, struct timespec *ts)
+docvwait(kcondvar_t *cv, kmutex_t *mtx, struct timespec *ts, int catch)
{
struct lwp *l =3D curlwp;
int rv;
@@ -395,9 +395,21 @@ docvwait(kcondvar_t *cv, kmutex_t *mtx, struct timespe=
c *ts)
=20
UNLOCKED(mtx, false);
=20
- l->l_sched.info =3D cv;
rv =3D 0;
- if (ts) {
+
+ if (catch) {
+ if (mtx !=3D &proc_lock)
+ mutex_enter(&proc_lock);
+ }
+ l->l_sched.info =3D cv;
+ if (catch) {
+ rv =3D sigispending(curlwp, 0);
+ if (mtx !=3D &proc_lock)
+ mutex_exit(&proc_lock);
+ }
+ if (rv) {
+ /* nothing */
+ } else if (ts) {
if (rumpuser_cv_timedwait(RUMPCV(cv), RUMPMTX(mtx),
ts->tv_sec, ts->tv_nsec))
rv =3D EWOULDBLOCK;
@@ -430,7 +442,15 @@ docvwait(kcondvar_t *cv, kmutex_t *mtx, struct timespe=
c *ts)
mutex_enter(mtx);
rv =3D EINTR;
}
+ if (catch) {
+ if (mtx !=3D &proc_lock)
+ mutex_enter(&proc_lock);
+ }
l->l_sched.info =3D NULL;
+ if (catch) {
+ if (mtx !=3D &proc_lock)
+ mutex_exit(&proc_lock);
+ }
=20
return rv;
}
@@ -441,7 +461,7 @@ cv_wait(kcondvar_t *cv, kmutex_t *mtx)
=20
if (__predict_false(rump_threads =3D=3D 0))
panic("cv_wait without threads");
- (void) docvwait(cv, mtx, NULL);
+ (void) docvwait(cv, mtx, NULL, /*catch*/0);
}
=20
int
@@ -450,7 +470,7 @@ cv_wait_sig(kcondvar_t *cv, kmutex_t *mtx)
=20
if (__predict_false(rump_threads =3D=3D 0))
panic("cv_wait without threads");
- return docvwait(cv, mtx, NULL);
+ return docvwait(cv, mtx, NULL, /*catch*/1);
}
=20
int
@@ -460,17 +480,35 @@ cv_timedwait(kcondvar_t *cv, kmutex_t *mtx, int ticks)
extern int hz;
int rv;
=20
+ if (ticks =3D=3D 0) {
+ cv_wait(cv, mtx);
+ rv =3D 0;
+ } else {
+ ts.tv_sec =3D ticks / hz;
+ ts.tv_nsec =3D (ticks % hz) * (1000000000/hz);
+ rv =3D docvwait(cv, mtx, &ts, /*catch*/0);
+ }
+
+ return rv;
+}
+
+int
+cv_timedwait_sig(kcondvar_t *cv, kmutex_t *mtx, int ticks)
+{
+ struct timespec ts;
+ extern int hz;
+ int rv;
+
if (ticks =3D=3D 0) {
rv =3D cv_wait_sig(cv, mtx);
} else {
ts.tv_sec =3D ticks / hz;
ts.tv_nsec =3D (ticks % hz) * (1000000000/hz);
- rv =3D docvwait(cv, mtx, &ts);
+ rv =3D docvwait(cv, mtx, &ts, /*catch*/1);
}
=20
return rv;
}
-__strong_alias(cv_timedwait_sig,cv_timedwait);
=20
void
cv_signal(kcondvar_t *cv)
diff --git a/sys/rump/librump/rumpkern/signals.c b/sys/rump/librump/rumpker=
n/signals.c
index 2eb06f188bf5..556678e08b4d 100644
--- a/sys/rump/librump/rumpkern/signals.c
+++ b/sys/rump/librump/rumpkern/signals.c
@@ -105,6 +105,22 @@ rumpsig_record(struct proc *p, int signo)
}
}
=20
+static void
+rumpsig_record_wakeup(struct proc *p, int signo)
+{
+ struct lwp *l;
+
+ if (!sigismember(&p->p_sigctx.ps_sigignore, signo)) {
+ sigaddset(&p->p_sigpend.sp_set, signo);
+ mutex_enter(p->p_lock);
+ LIST_FOREACH(l, &p->p_lwps, l_sibling) {
+ if (l->l_sched.info)
+ cv_broadcast(l->l_sched.info);
+ }
+ mutex_exit(p->p_lock);
+ }
+}
+
typedef void (*rumpsig_fn)(struct proc *, int);
=20
static rumpsig_fn rumpsig =3D rumpsig_raise;
@@ -135,6 +151,9 @@ rump_boot_setsigmodel(enum rump_sigmodel model)
case RUMP_SIGMODEL_RECORD:
rumpsig =3D rumpsig_record;
break;
+ case RUMP_SIGMODEL_RECORD_WAKEUP:
+ rumpsig =3D rumpsig_record_wakeup;
+ break;
=20
/* for compat, though I doubt anyone is using it */
case RUMP_SIGMODEL__HOST_NOTANYMORE:
@@ -143,6 +162,15 @@ rump_boot_setsigmodel(enum rump_sigmodel model)
}
}
=20
+void
+rump_lwp_kill(struct lwp *l, int signo)
+{
+
+ mutex_enter(&proc_lock);
+ psignal(l->l_proc, signo);
+ mutex_exit(&proc_lock);
+}
+
void
psignal(struct proc *p, int sig)
{
diff --git a/tests/rump/rumpkern/t_signals.c b/tests/rump/rumpkern/t_signal=
s.c
index ba0c0ea36fb2..1f7712d29eee 100644
--- a/tests/rump/rumpkern/t_signals.c
+++ b/tests/rump/rumpkern/t_signals.c
@@ -32,11 +32,13 @@
=20
#include <atf-c.h>
#include <errno.h>
+#include <pthread.h>
#include <string.h>
#include <signal.h>
#include <unistd.h>
=20
#include <rump/rump.h>
+#include <rump/rump_syscalls.h>
=20
#include "../kernspace/kernspace.h"
#include "h_macros.h"
@@ -62,6 +64,13 @@ ATF_TC_HEAD(sigpanic, tc)
atf_tc_set_md_var(tc, "descr", "RUMP_SIGMODEL_PANIC");
}
=20
+ATF_TC(sigrecordwakeup);
+ATF_TC_HEAD(sigrecordwakeup, tc)
+{
+
+ atf_tc_set_md_var(tc, "descr", "RUMP_SIGMODEL_RECORD_WAKEUP");
+}
+
static volatile sig_atomic_t sigcnt;
static void
thehand(int sig)
@@ -116,12 +125,65 @@ ATF_TC_BODY(sigpanic, tc)
}
}
=20
+struct sigrecordwakeupctx {
+ pid_t pid;
+ struct lwp *lwp;
+};
+
+static void *
+signalme(void *cookie)
+{
+ struct sigrecordwakeupctx *C =3D cookie;
+
+ RZ(rump_pub_lwproc_newlwp(C->pid));
+
+ sleep(1);
+ rump_lwp_kill(C->lwp, SIGUSR1);
+
+ return NULL;
+}
+
+ATF_TC_BODY(sigrecordwakeup, tc)
+{
+ struct sigrecordwakeupctx ctx, *C =3D &ctx;
+ pthread_t t;
+ int fd[2];
+ char c;
+ ssize_t nwrit;
+ int error;
+
+ rump_boot_setsigmodel(RUMP_SIGMODEL_RECORD_WAKEUP);
+ rump_init();
+
+ memset(C, 0, sizeof(*C));
+
+ RZ(rump_pub_lwproc_rfork(RUMP_RFCFDG));
+ RL(C->pid =3D rump_sys_getpid());
+ RZ(rump_pub_lwproc_newlwp(C->pid));
+ C->lwp =3D rump_pub_lwproc_curlwp();
+
+ RL(rump_sys_pipe(fd));
+
+ RZ(pthread_create(&t, NULL, &signalme, C));
+
+ alarm(2);
+ nwrit =3D rump_sys_read(fd[0], &c, sizeof(c));
+ ATF_REQUIRE_EQ_MSG(nwrit, -1, "rump_sys_read: %zd", nwrit);
+ error =3D errno;
+ ATF_REQUIRE_EQ_MSG(error, EINTR, "errno=3D%d (%s)",
+ error, strerror(error));
+
+ RZ(pthread_join(t, NULL));
+ alarm(0);
+}
+
ATF_TP_ADD_TCS(tp)
{
=20
ATF_TP_ADD_TC(tp, sigraise);
ATF_TP_ADD_TC(tp, sigignore);
ATF_TP_ADD_TC(tp, sigpanic);
+ ATF_TP_ADD_TC(tp, sigrecordwakeup);
=20
return atf_no_error();
}
--=_UO/DuVWULAgwAF6z8mwyycn6j85jhood--
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: netbsd-bugs@NetBSD.org
Subject: Re: kern/57705: rump doesn't have a way to interrupt rump_sys_* by a signal
Date: Sun, 19 Nov 2023 04:09:30 +0000
Oops... that patch isn't going to work reliably: potential lock order
reversal and races. Need to find another approach -- probably have to
arrange psignal to take the lock that's serializing cv_wait_sig, which
means finding some way to safely expose that lock through
l->l_sched.info.
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/57705 CVS commit: src/sys/kern
Date: Sun, 19 Nov 2023 04:13:38 +0000
Module Name: src
Committed By: riastradh
Date: Sun Nov 19 04:13:38 UTC 2023
Modified Files:
src/sys/kern: sys_eventfd.c
Log Message:
eventfd(2): Omit needless micro-optimization causing PR kern/57703.
Unfortunately, owing to PR kern/57705 and PR misc/57706, it isn't
convenient to flip the xfail switch on a test for this bug. So we'll
do that separately. (But I did verify that a rumpified version of
the test postd to PR kern/57703 failed without this change, and
passed with this change.)
PR kern/57703
XXX pullup-10
To generate a diff of this commit:
cvs rdiff -u -r1.9 -r1.10 src/sys/kern/sys_eventfd.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/57705 CVS commit: [netbsd-10] src/sys/kern
Date: Sun, 26 Nov 2023 12:33:19 +0000
Module Name: src
Committed By: bouyer
Date: Sun Nov 26 12:33:19 UTC 2023
Modified Files:
src/sys/kern [netbsd-10]: sys_eventfd.c
Log Message:
Pull up following revision(s) (requested by riastradh in ticket #468):
sys/kern/sys_eventfd.c: revision 1.10
eventfd(2): Omit needless micro-optimization causing PR kern/57703.
Unfortunately, owing to PR kern/57705 and PR misc/57706, it isn't
convenient to flip the xfail switch on a test for this bug. So we'll
do that separately. (But I did verify that a rumpified version of
the test postd to PR kern/57703 failed without this change, and
passed with this change.)
PR kern/57703
XXX pullup-10
To generate a diff of this commit:
cvs rdiff -u -r1.9 -r1.9.4.1 src/sys/kern/sys_eventfd.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.