NetBSD Problem Report #57705

From www@netbsd.org  Sun Nov 19 00:29:42 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id CA9681A9238
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 19 Nov 2023 00:29:41 +0000 (UTC)
Message-Id: <20231119002940.6B7B61A9239@mollari.NetBSD.org>
Date: Sun, 19 Nov 2023 00:29:40 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: rump doesn't have a way to interrupt rump_sys_* by a signal
X-Send-Pr-Version: www-1.0

>Number:         57705
>Category:       kern
>Synopsis:       rump doesn't have a way to interrupt rump_sys_* by a signal
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Nov 19 00:30:00 +0000 2023
>Last-Modified:  Sun Nov 26 12:35:03 +0000 2023
>Originator:     Taylor R Campbell
>Release:        current
>Organization:
The RumpBSD Signallation
>Environment:
>Description:
It is important for some automatic tests (like the one I drafted for PR kern/57703, https://gnats.NetBSD.org/57703) to test interrupting a blocking system call with a signal, so that cv_wait_sig (or legacy tsleep) will return with EINTR.

However, as far as I can tell, there's no way to do that with rump at the moment.  Simply delivering a signal to the process or pthread doesn't do anything -- the underlying pthread_cond_wait inside the rump cv_wait_sig won't wake on signal, and there's no logic in it to trigger a wakeup.
>How-To-Repeat:
Write a test of interrupting a blocking syscall with a signal.
>Fix:
Yes, please!

Probably just a matter of creating a path in locks.c docvwait to check for pending signals and set l->l_sched.info = cv under a lock, so that another thread can deliver a `signal' by setting pending signals and cv_broadcasting l->l_sched.info.

>Audit-Trail:
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: netbsd-bugs@NetBSD.org
Subject: Re: kern/57705: rump doesn't have a way to interrupt rump_sys_* by a signal
Date: Sun, 19 Nov 2023 04:02:37 +0000

 This is a multi-part message in MIME format.
 --=_UO/DuVWULAgwAF6z8mwyycn6j85jhood

 The attached patch implements a rump signal model,
 RUMP_SIGMODEL_RECORD_WAKEUP, for use with rump_boot_setsigmodel, that
 makes rump signal delivery record the signal in the process and wake
 up any cv_wait_sig or cv_timedwait_sig in progress.  And it adds a
 function rump_lwp_kill(l, signo) to simulate signal delivery.

 Combined with RUMP_SIGMODEL_RECORD_WAKEUP, rump_lwp_kill will cause
 pending cv_wait_sig or cv_timedwait_sig to wake, and cause subsequent
 cv_wait_sig or cv_timedwait_sig to fail with EINTR without sleeping.
 This is a new signal model so it won't break any existing code
 (although in principle the extra logic in cv_wait_sig and
 cv_timedwait_sig could break code because it's not conditional on
 RUMP_SIGMODEL_RECORD_WAKEUP).

 It's a little bodgy:

 1. It takes proc_lock in any cv_wait_sig or cv_timedwait_sig call --
    perhaps we can relax this to use a narrower-scoped lock, like
    l->l_mutex (if that works), so it doesn't become a point of
    contention in all rump code.

 2. It delivers the signal to all lwps in a process, not just the
    requested one -- should be a simple matter of programming to make
    it work for just the requested one, but this serves my testing
    needs for now.

 3. Nothing clears the signal, so subsequent cv_wait_sig or
    cv_timedwait_sig will continue to fail with EINTR without sleeping.
    Either the caller of rump_sys_* will need to do something to clear
    the pending signal (i.e., pretend to `handle' it), or we'll need to
    invent some way for rump_sys_* itself to clear it.

 But this serves my needs for the moment, which is to test the fix for
 the PR kern/57703 eventfd panic without crashing my development
 laptop.

 --=_UO/DuVWULAgwAF6z8mwyycn6j85jhood
 Content-Type: text/plain; charset="ISO-8859-1"; name="pr57705-rumpsignalwakeup"
 Content-Transfer-Encoding: quoted-printable
 Content-Disposition: attachment; filename="pr57705-rumpsignalwakeup.patch"

 From ec313e6d40f530da7a2fd5a823c1c71e99e36fc0 Mon Sep 17 00:00:00 2001
 From: Taylor R Campbell <riastradh@NetBSD.org>
 Date: Sun, 19 Nov 2023 01:14:21 +0000
 Subject: [PATCH] rump: New mechanism to interrupt rump_sys_* as if by a
  signal.

 PR kern/57705
 ---
  sys/rump/include/rump/rump.h        |  5 ++-
  sys/rump/librump/rumpkern/locks.c   | 52 ++++++++++++++++++++----
  sys/rump/librump/rumpkern/signals.c | 28 +++++++++++++
  tests/rump/rumpkern/t_signals.c     | 62 +++++++++++++++++++++++++++++
  4 files changed, 139 insertions(+), 8 deletions(-)

 diff --git a/sys/rump/include/rump/rump.h b/sys/rump/include/rump/rump.h
 index a4679d225fe7..48d1b794c95e 100644
 --- a/sys/rump/include/rump/rump.h
 +++ b/sys/rump/include/rump/rump.h
 @@ -64,7 +64,8 @@ enum rump_sigmodel {
  	RUMP_SIGMODEL_IGNORE,
  	RUMP_SIGMODEL__HOST_NOTANYMORE,
  	RUMP_SIGMODEL_RAISE,
 -	RUMP_SIGMODEL_RECORD
 +	RUMP_SIGMODEL_RECORD,
 +	RUMP_SIGMODEL_RECORD_WAKEUP,
  };
 =20
  /* flags to rump_lwproc_rfork */
 @@ -124,6 +125,8 @@ int	rump_init_server(const char *);
  int	rump_daemonize_done(int);
  #define RUMP_DAEMONIZE_SUCCESS 0
 =20
 +void	rump_lwp_kill(struct lwp *, int);
 +
  #ifndef _KERNEL
  #include <rump/rumpkern_if_pub.h>
  #include <rump/rumpvfs_if_pub.h>
 diff --git a/sys/rump/librump/rumpkern/locks.c b/sys/rump/librump/rumpkern/=
 locks.c
 index 1b60b8e70706..bb3931708102 100644
 --- a/sys/rump/librump/rumpkern/locks.c
 +++ b/sys/rump/librump/rumpkern/locks.c
 @@ -378,7 +378,7 @@ cv_destroy(kcondvar_t *cv)
  }
 =20
  static int
 -docvwait(kcondvar_t *cv, kmutex_t *mtx, struct timespec *ts)
 +docvwait(kcondvar_t *cv, kmutex_t *mtx, struct timespec *ts, int catch)
  {
  	struct lwp *l =3D curlwp;
  	int rv;
 @@ -395,9 +395,21 @@ docvwait(kcondvar_t *cv, kmutex_t *mtx, struct timespe=
 c *ts)
 =20
  	UNLOCKED(mtx, false);
 =20
 -	l->l_sched.info =3D cv;
  	rv =3D 0;
 -	if (ts) {
 +
 +	if (catch) {
 +		if (mtx !=3D &proc_lock)
 +			mutex_enter(&proc_lock);
 +	}
 +	l->l_sched.info =3D cv;
 +	if (catch) {
 +		rv =3D sigispending(curlwp, 0);
 +		if (mtx !=3D &proc_lock)
 +			mutex_exit(&proc_lock);
 +	}
 +	if (rv) {
 +		/* nothing */
 +	} else if (ts) {
  		if (rumpuser_cv_timedwait(RUMPCV(cv), RUMPMTX(mtx),
  		    ts->tv_sec, ts->tv_nsec))
  			rv =3D EWOULDBLOCK;
 @@ -430,7 +442,15 @@ docvwait(kcondvar_t *cv, kmutex_t *mtx, struct timespe=
 c *ts)
  		mutex_enter(mtx);
  		rv =3D EINTR;
  	}
 +	if (catch) {
 +		if (mtx !=3D &proc_lock)
 +			mutex_enter(&proc_lock);
 +	}
  	l->l_sched.info =3D NULL;
 +	if (catch) {
 +		if (mtx !=3D &proc_lock)
 +			mutex_exit(&proc_lock);
 +	}
 =20
  	return rv;
  }
 @@ -441,7 +461,7 @@ cv_wait(kcondvar_t *cv, kmutex_t *mtx)
 =20
  	if (__predict_false(rump_threads =3D=3D 0))
  		panic("cv_wait without threads");
 -	(void) docvwait(cv, mtx, NULL);
 +	(void) docvwait(cv, mtx, NULL, /*catch*/0);
  }
 =20
  int
 @@ -450,7 +470,7 @@ cv_wait_sig(kcondvar_t *cv, kmutex_t *mtx)
 =20
  	if (__predict_false(rump_threads =3D=3D 0))
  		panic("cv_wait without threads");
 -	return docvwait(cv, mtx, NULL);
 +	return docvwait(cv, mtx, NULL, /*catch*/1);
  }
 =20
  int
 @@ -460,17 +480,35 @@ cv_timedwait(kcondvar_t *cv, kmutex_t *mtx, int ticks)
  	extern int hz;
  	int rv;
 =20
 +	if (ticks =3D=3D 0) {
 +		cv_wait(cv, mtx);
 +		rv =3D 0;
 +	} else {
 +		ts.tv_sec =3D ticks / hz;
 +		ts.tv_nsec =3D (ticks % hz) * (1000000000/hz);
 +		rv =3D docvwait(cv, mtx, &ts, /*catch*/0);
 +	}
 +
 +	return rv;
 +}
 +
 +int
 +cv_timedwait_sig(kcondvar_t *cv, kmutex_t *mtx, int ticks)
 +{
 +	struct timespec ts;
 +	extern int hz;
 +	int rv;
 +
  	if (ticks =3D=3D 0) {
  		rv =3D cv_wait_sig(cv, mtx);
  	} else {
  		ts.tv_sec =3D ticks / hz;
  		ts.tv_nsec =3D (ticks % hz) * (1000000000/hz);
 -		rv =3D docvwait(cv, mtx, &ts);
 +		rv =3D docvwait(cv, mtx, &ts, /*catch*/1);
  	}
 =20
  	return rv;
  }
 -__strong_alias(cv_timedwait_sig,cv_timedwait);
 =20
  void
  cv_signal(kcondvar_t *cv)
 diff --git a/sys/rump/librump/rumpkern/signals.c b/sys/rump/librump/rumpker=
 n/signals.c
 index 2eb06f188bf5..556678e08b4d 100644
 --- a/sys/rump/librump/rumpkern/signals.c
 +++ b/sys/rump/librump/rumpkern/signals.c
 @@ -105,6 +105,22 @@ rumpsig_record(struct proc *p, int signo)
  	}
  }
 =20
 +static void
 +rumpsig_record_wakeup(struct proc *p, int signo)
 +{
 +	struct lwp *l;
 +
 +	if (!sigismember(&p->p_sigctx.ps_sigignore, signo)) {
 +		sigaddset(&p->p_sigpend.sp_set, signo);
 +		mutex_enter(p->p_lock);
 +		LIST_FOREACH(l, &p->p_lwps, l_sibling) {
 +			if (l->l_sched.info)
 +				cv_broadcast(l->l_sched.info);
 +		}
 +		mutex_exit(p->p_lock);
 +	}
 +}
 +
  typedef void (*rumpsig_fn)(struct proc *, int);
 =20
  static rumpsig_fn rumpsig =3D rumpsig_raise;
 @@ -135,6 +151,9 @@ rump_boot_setsigmodel(enum rump_sigmodel model)
  	case RUMP_SIGMODEL_RECORD:
  		rumpsig =3D rumpsig_record;
  		break;
 +	case RUMP_SIGMODEL_RECORD_WAKEUP:
 +		rumpsig =3D rumpsig_record_wakeup;
 +		break;
 =20
  	/* for compat, though I doubt anyone is using it */
  	case RUMP_SIGMODEL__HOST_NOTANYMORE:
 @@ -143,6 +162,15 @@ rump_boot_setsigmodel(enum rump_sigmodel model)
  	}
  }
 =20
 +void
 +rump_lwp_kill(struct lwp *l, int signo)
 +{
 +
 +	mutex_enter(&proc_lock);
 +	psignal(l->l_proc, signo);
 +	mutex_exit(&proc_lock);
 +}
 +
  void
  psignal(struct proc *p, int sig)
  {
 diff --git a/tests/rump/rumpkern/t_signals.c b/tests/rump/rumpkern/t_signal=
 s.c
 index ba0c0ea36fb2..1f7712d29eee 100644
 --- a/tests/rump/rumpkern/t_signals.c
 +++ b/tests/rump/rumpkern/t_signals.c
 @@ -32,11 +32,13 @@
 =20
  #include <atf-c.h>
  #include <errno.h>
 +#include <pthread.h>
  #include <string.h>
  #include <signal.h>
  #include <unistd.h>
 =20
  #include <rump/rump.h>
 +#include <rump/rump_syscalls.h>
 =20
  #include "../kernspace/kernspace.h"
  #include "h_macros.h"
 @@ -62,6 +64,13 @@ ATF_TC_HEAD(sigpanic, tc)
  	atf_tc_set_md_var(tc, "descr", "RUMP_SIGMODEL_PANIC");
  }
 =20
 +ATF_TC(sigrecordwakeup);
 +ATF_TC_HEAD(sigrecordwakeup, tc)
 +{
 +
 +	atf_tc_set_md_var(tc, "descr", "RUMP_SIGMODEL_RECORD_WAKEUP");
 +}
 +
  static volatile sig_atomic_t sigcnt;
  static void
  thehand(int sig)
 @@ -116,12 +125,65 @@ ATF_TC_BODY(sigpanic, tc)
  	}
  }
 =20
 +struct sigrecordwakeupctx {
 +	pid_t pid;
 +	struct lwp *lwp;
 +};
 +
 +static void *
 +signalme(void *cookie)
 +{
 +	struct sigrecordwakeupctx *C =3D cookie;
 +
 +	RZ(rump_pub_lwproc_newlwp(C->pid));
 +
 +	sleep(1);
 +	rump_lwp_kill(C->lwp, SIGUSR1);
 +
 +	return NULL;
 +}
 +
 +ATF_TC_BODY(sigrecordwakeup, tc)
 +{
 +	struct sigrecordwakeupctx ctx, *C =3D &ctx;
 +	pthread_t t;
 +	int fd[2];
 +	char c;
 +	ssize_t nwrit;
 +	int error;
 +
 +	rump_boot_setsigmodel(RUMP_SIGMODEL_RECORD_WAKEUP);
 +	rump_init();
 +
 +	memset(C, 0, sizeof(*C));
 +
 +	RZ(rump_pub_lwproc_rfork(RUMP_RFCFDG));
 +	RL(C->pid =3D rump_sys_getpid());
 +	RZ(rump_pub_lwproc_newlwp(C->pid));
 +	C->lwp =3D rump_pub_lwproc_curlwp();
 +
 +	RL(rump_sys_pipe(fd));
 +
 +	RZ(pthread_create(&t, NULL, &signalme, C));
 +
 +	alarm(2);
 +	nwrit =3D rump_sys_read(fd[0], &c, sizeof(c));
 +	ATF_REQUIRE_EQ_MSG(nwrit, -1, "rump_sys_read: %zd", nwrit);
 +	error =3D errno;
 +	ATF_REQUIRE_EQ_MSG(error, EINTR, "errno=3D%d (%s)",
 +	    error, strerror(error));
 +
 +	RZ(pthread_join(t, NULL));
 +	alarm(0);
 +}
 +
  ATF_TP_ADD_TCS(tp)
  {
 =20
  	ATF_TP_ADD_TC(tp, sigraise);
  	ATF_TP_ADD_TC(tp, sigignore);
  	ATF_TP_ADD_TC(tp, sigpanic);
 +	ATF_TP_ADD_TC(tp, sigrecordwakeup);
 =20
  	return atf_no_error();
  }

 --=_UO/DuVWULAgwAF6z8mwyycn6j85jhood--

From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: netbsd-bugs@NetBSD.org
Subject: Re: kern/57705: rump doesn't have a way to interrupt rump_sys_* by a signal
Date: Sun, 19 Nov 2023 04:09:30 +0000

 Oops... that patch isn't going to work reliably: potential lock order
 reversal and races.  Need to find another approach -- probably have to
 arrange psignal to take the lock that's serializing cv_wait_sig, which
 means finding some way to safely expose that lock through
 l->l_sched.info.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57705 CVS commit: src/sys/kern
Date: Sun, 19 Nov 2023 04:13:38 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sun Nov 19 04:13:38 UTC 2023

 Modified Files:
 	src/sys/kern: sys_eventfd.c

 Log Message:
 eventfd(2): Omit needless micro-optimization causing PR kern/57703.

 Unfortunately, owing to PR kern/57705 and PR misc/57706, it isn't
 convenient to flip the xfail switch on a test for this bug.  So we'll
 do that separately.  (But I did verify that a rumpified version of
 the test postd to PR kern/57703 failed without this change, and
 passed with this change.)

 PR kern/57703

 XXX pullup-10


 To generate a diff of this commit:
 cvs rdiff -u -r1.9 -r1.10 src/sys/kern/sys_eventfd.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57705 CVS commit: [netbsd-10] src/sys/kern
Date: Sun, 26 Nov 2023 12:33:19 +0000

 Module Name:	src
 Committed By:	bouyer
 Date:		Sun Nov 26 12:33:19 UTC 2023

 Modified Files:
 	src/sys/kern [netbsd-10]: sys_eventfd.c

 Log Message:
 Pull up following revision(s) (requested by riastradh in ticket #468):
 	sys/kern/sys_eventfd.c: revision 1.10
 eventfd(2): Omit needless micro-optimization causing PR kern/57703.
 Unfortunately, owing to PR kern/57705 and PR misc/57706, it isn't
 convenient to flip the xfail switch on a test for this bug.  So we'll
 do that separately.  (But I did verify that a rumpified version of
 the test postd to PR kern/57703 failed without this change, and
 passed with this change.)
 PR kern/57703
 XXX pullup-10


 To generate a diff of this commit:
 cvs rdiff -u -r1.9 -r1.9.4.1 src/sys/kern/sys_eventfd.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.