NetBSD Problem Report #57659
From www@netbsd.org Sun Oct 15 11:03:46 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 3033D1A9238
for <gnats-bugs@gnats.NetBSD.org>; Sun, 15 Oct 2023 11:03:46 +0000 (UTC)
Message-Id: <20231015110344.530F71A923A@mollari.NetBSD.org>
Date: Sun, 15 Oct 2023 11:03:44 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: closing pipe writefd fails to wake concurrent write on same writefd
X-Send-Pr-Version: www-1.0
>Number: 57659
>Category: kern
>Synopsis: closing pipe writefd fails to wake concurrent write on same writefd
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Oct 15 11:05:00 +0000 2023
>Last-Modified: Sat Nov 18 19:50:01 +0000 2023
>Originator: Taylor R Campbell
>Release: current
>Organization:
The NetBSD Fdation
>Environment:
>Description:
If a write system call on a pipe's writefd is blocking because the pipe is full, a concurrent close system call must wake the write system call and cause it to fail with EBADF.
But it doesn't; the write system call seems to hang indefinitely.
This happens before and after sys_pipe.c 1.165:
https://mail-index.netbsd.org/source-changes/2023/10/13/msg148107.html
>How-To-Repeat:
#include <assert.h>
#include <err.h>
#include <errno.h>
#include <pthread.h>
#include <unistd.h>
pthread_barrier_t barrier;
int fd;
static void
waitforbarrier(const char *caller)
{
int error;
error = pthread_barrier_wait(&barrier);
switch (error) {
case 0:
case PTHREAD_BARRIER_SERIAL_THREAD:
break;
default:
errc(1, error, "%s: pthread_barrier_wait", caller);
}
}
static void *
start(void *cookie)
{
static const char buf[1024*1024]; /* XXX >BIG_PIPE_SIZE */
ssize_t nwrit;
waitforbarrier("user");
nwrit = write(fd, buf, sizeof(buf));
if (nwrit != -1) /* buffer filled, try more */
nwrit = write(fd, buf, sizeof(buf));
assert(nwrit == -1);
assert(errno == EBADF);
return NULL;
}
int
main(void)
{
int p[2];
pthread_t t;
int error;
if (pipe(p) == -1)
err(1, "pipe");
fd = p[1];
error = pthread_barrier_init(&barrier, NULL, 2);
if (error)
errc(1, error, "pthread_barrier_init");
error = pthread_create(&t, NULL, &start, NULL);
if (error)
errc(1, error, "pthread_create");
waitforbarrier("closer");
sleep(1);
alarm(1);
if (close(fd) == -1)
err(1, "close");
error = pthread_join(t, NULL);
if (error)
errc(1, error, "pthread_join");
return 0;
}
>Fix:
Yes, please!
>Audit-Trail:
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: netbsd-bugs@NetBSD.org
Subject: Re: kern/57659: closing pipe writefd fails to wake concurrent write on same writefd
Date: Sun, 15 Oct 2023 13:16:10 +0000
Correction: not sure if this is broken in HEAD in a hard kernel, but
the whole cv_fdrestart mechanism is completely broken in a rump kernel
so my tests are inconclusive. (Tested on an earlier current hard
kernel and a todayish rump kernel.)
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/57659 CVS commit: src
Date: Sun, 15 Oct 2023 13:22:52 +0000
Module Name: src
Committed By: riastradh
Date: Sun Oct 15 13:22:52 UTC 2023
Modified Files:
src/distrib/sets/lists/debug: mi
src/distrib/sets/lists/tests: mi
src/tests/kernel: Makefile
Added Files:
src/tests/kernel: t_fdrestart.c
Log Message:
t_fdrestart: New test of closing fd with another thread in I/O on it.
Adapted from regress/sys/kern/dislodgefd.
PR kern/57659
To generate a diff of this commit:
cvs rdiff -u -r1.418 -r1.419 src/distrib/sets/lists/debug/mi
cvs rdiff -u -r1.1293 -r1.1294 src/distrib/sets/lists/tests/mi
cvs rdiff -u -r1.75 -r1.76 src/tests/kernel/Makefile
cvs rdiff -u -r0 -r1.1 src/tests/kernel/t_fdrestart.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/57659 CVS commit: src/tests/kernel
Date: Sun, 15 Oct 2023 14:30:52 +0000
Module Name: src
Committed By: riastradh
Date: Sun Oct 15 14:30:52 UTC 2023
Modified Files:
src/tests/kernel: t_fdrestart.c
Log Message:
t_fdrestart: Verify rump_sys_write failed second time around.
PR kern/57659
To generate a diff of this commit:
cvs rdiff -u -r1.1 -r1.2 src/tests/kernel/t_fdrestart.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/57659 CVS commit: src/tests/kernel
Date: Sat, 18 Nov 2023 19:46:55 +0000
Module Name: src
Committed By: riastradh
Date: Sat Nov 18 19:46:55 UTC 2023
Modified Files:
src/tests/kernel: t_fdrestart.c
Log Message:
t_fdrestart: Mark some tests no longer xfail.
Backing out ad's changes last month seemed to fix the symptoms
(although I'm pretty sure this logic is still broken, more to come).
PR kern/57659
To generate a diff of this commit:
cvs rdiff -u -r1.3 -r1.4 src/tests/kernel/t_fdrestart.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.