NetBSD Problem Report #57659

From www@netbsd.org  Sun Oct 15 11:03:46 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 3033D1A9238
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 15 Oct 2023 11:03:46 +0000 (UTC)
Message-Id: <20231015110344.530F71A923A@mollari.NetBSD.org>
Date: Sun, 15 Oct 2023 11:03:44 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: closing pipe writefd fails to wake concurrent write on same writefd
X-Send-Pr-Version: www-1.0

>Number:         57659
>Category:       kern
>Synopsis:       closing pipe writefd fails to wake concurrent write on same writefd
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Oct 15 11:05:00 +0000 2023
>Last-Modified:  Sat Nov 18 19:50:01 +0000 2023
>Originator:     Taylor R Campbell
>Release:        current
>Organization:
The NetBSD Fdation
>Environment:
>Description:
If a write system call on a pipe's writefd is blocking because the pipe is full, a concurrent close system call must wake the write system call and cause it to fail with EBADF.

But it doesn't; the write system call seems to hang indefinitely.

This happens before and after sys_pipe.c 1.165:

https://mail-index.netbsd.org/source-changes/2023/10/13/msg148107.html
>How-To-Repeat:
#include <assert.h>
#include <err.h>
#include <errno.h>
#include <pthread.h>
#include <unistd.h>

pthread_barrier_t barrier;
int fd;

static void
waitforbarrier(const char *caller)
{
	int error;

	error = pthread_barrier_wait(&barrier);
	switch (error) {
	case 0:
	case PTHREAD_BARRIER_SERIAL_THREAD:
		break;
	default:
		errc(1, error, "%s: pthread_barrier_wait", caller);
	}
}

static void *
start(void *cookie)
{
	static const char buf[1024*1024]; /* XXX >BIG_PIPE_SIZE */
	ssize_t nwrit;

	waitforbarrier("user");

	nwrit = write(fd, buf, sizeof(buf));
	if (nwrit != -1)    /* buffer filled, try more */
		nwrit = write(fd, buf, sizeof(buf));
	assert(nwrit == -1);
	assert(errno == EBADF);

	return NULL;
}

int
main(void)
{
	int p[2];
	pthread_t t;
	int error;

	if (pipe(p) == -1)
		err(1, "pipe");
	fd = p[1];

	error = pthread_barrier_init(&barrier, NULL, 2);
	if (error)
		errc(1, error, "pthread_barrier_init");
	error = pthread_create(&t, NULL, &start, NULL);
	if (error)
		errc(1, error, "pthread_create");
	waitforbarrier("closer");
	sleep(1);
	alarm(1);
	if (close(fd) == -1)
		err(1, "close");
	error = pthread_join(t, NULL);
	if (error)
		errc(1, error, "pthread_join");

	return 0;
}

>Fix:
Yes, please!

>Audit-Trail:
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: netbsd-bugs@NetBSD.org
Subject: Re: kern/57659: closing pipe writefd fails to wake concurrent write on same writefd
Date: Sun, 15 Oct 2023 13:16:10 +0000

 Correction: not sure if this is broken in HEAD in a hard kernel, but
 the whole cv_fdrestart mechanism is completely broken in a rump kernel
 so my tests are inconclusive.  (Tested on an earlier current hard
 kernel and a todayish rump kernel.)

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57659 CVS commit: src
Date: Sun, 15 Oct 2023 13:22:52 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sun Oct 15 13:22:52 UTC 2023

 Modified Files:
 	src/distrib/sets/lists/debug: mi
 	src/distrib/sets/lists/tests: mi
 	src/tests/kernel: Makefile
 Added Files:
 	src/tests/kernel: t_fdrestart.c

 Log Message:
 t_fdrestart: New test of closing fd with another thread in I/O on it.

 Adapted from regress/sys/kern/dislodgefd.

 PR kern/57659


 To generate a diff of this commit:
 cvs rdiff -u -r1.418 -r1.419 src/distrib/sets/lists/debug/mi
 cvs rdiff -u -r1.1293 -r1.1294 src/distrib/sets/lists/tests/mi
 cvs rdiff -u -r1.75 -r1.76 src/tests/kernel/Makefile
 cvs rdiff -u -r0 -r1.1 src/tests/kernel/t_fdrestart.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57659 CVS commit: src/tests/kernel
Date: Sun, 15 Oct 2023 14:30:52 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sun Oct 15 14:30:52 UTC 2023

 Modified Files:
 	src/tests/kernel: t_fdrestart.c

 Log Message:
 t_fdrestart: Verify rump_sys_write failed second time around.

 PR kern/57659


 To generate a diff of this commit:
 cvs rdiff -u -r1.1 -r1.2 src/tests/kernel/t_fdrestart.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57659 CVS commit: src/tests/kernel
Date: Sat, 18 Nov 2023 19:46:55 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Nov 18 19:46:55 UTC 2023

 Modified Files:
 	src/tests/kernel: t_fdrestart.c

 Log Message:
 t_fdrestart: Mark some tests no longer xfail.

 Backing out ad's changes last month seemed to fix the symptoms
 (although I'm pretty sure this logic is still broken, more to come).

 PR kern/57659


 To generate a diff of this commit:
 cvs rdiff -u -r1.3 -r1.4 src/tests/kernel/t_fdrestart.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.