NetBSD Problem Report #58929

From www@netbsd.org  Sat Dec 21 21:04:02 2024
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 3BB071A923A
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 21 Dec 2024 21:04:02 +0000 (UTC)
Message-Id: <20241221210400.6F0471A923B@mollari.NetBSD.org>
Date: Sat, 21 Dec 2024 21:04:00 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: POSIX.1-2024 compliance: posix_close, POSIX_CLOSE_RESTART
X-Send-Pr-Version: www-1.0

>Number:         58929
>Category:       kern
>Synopsis:       POSIX.1-2024 compliance: posix_close, POSIX_CLOSE_RESTART
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Dec 21 21:05:00 +0000 2024
>Last-Modified:  Sun Dec 22 04:55:01 +0000 2024
>Originator:     Taylor R Campbell
>Release:        
>Organization:
The CloseBSD Fdation
>Environment:
>Description:
Through POSIX.1-2008, the state of the descriptor after the close() function _fails_ has been specified to be indeterminate:

> If close() is interrupted by a signal that is to be caught, it shall return -1 with errno set to [EINTR] and the state of fildes is unspecified. If an I/O error occurred while reading from or writing to the file system during close(), it may return -1 with errno set to [EIO]; if this error is returned, the state of fildes is unspecified.
> 
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/close.html

On every OS I checked (NetBSD, FreeBSD, OpenBSD, Linux, illumos), the semantics is that after close(), the descriptor is closed, unconditionally, whether close() succeeded or failed -- even if this involves doing some I/O that may be interrupted by a signal which causes EINTR to come flying out of close().

Rather than simply endorse the reasonable semantics that probably every reasonable OS on the planet already implements, POSIX.1-2024 changed that language to:

> If close() is interrupted by a signal that is to be caught, then it is unspecified whether it returns -1 with errno set to [EINTR] and fildes remaining open, or returns -1 with errno set to [EINPROGRESS] and fildes being closed, or returns 0 to indicate successful completion; except that if POSIX_CLOSE_RESTART is defined as 0, then the option of returning -1 with errno set to [EINTR] and fildes remaining open shall not occur.
>
> [...]
>
> The posix_close() function shall be equivalent to the close() function, except with the modifications based on the flag argument as described below. If flag is 0, then posix_close() shall not return -1 with errno set to [EINTR], which implies that fildes will always be closed (except for [EBADF], where fildes was invalid). If flag includes POSIX_CLOSE_RESTART and POSIX_CLOSE_RESTART is defined as a non-zero value, and posix_close() is interrupted by a signal that is to be caught, then posix_close() may return -1 with errno set to [EINTR], in which case fildes shall be left open; however, it is unspecified whether fildes can subsequently be passed to any function except close() or posix_close() without error. If flag is invalid, posix_close() may fail with errno set to [EINVAL], but shall otherwise behave as if flag had been 0 and close fildes.
>
> [...]
>
> If POSIX_CLOSE_RESTART is zero, the close() function shall not return an [EINTR] error.
>
> https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/functions/posix_close.html

This probably caters only to such obscure and likely indefensible niche proprietary operating systems that I doubt whether many applications in the real world will ever call posix_close(), especially given the mental gymnastics needed to understand the language describing it.  But it's there in POSIX so it would be reasonable for us to implement it, or if we don't, then or at least to document why we don't.
>How-To-Repeat:
read POSIX.1-2024 and take some aspirin for the headache it will cause
>Fix:
Since close() (a) may return EINTR, and (b) is unconditionally final, i.e., cannot be restarted even if it returns EINTR, we cannot simply #define POSIX_CLOSE_RESTART 0.

So, perhaps we can do this:

#define POSIX_CLOSE_RESTART 1

int
posix_close(int fildes, int flag)
{

	if (close(fildes) == -1 && errno != EINTR)
		return -1;

	case (flag) {
	case 0:
	case POSIX_CLOSE_RESTART:
		break;
	default:
		errno = EINVAL;
		return -1;
	}

	return 0;
}

This meets the following criteria:

> If flag is 0, then posix_close() shall not return -1 with errno set to [EINTR], which implies that fildes will always be closed (except for [EBADF], where fildes was invalid).

This posix_close() definition does not fail with EINTR.  fildes is always closed.

> If flag includes POSIX_CLOSE_RESTART and POSIX_CLOSE_RESTART is defined as a non-zero value, and posix_close() is interrupted by a signal that is to be caught, then posix_close() may return -1 with errno set to [EINTR], in which case fildes shall be left open; [...]

POSIX_CLOSE_RESTART is defined as a non-zero value, so this clause applies.  And while this clause says posix_close() MAY fail with EINTR, it does not say posix_close() MUST fail with EINTR if interrupted by a signal.  This posix_close() definition succeeds (returns 0) if interrupted by a signal, so it meets the requirement by never failing with EINTR and never leaving fildes open.

> If flag is invalid, posix_close() may fail with errno set to [EINVAL], but shall otherwise behave as if flag had been 0 and close fildes.

This checks flag for validity _after_ close(), so fildes is closed even if flag is invalid.  (EBADF and other possible close() failures like EIO take precedence over EINVAL.)
This posix_close() definition closes fildes even if the flag is invalid.

>Audit-Trail:
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Cc: 
Subject: Re: kern/58929: POSIX.1-2024 compliance: posix_close, POSIX_CLOSE_RESTART
Date: Sat, 21 Dec 2024 22:07:19 +0000

 Correction: Our current close() -- and probably all existing close()
 implementations outside buggy proprietary Unix variants -- is not
 compliant with POSIX.1-2024, because if some underlying I/O is
 interrupted by a signal it can fail with EINTR even though the
 descriptor is unconditionally closed.  That is now forbidden (emphasis
 added):

 > If close() is interrupted by a signal that is to be caught, then it
 > is unspecified whether it returns -1 with errno set to [EINTR] and
 > fildes _remaining open_, or returns -1 with errno set to
 > [EINPROGRESS] and fildes being closed, or returns 0 to indicate
 > successful completion[...]
 > 
 > RATIONALE
 > 
 > [...] Note that the standard requires that close() and posix_close()
 > _must leave fildes open after [EINTR]_ (in the cases where [EINTR]
 > is permitted)

 So, in order to deal with POSIX.1-2024 having addressed the problem of
 its own failure to adequately specify semantics by breaking
 compatibility with every major OS on the planet, it looks like we'll
 have to do some symbol magic under _POSIX_C_SOURCE -- something like
 this:

 /* include/unistd.h */

 #if (_POSIX_C_SOURCE >= 202408L) && !defined(_NETBSD_SOURCE)
 int	close(int) __RENAME(__posixly_correct_close);
 int	posix_close(int, int);
 #define	POSIX_CLOSE_RESTART	0
 #else
 int	close(int);
 #endif

 /* lib/libc/sys/posixly_correct_close.c */

 int
 __posixly_correct_close(int d)
 {
 	const int ret = close(d);

 	if (ret == -1 && errno == EINTR)
 		errno = EINPROGRESS;

 	return ret;
 }

 int
 posix_close(int d, int flag)
 {
 	const int ret = __posixly_correct_close(d);

 	_DIAGASSERT(ret == 0 || errno != EINTR);

 	if (ret == 0 && flag != 0) {
 		errno = EINVAL;
 		ret = -1;
 	}
 	return ret;
 }

 What a colossal waste of effort -- and attack surface for new bugs to
 crawl onto -- to force everyone else to bend over backwards to
 accommodate bugs in proprietary implementations of close().

From: Robert Elz <kre@munnari.OZ.AU>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/58929: POSIX.1-2024 compliance: posix_close, POSIX_CLOSE_RESTART
Date: Sun, 22 Dec 2024 07:49:25 +0700

     Date:        Sat, 21 Dec 2024 22:07:19 +0000
     From:        Taylor R Campbell <riastradh@NetBSD.org>
     Message-ID:  <20241221220723.04ECB84E92@mail.netbsd.org>

   | Correction: Our current close() -- and probably all existing close()
   | implementations outside buggy proprietary Unix variants -- is not
   | compliant with POSIX.1-2024, because if some underlying I/O is
   | interrupted by a signal it can fail with EINTR even though the
   | descriptor is unconditionally closed.  That is now forbidden (emphasis
   | added):

 Can it really?   I haven't looked at our implementation (kernel)
 but it used to be the case that even if close() slept for some time,
 (like waiting for tty output to drain) and a signal occurred, close()
 would still return 0.

 But if not, surely the simple change here is to make that happen, not
 to work around it in libc - just have sys_close() never return EINTR.

 We have to function correctly in this case anyway, as we do close the
 fd, so another close(fd) will just return EBADF - so anything draining
 still in the kernel output queues needs to be properly handled without
 any more userland assistance.

 kre
   |
   | > If close() is interrupted by a signal that is to be caught, then it
   | > is unspecified whether it returns -1 with errno set to [EINTR] and
   | > fildes _remaining open_, or returns -1 with errno set to
   | > [EINPROGRESS] and fildes being closed, or returns 0 to indicate
   | > successful completion[...]
   | > 
   | > RATIONALE
   | > 
   | > [...] Note that the standard requires that close() and posix_close()
   | > _must leave fildes open after [EINTR]_ (in the cases where [EINTR]
   | > is permitted)
   |
   | So, in order to deal with POSIX.1-2024 having addressed the problem of
   | its own failure to adequately specify semantics by breaking
   | compatibility with every major OS on the planet, it looks like we'll
   | have to do some symbol magic under _POSIX_C_SOURCE -- something like
   | this:
   |
   | /* include/unistd.h */
   |
   | #if (_POSIX_C_SOURCE >= 202408L) && !defined(_NETBSD_SOURCE)
   | int	close(int) __RENAME(__posixly_correct_close);
   | int	posix_close(int, int);
   | #define	POSIX_CLOSE_RESTART	0
   | #else
   | int	close(int);
   | #endif
   |
   | /* lib/libc/sys/posixly_correct_close.c */
   |
   | int
   | __posixly_correct_close(int d)
   | {
   | 	const int ret = close(d);
   |
   | 	if (ret == -1 && errno == EINTR)
   | 		errno = EINPROGRESS;
   |
   | 	return ret;
   | }
   |
   | int
   | posix_close(int d, int flag)
   | {
   | 	const int ret = __posixly_correct_close(d);
   |
   | 	_DIAGASSERT(ret == 0 || errno != EINTR);
   |
   | 	if (ret == 0 && flag != 0) {
   | 		errno = EINVAL;
   | 		ret = -1;
   | 	}
   | 	return ret;
   | }
   |
   | What a colossal waste of effort -- and attack surface for new bugs to
   | crawl onto -- to force everyone else to bend over backwards to
   | accommodate bugs in proprietary implementations of close().
   |


From: Robert Elz <kre@munnari.OZ.AU>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/58929: POSIX.1-2024 compliance: posix_close, POSIX_CLOSE_RESTART
Date: Sun, 22 Dec 2024 10:45:54 +0700

     Date:        Sat, 21 Dec 2024 22:07:19 +0000
     From:        Taylor R Campbell <riastradh@NetBSD.org>
     Message-ID:  <20241221220723.04ECB84E92@mail.netbsd.org>


   | So, in order to deal with POSIX.1-2024 having addressed the problem of
   | its own failure to adequately specify semantics by breaking
   | compatibility with every major OS on the planet, it looks like we'll
   | have to do some symbol magic under _POSIX_C_SOURCE -- something like
   | this:

 Surely all we need is...

   | /* include/unistd.h */
   |
   | #if (_POSIX_C_SOURCE >= 202408L) && !defined(_NETBSD_SOURCE)
   | int	posix_close(int, int);
   | #define	POSIX_CLOSE_RESTART	0
   | #endif
   | int	close(int);

   | int
   | posix_close(int d, int flag)
   | {
   | 	const int ret = close(d);
   |
   | 	_DIAGASSERT(ret == 0 || errno != EINTR);
   |
   | 	if (ret == 0 && flag != 0) {
   | 		errno = EINVAL;
   | 		ret = -1;
   | 	}
   | 	return ret;
   | }

 Along with:

 Index: sys_descrip.c
 ===================================================================
 RCS file: /cvsroot/src/sys/kern/sys_descrip.c,v
 retrieving revision 1.40
 diff -u -r1.40 sys_descrip.c
 --- sys_descrip.c	16 Apr 2022 07:59:02 -0000	1.40
 +++ sys_descrip.c	22 Dec 2024 03:36:00 -0000
 @@ -519,7 +519,7 @@
  		printf("%s[%d]: close(%d) returned ERESTART\n",
  		    l->l_proc->p_comm, (int)l->l_proc->p_pid, fd);
  #endif
 -		error = EINTR;
 +		error = 0;
  	}

  	return error;

 where that 0 could be EINPROGRESS if anyone really believes that
 conveys any information anyone would ever use (what?  how?)

 kre

From: Taylor R Campbell <riastradh@NetBSD.org>
To: Robert Elz <kre@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/58929: POSIX.1-2024 compliance: posix_close, POSIX_CLOSE_RESTART
Date: Sun, 22 Dec 2024 04:21:24 +0000

 > Date: Sun, 22 Dec 2024 07:49:25 +0700
 > From: Robert Elz <kre@munnari.OZ.AU>
 > 
 >     Date:        Sat, 21 Dec 2024 22:07:19 +0000
 >     From:        Taylor R Campbell <riastradh@NetBSD.org>
 >     Message-ID:  <20241221220723.04ECB84E92@mail.netbsd.org>
 > 
 >   | Correction: Our current close() -- and probably all existing close()
 >   | implementations outside buggy proprietary Unix variants -- is not
 >   | compliant with POSIX.1-2024, because if some underlying I/O is
 >   | interrupted by a signal it can fail with EINTR even though the
 >   | descriptor is unconditionally closed.  That is now forbidden (emphasis
 >   | added):
 > 
 > Can it really?   I haven't looked at our implementation (kernel)
 > but it used to be the case that even if close() slept for some time,
 > (like waiting for tty output to drain) and a signal occurred, close()
 > would still return 0.

 Mount an nfs volume with `-i', open a file, unplug the network, write
 to the file, close it, and hit ^C.  I bet close(2) will fail with
 EINTR.  And I bet it will do this on most other operating systems too.

 > But if not, surely the simple change here is to make that happen, not
 > to work around it in libc - just have sys_close() never return EINTR.

 Would you like to volunteer to audit every device driver and file
 system, and every subroutine call they all make, to verify this?

 Gigantic waste of effort for a bad decision in POSIX which I'm
 inclined to say we should just flout because it's so bad.

 Current semantics -- which Linux, Solaris, and other BSDs already
 agree on -- is extremely simple and reliable: after close(d), d is
 closed.  No ifs or buts about it.  POSIX.1-2024 semantics is
 hopelessly confusing for users and implementors alike, and is
 incompatible with the vast majority of existing practice -- a mistake.

 If it was really necessary for some political reason not to simply
 endorse the simple rule `close() is final' (much like fclose() is
 already specified to be final), then POSIX could have added a simple
 feature macro like

 #define	_POSIX_CLOSE_IS_FINAL,

 or
 #define	_POSIX_CLOSE_IS_UNRELIABLE
 to accommodate the buggy proprietary Unix systems that motivated this
 whole mess.  Why should everyone else have to change their semantics
 of closing a file descriptor to accommodate their design bugs?

From: Robert Elz <kre@munnari.OZ.AU>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/58929: POSIX.1-2024 compliance: posix_close, POSIX_CLOSE_RESTART
Date: Sun, 22 Dec 2024 11:50:26 +0700

     Date:        Sun, 22 Dec 2024 04:21:24 +0000
     From:        Taylor R Campbell <riastradh@NetBSD.org>
     Message-ID:  <20241222042128.8CA8185574@mail.netbsd.org>

   | Mount an nfs volume with `-i', open a file, unplug the network, write
   | to the file, close it, and hit ^C.  I bet close(2) will fail with
   | EINTR.

 It quite likely might, but it shouldn't.   EINTR is supposed to mean
 "do it again (if you want to)" - or more verbosely "While waiting to
 perform the operation you requested, a signal was sent to you.  Rather
 than keep that signal pending for perhaps a long time, we are abandoning
 your request, so you can process the signal.  After you have done that,
 should you still desire to perform this operation, repeat the call".

 Note that in cases like read(), write() etc, once a single byte has
 been transferred to/from a device (or network connection, etc) EINTR
 is no longer allowed, rather the call returns successfully with a
 short count - as simply repeating the call would result in lost/duplicated
 data.   EINTR happens only when the system call hasn't changed the
 state of the process, or should.

   | Would you like to volunteer to audit every device driver and file
   | system, and every subroutine call they all make, to verify this?

 No, nor do I need to.   I just looked at the close system call, and
 it does, as you say, and has more or less been true forever:

   | Current semantics -- which Linux, Solaris, and other BSDs already
   | agree on -- is extremely simple and reliable: after close(d), d is
   | closed.

 Exactly.   The close has happened.   So why are we returning EINTR ?

 That makes no sense at all.

   | No ifs or buts about it.  POSIX.1-2024 semantics is
   | hopelessly confusing for users and implementors alike, and is
   | incompatible with the vast majority of existing practice -- a mistake.

 It is attempting (like almost always) to inform users of what they
 may encounter, while still making the system robust.

 For any system with the "close is final" semantic, it is really very
 simple, just don't ever return EINTR (which would indicate the close
 didn't occur) and all is good.

   | Why should everyone else have to change their semantics
   | of closing a file descriptor to accommodate their design bugs?

 No-one has to change (almost) anything, except to avoid returning an
 obviously bogus error code.

 What do you expect application code that receives EINTR from close()
 to do with that information as things are currently?

 kre

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.