NetBSD Problem Report #51024

From www@NetBSD.org  Wed Mar 30 09:55:08 2016
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 44AA47A46A
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 30 Mar 2016 09:55:08 +0000 (UTC)
Message-Id: <20160330095507.640247A64F@mollari.NetBSD.org>
Date: Wed, 30 Mar 2016 09:55:07 +0000 (UTC)
From: joerg.schilling@fokus.fraunhofer.de
Reply-To: joerg.schilling@fokus.fraunhofer.de
To: gnats-bugs@NetBSD.org
Subject: NetBSD misses the mandatory waitid() syscall
X-Send-Pr-Version: www-1.0

>Number:         51024
>Category:       standards
>Synopsis:       NetBSD misses the mandatory waitid() syscall
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    standards-manager
>State:          closed
>Class:          change-request
>Submitter-Id:   net
>Arrival-Date:   Wed Mar 30 10:00:00 +0000 2016
>Closed-Date:    Fri Nov 05 06:09:52 +0000 2021
>Last-Modified:  Fri Nov 05 06:09:52 +0000 2021
>Originator:     Jörg Schilling
>Release:        any
>Organization:
Fraunhofer FOKUS
>Environment:
any
>Description:
Since the mid-1990s, the waitid() call (that first appeared in 1998
with SVr4) was added to the single unix specification.

This call is missing in NetBSD even though this is a mandatory interface.

Note that the relevant standard text was correct when the interface
has been introduced into the standard in the mid 1990s. Unfortunately,
this correct text was damaged in the late 1990s. The related POSIX
standard was fixed in 2012 and the new text was approved in 2013.

See the related entry in the bugtracking system:

http://austingroupbugs.net/view.php?id=594

BTW: FreeBSD had waitid() already but implemented is incorrectly
with a masked with 0xFF exit code. FreeBSD fixed their bug in May
2015 within 20 hours after making a bug report.
>How-To-Repeat:
Try to compile a program that calls waitid()
>Fix:
Introduce waitid() and make sure that the siginfo_t * delivers 
the full 32 bits from the exit() call of the child to waitid()
and the SIGCHLD signal handler.

>Release-Note:

>Audit-Trail:

State-Changed-From-To: open->feedback
State-Changed-By: wiz@NetBSD.org
State-Changed-When: Mon, 04 Apr 2016 11:25:27 +0000
State-Changed-Why:
Christos fixed this, can you confirm it's conformant now?


From: Joerg Schilling <Joerg.Schilling@fokus.fraunhofer.de>
To: <wiz@netbsd.org>, <standards-manager@netbsd.org>, <netbsd-bugs@netbsd.org>,
        <gnats-bugs@netbsd.org>, <gnats-admin@netbsd.org>
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Mon, 4 Apr 2016 14:48:49 +0200

 <wiz@netbsd.org> wrote:

 > Synopsis: NetBSD misses the mandatory waitid() syscall
 >
 > State-Changed-From-To: open->feedback
 > State-Changed-By: wiz@NetBSD.org
 > State-Changed-When: Mon, 04 Apr 2016 11:25:27 +0000
 > State-Changed-Why:
 > Christos fixed this, can you confirm it's conformant now?

 Thank you for your work!

 I do not have netbsd installed, so I checked the sources.

 1)	I do not see any changes in the signal handling.
 	Is siginfo_t filled correctly in for the signalhandler
 	that handles SIGCHLD? It should deliver the same 
 	information as waitid() delivers, including the full 32 bits
 	from the exit() call in the child.

 2)	kern_exit.c does not look correct (complete).
 	If you really like to keep the deprecated wait status in the kernel,
 	you should use a more complete conversion code to convert the
 	deprecated wait status into siginfo_t. A correct piece of code
 	is in the compatibility code of the recent Bourne Shell:

 		http://schilytools.sourceforge.net/bosh.html

 	Source code is in the schily tools:

 		https://sourceforge.net/projects/schilytools/files/

 	Use latest *schily-*' tar archive.

 	Check the bottom of the file sh/jobs.c


 3)	From what I can tell, the new netbsd waitid() code only delivers
 	the low 8 bits from the exit() parameter. This is incorrect.

 	waitid() was introduced in 1989 by AT&T for SVr4 and defines to
 	deliver all 32 bits from the exit() call of the child.

 	Given that POSIX does not introcude own definitions but just
 	standardizes existing implementations, it should be obvious that
 	the SVID3 interface definitions apply to the POSIX waitid() call.

 	As a hint, here is the currenty valid relevant POSIX text:

 	**********
 	    the least significant 8 bits (that is, status & 0377) shall be 
 	    available from wait() and waitpid(); the full value shall 
 	    be available from waitid() and in the siginfo_t passed to a 
 	    signal handler for SIGCHLD. 
 	**********

 I hope that you are interested in full POSIX compliance with waitid() as
 waitid() is part of the basic set of interfaces. 

 Jörg

 -- 
  EMail:joerg@schily.net                    (home) Jörg Schilling D-13353 Berlin
        joerg.schilling@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
  URL:  http://cdrecord.org/private/ http://sourceforge.net/projects/schilytools/files/'

From: christos@zoulas.com (Christos Zoulas)
To: Joerg Schilling <Joerg.Schilling@fokus.fraunhofer.de>, <wiz@netbsd.org>, 
	<standards-manager@netbsd.org>, <netbsd-bugs@netbsd.org>, 
	<gnats-bugs@netbsd.org>, <gnats-admin@netbsd.org>
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Mon, 4 Apr 2016 10:08:07 -0400

 On Apr 4,  2:48pm, Joerg.Schilling@fokus.fraunhofer.de (Joerg Schilling) wrote:
 -- Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall

 | Thank you for your work!
 | 
 | I do not have netbsd installed, so I checked the sources.
 | 
 | 1)	I do not see any changes in the signal handling.
 | 	Is siginfo_t filled correctly in for the signalhandler
 | 	that handles SIGCHLD? It should deliver the same 
 | 	information as waitid() delivers, including the full 32 bits
 | 	from the exit() call in the child.

 Do you have the reference where it says it needs to preserve all 32 bits
 of exit code? I have not found one.

 | 2)	kern_exit.c does not look correct (complete).
 | 	If you really like to keep the deprecated wait status in the kernel,
 | 	you should use a more complete conversion code to convert the
 | 	deprecated wait status into siginfo_t. A correct piece of code
 | 	is in the compatibility code of the recent Bourne Shell:
 | 
 | 		http://schilytools.sourceforge.net/bosh.html
 | 
 | 	Source code is in the schily tools:
 | 
 | 		https://sourceforge.net/projects/schilytools/files/
 | 
 | 	Use latest *schily-*' tar archive.
 | 
 | 	Check the bottom of the file sh/jobs.c

 I'll take a look.

 | 3)	From what I can tell, the new netbsd waitid() code only delivers
 | 	the low 8 bits from the exit() parameter. This is incorrect.
 | 
 | 	waitid() was introduced in 1989 by AT&T for SVr4 and defines to
 | 	deliver all 32 bits from the exit() call of the child.
 | 
 | 	Given that POSIX does not introcude own definitions but just
 | 	standardizes existing implementations, it should be obvious that
 | 	the SVID3 interface definitions apply to the POSIX waitid() call.
 | 
 | 	As a hint, here is the currenty valid relevant POSIX text:
 | 
 | 	**********
 | 	    the least significant 8 bits (that is, status & 0377) shall be 
 | 	    available from wait() and waitpid(); the full value shall 
 | 	    be available from waitid() and in the siginfo_t passed to a 
 | 	    signal handler for SIGCHLD. 
 | 	**********
 | 
 | I hope that you are interested in full POSIX compliance with waitid() as
 | waitid() is part of the basic set of interfaces. 

 Yes, and I know how to do that; the question is, where is it spelled in
 the docs that the full 32 bits of exit code need to be preserved?

 christos

From: christos@zoulas.com (Christos Zoulas)
To: Joerg Schilling <Joerg.Schilling@fokus.fraunhofer.de>, <wiz@netbsd.org>, 
	<standards-manager@netbsd.org>, <netbsd-bugs@netbsd.org>, 
	<gnats-bugs@netbsd.org>, <gnats-admin@netbsd.org>
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Mon, 4 Apr 2016 11:21:03 -0400

 On Apr 4, 10:08am, christos@zoulas.com (Christos Zoulas) wrote:
 -- Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall

 | Do you have the reference where it says it needs to preserve all 32 bits
 | of exit code? I have not found one.

 The reason I am asking all this is because although FreeBSD has made the
 change to pass all 32 bits in this particular instance, they have overloaded
 the meaning of si_status, so that now the user code needs to understand
 what context we are in to interpret it properly. I.e.

 - it could be a full 32 bit exit code
 - it could be a signal number
 - it could be a composite 'wait status' (as it is always now with NetBSD,
   except in the waitid case...).

 kern_exit.c:                    siginfo->si_status = WTERMSIG(p->p_xsig);
 kern_exit.c:                    siginfo->si_status = WTERMSIG(p->p_xsig);
 kern_exit.c:                    siginfo->si_status = p->p_xexit;
 kern_exit.c:                            siginfo->si_status = p->p_xsig;
 kern_exit.c:                            siginfo->si_status = p->p_xsig;
 kern_exit.c:                            siginfo->si_status = SIGCONT;
 kern_sig.c:             p->p_ksi->ksi_status = status;

 christos

From: Joerg Schilling <Joerg.Schilling@fokus.fraunhofer.de>
To: <wiz@netbsd.org>, <standards-manager@netbsd.org>, <netbsd-bugs@netbsd.org>,
        <gnats-bugs@netbsd.org>, <gnats-admin@netbsd.org>,
        <christos@zoulas.com>
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Mon, 4 Apr 2016 17:49:34 +0200

 Christos Zoulas <christos@zoulas.com> wrote:

 > On Apr 4, 10:08am, christos@zoulas.com (Christos Zoulas) wrote:
 > -- Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall
 >
 > | Do you have the reference where it says it needs to preserve all 32 bits
 > | of exit code? I have not found one.
 >
 > The reason I am asking all this is because although FreeBSD has made the
 > change to pass all 32 bits in this particular instance, they have overloaded
 > the meaning of si_status, so that now the user code needs to understand
 > what context we are in to interpret it properly. I.e.
 >
 > - it could be a full 32 bit exit code
 > - it could be a signal number
 > - it could be a composite 'wait status' (as it is always now with NetBSD,
 >   except in the waitid case...).

 Does the last apply to the signal handler for SIGCHLD only?
 It is wrong in any case. Only the first two cases are correct.

 > kern_exit.c:                    siginfo->si_status = WTERMSIG(p->p_xsig);
 > kern_exit.c:                    siginfo->si_status = WTERMSIG(p->p_xsig);
 > kern_exit.c:                    siginfo->si_status = p->p_xexit;
 > kern_exit.c:                            siginfo->si_status = p->p_xsig;
 > kern_exit.c:                            siginfo->si_status = p->p_xsig;
 > kern_exit.c:                            siginfo->si_status = SIGCONT;
 > kern_sig.c:             p->p_ksi->ksi_status = status;

 Thank you for this hint, it seems that I would need to check FreeBSD. 
 Fortunately I have access to an installation with a recent version.

 si_status holds the exit() parameter in case of a normal exit (CLD_EXITED).

 It holds the signal number in case the child has been terminated as a result of 
 a signal: CLD_DUMPED, CLD_KILLED, CLD_TRAPPED, CLD_STOPPED. 

 In case of CLD_CONTINUED, the status is 0.

 So in general: si_status either contains the exit code or a signal number.

 See the waitid() emulation in sh/jobs.c. The exit code of course cannot be 
 retrieved from the historical wait() status value as this only holds the low 8 
 bits. What you can see from the emulation code in sh/jobs.c is what is expected 
 to be under what condition in si_code.

 If you ever implement more than 255 signals, WTERMSIG() will not work as well.

 Jörg

 -- 
  EMail:joerg@schily.net                    (home) Jörg Schilling D-13353 Berlin
        joerg.schilling@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
  URL:  http://cdrecord.org/private/ http://sourceforge.net/projects/schilytools/files/'

From: Joerg Schilling <Joerg.Schilling@fokus.fraunhofer.de>
To: <wiz@netbsd.org>, <standards-manager@netbsd.org>, <netbsd-bugs@netbsd.org>,
        <gnats-bugs@netbsd.org>, <gnats-admin@netbsd.org>,
        <christos@zoulas.com>
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Mon, 4 Apr 2016 16:56:16 +0200

 Christos Zoulas <christos@zoulas.com> wrote:

 > On Apr 4,  2:48pm, Joerg.Schilling@fokus.fraunhofer.de (Joerg Schilling) wrote:
 > -- Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall
 >
 > | Thank you for your work!
 > | 
 > | I do not have netbsd installed, so I checked the sources.
 > | 
 > | 1)	I do not see any changes in the signal handling.
 > | 	Is siginfo_t filled correctly in for the signalhandler
 > | 	that handles SIGCHLD? It should deliver the same 
 > | 	information as waitid() delivers, including the full 32 bits
 > | 	from the exit() call in the child.
 >
 > Do you have the reference where it says it needs to preserve all 32 bits
 > of exit code? I have not found one.

 The description is in the current POSIX standard wich is called: 
 	"ISSUE 7 tc2".

 The text applies to (_Exit() and _exit()) starting at page 585
 line 19365 and on page 827 starting at line 26976 (this is the exit() 
 description):

     The value of status may be 0, EXIT_SUCCESS, EXIT_FAILURE, or any other 
     value, though only the least significant 8 bits (that is, status & 0377)
     shall be available from wait() and waitpid(); the full value shall
     be available from waitid() and in the siginfo_t passed to a
     signal handler for SIGCHLD.

 See also http://austingroupbugs.net/view.php?id=594 that was mentioned in the 
 original bug report. The original text from SUSV1 was correct, but a bug has 
 been introduced in the POSIX text in the late 1990s. This bug was fixed on 
 August 8, 2012. Given that POSIX standardized the AT&T waitid() interface from
 1989, waitid() always required to return all 32 bits from exit().

 > | 3)	From what I can tell, the new netbsd waitid() code only delivers
 > | 	the low 8 bits from the exit() parameter. This is incorrect.
 > | 
 > | 	waitid() was introduced in 1989 by AT&T for SVr4 and defines to
 > | 	deliver all 32 bits from the exit() call of the child.
 > | 
 > | 	Given that POSIX does not introcude own definitions but just
 > | 	standardizes existing implementations, it should be obvious that
 > | 	the SVID3 interface definitions apply to the POSIX waitid() call.
 > | 
 > | 	As a hint, here is the currenty valid relevant POSIX text:
 > | 
 > | 	**********
 > | 	    the least significant 8 bits (that is, status & 0377) shall be 
 > | 	    available from wait() and waitpid(); the full value shall 
 > | 	    be available from waitid() and in the siginfo_t passed to a 
 > | 	    signal handler for SIGCHLD. 
 > | 	**********
 > | 
 > | I hope that you are interested in full POSIX compliance with waitid() as
 > | waitid() is part of the basic set of interfaces. 
 >
 > Yes, and I know how to do that; the question is, where is it spelled in
 > the docs that the full 32 bits of exit code need to be preserved?

 See above, I mentioned the current text already in the original report and in 
 my last mail.

 For people who are not actively involved in the standardization, the current 
 text will be available as a whole aprox. in mid-summer 2016. This is because 
 of the official procedure requires a review by IEEE after the OpenGroup 
 presented a new standard.

 Jörg

 -- 
  EMail:joerg@schily.net                    (home) Jörg Schilling D-13353 Berlin
        joerg.schilling@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
  URL:  http://cdrecord.org/private/ http://sourceforge.net/projects/schilytools/files/'

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Wed, 20 Apr 2016 08:50:11 +0000

 On Mon, Apr 04, 2016 at 04:10:01PM +0000, Joerg Schilling wrote:
  >    The value of status may be 0, EXIT_SUCCESS, EXIT_FAILURE, or any other 
  >    value, though only the least significant 8 bits (that is, status & 0377)
  >    shall be available from wait() and waitpid(); the full value shall
  >    be available from waitid() and in the siginfo_t passed to a
  >    signal handler for SIGCHLD.

 Am I supposed to interpret this to *prohibit* wait() from returning
 more than 8 bits of status?

 That is... stupid.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Joerg Schilling <Joerg.Schilling@fokus.fraunhofer.de>
To: <standards-manager@netbsd.org>, <netbsd-bugs@netbsd.org>,
        <gnats-bugs@netbsd.org>, <gnats-admin@netbsd.org>
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Wed, 20 Apr 2016 11:45:15 +0200

 David Holland <dholland-bugs@netbsd.org> wrote:

 > The following reply was made to PR standards/51024; it has been noted by GNATS.
 >
 > From: David Holland <dholland-bugs@netbsd.org>
 > To: gnats-bugs@NetBSD.org
 > Cc: 
 > Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
 > Date: Wed, 20 Apr 2016 08:50:11 +0000
 >
 >  On Mon, Apr 04, 2016 at 04:10:01PM +0000, Joerg Schilling wrote:
 >   >    The value of status may be 0, EXIT_SUCCESS, EXIT_FAILURE, or any other 
 >   >    value, though only the least significant 8 bits (that is, status & 0377)
 >   >    shall be available from wait() and waitpid(); the full value shall
 >   >    be available from waitid() and in the siginfo_t passed to a
 >   >    signal handler for SIGCHLD.
 >  
 >  Am I supposed to interpret this to *prohibit* wait() from returning
 >  more than 8 bits of status?

 Do you confuse "status" with the exit status of the child?

 wait() and waitpid() are deprecated historic interfaces that return a pointer 
 to 16 bits of "stat-loc" data. The interface they implement was never designed 
 to return more than 8 bits of the exit status from the child.

 The POSIX standard indeed disallows WEXITSTATUS(statloc) to return more than 8 
 bits from the child's exit code and even when you try to make use of the fact 
 that stat-loc was and is an "int", you would not be able to include more than 
 24 bits from the child's exit code in that int.

 NOTE: while the definition of the behavior for WEXITSTATUS() could have been 
 different, there are some people who are in fear that the shell could 
 unexpectedly report a value != 0 in $? by default when the child calls 
 exit(256). 

 BTW: here is the history of the interfaces:

 In 1980, a group of former AT&T employees created "Charles River Data Systems" 
 that offered "UNOS", the first UNIX clone. This OS did come with an interface 
 called "cwait":

 	int cwait(pidp,statp)
 	  int *pidp;            /* where to fill in pid */
 	  int *statp;           /* where to fill in status */

 	Returns  the  following  values, depending on  why  cwait()  came
 	back:

 	     < 0 error (no children)
 	      0  normal termination, exit code of child
 	      1  ^C
 	      2  killed
 	      3  trap, particular trap in status
 	      4  suspended, can be resumed or killed
 	      5  exec failure, standard system error code in status
 	      6  syserr, error detected in kernel, standard error code


 This interface already returned the full 32 bits from the child's exit code.

 Later in 1989, AT&T added waitid() to UNIX for SVr4 and offered a very similar 
 interface to "cwait". It just moved the exit reason, the pid and the child exit 
 status into a structgure.

 Jörg

 -- 
  EMail:joerg@schily.net                    (home) Jörg Schilling D-13353 Berlin
        joerg.schilling@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
  URL:  http://cdrecord.org/private/ http://sourceforge.net/projects/schilytools/files/'

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, standards-manager@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, 
	joerg.schilling@fokus.fraunhofer.de
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Wed, 20 Apr 2016 08:15:10 -0400

 On Apr 20,  8:55am, dholland-bugs@netbsd.org (David Holland) wrote:
 -- Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall

 |  Am I supposed to interpret this to *prohibit* wait() from returning
 |  more than 8 bits of status?
 |  
 |  That is... stupid.

 There is no place to put it by the ABI...

 christos

From: Joerg Schilling <Joerg.Schilling@fokus.fraunhofer.de>
To: <standards-manager@netbsd.org>, <netbsd-bugs@netbsd.org>,
        <gnats-bugs@netbsd.org>, <gnats-admin@netbsd.org>
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Wed, 20 Apr 2016 14:31:43 +0200

 Christos Zoulas <christos@zoulas.com> wrote:

 > The following reply was made to PR standards/51024; it has been noted by GNATS.
 >
 > From: christos@zoulas.com (Christos Zoulas)
 > To: gnats-bugs@NetBSD.org, standards-manager@netbsd.org, 
 > 	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, 
 > 	joerg.schilling@fokus.fraunhofer.de
 > Cc: 
 > Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
 > Date: Wed, 20 Apr 2016 08:15:10 -0400
 >
 >  On Apr 20,  8:55am, dholland-bugs@netbsd.org (David Holland) wrote:
 >  -- Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall
 >  
 >  |  Am I supposed to interpret this to *prohibit* wait() from returning
 >  |  more than 8 bits of status?
 >  |  
 >  |  That is... stupid.
 >  
 >  There is no place to put it by the ABI...

 Well, not really. OS X puts 16 more bits from the child's exit code in and thus 
 returns up to 24 bits from the exit code if you shift down the upper 16 bits 
 from the int and add the result to the low 8 bits.

 Given that this is not enough for POSIX, it can be seen as a nice hack only.

 Jörg

 -- 
  EMail:joerg@schily.net                    (home) Jörg Schilling D-13353 Berlin
        joerg.schilling@fokus.fraunhofer.de (work) Blog: http://schily.blogspot.com/
  URL:  http://cdrecord.org/private/ http://sourceforge.net/projects/schilytools/files/'

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Wed, 20 Apr 2016 14:06:08 +0000

 On Wed, Apr 20, 2016 at 12:20:00PM +0000, Christos Zoulas wrote:
  >  |  Am I supposed to interpret this to *prohibit* wait() from returning
  >  |  more than 8 bits of status?
  >  |  
  >  |  That is... stupid.
  >  
  >  There is no place to put it by the ABI...

 It's easy to construct a different encoding of WIF* that gives you 30
 bits of exit code. It would be stupid for that to be *against* POSIX
 rather than merely not being required.

 There isn't any compelling reason for NetBSD to change, but there are
 obvious reasons for any new implementation to favor such an encoding.

 Wouldn't exactly be the first POSIX_MISTAKE though.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Thu, 21 Apr 2016 08:47:18 +0700

     Date:        Wed, 20 Apr 2016 08:55:01 +0000 (UTC)
     From:        David Holland <dholland-bugs@netbsd.org>
     Message-ID:  <20160420085501.749AD7AA93@mollari.NetBSD.org>

   |  Am I supposed to interpret this to *prohibit* wait() from returning
   |  more than 8 bits of status?
   |  
   |  That is... stupid.

 In most situations I would agree with that reasoning, but here I am not
 sure that I do.

 We need to work out just what the status is for, and how it is intended
 to be used.

 Basic POSIX allows just two values to be passed to exit() (by applications)
 EXIT_SUCCESS and EXIT_FAILURE (strictly, 3, with 0 as well, but if there's
 a system where EXIT_SUCCESS and 0 are not the same thing I have yet to see it.)

 Allowing applications to use other values (even just another 254 of them)
 is an extension (one of the posix options.)


 If the exit status is just to say whether a program worked or not, then
 two values is enough.   It is nice to have a few more than that so it
 is easy for parent processes (especially shell scripts) to determine what
 kind of "not worked" it was, when a program failed - eg: a program like
 grep can "fail" because the pattern was not in the file(s), or because
 one or more files could not be opened, or because of a regular expression
 syntax error (or illegal option or similar.)

 However, more than a small set of such different codes would soon become
 unmanageable - about the biggest rational set I have seen are the
 <sysexits.h> values that sendmail originally introduced (or maybe delivermail
 before it).   There are 15 of them (plus "ok").

 So, we need to ask ourselves, just what is the point of allowing more than
 a couple of hundred exit codes (or even that many) ?

 No-one I have asked has ever supplied a reason why they'd rationally need
 more than 10 or 15.

 If we start having programs generate zillions of different error exit codes
 then we might end up with a system that does like some others I could
 name (but won't) where the program (or shell equivalent) ends up printing
 something like "Error 12073" ... and then the user has to go look up
 the value in a list soewhere to find out what that really means.

 The other reason one might decide to use more exit codes, is to stop using
 them to indicate success/failure status and instead start using them to
 return answers.   But if we do that, are 32 bit integers really enough?
 Surely we really need at least 64 bits these days (32 might have been enough
 back in the 1980's, but now?)   And what about floats - or character strings?
 Or even general structs.   Clearly we would need a much richer exit value
 mechanism than just a simple integer....   One might even suggest that one
 exit value might be an array of char strings with a name=value type syntax,
 so it can be used by programs to export values back to the parent's 
 environment.

 The thought of any of this simply makes me ill.

 Hence, actually prohibiting systems from implementing more than a reasonably
 small set of exit values, such that they can only rationally be used for
 either success, or a fairly generic failure type, is not something I
 necessarily see as a mistake, or stupid - it might actually be enlightened.

 kre


From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Mon, 30 May 2016 04:07:08 +0000

 On Thu, Apr 21, 2016 at 01:50:00AM +0000, Robert Elz wrote:
  >    |  Am I supposed to interpret this to *prohibit* wait() from returning
  >    |  more than 8 bits of status?
  >    |  
  >    |  That is... stupid.
  >  
  >  In most situations I would agree with that reasoning, but here I am not
  >  sure that I do.
  >  
  >  We need to work out just what the status is for, and how it is intended
  >  to be used.
  >  
  >  Basic POSIX allows just two values to be passed to exit() (by applications)
  >  EXIT_SUCCESS and EXIT_FAILURE (strictly, 3, with 0 as well, but if there's
  >  a system where EXIT_SUCCESS and 0 are not the same thing I have
  >  yet to see it.)

 That would be VMS...

  >  Allowing applications to use other values (even just another 254 of them)
  >  is an extension (one of the posix options.)
  >  [...]
  >  So, we need to ask ourselves, just what is the point of allowing more than
  >  a couple of hundred exit codes (or even that many) ?

 It allows you to use them for signalling. For example (offhand, so may
 not be that persuasive if examined too closely) given some kind of
 preforking httpd that runs application-server things with exec, you
 might conceivably define the exit codes as
    0 - everything went fine, nothing to do
    1 - something went wrong
    2 - something went badly wrong
    3+ - everything went fine; you should close this fd

 Yes, this can become gross, and if you want to generate significant
 output that's what stdout (or other things) is for, but there's no
 reason a cooperating subnet of processes can't do productive things
 even if it doesn't make sense in the general case.

 A perhaps more persuasive argument is: we provide an integer result
 argument. Is there any reason *to* arbitrarily truncate it, vs.
 rather than delivering the value as submitted or as much of it as can
 be carried through readily? I can't think of any.

 One could also make an argument that like errno is a special case of
 an exception, an exit code is also a special case of an exception and
 there should be a way to throw system-level exceptions out of
 programs. That's definitely into crazy talk though :-)

 -- 
 David A. Holland
 dholland@netbsd.org

State-Changed-From-To: feedback->open
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Wed, 09 Jun 2021 02:43:27 +0000
State-Changed-Why:
This PR has been in feedback since 2016 and the feedback mails have started
bouncing.
Some feedback was received, but was not clearly actionable and it's not
entirely clear what the current state is.


From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: standards/51024 (NetBSD misses the mandatory waitid() syscall)
Date: Wed, 09 Jun 2021 16:41:26 +0700

     Date:        Wed,  9 Jun 2021 02:43:27 +0000 (UTC)
     From:        dholland@NetBSD.org
     Message-ID:  <20210609024327.D54F71A9239@mollari.NetBSD.org>

   | This PR has been in feedback since 2016 and the feedback mails have started
   | bouncing.

 Joerg retired... change the address to schily@schily.net and the bounces
 should stop.

   | Some feedback was received, but was not clearly actionable and it's not
   | entirely clear what the current state is.

 My memory is that everything is "fixed" (if that really is an appropriate
 description) but I'm not sure anyone has taken the time to verify it.

 kre

State-Changed-From-To: open->closed
State-Changed-By: kre@NetBSD.org
State-Changed-When: Fri, 05 Nov 2021 06:09:52 +0000
State-Changed-Why:
This is believed fixed.  waitid() exists and is
believed correct.   No more feedback is possible.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.