NetBSD Problem Report #59498

From www@netbsd.org  Tue Jul  1 16:03:46 2025
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id B30011A923C
	for <gnats-bugs@gnats.NetBSD.org>; Tue,  1 Jul 2025 16:03:46 +0000 (UTC)
Message-Id: <20250701160345.825FD1A923E@mollari.NetBSD.org>
Date: Tue,  1 Jul 2025 16:03:45 +0000 (UTC)
From: rbranco@suse.de
Reply-To: rbranco@suse.de
To: gnats-bugs@NetBSD.org
Subject: Add missing POSIX O_CLOFORK flag
X-Send-Pr-Version: www-1.0

>Number:         59498
>Category:       kern
>Synopsis:       Add missing POSIX O_CLOFORK flag
>Confidential:   no
>Severity:       non-critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jul 01 16:05:00 +0000 2025
>Last-Modified:  Fri Jul 11 02:40:01 +0000 2025
>Originator:     Ricardo Branco
>Release:        NetBSD 10.99.14
>Organization:
>Environment:
>Description:
Add missing POSIX O_CLOFORK flag

It's 99% done. The TODO bits are documented. Mostly the manpages.
>How-To-Repeat:

>Fix:
https://github.com/NetBSD/src/pull/53

>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Wed, 02 Jul 2025 00:13:07 +0700

     Date:        Tue,  1 Jul 2025 16:05:00 +0000 (UTC)
     From:        rbranco@suse.de
     Message-ID:  <20250701160500.760D61A9242@mollari.NetBSD.org>

 Have you investigated how many (if any, I haven't looked) of
 the applications in NetBSD simply assume get fcntl(fd, F_GETFD)
 returning non-zero means that close-on-exec is set ?

 That's what I always considered the most difficult part of
 implementing O_CLOFORK -- the alteration of this fairly long
 held assumption.

 kre

From: Ricardo Branco <rbranco@suse.de>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Tue, 1 Jul 2025 19:20:47 +0200

 On 7/1/25 7:15 PM, Robert Elz via gnats wrote:
 > The following reply was made to PR kern/59498; it has been noted by GNATS.
 >
 > From: Robert Elz <kre@munnari.OZ.AU>
 > To: gnats-bugs@netbsd.org
 > Cc:
 > Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
 > Date: Wed, 02 Jul 2025 00:13:07 +0700
 >
 >       Date:        Tue,  1 Jul 2025 16:05:00 +0000 (UTC)
 >       From:        rbranco@suse.de
 >       Message-ID:  <20250701160500.760D61A9242@mollari.NetBSD.org>
 >   
 >   Have you investigated how many (if any, I haven't looked) of
 >   the applications in NetBSD simply assume get fcntl(fd, F_GETFD)
 >   returning non-zero means that close-on-exec is set ?
 >   
 >   That's what I always considered the most difficult part of
 >   implementing O_CLOFORK -- the alteration of this fairly long
 >   held assumption.
 >   
 >   kre
 >   

 In this implementation, O_CLOFORK is cleared on exec, so
 this point is moot for applications.

 Library code needs minor fixing, though.  That can be done
 and should be done.

 Best,
 Ricardo.

From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Tue, 01 Jul 2025 14:25:38 -0400

 In addition to kre@'s comments about reviewing fcntl(F_GETFD),
 I would add a fd_set_fdflags_from_oflags() that does:

        fd_set_fdflags(curlwp, newfd,
 	    ((flags & O_CLOEXEC) ? FD_CLOEXEC : 0) |
 	    ((flags & O_CLOFORK) ? FD_CLOFORK : 0));

 since this is repeated a bunch of times.

 christos

From: Robert Elz <kre@munnari.OZ.AU>
To: Ricardo Branco <rbranco@suse.de>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Wed, 02 Jul 2025 02:05:35 +0700

     Date:        Tue, 1 Jul 2025 19:20:47 +0200
     From:        Ricardo Branco <rbranco=40suse.de>
     Message-ID:  <78c4992b-b5b6-493b-8eb2-594df8990ad6=40suse.de>

   =7C In this implementation, O_CLOFORK is cleared on exec,

 That's fine, but not related to what I asked.

 Typically O_CLOFORK is set by library functions to guard against
 possible other threads forking while a temporary fd is open.
 Those temporary fd's can last for noticeable time, and can be
 revealed to the application code.

 The application might want to see if close-on-exec has been set
 for the fd (for some reason) and use fcntl(F_GETFD) to do it, and
 never having heard of O_CLOFORK (or FD_CLOFORK) simply assumes that
 the non-zero return means O_CLOEXEC is set on the fd.

 I think we need an audit of applications (which includes library code
 that they might call) to examine all fcntl(F_GETFD) (and fcntl(F_SETFD))
 calls, and make sure they are doing the right thing, before the
 O_CLOFORK mechanism is exposed to user space in any way (it doesn't hurt
 to have it in the kernel, as long as nothing, tests excepted, ever
 sets it).

 kre


From: Ricardo Branco <rbranco@suse.de>
To: Robert Elz <kre@munnari.OZ.AU>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Tue, 1 Jul 2025 21:16:22 +0200

 On 7/1/25 9:05 PM, Robert Elz wrote:
 >      Date:        Tue, 1 Jul 2025 19:20:47 +0200
 >      From:        Ricardo Branco <rbranco@suse.de>
 >      Message-ID:  <78c4992b-b5b6-493b-8eb2-594df8990ad6@suse.de>
 >
 >    | In this implementation, O_CLOFORK is cleared on exec,
 >
 > That's fine, but not related to what I asked.
 >
 > Typically O_CLOFORK is set by library functions to guard against
 > possible other threads forking while a temporary fd is open.
 > Those temporary fd's can last for noticeable time, and can be
 > revealed to the application code.
 >
 > The application might want to see if close-on-exec has been set
 > for the fd (for some reason) and use fcntl(F_GETFD) to do it, and
 > never having heard of O_CLOFORK (or FD_CLOFORK) simply assumes that
 > the non-zero return means O_CLOEXEC is set on the fd.
 >
 > I think we need an audit of applications (which includes library code
 > that they might call) to examine all fcntl(F_GETFD) (and fcntl(F_SETFD))
 > calls, and make sure they are doing the right thing, before the
 > O_CLOFORK mechanism is exposed to user space in any way (it doesn't hurt
 > to have it in the kernel, as long as nothing, tests excepted, ever
 > sets it).
 >
 > kre

 That sounds like a reasonable compromise.

 FWIW, FreeBSD recently introduced the FD_RESOLVE_BENEATH flag.

 I identified a number of places where FD_CLOEXEC is naively set
 without ORing with the results from F_GETFD, but first wanted to
 check whether you want to support this flag at all.


From: Robert Elz <kre@munnari.OZ.AU>
To: Ricardo Branco <rbranco@suse.de>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Wed, 02 Jul 2025 03:32:33 +0700

     Date:        Tue, 1 Jul 2025 21:16:22 +0200
     From:        Ricardo Branco <rbranco@suse.de>
     Message-ID:  <c1904fad-a790-48e0-aaa3-781e60541686@suse.de>

   | I identified a number of places where FD_CLOEXEC is naively set
   | without ORing with the results from F_GETFD, but first wanted to
   | check whether you want to support this flag at all.

 O_CLOFORK (and FD_CLOFORK) - yes certainly, it has to happen, we just
 need to do it a bit carefully, as for a long time (like ever since
 fcntl(F_GETFD) was invented) it has been "known" that the only flag
 at that level was close-on-exec (the only one which belongs to the
 process's open file table, rather that the system's global open file
 table).   So we need to avoid breaking things which currently assume
 that (by fixing them) when this appears - ideally before they break.

 kre



From: Ricardo Branco <rbranco@suse.de>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
 gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Thu, 10 Jul 2025 15:13:49 +0200

 On 7/1/25 8:35 PM, Christos Zoulas via gnats wrote:
 > The following reply was made to PR kern/59498; it has been noted by GNATS.
 >
 > From: Christos Zoulas <christos@zoulas.com>
 > To: gnats-bugs@netbsd.org
 > Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
 >   netbsd-bugs@netbsd.org
 > Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
 > Date: Tue, 01 Jul 2025 14:25:38 -0400
 >
 >   In addition to kre@'s comments about reviewing fcntl(F_GETFD),
 >   I would add a fd_set_fdflags_from_oflags() that does:
 >   
 >          fd_set_fdflags(curlwp, newfd,
 >   	    ((flags & O_CLOEXEC) ? FD_CLOEXEC : 0) |
 >   	    ((flags & O_CLOFORK) ? FD_CLOFORK : 0));
 >   
 >   since this is repeated a bunch of times.
 >   
 >   christos
 >   

 Now that it's merged in both FreeBSD & DragonflyBSD, and
 with the OpenBSD version being taken care of Theo Buehler,
 I can devote attention to this one.

 To me it's the most problematic because I had to change a
 bool field in a struct to u_char, and also because the dup3()
 implementation is or was reportedly broken:
 https://github.com/NetBSD/src/blob/trunk/sys/sys/filedesc.h#L109

 I added tests, updated manpages and will work on it over the
 weekend.

 Best,
 Ricardo

From: Robert Elz <kre@munnari.OZ.AU>
To: Ricardo Branco <rbranco@suse.de>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Fri, 11 Jul 2025 07:27:42 +0700

     Date:        Thu, 10 Jul 2025 15:13:49 +0200
     From:        Ricardo Branco <rbranco@suse.de>
     Message-ID:  <330bc7f1-ff17-45c5-b211-6b8cf1844d9d@suse.de>

   | To me it's the most problematic because I had to change a
   | bool field in a struct to u_char,

 That, or you could just add a new bool field.   The following
 field is an int, so there's padding between the 2 bool fields
 that start the struct fdfile, and the ff_refcnt field anyway.
 Add a 3rd bool (between ff_allocated and ff_refcnt, so it goes
 into the 2 byte padding space) and I think that means that we
 won't need a kernel revbump for this addition (unless some other
 change elsewhere requires it).

   | and also because the dup3()
   | implementation is or was reportedly broken:

 dup3() had some changes last year, but that seems like a minor
 issue, and would be unrelated to what you're doing.   What is
 (or was) the problem supposed to be?

 kre

From: Robert Elz <kre@munnari.OZ.AU>
To: Ricardo Branco <rbranco@suse.de>, gnats-bugs@netbsd.org,
        kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Fri, 11 Jul 2025 09:13:36 +0700

     Date:        Fri, 11 Jul 2025 07:27:42 +0700
     From:        Robert Elz <kre@munnari.OZ.AU>
     Message-ID:  <7169.1752193662@jacaranda.noi.kre.to>

   | What is (or was) the problem supposed to be?

 And when I finally found a browser that would let me actually see
 the URL that you included, I guess you meant the bug with the PR
 number (which was wrong incidentally, a typo when that comment was
 added, now fixed) - that one was also fixed last year, there should
 be no issue there.

 But please do send me diffs (not a reference to anything on github)
 or even just a rough outline of how you see the changes happening,
 showing the kinds of changes you're planning to make (doesn't need
 to be everything - just so I can see what method you're using) so we
 can make sure that nothing like that bug reappears in this case.
 Off list would be best, both because I'm requesting something in the
 very early stages (so we can perhaps avoid some wasted effort) and
 as (eventually) a complete set of diffs touching everywhere in the
 kernel that needs to be touched for this, might be fairly large.

 kre

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.