NetBSD Problem Report #59498
From www@netbsd.org Tue Jul 1 16:03:46 2025
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
client-signature RSA-PSS (2048 bits) client-digest SHA256)
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id B30011A923C
for <gnats-bugs@gnats.NetBSD.org>; Tue, 1 Jul 2025 16:03:46 +0000 (UTC)
Message-Id: <20250701160345.825FD1A923E@mollari.NetBSD.org>
Date: Tue, 1 Jul 2025 16:03:45 +0000 (UTC)
From: rbranco@suse.de
Reply-To: rbranco@suse.de
To: gnats-bugs@NetBSD.org
Subject: Add missing POSIX O_CLOFORK flag
X-Send-Pr-Version: www-1.0
>Number: 59498
>Category: kern
>Synopsis: Add missing POSIX O_CLOFORK flag
>Confidential: no
>Severity: non-critical
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Jul 01 16:05:00 +0000 2025
>Last-Modified: Fri Jul 11 02:40:01 +0000 2025
>Originator: Ricardo Branco
>Release: NetBSD 10.99.14
>Organization:
>Environment:
>Description:
Add missing POSIX O_CLOFORK flag
It's 99% done. The TODO bits are documented. Mostly the manpages.
>How-To-Repeat:
>Fix:
https://github.com/NetBSD/src/pull/53
>Audit-Trail:
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Wed, 02 Jul 2025 00:13:07 +0700
Date: Tue, 1 Jul 2025 16:05:00 +0000 (UTC)
From: rbranco@suse.de
Message-ID: <20250701160500.760D61A9242@mollari.NetBSD.org>
Have you investigated how many (if any, I haven't looked) of
the applications in NetBSD simply assume get fcntl(fd, F_GETFD)
returning non-zero means that close-on-exec is set ?
That's what I always considered the most difficult part of
implementing O_CLOFORK -- the alteration of this fairly long
held assumption.
kre
From: Ricardo Branco <rbranco@suse.de>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Tue, 1 Jul 2025 19:20:47 +0200
On 7/1/25 7:15 PM, Robert Elz via gnats wrote:
> The following reply was made to PR kern/59498; it has been noted by GNATS.
>
> From: Robert Elz <kre@munnari.OZ.AU>
> To: gnats-bugs@netbsd.org
> Cc:
> Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
> Date: Wed, 02 Jul 2025 00:13:07 +0700
>
> Date: Tue, 1 Jul 2025 16:05:00 +0000 (UTC)
> From: rbranco@suse.de
> Message-ID: <20250701160500.760D61A9242@mollari.NetBSD.org>
>
> Have you investigated how many (if any, I haven't looked) of
> the applications in NetBSD simply assume get fcntl(fd, F_GETFD)
> returning non-zero means that close-on-exec is set ?
>
> That's what I always considered the most difficult part of
> implementing O_CLOFORK -- the alteration of this fairly long
> held assumption.
>
> kre
>
In this implementation, O_CLOFORK is cleared on exec, so
this point is moot for applications.
Library code needs minor fixing, though. That can be done
and should be done.
Best,
Ricardo.
From: Christos Zoulas <christos@zoulas.com>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Tue, 01 Jul 2025 14:25:38 -0400
In addition to kre@'s comments about reviewing fcntl(F_GETFD),
I would add a fd_set_fdflags_from_oflags() that does:
fd_set_fdflags(curlwp, newfd,
((flags & O_CLOEXEC) ? FD_CLOEXEC : 0) |
((flags & O_CLOFORK) ? FD_CLOFORK : 0));
since this is repeated a bunch of times.
christos
From: Robert Elz <kre@munnari.OZ.AU>
To: Ricardo Branco <rbranco@suse.de>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Wed, 02 Jul 2025 02:05:35 +0700
Date: Tue, 1 Jul 2025 19:20:47 +0200
From: Ricardo Branco <rbranco=40suse.de>
Message-ID: <78c4992b-b5b6-493b-8eb2-594df8990ad6=40suse.de>
=7C In this implementation, O_CLOFORK is cleared on exec,
That's fine, but not related to what I asked.
Typically O_CLOFORK is set by library functions to guard against
possible other threads forking while a temporary fd is open.
Those temporary fd's can last for noticeable time, and can be
revealed to the application code.
The application might want to see if close-on-exec has been set
for the fd (for some reason) and use fcntl(F_GETFD) to do it, and
never having heard of O_CLOFORK (or FD_CLOFORK) simply assumes that
the non-zero return means O_CLOEXEC is set on the fd.
I think we need an audit of applications (which includes library code
that they might call) to examine all fcntl(F_GETFD) (and fcntl(F_SETFD))
calls, and make sure they are doing the right thing, before the
O_CLOFORK mechanism is exposed to user space in any way (it doesn't hurt
to have it in the kernel, as long as nothing, tests excepted, ever
sets it).
kre
From: Ricardo Branco <rbranco@suse.de>
To: Robert Elz <kre@munnari.OZ.AU>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Tue, 1 Jul 2025 21:16:22 +0200
On 7/1/25 9:05 PM, Robert Elz wrote:
> Date: Tue, 1 Jul 2025 19:20:47 +0200
> From: Ricardo Branco <rbranco@suse.de>
> Message-ID: <78c4992b-b5b6-493b-8eb2-594df8990ad6@suse.de>
>
> | In this implementation, O_CLOFORK is cleared on exec,
>
> That's fine, but not related to what I asked.
>
> Typically O_CLOFORK is set by library functions to guard against
> possible other threads forking while a temporary fd is open.
> Those temporary fd's can last for noticeable time, and can be
> revealed to the application code.
>
> The application might want to see if close-on-exec has been set
> for the fd (for some reason) and use fcntl(F_GETFD) to do it, and
> never having heard of O_CLOFORK (or FD_CLOFORK) simply assumes that
> the non-zero return means O_CLOEXEC is set on the fd.
>
> I think we need an audit of applications (which includes library code
> that they might call) to examine all fcntl(F_GETFD) (and fcntl(F_SETFD))
> calls, and make sure they are doing the right thing, before the
> O_CLOFORK mechanism is exposed to user space in any way (it doesn't hurt
> to have it in the kernel, as long as nothing, tests excepted, ever
> sets it).
>
> kre
That sounds like a reasonable compromise.
FWIW, FreeBSD recently introduced the FD_RESOLVE_BENEATH flag.
I identified a number of places where FD_CLOEXEC is naively set
without ORing with the results from F_GETFD, but first wanted to
check whether you want to support this flag at all.
From: Robert Elz <kre@munnari.OZ.AU>
To: Ricardo Branco <rbranco@suse.de>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Wed, 02 Jul 2025 03:32:33 +0700
Date: Tue, 1 Jul 2025 21:16:22 +0200
From: Ricardo Branco <rbranco@suse.de>
Message-ID: <c1904fad-a790-48e0-aaa3-781e60541686@suse.de>
| I identified a number of places where FD_CLOEXEC is naively set
| without ORing with the results from F_GETFD, but first wanted to
| check whether you want to support this flag at all.
O_CLOFORK (and FD_CLOFORK) - yes certainly, it has to happen, we just
need to do it a bit carefully, as for a long time (like ever since
fcntl(F_GETFD) was invented) it has been "known" that the only flag
at that level was close-on-exec (the only one which belongs to the
process's open file table, rather that the system's global open file
table). So we need to avoid breaking things which currently assume
that (by fixing them) when this appears - ideally before they break.
kre
From: Ricardo Branco <rbranco@suse.de>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Thu, 10 Jul 2025 15:13:49 +0200
On 7/1/25 8:35 PM, Christos Zoulas via gnats wrote:
> The following reply was made to PR kern/59498; it has been noted by GNATS.
>
> From: Christos Zoulas <christos@zoulas.com>
> To: gnats-bugs@netbsd.org
> Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
> netbsd-bugs@netbsd.org
> Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
> Date: Tue, 01 Jul 2025 14:25:38 -0400
>
> In addition to kre@'s comments about reviewing fcntl(F_GETFD),
> I would add a fd_set_fdflags_from_oflags() that does:
>
> fd_set_fdflags(curlwp, newfd,
> ((flags & O_CLOEXEC) ? FD_CLOEXEC : 0) |
> ((flags & O_CLOFORK) ? FD_CLOFORK : 0));
>
> since this is repeated a bunch of times.
>
> christos
>
Now that it's merged in both FreeBSD & DragonflyBSD, and
with the OpenBSD version being taken care of Theo Buehler,
I can devote attention to this one.
To me it's the most problematic because I had to change a
bool field in a struct to u_char, and also because the dup3()
implementation is or was reportedly broken:
https://github.com/NetBSD/src/blob/trunk/sys/sys/filedesc.h#L109
I added tests, updated manpages and will work on it over the
weekend.
Best,
Ricardo
From: Robert Elz <kre@munnari.OZ.AU>
To: Ricardo Branco <rbranco@suse.de>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Fri, 11 Jul 2025 07:27:42 +0700
Date: Thu, 10 Jul 2025 15:13:49 +0200
From: Ricardo Branco <rbranco@suse.de>
Message-ID: <330bc7f1-ff17-45c5-b211-6b8cf1844d9d@suse.de>
| To me it's the most problematic because I had to change a
| bool field in a struct to u_char,
That, or you could just add a new bool field. The following
field is an int, so there's padding between the 2 bool fields
that start the struct fdfile, and the ff_refcnt field anyway.
Add a 3rd bool (between ff_allocated and ff_refcnt, so it goes
into the 2 byte padding space) and I think that means that we
won't need a kernel revbump for this addition (unless some other
change elsewhere requires it).
| and also because the dup3()
| implementation is or was reportedly broken:
dup3() had some changes last year, but that seems like a minor
issue, and would be unrelated to what you're doing. What is
(or was) the problem supposed to be?
kre
From: Robert Elz <kre@munnari.OZ.AU>
To: Ricardo Branco <rbranco@suse.de>, gnats-bugs@netbsd.org,
kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: kern/59498: Add missing POSIX O_CLOFORK flag
Date: Fri, 11 Jul 2025 09:13:36 +0700
Date: Fri, 11 Jul 2025 07:27:42 +0700
From: Robert Elz <kre@munnari.OZ.AU>
Message-ID: <7169.1752193662@jacaranda.noi.kre.to>
| What is (or was) the problem supposed to be?
And when I finally found a browser that would let me actually see
the URL that you included, I guess you meant the bug with the PR
number (which was wrong incidentally, a typo when that comment was
added, now fixed) - that one was also fixed last year, there should
be no issue there.
But please do send me diffs (not a reference to anything on github)
or even just a rough outline of how you see the changes happening,
showing the kinds of changes you're planning to make (doesn't need
to be everything - just so I can see what method you're using) so we
can make sure that nothing like that bug reappears in this case.
Off list would be best, both because I'm requesting something in the
very early stages (so we can perhaps avoid some wasted effort) and
as (eventually) a complete set of diffs touching everywhere in the
kernel that needs to be touched for this, might be fairly large.
kre
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.