NetBSD Problem Report #58916
From www@netbsd.org Wed Dec 18 14:52:02 2024
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
client-signature RSA-PSS (2048 bits) client-digest SHA256)
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 8A8841A9238
for <gnats-bugs@gnats.NetBSD.org>; Wed, 18 Dec 2024 14:52:02 +0000 (UTC)
Message-Id: <20241218145201.81F491A923A@mollari.NetBSD.org>
Date: Wed, 18 Dec 2024 14:52:01 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: timerfd(2) claims ready for write
X-Send-Pr-Version: www-1.0
>Number: 58916
>Category: kern
>Synopsis: timerfd(2) claims ready for write
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Dec 18 14:55:00 +0000 2024
>Last-Modified: Thu Dec 19 17:45:01 +0000 2024
>Originator: Taylor R Campbell
>Release: current, 10
>Organization:
The TimerFD Writation
>Environment:
>Description:
For a timerfd(2), select(2) returns writable if asked and poll(2) returns POLLOUT|POLLWRNORM if asked. (Also POLLRDNORM if asked.)
In contrast, in Linux, select(2) never returns writable and poll(2) never returns POLLOUT|POLLWRNORM -- or, for that matter, anything other than POLLIN.
Writing to a timerfd fails with EOPNOTSUPP, so it's not useful to claim writable. This doesn't appear to be a POSIX interface, so it looks like Linux is the `spec' here if I haven't missed anything obvious.
>How-To-Repeat:
run Python 3.13.1 test suite
======================================================================
FAIL: test_timerfd_ns_select (test.test_os.TimerfdTests.test_timerfd_ns_select)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/scratch/lang/python313/work/Python-3.13.1/Lib/test/test_os.py", line 4436, in test_timerfd_ns_select
self.assertEqual((rfd, wfd, xfd), ([], [], []))
~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: Tuples differ: ([], [3], []) != ([], [], [])
First differing element 1:
[3]
[]
- ([], [3], [])
? -
+ ([], [], [])
>Fix:
Yes, please!
>Audit-Trail:
From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/58916: timerfd(2) claims ready for write
Date: Wed, 18 Dec 2024 16:49:02 +0100
--8vIkFKi2bneNJ0ZW
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
On Wed, Dec 18, 2024 at 02:55:00PM +0000, campbell+netbsd@mumble.net wrote:
> >Number: 58916
> >Category: kern
> >Synopsis: timerfd(2) claims ready for write
This seems to fix the two tests for me. Ok to commit?
Thomas
--8vIkFKi2bneNJ0ZW
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="poll.diff"
Index: sys_timerfd.c
===================================================================
RCS file: /cvsroot/src/sys/kern/sys_timerfd.c,v
retrieving revision 1.8
diff -u -r1.8 sys_timerfd.c
--- sys_timerfd.c 17 Feb 2022 16:28:29 -0000 1.8
+++ sys_timerfd.c 18 Dec 2024 15:48:37 -0000
@@ -337,7 +337,7 @@
timerfd_fop_poll(file_t * const fp, int const events)
{
struct timerfd * const tfd = fp->f_timerfd;
- int revents = events & (POLLOUT | POLLWRNORM);
+ int revents = 0;
if (events & (POLLIN | POLLRDNORM)) {
itimer_lock();
--8vIkFKi2bneNJ0ZW--
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/58916 CVS commit: src
Date: Wed, 18 Dec 2024 16:01:28 +0000
Module Name: src
Committed By: riastradh
Date: Wed Dec 18 16:01:28 UTC 2024
Modified Files:
src/sys/kern: sys_timerfd.c
src/tests/lib/libc/sys: t_timerfd.c
Log Message:
timerfd(2): Do not claim writable.
Writes will fail with EOPNOTSUPP.
PR kern/58916: timerfd(2) claims ready for write
To generate a diff of this commit:
cvs rdiff -u -r1.8 -r1.9 src/sys/kern/sys_timerfd.c
cvs rdiff -u -r1.5 -r1.6 src/tests/lib/libc/sys/t_timerfd.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Thomas Klausner <wiz@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/58916: timerfd(2) claims ready for write
Date: Wed, 18 Dec 2024 16:23:22 +0000
> Date: Wed, 18 Dec 2024 16:49:02 +0100
> From: Thomas Klausner <wiz@NetBSD.org>
>
> This seems to fix the two tests for me. Ok to commit?
Oops -- I already committed. The other part I committed is just as
important: updating the tests.
(And for the other issues we need automatic tests added! I went ahead
with the itimespecfix patch for PR 58914 without adding tests because,
well, the tests would crash our testbed, but we do need to add those
tests.)
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/58916: timerfd(2) claims ready for write
Date: Thu, 19 Dec 2024 08:34:07 +0700
Date: Wed, 18 Dec 2024 14:55:00 +0000 (UTC)
From: campbell+netbsd@mumble.net
Message-ID: <20241218145500.8B58E1A923B@mollari.NetBSD.org>
| For a timerfd(2), select(2) returns writable if asked and poll(2)
| returns POLLOUT|POLLWRNORM if asked. (Also POLLRDNORM if asked.)\
Seems reasonable to me.
| In contrast, in Linux, select(2) never returns writable
I don't know what Linux does on writes, is it also:
| Writing to a timerfd fails with EOPNOTSUPP,
If so, then I'd suggest Linux's poll/select has a bug.
| so it's not useful to claim writable.
Someone fails to understand the purpose of poll/select. It isn't
to inform the process that a read/write will succeed, it is to inform
the process that a read/write will not hang. There are too many possible
results for select (which just returns 1 bit) in particular to be able
to differentiate between different possible results (poll() possibly
could, but doesn't really) - so there is no attempt to do that.
| This doesn't appear to be a POSIX interface,
Timerfd isn't (devices in general aren't) but select() and poll() are.
Further, they're generic, while the actual implementation depends upon
the underlying device, the functions themselves do not.
What POSIX says of select() is:
to see whether some of their descriptors are ready for
reading, are ready for writing, or...
That is, there won't be an EAGAIN (or similar) because the underlying
device is not ready yet, nor would the call hang if not in non-blocking
mode. Whether there might be an error, EOF, or other return (or how
much data might be successfully transferred) isn't conveyed by these
interfaces.
kre
ps: not that I really care what timerfd devices consider ready or not.
From: Taylor R Campbell <campbell@mumble.net>
To: Robert Elz <kre@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/58916: timerfd(2) claims ready for write
Date: Thu, 19 Dec 2024 02:39:17 +0000
> Date: Thu, 19 Dec 2024 08:34:07 +0700
> From: Robert Elz <kre@munnari.OZ.AU>
>
> Date: Wed, 18 Dec 2024 14:55:00 +0000 (UTC)
> From: campbell+netbsd@mumble.net
> Message-ID: <20241218145500.8B58E1A923B@mollari.NetBSD.org>
>
> | For a timerfd(2), select(2) returns writable if asked and poll(2)
> | returns POLLOUT|POLLWRNORM if asked. (Also POLLRDNORM if asked.)\
>
> Seems reasonable to me.
select(2) returns nonwritable if asked about the reader side of a
pipe (and vice versa, nonreadable if asked about the writer side of a
pipe). Is that a bug?
> | In contrast, in Linux, select(2) never returns writable
>
> I don't know what Linux does on writes, is it also:
>
> | Writing to a timerfd fails with EOPNOTSUPP,
>
> If so, then I'd suggest Linux's poll/select has a bug.
Correction:
- On Linux, writing to a timerfd fails with EINVAL.
- On NetBSD, writing to a timerfd fails with EBADF (same as the reader
side of a pipe).
> | so it's not useful to claim writable.
>
> Someone fails to understand the purpose of poll/select. It isn't
> to inform the process that a read/write will succeed, it is to inform
> the process that a read/write will not hang. There are too many possible
> results for select (which just returns 1 bit) in particular to be able
> to differentiate between different possible results (poll() possibly
> could, but doesn't really) - so there is no attempt to do that.
OK, but:
(a) Probably need to audit all the struct fileops and struct cdevsw
*poll functions in tree if you want to enforce this -- and you'll
have to change a lot of existing cases, I expect.
(b) Currently Linux is the `spec' for timerfd and we're deviating from
that behaviour.
(c) timerfd-on-NetBSD is the only case of {pipe, timerfd, kqueue,
...}-on-{NetBSD, Linux} that I've found that behaves like this,
where writes are nonsensical and immediately rejected but select
returns writable. (But I haven't done an exhaustive search.)
(This came up in Python's automatic tests of select on timerfds,
details in <https://gnats.netbsd.org/58914>. Happy to revert the
change. Mainly I just want to make sure these timer- and timerfd-
related paths are adequately exercised with clear automatic tests that
cover enough to avoid gratuitous application incompatibility and
kernel crashes.)
From: Robert Elz <kre@munnari.OZ.AU>
To: Taylor R Campbell <campbell@mumble.net>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/58916: timerfd(2) claims ready for write
Date: Thu, 19 Dec 2024 19:49:40 +0700
Date: Thu, 19 Dec 2024 02:39:17 +0000
From: Taylor R Campbell <campbell@mumble.net>
Message-ID: <20241219023918.2023460BB5@jupiter.mumble.net>
| select(2) returns nonwritable if asked about the reader side of a
| pipe (and vice versa, nonreadable if asked about the writer side of a
| pipe). Is that a bug?
Perhaps - I assume that also applies to any fd open O_WRONLY or O_RDONLY
with select/poll used for the other direction. In practice I'm not
sure it matters, as real code just doesn't do things like this, code
doesn't ask when it can write to a fd which doesn't support write()
there's no point.
| (This came up in Python's automatic tests of select on timerfds,
Yes, I saw that -- the number of issues made that look more like
an "is this linux" test, that kind of thing is also likely the
only place where this difference would be noted, and even then I
don't see the point, after all we could extend the timerfd interface
to make writing have some meaning (no, I have no idea what that
might be) - unless they actually intend writing to it in real
applications, testing that it must fail (at all, just just via
select/poll) seems like a pointless waste of time, it isn't as
if this is a specific test of just that interface even.
| Happy to revert the change.
No need, like I said initially, I don't really care what the
behaviour is here. I just like to make sure we don't fall
into the trap of "select() (or poll()) said it was writable,
so a write must succeed!" mentality.
kre
From: Jason Thorpe <thorpej@me.com>
To: Taylor Campbell <campbell@mumble.net>
Cc: Robert Elz <kre@NetBSD.org>,
"gnats-bugs@netbsd.org" <gnats-bugs@NetBSD.org>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>
Subject: Re: kern/58916: timerfd(2) claims ready for write
Date: Thu, 19 Dec 2024 05:15:05 -0800
> On Dec 18, 2024, at 6:39=E2=80=AFPM, Taylor R Campbell =
<campbell@mumble.net> wrote:
>=20
> (This came up in Python's automatic tests of select on timerfds,
> details in <https://gnats.netbsd.org/58914>. Happy to revert the
> change. Mainly I just want to make sure these timer- and timerfd-
> related paths are adequately exercised with clear automatic tests that
> cover enough to avoid gratuitous application incompatibility and
> kernel crashes.)
In this case, I think it=E2=80=99s better to file a bug against the =
Python test suite.
-- thorpej
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Jason Thorpe <thorpej@me.com>
Cc: Robert Elz <kre@NetBSD.org>,
gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/58916: timerfd(2) claims ready for write
Date: Thu, 19 Dec 2024 13:57:15 +0000
> Date: Thu, 19 Dec 2024 05:15:05 -0800
> From: Jason Thorpe <thorpej@me.com>
>
> In this case, I think it's better to file a bug against the Python
> test suite.
We can do that but I want to make sure we have a clear story for what
timerfd(2) _should_ be doing so we have a compelling argument that it
is a Python bug.
Right now there's another part of the Python test suite that crashes
the kernel, and on closer inspection -- in the course of writing tests
to cover that case -- I'm not actually sure what the right thing is
there (PR 58914).
So I'm not about to go claiming bugs in other projects' code until I
get this egg off our face!
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Robert Elz <kre@NetBSD.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Subject: Re: kern/58916: timerfd(2) claims ready for write
Date: Thu, 19 Dec 2024 14:15:01 +0000
> Date: Thu, 19 Dec 2024 19:49:40 +0700
> From: Robert Elz <kre@munnari.OZ.AU>
>
> Date: Thu, 19 Dec 2024 02:39:17 +0000
> From: Taylor R Campbell <campbell@mumble.net>
> Message-ID: <20241219023918.2023460BB5@jupiter.mumble.net>
>
> | select(2) returns nonwritable if asked about the reader side of a
> | pipe (and vice versa, nonreadable if asked about the writer side of a
> | pipe). Is that a bug?
>
> Perhaps - I assume that also applies to any fd open O_WRONLY or O_RDONLY
> with select/poll used for the other direction. In practice I'm not
> sure it matters, as real code just doesn't do things like this, code
> doesn't ask when it can write to a fd which doesn't support write()
> there's no point.
It's not limited to O_WRONLY and O_RDONLY, or the read-only and
write-only sides of a pipe.
I tested with a disconnected stream-type socket and got the same
result: read and write don't hang (they fail immediately with
ENOTCONN, rather than EBADF), but select doesn't report them readable
or writable.
I checked some input-only devices too like wskbd and wsmouse, and they
too never return POLLOUT|POLLWRNORM even though write would fail
immediately with ENODEV.
I think if we want select and poll to guarantee claiming
readable/writable if a read or write would return immediately, we have
a lot of work to do.
Now it may be worthwhile to audit device poll routines for various
types of bugs (like returning E* instead of POLL*), because they are
often not very carefully written.
But I don't think the particular behaviour you're asking for is useful
when the underlying file object guarantees that read or write will
_never_ work because the operation is nonsensical.
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/58916 CVS commit: src/tests/lib/libc/sys
Date: Thu, 19 Dec 2024 14:42:32 +0000
Module Name: src
Committed By: riastradh
Date: Thu Dec 19 14:42:32 UTC 2024
Modified Files:
src/tests/lib/libc/sys: t_timerfd.c
Log Message:
t_timerfd: Fix select/poll tests and add kevent EVFILT_WRITE test.
PR kern/58916: timerfd(2) claims ready for write
To generate a diff of this commit:
cvs rdiff -u -r1.6 -r1.7 src/tests/lib/libc/sys/t_timerfd.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Robert Elz <kre@munnari.OZ.AU>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: Jason Thorpe <thorpej@me.com>, gnats-bugs@NetBSD.org,
netbsd-bugs@NetBSD.org
Subject: Re: kern/58916: timerfd(2) claims ready for write
Date: Fri, 20 Dec 2024 00:42:10 +0700
Date: Thu, 19 Dec 2024 13:57:15 +0000
From: Taylor R Campbell <riastradh@NetBSD.org>
Message-ID: <20241219135721.0AF69855E1@mail.netbsd.org>
| We can do that but I want to make sure we have a clear story for what
| timerfd(2) _should_ be doing so we have a compelling argument that it
| is a Python bug.
In this case that should be easy ... since no-one apparently supports
writing to a timerfd, there is no reason to test, in any way, that
doing so fails, as no existing (python or other) code can possibly
have a valid reason for attempting a write to a timerfd, but it is
possible that some implementation might add a timerfd extension which
would involve writing to one of them.
Note that this doesn't mean that a kernel specific test for how
timerfds work shouldn't test for attempting a write, and checking that
the (system dependent) error is returned, if only so that no local
change inadvertently alters how things work (eg: whether it is even
possible to open a timerfd for writing, and if it is, what happens
when a write is attempted) - but there is no reason at all for this
to be connected to anything intended to be portable across systems.
| Right now there's another part of the Python test suite that crashes
| the kernel, and on closer inspection -- in the course of writing tests
| to cover that case -- I'm not actually sure what the right thing is
| there (PR 58914).
Yes, that one looks like it ought to be fixed, and your patch looked like
it was reasonable to me. But I have never used a timerfd, so have no
real way to know for sure what should happen.
kre
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.