NetBSD Problem Report #59132

From www@netbsd.org  Wed Mar  5 13:30:12 2025
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id A37B31A923D
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  5 Mar 2025 13:30:12 +0000 (UTC)
Message-Id: <20250305133011.58EF61A923F@mollari.NetBSD.org>
Date: Wed,  5 Mar 2025 13:30:11 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: t_futex_ops:futex_wait_timeout_* sometimes fails on early wakeup
X-Send-Pr-Version: www-1.0

>Number:         59132
>Category:       kern
>Synopsis:       t_futex_ops:futex_wait_timeout_* sometimes fails on early wakeup
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          needs-pullups
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Mar 05 13:35:00 +0000 2025
>Closed-Date:    
>Last-Modified:  Wed Mar 05 14:06:00 +0000 2025
>Originator:     Taylor R Campbell
>Release:        current
>Organization:
The NetBSD Futexwakeupearly
>Environment:
>Description:
tc-start: 1741180577.73934, futex_wait_timeout_deadline
tc-se:*** Check failed: /home/riastradh/netbsd/current/src/tests/lib/libc/sys/t_futex_ops.c:1334: ts=98.529265778sec deadline=98.544610297sec
tc-end: 1741180579.62123, futex_wait_timeout_deadline, failed, 1 checks failed; see output for more details
tc-start: 1741180579.62211, futex_wait_timeout_deadline_rt
tc-end: 1741180581.51927, futex_wait_timeout_deadline_rt, passed
tc-start: 1741180581.52108, futex_wait_timeout_relative
tc-se:*** Check failed: /home/riastradh/netbsd/current/src/tests/lib/libc/sys/t_futex_ops.c:1334: ts=102.509214057sec deadline=102.522998637sec
tc-end: 1741180583.42052, futex_wait_timeout_relative, failed, 1 checks failed; see output for more details
tc-start: 1741180583.42136, futex_wait_timeout_relative_rt
tc-end: 1741180585.38947, futex_wait_timeout_relative_rt, passed

This is under qemu-system-x86_64 with nvmm on a NetBSD 10ish host.

So far I've only recorded seeing it with the monotonic clock, not the realtime clock (!), but I saw it in passing while testing other things and wasn't paying close attention so I can't rule out failures with the realtime clock too.
>How-To-Repeat:
1. boot NetBSD-current under qemu
2. cd /usr/tests/lib/libc/sys && atf-run t_futex_ops
>Fix:
Yes, please!

>Release-Note:

>Audit-Trail:
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/59132 CVS commit: src/sys/kern
Date: Wed, 5 Mar 2025 14:01:55 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Wed Mar  5 14:01:55 UTC 2025

 Modified Files:
 	src/sys/kern: sys_futex.c

 Log Message:
 futex(2): Avoid returning early on timeout.

 Rounding in the arithmetic leading into cv_timedwait_sig, and any
 skew between the timecounter used by clock_gettime and the hardclock
 timer used to wake cv_timedwait_sig, can lead cv_timedwait_sig to
 wake up before the deadline as observable by clock_gettime.

 futex(FUTEX_WAIT) is not supposed to do that, so ignore when
 cv_timedwait_sig returns EWOULDBLOCK -- we'll notice the deadline has
 passed in the next iteration anyway, if it has actually passed.

 While here, make sure that we never pass less than 1 tick to
 cv_timedwait_sig -- that turns it into cv_wait_sig, to wait
 indefinitely with no timeout.

 With this change, I have not seen any failures as reported in:

 PR kern/59132: t_futex_ops:futex_wait_timeout_* sometimes fails on
 early wakeup

 Some instrumentation in futex_wait to count when cv_timedwait_sig
 returns early as measured by clock_gettime (not committed in this
 change, just local experiments) supports this hypothesis for the
 symptoms observed in the PR.


 To generate a diff of this commit:
 cvs rdiff -u -r1.25 -r1.26 src/sys/kern/sys_futex.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->needs-pullups
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Wed, 05 Mar 2025 14:06:00 +0000
State-Changed-Why:
probably fixed in HEAD, needs pullup-10


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.