NetBSD Problem Report #58733

From kre@munnari.OZ.AU  Wed Oct  9 07:53:06 2024
Return-Path: <kre@munnari.OZ.AU>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 6EB971A923B
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  9 Oct 2024 07:53:06 +0000 (UTC)
Message-Id: <202410090749.4997ni3s010887@jacaranda.noi.kre.to>
Date: Wed, 9 Oct 2024 14:49:44 +0700 (+07)
From: kre@munnari.OZ.AU
To: gnats-bugs@NetBSD.org
Subject: clock_nanosleep() (and I suspect more) is badly broken
X-Send-Pr-Version: 3.95

>Number:         58733
>Category:       kern
>Synopsis:       clock_nanosleep() (and I suspect more) is badly broken
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kre
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Oct 09 07:55:00 +0000 2024
>Closed-Date:    Sun Oct 13 20:33:45 +0000 2024
>Last-Modified:  Sun Oct 13 20:33:45 +0000 2024
>Originator:     Robert Elz
>Release:        NetBSD 10.99.12
>Organization:
>Environment:
System: NetBSD jacaranda.noi.kre.to 10.99.12 NetBSD 10.99.12 (JACARANDA:1.1-20241009) #156: Wed Oct 9 11:55:18 +07 2024 kre@jacaranda.noi.kre.to:/usr/obj/testing/kernels/amd64/JACARANDA amd64
Architecture: x86_64
Machine: amd64
>Description:

	clock_nanosleep() is used by sleep(1) to delay until a specified
	time (called with TIMER_ABSTIME after sleep.c calculates what the
	desired ending time is).

	In normal circumstances, all goes well, nothing happens, and the
	sleep ends at the desired time.

	But not always...

	NOTE:	This affects NetBSD 10 and should be fixed before 10.1 is
		released.   Haven't looked at 9 (yet), but that's less urgent
		right now.

>How-To-Repeat:

	First, the system to run this test must have been running for more
	than a short while - longer than the time we are to sleep for, or
	the problem is not obvious (there's still a problem, but it you might
	not notice it).   Further, we want the sleep time we're testing to
	be large enough so we can tell when sleep(1) exits too quickly.
	Testing with "sleep 10" will be useless, the value needs to be
	at least 1000, 10000 makes things even more obvious, but for that
	the system needs to have been booted no less than 3 hours ago,
	with 1000, an uptime of 20 mins is enough.

	With such a setup, run (from a shell supporting job control)

		/bin/sleep 1000			(or whatever)
		^Z
		fg  (or bg if you prefer)

	Do this reasonably quickly.

	If all is well, (no bug occurs) when sleep is resumed after having
	been stopped, it will just keep on sleeping until the specified
	time has expired.   (By all means, do date before and after, verify
	the correct time has passed).

	Don't worry, you don't need to wait that long, provided that the
	system has been running long enough, as soon as you resume running
	sleep again, it will exit.   That's because clock_nanotime() has
	returned prematurely in this case -- if the system hasn't been running
	long enough then it won't immediately return, but the sleep will end
	sooner than it should have - how much sooner depends upon how close
	to having been up long enough you waited - if you run this test
	immediately the system has booted, it will all appear to work
	perfectly.

	I know - that's what happened to me when I first started debugging
	this.   Boot new kernel, immediately run test.  It works!   I changed
	nothing (relevant).   Must be a compiler bug..... no.

>Fix:
	Not a fix yet, but I understand the problem, and I think there is
	a serious issue that needs to be fixed.   I believe it affects more
	than just clock_nanosleep().

	clock_nanosleep() computes the number of ticks to sleep for, and
	uses kpause() to wait for that long.   If nothing happens, kpause()
	waits that long (or as long as an int representation of the number
	of ticks can manage - which is long enough that none of us will ever
	observe what happens when that's exceeded) and all is good.

	But kpause(9) says:
	   kpause() can wake up spontaneously.  Callers should prepare
	   to handle it.

	One thing that will cause a spontaneous early return is the process
	having been stopped and resumed.

	That's fine, and clock_nanosleep() is prepared to deal with that,
	it uses the same code that deals with sleeps of very long periods
	(more ticks than an int can represent).   The code looks perfect,
	and would be except for one little thing.

	Wait for it, we're getting there.

	The problem stems from this wording in ts2timo(9).   ts2timo() is
	the function used to convert a timespec specifying when something
	is to happen, into a count of the number ticks to wait until that
	something should have occurred (or just how long to sleep).

	The issue is this (in ts2timo(9)):

	        If the interval is specified as an absolute time, then
		the clock_id clock is used to convert it to a relative time.

	Now as it stands that is fine, if we're passing in an absolute
	end time (according to one of the system's clocks) then to work
	out how long between now and then, we need to subtract now from
	the end time, to determine how long (as a timespec first, then
	converted into ticks).   The whole issue is the use of "it" in
	that sentence.

	If ts2timo(9) instead said:

		If the interval is specified as an absolute time, then
		the clock_id clock is used to compute the relative time.

	But as it is, ts2timo() says it will convert "it" (that being
	the interval specified as an absolute time) into a relative time.
	And that's what it does.

	From what I can see, looking at the relatively few uses of ts2timo()
	(including the one in nanosleep1() - the function that actually
	implements clock_nanosleep() and nanosleep()) none of the callers
	are expecting that.

	nanosleep1() certainly isn't - that's why the sleep ends early.
	Initially the correct number of ticks are computed, and it pauses.
	When the pause ends, it determines if the current time is now
	greater than the time to end, and if not, goes back to sleep again.
	But because of ts2timo()'s conversion of the absolute time into a
	relative time, what we're now testing (since sleep(1) is using
	CLOCK_MONATONIC - which is approximately just uptime) if the uptime
	is greater than the (original) relative time to sleep (that is, the
	arg given to sleep(1)) then it looks as if the end time has passed,
	and nanosleep1() concludes it has successfully finished.

	There are several other calls of ts2timo() that use, or might use,
	TIMER_ABSTIME

	First is lwp_park() in sys_lwp.c (the generic implementation of the
	_lwp_park() system call's many variants).   That might use
	TIMER_ABSTIME or TIMER_RELTIME at the caller's discretion (just
	like clock_nanosleep()).

	It suffers from the same manipulation by ts2timo() in the TIMER_ABSTIME
	case, but doesn't care, as it never uses the ts again in that case,
	only in the relative time case when the remaining time is computed
	and returned to the caller.   I didn't check, but I assume (hope) in
	the TIMER_ABSTIME case, the (various) interfaces to this fuction don't
	copyout() the timespec back to the caller (if they did, bad things
	would probably be happening).   (I can also comment on the
	clock_timeleft() function, used only by lwp_park() from what I could
	find - which is a little (or perhaps a lot) weird, though seems to
	work OK for its one use).

	Anyway, lwp_park() seems not to care what ts2timo() does, or doesn't
	do to the timespec passed in in the TIMER_ABSTIME case.

	Next are mq_recv1() and mq_send1() in sys_mqueue.c.  Both functions
	are identical for this pupose, they are handed an optional pointer
	to a timespec (we're only interested in the case where it us used,
	obviously).   These always use TIMER_ABSTIME.

	Each has a loop...

		while (mqattr->mq_curmsgs ...
		{
			/* the condition is different for each function, but
			   for us here, it is irrelevant, all that matters is
			   that there is a loop.  */

			if (ts) {
			    error = ts2timo(..., TIMER_ABSTIME, ts, &t, ...);
				/* the other args and error handling immaterial
				   here.  That turns ts(timespec) -> t(ticks) */
			 } else t = 0;	/* the case we don't care about */

			 error = cv_timedwait_sig(..., t);
				/* do stuff, no longer than t ticks */
		}

	Very approximately.   First time around the loop, that's all good.
	Second time around, now we have the timespec ts points to having
	been adjusted by ts2timo() to be a relative time.   What used to
	be a BIG number (since this actually uses CLOCK_REALTIME which is
	the time of day clock, and hence already has 54 years (almost)
	in its value) into a small one (the time to wait, in seconds, which
	is unlikely to be anything like 54 years).   So second time around
	the loop, the current time will be > than the "absolute" end time
	passed in to ts2timo() which will detect that, and return ETIMEDOUT.
	The code not shown above detects that as "goto error", and the
	loop abruptly terminates.   Not what is expected I think.

	Next is do_ksem_wait() in uips_sem.c and takes an optional timespec
	as an absolute time (uses TIMER_ABSTIME only).   do_ksem_wait()
	is used by a few sys__ksem_*() functions, but only sys__ksem_timedwait()
	has the ability to pass in a timespec - the other just pass NULL.

	sys__ksem_timedwait() doesn't care what is left in the timespec
	when do_ksem_wait() returns (it is never sent back to userland).

	do_ksem_wait() contains a loop, really very similar to the one
	used in the mqueue functions.   I won't show the code again, it
	has the same issues (exactly).

	That's it.

	So, it seems like nothing actually wants the "update the timespec
	passed in" behaviour of ts2timo() - which the manual page even
	only hints is what happens, it doesn't say it explicitly.


	So, what I propose to do is to simply stop that from happening.
	Make the "struct timespec *ts" parameter (the 3rd) of ts2timo()
	be instead a "const struct timespec *ts" (no modifications allowed)
	to make it quite clear that it will *not* be changed, and adjust
	the code in ts2timo() to avoid doing that (at worst, this is one
	copy of a struct timespec to be added).

	But before I do that, I wanted to see if there is anyone who knows
	why ts2timo() was written like that - it is a very peculiar way to
	implement it by accident.

	An alternative fix is simply to have clock_nanosleep() (whcih is the
	interface I care about at the minute) pass a never to be used again
	copy of the ABSTIME timespec to ts2timo() and retain the real one
	to be used again next time (and I'd guess, those other functions above,
	except possibly lwp_park() which doesn't seem to care) would need to
	do that same.

	Opinions?

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: kern-bug-people->kre
Responsible-Changed-By: kre@NetBSD.org
Responsible-Changed-When: Wed, 09 Oct 2024 07:57:03 +0000
Responsible-Changed-Why:
I am looking into this PR


State-Changed-From-To: open->analyzed
State-Changed-By: kre@NetBSD.org
State-Changed-When: Wed, 09 Oct 2024 07:57:23 +0000
State-Changed-Why:
What is happening is understood.


From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/58733: clock_nanosleep() (and I suspect more) is badly broken
Date: Wed, 09 Oct 2024 15:28:32 +0700

 This is the simple change I am planning (sans objections giving a
 reason why this is the wrong way) to make to fix (I believe, I haven't
 yet tested it) this issue.

 Index: subr_time.c
 ===================================================================
 RCS file: /cvsroot/src/sys/kern/subr_time.c,v
 retrieving revision 1.38
 diff -u -r1.38 subr_time.c
 --- subr_time.c	8 Jul 2023 20:02:10 -0000	1.38
 +++ subr_time.c	9 Oct 2024 08:15:44 -0000
 @@ -331,7 +331,8 @@
  	if ((flags & TIMER_ABSTIME) != 0) {
  		if (!timespecsubok(ts, &tsd))
  			return EINVAL;
 -		timespecsub(ts, &tsd, ts);
 +		timespecsub(ts, &tsd, &tsd);
 +		ts = &tsd;
  	}

  	error = itimespecfix(ts);



 I had to (for this simple change) omit the (desirable) addition
 of the "const" to the parameter, as that would apply to "ts" and
 while it is no issue for timespecsubok() (and certainly not for
 timespecsub() which is just a macro, and where the change would
 actually benefit, as the compiler could check bad things don't
 happen) it isn't OK for itimespecfix() which wants to modify its
 arg, so can't be passed a const * pointer.    A little more code,
 and an extra variable, would fix that, but that's just a frill
 really, the real objective is to have the bug fixed.

 Further, it is possible that callers using TIMER_RELTIME instead
 of TIMER_ABSTIME are actually expecting the change that is made,
 when needed, by itimespecfix() to be reflected back to the passed
 in timespec - that I haven't audited (and understanding what lwp_park()
 is doing in that case is more than I care to do, really...) - I
 didn't even bother to look at the one or two cases of ts2timo() which
 always use TIMER_RELTIME as they cannot be affected by the issue of
 concern - as seen above, the change is confined to the TIMER_ABSTIME
 case.

 I will also make the change suggested in the PR to ts2timo(9) if
 we proceed with this method of fetching things.


From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/58733 CVS commit: src
Date: Wed, 9 Oct 2024 13:02:54 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Wed Oct  9 13:02:54 UTC 2024

 Modified Files:
 	src/distrib/sets/lists/debug: mi
 	src/distrib/sets/lists/tests: mi
 	src/tests/kernel: Makefile
 Added Files:
 	src/tests/kernel: t_nanosleep.c

 Log Message:
 Add a test for clock_nanotime() [ PR kern/58733 ]

 Add a t_nanosleep test to the kernel tests, to (to a limited extent)
 validate its functionality.

 Initially this concentrates on the issue from PR kern/58733
 where if a process paused in nanosleep() is stopped, and then
 continued, and if it is using TIMER_ABSTIME, the call will
 return prematurely, but indicating success.

 There are (currently) 4 test cases, to test all 4 possibilities
 using CLOCK_MONOTONIC and CLOCK_REALTIME (if someone wants to
 add any other clocks that make sense, that should be easy) and
 TIMER_ABSTIME and TIMER_RELTIME.

 Currently both TIMER_ABSTIME tests fail (the TIMER_RELTIME
 tests pass).   When the kernel bug is fixed, the ABSTIME
 tests should be fixed along with it.

 These tests are currently somewhat crude, and I'm not sure
 how well they will work on a qemu test system (they work as
 expected on bare metal).


 To generate a diff of this commit:
 cvs rdiff -u -r1.450 -r1.451 src/distrib/sets/lists/debug/mi
 cvs rdiff -u -r1.1340 -r1.1341 src/distrib/sets/lists/tests/mi
 cvs rdiff -u -r1.81 -r1.82 src/tests/kernel/Makefile
 cvs rdiff -u -r0 -r1.1 src/tests/kernel/t_nanosleep.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/58733: clock_nanosleep() (and I suspect more) is badly broken
Date: Thu, 10 Oct 2024 18:09:49 +0700

 I have been looking more into this.   I am running the patch I suggested
 in an earlier message, and all seems well, I plan to commit that soon.

 This is what happens now (using some diagnostics requested by setting
 a magic bit in the flags word - and no, neither the diagnostics nor the
 modified version of sleep used to request them are going to be committed).

 [  3098.057195] nanosleep1(): ts2timo() OK [4098.057887029] @[3098.057887687] -> 1000001
 [  3105.459332] nanosleep1(): kpause(1000001)->0
 [  3105.459332] nanosleep1(): now=[3105.459390099] err=0
 [  3105.459332] nanosleep1(): abstime end-now=[992.598496930]
 [  3105.459332] nanosleep1(): for [992.598496930] need 992600
 [  3166.676468] nanosleep1(): kpause(992600)->0
 [  3166.676468] nanosleep1(): now=[3166.676518727] err=0
 [  3166.676468] nanosleep1(): abstime end-now=[931.381368302]
 [  3166.676468] nanosleep1(): for [931.381368302] need 931383
 [  3323.383389] nanosleep1(): kpause(931383)->0
 [  3323.383389] nanosleep1(): now=[3323.383446755] err=0
 [  3323.383389] nanosleep1(): abstime end-now=[774.674440274]
 [  3323.383389] nanosleep1(): for [774.674440274] need 774676
 [  3612.484853] nanosleep1(): kpause(774676)->0
 [  3612.484853] nanosleep1(): now=[3612.485074839] err=0
 [  3612.484853] nanosleep1(): abstime end-now=[485.572812190]
 [  3612.484853] nanosleep1(): for [485.572812190] need 485574
 [  4098.067055] nanosleep1(): kpause(485574)->35
 [  4098.067055] nanosleep1(): now=[4098.067775389] err=0
 [  4098.067055] nanosleep1(): abstime end-now=[-1.990111640]
 [  4098.067055] nanosleep1(): for [0.000000000] need 0
 [  4098.067055] nanosleep1()->0

 The first line of that is where the change is most clear, before
 the patch it would have been something more like

 [  2794.557267] nanosleep1(): ts2timo() OK [999.999999361] @[2794.558139625] -> 1000001

 In both cases the request was for a TIMER_ABSTIME sleep until
 the current time (that is in the [] after '@' in those lines) + 1000
 seconds.   This diagnostic shows what is in the request timespec
 after ts2timo() is called (and returns 0, the "OK" in the line).

 In the patched version that value is unchanged from the request,
 in the previous version it has been set to the interval requested
 (1000 secs, more or less - slightly less as some time passed between
 when sleep obtained the current time, added 10000, and requested that
 end time, and when the kernel subtracted the slightly later (larger)
 current time from that, so leaving a little less than 1000).

 The rest of the diagnostics show the working of nanosleep1() - it
 calls kpause with the timeout (nb: on my system, HZ=1000 so for
 1000 seconds we have 1000000 ticks, plus the free one the kernel
 throws in as I'm such a good customer...)

 In the successful trace above, kpause returned after about 7.4
 seconds (that was as a result of a ^Z fg sequence) leaving 992.6
 seconds left to sleep, 992600 ticks to go, go back and repeat the
 kpause() all over again.   That repeats several times (me stopping
 and resuming the sleep) until the end when kpause() returns 35 (EWOULDBLOCK)
 indicating that kpause() is returning because the time ran out,
 When we calculate the end time then, we see we have overrun by about
 2 seconds (that's acceptable) and nanosleep1() and then clock_nanosleep()
 return 0.   All good.

 With the old ts2timo() the sequence was more like this (from yesterday's
 debugging)...

 [  2794.557267] nanosleep1(): ts2timo() OK [999.999999361] @[2794.558139625] -> 1000001
 [  2797.301318] nanosleep1(): kpause(1000001)->0
 [  2797.301318] nanosleep1(): now=[2797.301376196] err=0
 [  2797.301318] nanosleep1(): abstime end-now=[-1798.698623165]
 [  2797.301318] nanosleep1(): for [0.000000000] need 0
 [  2797.301318] nanosleep1()->0

 The initial kpause() is the same, that time I stopped, and resumed,
 it after about 3 seconds (time went from 2794 to 2797).   The code
 then calculates how much time is left to run, and discovers that it
 has already overrun by almost 1800 seconds!   Better quit quickly,
 so it does.    All because 999-2797 == -1798.

 (How long the overrun appears to be depends upon the current time, the
 uptime for CLOCK_MONOTONIC, if the uptime had been < 1000, when the
 sleep process was resumed, then there would have been a positive
 result from the subtraction, and it would have gone back to sleep
 again, for a while.   That is, the -1800 value is irrelevant, just
 that it is something <= 0 - there was no actual overrun, just faulty
 calculations).

 I have done more examination of the kernel sources, I am even more
 convinced now that this change is the right one to make.   The analysis
 of the callers of ts2timo() stands.   I looked more into lwp_park(),
 that (as I suspected) doesn't care at all what happens to the timespec
 structure that is passed in, it just converts it to ticks, and waits
 for that long (or less, or perhaps a little more, it really doesn't care).
 In the TIMER_ABSTIME case that's it.   This change will make no difference
 to lwp_park() (the only caller I was slightly concerned about) at all.

 However, I have also been considering the TIMER_RELTIME case.   For
 that the timespec is already the interval, so ts2timo() doesn't need
 to do any computation, or adjustment, to get that.

 But if the caller specifies a very small interval, < 1 tick, ts2timo()
 adjusts the timespec passed in to be 1 tick (itimerfix()).

 That also seems wrong to me - this one is an adjustment to the input
 value in the TIMER_RELTIME case only (at least after the change to be
 made for the TIMER_ABSTIME case, when that is made itimerfix() in that
 case will be adjusting a local temporary variable no-one is ever going
 to see again, just wasting everyone's time).

 That can be investigated at some later time, but I suspect that what
 is happening in that case is (at best) sub-optimal as well (even if
 just needless computations doing nothing perhaps, in some of the
 cases.)

 It appears to be as if netbsd-10 has the exact same code and hence
 the same issue, and will need a pullup.   But that should be trivial
 as the change is very small and will apply cleanly to the -10 sources.

 For netbsd-9 the code isn't identical, but that appears to have the
 same problem - simply applying the patch won't work, it will need a
 separate patch (not difficult).   There's one minor additional complication
 though, in -9 ts2timo() is used for the linux compat futex calls (or one
 of them).   It seems from a not very thorough examination admittedly, that
 what ts2timo() does to the timespec is irrelevant to it - in fact from
 the comments, it seems that the author of that code was aware of the
 issue, and didn't much like it happening.   So I think that won't be
 a problem either.

 I do not plan on requesting pullups of either the change to ts2timo(9)
 or the added test case - the older versions don't need either of those.
 The man page, as anyone doing kernel development should be doing it
 using HEAD, not -10 (or -9) so the section-9 man pages there don't need
 to be perfect.   The AFT test working in HEAD (which it does for me)
 should be enough to verify that the identical code in -10 works as well,
 no need to run extra tests there for no reason.   Of course if anyone
 else disagrees and wants to requests pullups of those, go ahead.

 Commit coming very soon.

 kre

From: "Robert Elz" <kre@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/58733 CVS commit: src
Date: Thu, 10 Oct 2024 11:14:29 +0000

 Module Name:	src
 Committed By:	kre
 Date:		Thu Oct 10 11:14:28 UTC 2024

 Modified Files:
 	src/share/man/man9: ts2timo.9
 	src/sys/kern: subr_time.c

 Log Message:
 PR kern/58733 - avoid ts2timo() clobbering its arg

 See the PR for the gory details - in the TIMER_ABSTIME case
 ts2timo() should not (really *must* not) alter the timespec
 it is passed (in that case it should be const - but for now
 anyway, cannot be for the TIMER_RELTIME case, and there is
 just one of them!)

 XXX pullup -10
 XXX pullup -9 (will need a patch).


 To generate a diff of this commit:
 cvs rdiff -u -r1.4 -r1.5 src/share/man/man9/ts2timo.9
 cvs rdiff -u -r1.38 -r1.39 src/sys/kern/subr_time.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: analyzed->needs-pullups
State-Changed-By: kre@NetBSD.org
State-Changed-When: Thu, 10 Oct 2024 11:36:57 +0000
State-Changed-Why:
Needs pullups to -10 (soon) and -9 (later, will need a patch prepared).


From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/58733: clock_nanosleep() (and I suspect more) is badly broken
Date: Thu, 10 Oct 2024 18:57:34 +0700

     Date:        Thu, 10 Oct 2024 18:09:49 +0700
     From:        Robert Elz <kre@munnari.OZ.AU>
     Message-ID:  <10285.1728558589@jacaranda.noi.kre.to>

   | [  4098.067055] nanosleep1(): abstime end-now=[-1.990111640]
   | When we calculate the end time then, we see we have overrun by about
   | 2 seconds

 And of course, that was just me mis-reading how negative timespecs (and
 timevals for that matter) work...   -1.99 means -1 second + 0.99 seconds,
 which is really -0.01 seconds (approx), not almost -2 !

 Intellectually I know that, but when I see -1.99 the brain just jumps
 to normal numbers.

 kre

From: kre@NetBSD.ORG
To: gnats-bugs@NetBSD.ORG
Cc: kre@munnari.OZ.AU
Subject: Re: kern/58733: clock_nanosleep() (and I suspect more) is badly broken
Date: Fri, 11 Oct 2024 21:23:03 +0700 (+07)

 Pullup to -10 requested: [pullup-10 #942]

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/58733 CVS commit: [netbsd-10] src
Date: Fri, 11 Oct 2024 17:07:17 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Fri Oct 11 17:07:17 UTC 2024

 Modified Files:
 	src/share/man/man9 [netbsd-10]: ts2timo.9
 	src/sys/kern [netbsd-10]: subr_time.c

 Log Message:
 Pull up following revision(s) (requested by kre in ticket #942):

 	share/man/man9/ts2timo.9: revision 1.4
 	share/man/man9/ts2timo.9: revision 1.5
 	sys/kern/subr_time.c: revision 1.39

 ts2timo() uses struct timespec, those don't have a tv_usec
 field, they have tv_nsec instead.   EINVAL will happen if the
 tv_nsec field is invalid, not the non-existant tv_usec field.

 PR kern/58733 - avoid ts2timo() clobbering its arg

 See the PR for the gory details - in the TIMER_ABSTIME case
 ts2timo() should not (really *must* not) alter the timespec
 it is passed (in that case it should be const - but for now
 anyway, cannot be for the TIMER_RELTIME case, and there is
 just one of them!).


 To generate a diff of this commit:
 cvs rdiff -u -r1.3 -r1.3.42.1 src/share/man/man9/ts2timo.9
 cvs rdiff -u -r1.35 -r1.35.4.1 src/sys/kern/subr_time.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: needs-pullups->pending-pullups
State-Changed-By: kre@NetBSD.org
State-Changed-When: Sun, 13 Oct 2024 03:40:17 +0000
State-Changed-Why:
[pullup-9 #1909] requested (-10 pullup completed already).


From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/58733 CVS commit: [netbsd-9] src
Date: Sun, 13 Oct 2024 15:33:17 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Sun Oct 13 15:33:17 UTC 2024

 Modified Files:
 	src/share/man/man9 [netbsd-9]: ts2timo.9
 	src/sys/kern [netbsd-9]: subr_time.c

 Log Message:
 Pull up following revision(s) (requested by kre in ticket #1909):

 	share/man/man9/ts2timo.9: revision 1.4
 	share/man/man9/ts2timo.9: revision 1.5
 	sys/kern/subr_time.c: revision 1.39 (patch)

 ts2timo() uses struct timespec, those don't have a tv_usec
 field, they have tv_nsec instead.   EINVAL will happen if the
 tv_nsec field is invalid, not the non-existant tv_usec field.
 PR kern/58733 - avoid ts2timo() clobbering its arg

 See the PR for the gory details - in the TIMER_ABSTIME case
 ts2timo() should not (really *must* not) alter the timespec
 it is passed (in that case it should be const - but for now
 anyway, cannot be for the TIMER_RELTIME case, and there is
 just one of them!).


 To generate a diff of this commit:
 cvs rdiff -u -r1.3 -r1.3.34.1 src/share/man/man9/ts2timo.9
 cvs rdiff -u -r1.20.8.1 -r1.20.8.2 src/sys/kern/subr_time.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: pending-pullups->closed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Sun, 13 Oct 2024 20:33:45 +0000
State-Changed-Why:
Looks like this was fixed in HEAD and pulled up to 9 and 10?
(Feel free to reopen if I have assessed incorrectly!)


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.