NetBSD Problem Report #57512

From conorh@sdf.org  Fri Jul  7 21:43:02 2023
Return-Path: <conorh@sdf.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 379C21A923D
	for <gnats-bugs@gnats.NetBSD.org>; Fri,  7 Jul 2023 21:43:02 +0000 (UTC)
Message-Id: <202307072142.367Lgo0w023311@sdf.org>
Date: Fri, 7 Jul 2023 14:42:50 -0700 (PDT)
From: conorh@sdf.org
Reply-To: conorh@sdf.org
To: gnats-bugs@NetBSD.org
Subject: clock_gettime(CLOCK_THREAD_CPUTIME_ID) sometimes goes backwards
X-Send-Pr-Version: 3.95

>Number:         57512
>Category:       kern
>Synopsis:       clock_gettime(CLOCK_THREAD_CPUTIME_ID, &ts) sometimes reports a timespec less than one obtained on a previous invocation (i.e, it goes backwards)
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jul 07 21:45:00 +0000 2023
>Last-Modified:  Sat Jul 08 20:05:01 +0000 2023
>Originator:     Conor Hughes
>Release:        NetBSD 9.3
>Organization:
conorh@sdf.org
SDF Public Access UNIX System - http://sdf.org
>Environment:


System: NetBSD otaku 9.3 NetBSD 9.3 (GENERIC) #0: Thu Aug 4 15:30:37 UTC 2022 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
	When calling clock_gettime(2) and using CLOCK_THREAD_CPUTIME_ID as the clockid_t, the value filled into the timespec is sometimes less than a previous value obtained via the same call on the same thread. Since consuming negative CPU time doesn't make sense, this seems like a bug.

        CLOCK_PROCESS_CPUTIME_ID does not seem to be affected and is a workaround in some cases. I've reproduced this on NetBSD 9.3 on x86_64 and evbarm/armv6.

>How-To-Repeat:
        Compile and run the following test program.

#include <stdio.h>
#include <time.h>
#include <err.h>
#include <sysexits.h>
#include <stdlib.h>
#include <string.h>

/* The specifics of this function are immaterial but we have to do some work in order to separate
 * start and end.
 */
static void waste_some_time()
{
  for ( int i = 0; i< 1000; i++ ) {
    char buf[4 * 4096];
    arc4random_buf(buf, sizeof(buf));
    char buf2[4 * 4096];
    arc4random_buf(buf2, sizeof(buf2));
    if ( memcmp(buf, buf2, sizeof(buf)) == 0 ) {
      fprintf(stderr, "wow!\n");
    }
  }
}

int main(void)
{
  static const int niter = 100;
  static const clockid_t query_clock = CLOCK_THREAD_CPUTIME_ID;
  for ( int iter = 0; iter < niter; iter++ ) {
    struct timespec start;
    if ( clock_gettime(query_clock, &start) != 0 ) {
      err(EX_OSERR, "clock_gettime");
    }

    waste_some_time();

    struct timespec end;
    if ( clock_gettime(query_clock, &end) != 0 ) {
      err(EX_OSERR, "clock_gettime");
    }

    if ( end.tv_sec < start.tv_sec ||
         (end.tv_sec == start.tv_sec && end.tv_nsec < start.tv_nsec) ) {
      errx(EX_OSERR, "clock_gettime went backwards!? after %d iterations, start (%lld %lld) end (%lld %lld)",
           iter,
           (long long)start.tv_sec, (long long)start.tv_nsec,
           (long long)end.tv_sec, (long long)end.tv_nsec);
    }
  }
  fprintf(stderr, "did %d iterations.\n", niter);
  return EXIT_SUCCESS;
}

>Fix:

Unknown
>Audit-Trail:
From: Taylor R Campbell <riastradh@NetBSD.org>
To: conorh@sdf.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/57512: clock_gettime(CLOCK_THREAD_CPUTIME_ID) sometimes goes backwards
Date: Sat, 8 Jul 2023 00:18:21 +0000

 This is because CLOCK_THREAD_CPUTIME_ID is implemented by summing
 l->l_rticksum + l->l_slpticksum, but these are both reset periodically
 by sched_lwp_stats.

 Aside: Not sure why we add the sleeping time, l_slpticksum?

 I think this should probably instead use l->l_ru.ru_utime +
 l->l_ru.ru_stime, possibly with an adjustment from the timecounter
 since the last tick like calcru does for process getrusage.  But I
 would need to think more about it to be sure.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: conorh@sdf.org
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/57512: clock_gettime(CLOCK_THREAD_CPUTIME_ID) sometimes goes backwards
Date: Sat, 8 Jul 2023 01:31:03 +0000

 This is a multi-part message in MIME format.
 --=_17Wymd7lPaKOmlAiVMiz9N/EY5eA0fm1

 Can you please try the attached patch?

 --=_17Wymd7lPaKOmlAiVMiz9N/EY5eA0fm1
 Content-Type: text/plain; charset="ISO-8859-1"; name="clockcputimeid"
 Content-Transfer-Encoding: quoted-printable
 Content-Disposition: attachment; filename="clockcputimeid.patch"

 From 881ce53d36439f2dc9a4cc3be389095344d42501 Mon Sep 17 00:00:00 2001
 From: Taylor R Campbell <riastradh@NetBSD.org>
 Date: Sat, 8 Jul 2023 01:29:28 +0000
 Subject: [PATCH] WIP: clock_gettime(2): Fix CLOCK_PROCESS/THREAD_CPUTIME_ID.

 Use same calculation as getrusage, not some ad-hoc arithmetic of
 internal scheduler parameters that are periodically rewound.

 XXX This creates copypasta of calcru for lwps -- should factor it out
 into a calcrulwp or something to avoid this.
 ---
  sys/kern/subr_time.c | 67 ++++++++++++++++++++++----------------------
  1 file changed, 33 insertions(+), 34 deletions(-)

 diff --git a/sys/kern/subr_time.c b/sys/kern/subr_time.c
 index eaa8312dd8fb..cad7fa26e998 100644
 --- a/sys/kern/subr_time.c
 +++ b/sys/kern/subr_time.c
 @@ -45,12 +45,6 @@ __KERNEL_RCSID(0, "$NetBSD: subr_time.c,v 1.37 2023/04/2=
 9 03:36:55 isaki Exp $")
  #include <sys/timetc.h>
  #include <sys/intr.h>
 =20
 -#ifdef DEBUG_STICKS
 -#define DPRINTF(a) uprintf a
 -#else
 -#define DPRINTF(a)=20
 -#endif
 -
  /*
   * Compute number of hz until specified time.  Used to compute second
   * argument to callout_reset() from an absolute time.
 @@ -248,32 +242,16 @@ clock_timeleft(clockid_t clockid, struct timespec *ts=
 , struct timespec *sleepts)
  	*sleepts =3D sleptts;
  }
 =20
 -static void
 -ticks2ts(uint64_t ticks, struct timespec *ts)
 -{
 -	ts->tv_sec =3D ticks / hz;
 -	uint64_t sticks =3D ticks - ts->tv_sec * hz;
 -	if (sticks > BINTIME_SCALE_MS)	/* floor(2^64 / 1000) */
 -		ts->tv_nsec =3D sticks / hz * 1000000000LL;
 -	else if (sticks > BINTIME_SCALE_US)	/* floor(2^64 / 1000000) */
 -		ts->tv_nsec =3D sticks * 1000LL / hz * 1000000LL;
 -	else
 -		ts->tv_nsec =3D sticks * 1000000000LL / hz;
 -	DPRINTF(("%s: %ju/%ju -> %ju.%ju\n", __func__,
 -	    (uintmax_t)ticks, (uintmax_t)sticks,
 -	    (uintmax_t)ts->tv_sec, (uintmax_t)ts->tv_nsec));
 -}
 -
  int
  clock_gettime1(clockid_t clock_id, struct timespec *ts)
  {
  	int error;
 -	uint64_t ticks;
  	struct proc *p;
 =20
  #define CPUCLOCK_ID_MASK (~(CLOCK_THREAD_CPUTIME_ID|CLOCK_PROCESS_CPUTIME_=
 ID))
  	if (clock_id & CLOCK_PROCESS_CPUTIME_ID) {
  		pid_t pid =3D clock_id & CPUCLOCK_ID_MASK;
 +		struct timeval cputime;
 =20
  		mutex_enter(&proc_lock);
  		p =3D pid =3D=3D 0 ? curproc : proc_find(pid);
 @@ -281,10 +259,10 @@ clock_gettime1(clockid_t clock_id, struct timespec *t=
 s)
  			mutex_exit(&proc_lock);
  			return ESRCH;
  		}
 -		ticks =3D p->p_uticks + p->p_sticks + p->p_iticks;
 -		DPRINTF(("%s: u=3D%ju, s=3D%ju, i=3D%ju\n", __func__,
 -		    (uintmax_t)p->p_uticks, (uintmax_t)p->p_sticks,
 -		    (uintmax_t)p->p_iticks));
 +		mutex_enter(p->p_lock);
 +		calcru(p, /*usertime*/NULL, /*systime*/NULL, /*intrtime*/NULL,
 +		    &cputime);
 +		mutex_exit(p->p_lock);
  		mutex_exit(&proc_lock);
 =20
  		// XXX: Perhaps create a special kauth type
 @@ -293,9 +271,14 @@ clock_gettime1(clockid_t clock_id, struct timespec *ts)
  		    KAUTH_ARG(KAUTH_REQ_PROCESS_CANSEE_ENTRY), NULL, NULL);
  		if (error)
  			return error;
 +
 +		TIMEVAL_TO_TIMESPEC(&cputime, ts);
 +		return 0;
  	} else if (clock_id & CLOCK_THREAD_CPUTIME_ID) {
  		struct lwp *l;
  		lwpid_t lid =3D clock_id & CPUCLOCK_ID_MASK;
 +		struct bintime tm =3D {0, 0};
 +
  		p =3D curproc;
  		mutex_enter(p->p_lock);
  		l =3D lid =3D=3D 0 ? curlwp : lwp_find(p, lid);
 @@ -303,15 +286,31 @@ clock_gettime1(clockid_t clock_id, struct timespec *t=
 s)
  			mutex_exit(p->p_lock);
  			return ESRCH;
  		}
 -		ticks =3D l->l_rticksum + l->l_slpticksum;
 -		DPRINTF(("%s: r=3D%ju, s=3D%ju\n", __func__,
 -		    (uintmax_t)l->l_rticksum, (uintmax_t)l->l_slpticksum));
 +
 +		/* XXX begin copypasta from calcru */
 +		lwp_lock(l);
 +		bintime_add(&tm, &l->l_rtime);
 +		if ((l->l_pflag & LP_RUNNING) !=3D 0 &&
 +		    (l->l_pflag & (LP_INTR | LP_TIMEINTR)) !=3D LP_INTR) {
 +			struct bintime diff;
 +			/*
 +			 * Adjust for the current time slice.  This is
 +			 * actually fairly important since the error
 +			 * here is on the order of a time quantum,
 +			 * which is much greater than the sampling
 +			 * error.
 +			 */
 +			binuptime(&diff);
 +			membar_consumer(); /* for softint_dispatch() */
 +			bintime_sub(&diff, &l->l_stime);
 +			bintime_add(&tm, &diff);
 +		}
 +		lwp_unlock(l);
 +		/* XXX end copypasta from calcru */
 +
  		mutex_exit(p->p_lock);
 -	} else
 -		ticks =3D (uint64_t)-1;
 =20
 -	if (ticks !=3D (uint64_t)-1) {
 -		ticks2ts(ticks, ts);
 +		bintime2timespec(&tm, ts);
  		return 0;
  	}
 =20

 --=_17Wymd7lPaKOmlAiVMiz9N/EY5eA0fm1--

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57512 CVS commit: src/tests/lib/libc/sys
Date: Sat, 8 Jul 2023 14:05:51 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Jul  8 14:05:51 UTC 2023

 Modified Files:
 	src/tests/lib/libc/sys: t_clock_gettime.c

 Log Message:
 t_clock_gettime: Add test for PR kern/57512.


 To generate a diff of this commit:
 cvs rdiff -u -r1.3 -r1.4 src/tests/lib/libc/sys/t_clock_gettime.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57512 CVS commit: src
Date: Sat, 8 Jul 2023 20:02:10 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Sat Jul  8 20:02:10 UTC 2023

 Modified Files:
 	src/sys/kern: kern_resource.c subr_time.c
 	src/sys/sys: resourcevar.h
 	src/tests/lib/libc/sys: t_clock_gettime.c

 Log Message:
 clock_gettime(2): Fix CLOCK_PROCESS/THREAD_CPUTIME_ID.

 Use same calculation as getrusage, not some ad-hoc arithmetic of
 internal scheduler parameters that are periodically rewound.

 PR kern/57512

 XXX pullup-8
 XXX pullup-9
 XXX pullup-10


 To generate a diff of this commit:
 cvs rdiff -u -r1.190 -r1.191 src/sys/kern/kern_resource.c
 cvs rdiff -u -r1.37 -r1.38 src/sys/kern/subr_time.c
 cvs rdiff -u -r1.57 -r1.58 src/sys/sys/resourcevar.h
 cvs rdiff -u -r1.4 -r1.5 src/tests/lib/libc/sys/t_clock_gettime.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.