NetBSD Problem Report #57920
From www@netbsd.org Sat Feb 10 19:55:30 2024
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 5AC441A9239
for <gnats-bugs@gnats.NetBSD.org>; Sat, 10 Feb 2024 19:55:30 +0000 (UTC)
Message-Id: <20240210195528.8497C1A923A@mollari.NetBSD.org>
Date: Sat, 10 Feb 2024 19:55:28 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: hardclock(9) contract is unclear about missed ticks
X-Send-Pr-Version: www-1.0
>Number: 57920
>Category: kern
>Synopsis: hardclock(9) contract is unclear about missed ticks
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Feb 10 20:00:00 +0000 2024
>Last-Modified: Sun Apr 07 23:00:02 +0000 2024
>Originator: Taylor R Campbell
>Release: current
>Organization:
The NetBSD Hardclock
>Environment:
>Description:
Quoth the hardclock(9) man page:
The hardclock() function is called hz(9) times per second. It implements
the real-time system clock. The argument frame is an opaque, machine-
dependent structure that encapsulates the previous machine state.
What happens if the machine-dependent periodic timer interrupt is delayed or some timer interrupts have been missed, but the underlying timer hardware can tell by how much it has been delayed or how many interrupts are missed?
Reasons for this include entering and exiting ddb, suspending and resuming hardware, scheduling delays on virtual hardware, and flaky hardware
Here are some options if n > 1 periods have elapsed since the last hardclock tick:
1. Call hardclock once, i.e., pretend nothing happened and let the timecounter sort out clock jumps.
2. Call hardclock n times, i.e., try to catch up as fast as we can even if that means hardclocks happen much faster than 1/hz times per second.
3. Call hardclock MIN(n, k) times for some time k, i.e., try to catch up but by at most k/hz seconds.
Some drivers, like the i8254 driver in arch/x86/isa/clock.c and the Intel local APIC driver in arch/x86/x86/lapic.c, do (1); some drivers, like the PowerPC e500 clock driver in arch/powerpc/booke/e500_timer.c, do (2); other drivers, like the Xen clock driver in arch/xen/xen/xen_clock.c, do (3). Which should it be?
>How-To-Repeat:
code inspection, diagnosing heartbeat issues with ddb on riscv, writing a new clock driver and wondering what to do in this case
>Fix:
Yes, please!
Perhaps hardclock(9) should be extended with an argument saying how many ticks the MD clock driver thinks have elapsed; if >1, it missed some. We can have the policy about what to do in this case -- dtrace probe, event counter, printf, callout scheduling, whatever -- in MI code, and leave only the mechanism for detecting missed ticks in MD code.
>Audit-Trail:
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/57920 CVS commit: src/sys/arch/riscv/riscv
Date: Sun, 7 Apr 2024 22:59:13 +0000
Module Name: src
Committed By: riastradh
Date: Sun Apr 7 22:59:13 UTC 2024
Modified Files:
src/sys/arch/riscv/riscv: clock_machdep.c
Log Message:
riscv: Schedule next hardclock tick in the future, not the past.
If we have missed hardclock ticks, schedule up to one tick interval
in the future anyway; don't try to play hardclock catchup by
scheduling for when the next hardclock tick _should_ have been, in
the past, leading to ticking as fast as possible until we've caught
up. as fast as possible until we've caught up.
Playing hardclock catchup triggers heartbeat panics when continuing
from ddb, if you've been in ddb for >15sec. Other hardclock drivers
like x86 lapic don't play hardclock catchup either.
PR kern/57920
To generate a diff of this commit:
cvs rdiff -u -r1.7 -r1.8 src/sys/arch/riscv/riscv/clock_machdep.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.