NetBSD Problem Report #58560
From www@netbsd.org Wed Aug 7 14:36:35 2024
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits)
client-signature RSA-PSS (2048 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 41D9F1A923A
for <gnats-bugs@gnats.NetBSD.org>; Wed, 7 Aug 2024 14:36:35 +0000 (UTC)
Message-Id: <20240807143633.CFFD51A923C@mollari.NetBSD.org>
Date: Wed, 7 Aug 2024 14:36:33 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: experimental wg(4) handshake retry doesn't add timing jitter
X-Send-Pr-Version: www-1.0
>Number: 58560
>Category: kern
>Synopsis: experimental wg(4) handshake retry doesn't add timing jitter
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Aug 07 14:40:00 +0000 2024
>Originator: Taylor R Campbell
>Release: current, 10
>Organization:
The Jitterbug Wgdation
>Environment:
>Description:
From the technical whitepaper (https://www.wireguard.com/papers/wireguard.pdf):
[W] 6.1 Timers & Stateless UX: Preliminaries
"Whenever a handshake initiation message is sent as the result of an expiring timer, an additional amount of jitter is added to the expiration, in order to prevent two peers from repeatedly initiating handshakes at the same time."
This leaves it entirely unspecified what the distribution of the delay is, so really we do implement this already by simply not having a perfect clock behind callout_schedule.
But maybe the intent is to have much higher variance in the actual timing difference -- say, an exponential distribution averaging about 1/16 sec truncated at 1 sec.
>How-To-Repeat:
code inspection
>Fix:
Add a random delay when scheduling wgp_handshake_timeout_timer.
Here's a candidate implementation. I'm not committing it just yet because it will require adjusting the tests that currently rely on timeouts precise to the second to exercise various paths in the state machine -- they will need to have more slop added to avoid spuriously failing.
/*
* wg_jittery_timeout(n)
*
* Return a random timeout of at most n + 1 seconds, and at most
* INT_MAX ticks, and, unless n is too large for INT_MAX ticks, at
* least n seconds.
*
* There is no guidance on the distribution of jitter, so we'll
* approximate an exponential distribution with average 1/16 sec,
* truncated to 1 sec, to add to n sec.
*
* [W] 6.1 Timers & Stateless UX: Preliminaries
* "Whenever a handshake initiation message is sent as the result
* of an expiring timer, an additional amount of jitter is added
* to the expiration, in order to prevent two peers from
* repeatedly initiating handshakes at the same time."
*/
static int
wg_jittery_timeout(int nsec)
{
/*
* Sample g from a truncated Geometric(1/2) distribution,
* truncated to MIN(hz, 32).
*
* ffs32 is 1-based, with ffs32(0) = 0, so ffs32(0) - 1 = -1 =
* UINT_MAX. For nonzero x, ffs32(x) - 1 lies in {0, 1, 2,
* ..., 31}. Hence g = MIN(ffs32(x) - 1, MIN(hz, 32)) lies in
* {0, 1, 2, ..., MIN(hz, 32)}; UINT_MAX is clamped to the
* largest point, MIN(hz, 32).
*
* Since g <= MIN(hz, 32), we have g/MIN(hz, 32) <= 1, so
*
* hz*(g/MIN(hz, 32)) <= hz,
*
* and hence the delay d that we actually compute is bounded
* by:
*
* d := g*floor(hz/MIN(hz, 32))
* <= g*(hz/MIN(hz, 32))
* = hz*(g/MIN(hz, 32))
* <= hz.
*/
const uint32_t x = cprng_fast32();
const unsigned g = MIN(ffs32(x) - 1, MIN(hz, 32));
const unsgined d = g*(hz/MIN(hz, 32));
KASSERTMSG(d <= hz, "x=%"PRIu32" g=%u d=%u hz=%d", x, g, d, hz);
return MIN(wg_rekey_timeout, (unsigned)(INT_MAX/hz - 1))*hz + d;
}
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.