NetBSD Problem Report #58198
From kardel@Kardel.name Fri Apr 26 14:48:11 2024
Return-Path: <kardel@Kardel.name>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 166151A9238
for <gnats-bugs@gnats.NetBSD.org>; Fri, 26 Apr 2024 14:48:11 +0000 (UTC)
Message-Id: <20240426132832.3CB3044AFF@Andromeda.Kardel.name>
Date: Fri, 26 Apr 2024 15:28:32 +0200 (CEST)
From: kardel@netbsd.org
Reply-To: kardel@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: ZFS can lead to UVM kills (no swap, out of swap)
X-Send-Pr-Version: 3.95
>Number: 58198
>Category: kern
>Synopsis: ZFS can lead to UVM kills (no swap, out of swap)
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Apr 26 14:50:00 +0000 2024
>Originator: kardel@netbsd.org
>Release: NetBSD 10.0_STABLE
>Organization:
>Environment:
System: NetBSD AlpineTest.acrys.com 10.0_STABLE NetBSD 10.0_STABLE (GENERIC) #2: Thu Apr 25 19:55:18 CEST 2024 kardel@gaia.acrys.com:/src/NetBSD/n10/src/obj.amd64/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
Using ZFS can lead to UVM killed due to out of swap or no swap space
configured.
This has been observed on systems with sufficiently large memory footprint
(e. g. 240GB) on a Xen DOMU hvm GENERIC kernel.
The use case is for example a parallel (8 times) load of a database.
The out of swap / no swap space kill happens around the time
when ZFS should start latest releasing pool memory.
The hypothesis is, that even though enough physical memory is
available, ZFS eventually hogs most pool memory and does not free
up pool memory resources *in time* before the pagedaemon decides it
needs to swap. At that point swap is required or the process is killed
when no swap is available.
With swap available, processing continues and ZFS frees some of
its pool memory any all continues.
So, though there is no real resource reason to needing to use swap
the ZFS / pagedaemon/swap / UVM interaction is at best suboptimal.
There should be no need to have swap space available when running
a database (or any other writing process) on ZFS as ZFS can evict
data always to storage.
>How-To-Repeat:
Set up a Xen DOMU with significant memory and no swap. Create a database on ZFS
and load a larger database backup with a higher value of paralellism (e.g. 8).
Sit back watch ZFS consume pool memory. Once almost all poolmemory is consumed
some unlucky processes may be UVM killed.
Apr 22 09:32:02 Alpine-next /netbsd: [ 8871047.9934402] UVM: pid 26090.6694 (java), uid 1802 killed: out of swap
Apr 22 09:32:02 Alpine-next /netbsd: [ 8871047.9934402] UVM: pid 1944.1944 (postgres), uid 1003 killed: out of swap
Apr 22 09:32:02 Alpine-next /netbsd: [ 8871047.9934402] UVM: pid 23508.26256 (java), uid 1802 killed: out of swap
>Fix:
rework ZFS/pagedaemon communication so overshoots into using swap before ZFS manages to free memory cannot happen.
This might require a better coordination with ZFS as currently the pooldrain mechanism is too asynchronous so
more memory requests while the pooldraining was just triggered leads to swap usage.
Alternatively ZFS could be limited to not consume so much memory.
May be related to PR kern/57558 (there ZFS does not free resources though it could)
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.