NetBSD Problem Report #57140

From www@netbsd.org  Wed Dec 28 12:21:35 2022
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 5ED751A9239
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 28 Dec 2022 12:21:35 +0000 (UTC)
Message-Id: <20221228122133.DA9841A923A@mollari.NetBSD.org>
Date: Wed, 28 Dec 2022 12:21:33 +0000 (UTC)
From: cryintothebluesky@gmail.com
Reply-To: cryintothebluesky@gmail.com
To: gnats-bugs@NetBSD.org
Subject: ZFS on NetBSD results in heavy swapping to disk
X-Send-Pr-Version: www-1.0

>Number:         57140
>Category:       kern
>Synopsis:       ZFS on NetBSD results in heavy swapping to disk
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Dec 28 12:25:00 +0000 2022
>Originator:     Sad Clouds
>Release:        10.0_BETA
>Organization:
>Environment:
NetBSD rp4-netbsd 10.0_BETA NetBSD 10.0_BETA (GENERIC64) #0: Fri Dec 23 11:21:46 GMT 2022
>Description:
This issue only affects ZFS. For example VM subsystem with FFSv2 behaves correctly and releases file cache for the use of anonymous memory.

The test was done on Raspberry Pi with 4 GiB of RAM and 4 GiB of swap space.


First enable ZFS and create file system which has compression set to off:

/etc/rc.conf
zfs=YES

zpool create data /dev/dk4
zfs create -o compression=off -o mountpoint=/opt data/opt


Write large file to ZFS of size equal to available RAM. This will use most of RAM as ZFS cache:

dd if=/dev/zero of=/opt/out bs=1m count=4000


Next, execute a program that allocates 3 GiB of memory:

# cat test.c
#include <assert.h>
#include <stdlib.h>
#include <stdint.h>

int main(void)
{
        uint8_t *u8 = NULL;
        size_t size_val = 3U * 1024U * 1024U * 1024U;

        u8 = malloc(size_val);
        assert(u8 != NULL);

        /* Force page faults and physical memory allocation for every 4K */
        for (size_t i = 0; i < (size_val / 4096); i++)
        {
                *u8 = 123;
                u8 += 4096;
        }
}


This is where ZFS issues start. It does not release file cache memory and the system starts swapping like crazy

# top -s1 -b
load averages:  0.89,  0.31,  0.12;               up 0+00:05:33        11:57:42
20 processes: 18 sleeping, 2 on CPU
CPU states:  0.0% user,  0.0% nice, 25.7% system,  0.7% interrupt, 73.5% idle
Memory: 669M Act, 41M Inact, 9984K Exec, 448K File, 56K Free
Swap: 4096M Total, 2065M Used, 2031M Free / Pools: 3043M Used

  PID USERNAME PRI NICE   SIZE   RES STATE       TIME   WCPU    CPU COMMAND
    0 root     126    0     0K 2972M CPU/2       1:36 98.10% 98.10% [system]
  402 root      85    0  3089M  701M flt_no/1    0:03  2.08%  2.05% a.out


There does not seem to be a tunable setting to limit ZFS file cache or force it to release its memory more aggressively. Not sure if I'm missing something here, or ZFS is simply not suitable for system with heavy memory contention between file cache and anonymous mappings. ZFS is using its own ARC as file cache, rather than NetBSD VM, hence the issue.
>How-To-Repeat:

>Fix:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2022 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.