NetBSD Problem Report #53422

From gson@gson.org  Tue Jul  3 14:59:31 2018
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 01AEE7A180
	for <gnats-bugs@gnats.NetBSD.org>; Tue,  3 Jul 2018 14:59:30 +0000 (UTC)
Message-Id: <20180703145924.D23879896A2@guava.gson.org>
Date: Tue,  3 Jul 2018 17:59:24 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: Many zfs tests fail on low-memory systems
X-Send-Pr-Version: 3.95

>Number:         53422
>Category:       kern
>Synopsis:       Many zfs tests fail on low-memory systems
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Jul 03 15:00:00 +0000 2018
>Closed-Date:    Wed Apr 24 12:01:35 +0000 2019
>Last-Modified:  Wed Apr 24 12:01:35 +0000 2019
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current, source date >= 2018-05-28
>Organization:

>Environment:
System: NetBSD
Architecture: x86_64
Machine: amd64
>Description:

Since the zfs updates of May 28, a large number of zfs tests are
failing on the amd64 testbed:

  http://releng.netbsd.org/b5reports/amd64/commits-2018.05.html#2018.05.29.01.09.49

Most of the failures, for example fs/vfs/t_full/zfs_fillfs,
are accompanied by console messages like the following:

    zfs_fillfs: [ 53721.7611795] WARNING: ZFS on NetBSD is under development
[ 53721.7611795] ERROR: at least 512MB of memory required touse ZFS
[ 53721.7611795] WARNING: module error: modcmd function failed for `zfs', error 12
[3.195509s] Failed: mount failed: Unknown error: 256

Before the changes, the tests worked because although the virtual
machine running them has only 128 MB, the zfs module is not loaded
into the real kernel of the VM but only into a rump kernel, and rump
emulates 512 MB of memory regardless of the memory size of the system
it runs in.  This size was chosen specifically for the benefit of zfs,
as it says in src/sys/rump/librump/rumpkern/emul.c:

  /*                                                                                                                                
   * physmem is largely unused (except for nmbcluster calculations),                                                                
   * so pick a default value which suits ZFS.  if an application wants                                                              
   * a very small memory footprint, it can still adjust this before                                                                 
   * calling rump_init()                                                                                                            
   */
  #define PHYSMEM 512*256

After the changes, the tests attempt to load the zfs module into the
real kernel, triggering the console messages shown above.  I tracked
down one such attempt to where the zfs_fillfs test executes the shell
command "zpool create mnt /zfsdev".  Previously, all zfs operations
executed by the zpool process were successfully hijacked using
RUMPHIJACK and executed in the rump kernel, but now libzfs_init() has
been changed to call libzfs_load() which attempts to load the module
into the real kernel using modctl(MODCTL_LOAD...), which is not
subject to hijacking.

However, it looks like this is not all there is to the problem,
because when I tried commenting out the modctl(MODCTL_LOAD...) call
in libzfs_load(), the tests still failed.

There are also other new zfs test failures that occur even on systems
with 512 MB or more, but I will file a separate PR about those.

>How-To-Repeat:

Run the ATF tests in a virtual machine with 128 MB memory.

>Fix:

>Release-Note:

>Audit-Trail:
From: "Juergen Hannken-Illjes" <hannken@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/53422 CVS commit: src/tests/fs
Date: Sun, 16 Dec 2018 14:04:14 +0000

 Module Name:	src
 Committed By:	hannken
 Date:		Sun Dec 16 14:04:14 UTC 2018

 Modified Files:
 	src/tests/fs/common: fstest_zfs.c
 	src/tests/fs/zfs: t_zpool.sh

 Log Message:
 Have to hijack sysctl() and modctl() for zfs commands.

 Should fix PR kern/53422


 To generate a diff of this commit:
 cvs rdiff -u -r1.1 -r1.2 src/tests/fs/common/fstest_zfs.c
 cvs rdiff -u -r1.3 -r1.4 src/tests/fs/zfs/t_zpool.sh

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Wed, 24 Apr 2019 12:01:35 +0000
State-Changed-Why:
Fixed by hanken some time ago, thanks!


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.