NetBSD Problem Report #57199

From brad@anduin.eldar.org  Fri Jan 27 17:38:12 2023
Return-Path: <brad@anduin.eldar.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 363051A9239
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 27 Jan 2023 17:38:12 +0000 (UTC)
Message-Id: <202301271738.30RHc5UR029176@anduin.eldar.org>
Date: Fri, 27 Jan 2023 12:38:05 -0500 (EST)
From: brad@anduin.eldar.org
Reply-To: brad@anduin.eldar.org
To: gnats-bugs@NetBSD.org
Subject: Pure PVH i386 guests hang on disk activity
X-Send-Pr-Version: 3.95

>Number:         57199
>Category:       kern
>Synopsis:       Pure PVH i386 guests hang on disk activity
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Jan 27 17:40:00 +0000 2023
>Originator:     Brad Spencer
>Release:        NetBSD 10.0_BETA
>Organization:
	eldar.org
>Environment:
System: NetBSD nbsd10test32bit 10.0_BETA NetBSD 10.0_BETA (GENERIC_PAE) #0: Thu Jan 26 20:00:30 EST 2023  brad@samwise.nat.eldar.org:/lhome/NetBSD_10_branch_20230110/i386/OBJ/sys/arch/i386/compile/GENERIC_PAE i386
Architecture: x86_64
Machine: amd64
>Description:

A 32 bit i386 pure PVH DOMU will hang processes after a moderate
amount of disk activity.  A PV guest using XEN3PAE_DOMU will not hang
while performing the same actions.

>How-To-Repeat:

Set up a 32 bit i386 DOMU and use the GENERIC or GENERIC_PAE kernel to
create a pure PVH Xen guest.  The system should boot.  Use nearly any
large tar archive you like, such as the sets and try to unpack them.
The system will hang in short order with user processes stuck in
biowait or uvmfp2 (seen using a CTRL-T).  User processes appear to be
unkillable at that point, but the shells will respond to CTRL-T (at
least for a while).  You can get into DDB.  The exact same system
using XEN3PAE_DOMU is fine.  Commands that have been cached, such as
repeating 'vmstat -m' in another shell will continue to work for a
while after the tar hangs.  Any new commands will hang, however.

I used this config for the guest:

kernel = "/lhome/xen/kernels/NetBSD_10.x/NetBSD_10_branch_20230110/i386/netbsd-GENERIC_PAE"
memory = 512
cpu_weight = 32
name = "nbsd10test32bit"
type="pvh"
vcpus = 1
vif = [ 'mac=NO:NO:NO:NO:NO:NO, bridge=bridge4' ]
disk = [ 'phy:/dev/mapper/rustvg0-nbsd10test32bitlv0,0x01,w' ]
root = "xbd0"

The file system is a FFSv2 without WAPBL enabled.  After a 'xl destory
...' is performed on the guest and the guest rebooted, the filesystem
appears have suffered some damage that fsck fixes up.  The guest is
not configured to have any swap space.

It did not seem that adding memory to the guest helped out in any
great amount.

>Fix:

No idea.  Resource exhaustion or a lack of freeing something or
other??  Nothing is printed on the console when the hang happens.

>Unformatted:
 10.0_BETA from 2023-01-10
Home
PR Database Search
(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.