NetBSD Problem Report #57199
From brad@anduin.eldar.org Fri Jan 27 17:38:12 2023
Return-Path: <brad@anduin.eldar.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 363051A9239
for <gnats-bugs@gnats.NetBSD.org>; Fri, 27 Jan 2023 17:38:12 +0000 (UTC)
Message-Id: <202301271738.30RHc5UR029176@anduin.eldar.org>
Date: Fri, 27 Jan 2023 12:38:05 -0500 (EST)
From: brad@anduin.eldar.org
Reply-To: brad@anduin.eldar.org
To: gnats-bugs@NetBSD.org
Subject: Pure PVH i386 guests hang on disk activity
X-Send-Pr-Version: 3.95
>Number: 57199
>Category: kern
>Synopsis: Pure PVH i386 guests hang on disk activity
>Confidential: no
>Severity: serious
>Priority: low
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Jan 27 17:40:00 +0000 2023
>Closed-Date: Wed Jul 24 00:39:02 +0000 2024
>Last-Modified: Wed Jul 24 00:39:02 +0000 2024
>Originator: Brad Spencer
>Release: NetBSD 10.0_BETA
>Organization:
eldar.org
>Environment:
System: NetBSD nbsd10test32bit 10.0_BETA NetBSD 10.0_BETA (GENERIC_PAE) #0: Thu Jan 26 20:00:30 EST 2023 brad@samwise.nat.eldar.org:/lhome/NetBSD_10_branch_20230110/i386/OBJ/sys/arch/i386/compile/GENERIC_PAE i386
Architecture: x86_64
Machine: amd64
>Description:
A 32 bit i386 pure PVH DOMU will hang processes after a moderate
amount of disk activity. A PV guest using XEN3PAE_DOMU will not hang
while performing the same actions.
>How-To-Repeat:
Set up a 32 bit i386 DOMU and use the GENERIC or GENERIC_PAE kernel to
create a pure PVH Xen guest. The system should boot. Use nearly any
large tar archive you like, such as the sets and try to unpack them.
The system will hang in short order with user processes stuck in
biowait or uvmfp2 (seen using a CTRL-T). User processes appear to be
unkillable at that point, but the shells will respond to CTRL-T (at
least for a while). You can get into DDB. The exact same system
using XEN3PAE_DOMU is fine. Commands that have been cached, such as
repeating 'vmstat -m' in another shell will continue to work for a
while after the tar hangs. Any new commands will hang, however.
I used this config for the guest:
kernel = "/lhome/xen/kernels/NetBSD_10.x/NetBSD_10_branch_20230110/i386/netbsd-GENERIC_PAE"
memory = 512
cpu_weight = 32
name = "nbsd10test32bit"
type="pvh"
vcpus = 1
vif = [ 'mac=NO:NO:NO:NO:NO:NO, bridge=bridge4' ]
disk = [ 'phy:/dev/mapper/rustvg0-nbsd10test32bitlv0,0x01,w' ]
root = "xbd0"
The file system is a FFSv2 without WAPBL enabled. After a 'xl destory
...' is performed on the guest and the guest rebooted, the filesystem
appears have suffered some damage that fsck fixes up. The guest is
not configured to have any swap space.
It did not seem that adding memory to the guest helped out in any
great amount.
>Fix:
No idea. Resource exhaustion or a lack of freeing something or
other?? Nothing is printed on the console when the hang happens.
>Release-Note:
>Audit-Trail:
From: Brad Spencer <brad@anduin.eldar.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
gdt@lexort.com
Subject: Re: kern/57199: Pure PVH i386 guests hang on disk activity
Date: Fri, 12 Jul 2024 13:17:50 -0400
This is still 100% reproducable in 10.0 built from source as of
2024-03-28.
Config file:
kernel = "/lhome/xen/kernels/NetBSD_10.x/NetBSD_10_branch_20240328/i386/netbsd-GENERIC"
memory = 512
cpu_weight = 32
name = "nbsd10test32bit"
type="pvh"
vcpus = 1
vif = [ 'mac=aa:00:00:50:07:f4, bridge=bridge4' ]
disk = [ 'phy:/dev/mapper/rustvg0-nbsd10test32bitlv0,0x01,w' ]
root = "xbd0"
nbsd10test32bit# uname -a
NetBSD nbsd10test32bit 10.0 NetBSD 10.0 (GENERIC) #0: Sat Mar 30 22:01:46 EDT 2024 brad@samwise.nat.eldar.org:/lhome/NetBSD_10_branch_20240328/i386/OBJ/sys/arch/i386/compile/GENERIC i386
Create a tar archive that is fairly big. It is likely that the larger
sets will work, but what I used was a tar archive from the userland of
the system itself (something like: cd /;tar -cf /var/tmp/distball.tar
./bin ./lib ./sbin ./usr ./stand). I placed this in /var/tmp and called
it distball.tar
Create a directory in the path /var/tmp/Q and cd there.
Untar the archive.. something like: tar -xvpf ../distball.tar
The untar will work for a short time and then hang up. A CTRL-T
produces this:
x ./usr/libexec/cc1obj[ 115.0723697] load: 0.15 cmd: tar 1854 [uvnfp2] 0.04u 0.14s 0% 5340k
[ 116.9523882] load: 0.13 cmd: tar 1854 [uvnfp2] 0.04u 0.14s 0% 5340k
[ 118.0724972] load: 0.13 cmd: tar 1854 [uvnfp2] 0.04u 0.14s 0% 5340k
[ 118.6725079] load: 0.13 cmd: tar 1854 [uvnfp2] 0.04u 0.14s 0% 5340k
If I ran the same test with 1GB of memory it get a little further along,
but still hangs up:
x ./usr/libexec/postfix/scache[ 66.3212543] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
[ 67.1712694] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
[ 67.3612609] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
[ 68.2412022] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
[ 68.4512385] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
[ 68.6312665] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
[ 68.8012961] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
[ 68.9713106] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
[ 69.1213030] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
When the system is hung up, trying to log into the console will stick
after entering the username. You can get into ddb from the console.
A PV+PVSHIM kernel of the same generation of source will work just fine
with the same test.
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, gdt@lexort.com
Subject: Re: kern/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 13:25:31 +0200
On Fri, Jul 12, 2024 at 01:17:50PM -0400, Brad Spencer wrote:
>
> This is still 100% reproducable in 10.0 built from source as of
> 2024-03-28.
>
> Config file:
> kernel = "/lhome/xen/kernels/NetBSD_10.x/NetBSD_10_branch_20240328/i386/netbsd-GENERIC"
> memory = 512
> cpu_weight = 32
> name = "nbsd10test32bit"
> type="pvh"
> vcpus = 1
> vif = [ 'mac=aa:00:00:50:07:f4, bridge=bridge4' ]
> disk = [ 'phy:/dev/mapper/rustvg0-nbsd10test32bitlv0,0x01,w' ]
> root = "xbd0"
>
>
> nbsd10test32bit# uname -a
> NetBSD nbsd10test32bit 10.0 NetBSD 10.0 (GENERIC) #0: Sat Mar 30 22:01:46 EDT 2024 brad@samwise.nat.eldar.org:/lhome/NetBSD_10_branch_20240328/i386/OBJ/sys/arch/i386/compile/GENERIC i386
>
> Create a tar archive that is fairly big. It is likely that the larger
> sets will work, but what I used was a tar archive from the userland of
> the system itself (something like: cd /;tar -cf /var/tmp/distball.tar
> ./bin ./lib ./sbin ./usr ./stand). I placed this in /var/tmp and called
> it distball.tar
>
> Create a directory in the path /var/tmp/Q and cd there.
>
> Untar the archive.. something like: tar -xvpf ../distball.tar
>
> The untar will work for a short time and then hang up. A CTRL-T
> produces this:
> x ./usr/libexec/cc1obj[ 115.0723697] load: 0.15 cmd: tar 1854 [uvnfp2] 0.04u 0.14s 0% 5340k
> [ 116.9523882] load: 0.13 cmd: tar 1854 [uvnfp2] 0.04u 0.14s 0% 5340k
> [ 118.0724972] load: 0.13 cmd: tar 1854 [uvnfp2] 0.04u 0.14s 0% 5340k
> [ 118.6725079] load: 0.13 cmd: tar 1854 [uvnfp2] 0.04u 0.14s 0% 5340k
>
>
> If I ran the same test with 1GB of memory it get a little further along,
> but still hangs up:
> x ./usr/libexec/postfix/scache[ 66.3212543] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
> [ 67.1712694] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
> [ 67.3612609] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
> [ 68.2412022] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
> [ 68.4512385] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
> [ 68.6312665] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
> [ 68.8012961] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
> [ 68.9713106] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
> [ 69.1213030] load: 0.34 cmd: tar 1977 [biowait] 0.04u 0.20s 0% 5344k
>
>
> When the system is hung up, trying to log into the console will stick
> after entering the username. You can get into ddb from the console.
>
> A PV+PVSHIM kernel of the same generation of source will work just fine
> with the same test.
I can reproduce a hang of the network interface, which may have the same
cause. Looks like an event is missed by the backend.
What is your dom0 ?
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, gdt@lexort.com
Subject: Re: kern/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 18:36:41 +0200
--C4fsJCkvRmqOXAf5
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
On Mon, Jul 15, 2024 at 01:25:31PM +0200, Manuel Bouyer wrote:
> I can reproduce a hang of the network interface, which may have the same
> cause. Looks like an event is missed by the backend.
>
> What is your dom0 ?
Can you try the attached patch on your guest ?
With this my network hang is gone, and I can run your tar test to
completion
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
--C4fsJCkvRmqOXAf5
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename=diff
Index: include/hypervisor.h
===================================================================
RCS file: /cvsroot/src/sys/arch/xen/include/hypervisor.h,v
retrieving revision 1.55.4.3
diff -u -p -u -r1.55.4.3 hypervisor.h
--- include/hypervisor.h 18 Oct 2023 16:53:03 -0000 1.55.4.3
+++ include/hypervisor.h 15 Jul 2024 16:08:09 -0000
@@ -108,13 +108,15 @@ struct xen_npx_attach_args {
* The proper fix is to get upstream to stop assuming that all OSs use
* mb(), rmb(), wmb().
*/
+
#undef xen_mb
#undef xen_rmb
#undef xen_wmb
-#define xen_mb() membar_sync()
-#define xen_rmb() membar_acquire()
-#define xen_wmb() membar_release()
+#define xen_mb() membar_sync(); asm volatile("mfence":::"memory");
+#define xen_rmb() membar_acquire(); asm volatile("lfence":::"memory");
+#define xen_wmb() membar_release(); asm volatile("sfence" ::: "memory");
+
#endif /* __XEN_INTERFACE_VERSION */
#include <machine/xen/hypercalls.h>
Index: include/xenring.h
===================================================================
RCS file: /cvsroot/src/sys/arch/xen/include/xenring.h,v
retrieving revision 1.6.20.1
diff -u -p -u -r1.6.20.1 xenring.h
--- include/xenring.h 31 Jul 2023 15:23:02 -0000 1.6.20.1
+++ include/xenring.h 15 Jul 2024 16:08:09 -0000
@@ -24,9 +24,9 @@
#undef xen_rmb
#undef xen_wmb
-#define xen_mb() membar_sync()
-#define xen_rmb() membar_acquire()
-#define xen_wmb() membar_release()
+#define xen_mb() membar_sync(); asm volatile("mfence":::"memory");
+#define xen_rmb() membar_acquire(); asm volatile("lfence":::"memory");
+#define xen_wmb() membar_release(); asm volatile("sfence" ::: "memory");
/*
* Define ring types. These were previously part of the public API.
--C4fsJCkvRmqOXAf5--
From: Brad Spencer <brad@anduin.eldar.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 13:24:48 -0400
Manuel Bouyer <bouyer@antioche.eu.org> writes:
> The following reply was made to PR kern/57199; it has been noted by GNATS.
>
[snip]
>
> I can reproduce a hang of the network interface, which may have the same
> cause. Looks like an event is missed by the backend.
>
> What is your dom0 ?
9.3_STABLE with:
xenkernel415-4.15.3 Xen 4.15.x Kernel
xentools415-4.15.3nb1 Userland Tools for Xen 4.15.x
The DOM0 runs the normal DOM0 kernel, except that HZ=1000
> --
> Manuel Bouyer <bouyer@antioche.eu.org>
> NetBSD: 26 ans d'experience feront toujours la difference
> --
>
--
Brad Spencer - brad@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
From: Brad Spencer <brad@anduin.eldar.org>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, gdt@lexort.com
Subject: Re: kern/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 13:25:30 -0400
Manuel Bouyer <bouyer@antioche.eu.org> writes:
> [1:text/plain Hide]
>
> On Mon, Jul 15, 2024 at 01:25:31PM +0200, Manuel Bouyer wrote:
>> I can reproduce a hang of the network interface, which may have the same
>> cause. Looks like an event is missed by the backend.
>>
>> What is your dom0 ?
>
> Can you try the attached patch on your guest ?
> With this my network hang is gone, and I can run your tar test to
> completion
As long as it is only for the DOMU guest, then yes, this can be tried.
--
Brad Spencer - brad@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Brad Spencer <brad@anduin.eldar.org>,
Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org, gdt@lexort.com
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 17:33:17 +0000
I can think of two ways this patch could have an impact:
1. Some Xen driver relies on write-combining memory (i.e.,
`prefetchable' in PCIese and bus_dmaese), or on non-temporal
stores. This seems unlikely.
2. This is a single-(v)CPU system which has patched out the lock
prefix in membar_sync.
Unless (1) is happening, I doubt there's any reason to need mfence,
lfence, or sfence -- except in the circumstances of (1), mfence is
just a more expensive version of a locked-add for store-before-load
ordering, and lfence and sfence are never necessary. See, e.g., the
AMD memory access ordering rules table:
AMD64 Architecture Programmer's Manual, Volume 2: System Programming,
24593--Rev. 3.38--November 2021, Sec. 7.4.2 Memory Barrier Interaction
with Memory Types, Table 7-3, p. 196.
https://web.archive.org/web/20220625040004/https://www.amd.com/system/files=
/TechDocs/24593.pdf#page=3D256
Is this a single-(v)CPU system? Can you enter crash(8) or drop into
ddb and disassemble the membar_sync function? I bet you'll find no
lock prefix there, which would explain the hangs.
If my hypothesis about (2) is correct, the right thing is probably
either to make xen_mb be an assembly stub that does
lock
addq $0,-8(%rsp)
(without the membar_sync hotpatching), or to make xen_mb be inline asm
to do the same.
From: Brad Spencer <brad@anduin.eldar.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 13:55:10 -0400
Taylor R Campbell <riastradh@NetBSD.org> writes:
> The following reply was made to PR kern/57199; it has been noted by GNATS.
>
> From: Taylor R Campbell <riastradh@NetBSD.org>
> To: Brad Spencer <brad@anduin.eldar.org>,
> Manuel Bouyer <bouyer@antioche.eu.org>
> Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org, gdt@lexort.com
> Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
> Date: Mon, 15 Jul 2024 17:33:17 +0000
>
> I can think of two ways this patch could have an impact:
>
> 1. Some Xen driver relies on write-combining memory (i.e.,
> `prefetchable' in PCIese and bus_dmaese), or on non-temporal
> stores. This seems unlikely.
>
> 2. This is a single-(v)CPU system which has patched out the lock
> prefix in membar_sync.
>
> Unless (1) is happening, I doubt there's any reason to need mfence,
> lfence, or sfence -- except in the circumstances of (1), mfence is
> just a more expensive version of a locked-add for store-before-load
> ordering, and lfence and sfence are never necessary. See, e.g., the
> AMD memory access ordering rules table:
>
> AMD64 Architecture Programmer's Manual, Volume 2: System Programming,
> 24593--Rev. 3.38--November 2021, Sec. 7.4.2 Memory Barrier Interaction
> with Memory Types, Table 7-3, p. 196.
> https://web.archive.org/web/20220625040004/https://www.amd.com/system/files=
> /TechDocs/24593.pdf#page=3D256
>
>
> Is this a single-(v)CPU system? Can you enter crash(8) or drop into
> ddb and disassemble the membar_sync function? I bet you'll find no
> lock prefix there, which would explain the hangs.
Yes, it is a single vcpu DOMU. I can get into ddb without any problems
but would need the commands to do the disassembly.
Context reminder.. this is a 32 bit i386 DOMU running PVH mode that
hangs up. For reasons that are not clear to me, a i386 guest running
PV+PVSHIM does not hang. A 64 bit PVH DOMU does not hang.
> If my hypothesis about (2) is correct, the right thing is probably
> either to make xen_mb be an assembly stub that does
>
> lock
> addq $0,-8(%rsp)
>
> (without the membar_sync hotpatching), or to make xen_mb be inline asm
> to do the same.
>
--
Brad Spencer - brad@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Brad Spencer <brad@anduin.eldar.org>,
Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org, gdt@lexort.com
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 18:25:05 +0000
This is a multi-part message in MIME format.
--=_fDlpu8OzeqfFpAJUhARZcts9raUpcXJJ
> Date: Mon, 15 Jul 2024 17:33:17 +0000
> From: Taylor R Campbell <riastradh@NetBSD.org>
>
> 2. This is a single-(v)CPU system which has patched out the lock
> prefix in membar_sync.
> [...]
> If my hypothesis about (2) is correct, the right thing is probably
> either to make xen_mb be an assembly stub that does
>
> lock
> addq $0,-8(%rsp)
>
> (without the membar_sync hotpatching), or to make xen_mb be inline asm
> to do the same.
The attached patch implements this approach, and leaves extensive
comments explaining what's going on, without issuing any unnecessary
mfence/lfence/sfence instructions. Can you try it out?
Under my hypothesis, the domU kernel certainly needs this change. And
the dom0 kernel may also need it, because I believe it uses
RING_PUSH_RESPONSES_AND_CHECK_NOTIFY and RING_FINAL_CHECK_FOR_REQUESTS
which rely on xen_mb too.
--=_fDlpu8OzeqfFpAJUhARZcts9raUpcXJJ
Content-Type: text/plain; charset="ISO-8859-1"; name="pr57199-xenmbuniproc"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="pr57199-xenmbuniproc.patch"
# HG changeset patch
# User Taylor R Campbell <riastradh@NetBSD.org>
# Date 1721067262 0
# Mon Jul 15 18:14:22 2024 +0000
# Branch trunk
# Node ID 7ecbeb4758c5d1bd73ce8492ada9c88cb4a3f4b3
# Parent b61cb6f1e4ce13fd55dc3584c62bcd23b9a9d2dd
# EXP-Topic riastradh-pr57199-xenmbuniproc
xen: Don't hotpatch away LOCK prefix in xen_mb, even on UP boots.
Both xen_mb and membar_sync are designed to provide store-before-load
ordering, but xen_mb has to provide it in synchronizing guest with
hypervisor, while membar_sync only has to provide it in synchronizing
one (guest) CPU with another (guest) CPU.
It is safe to hotpatch away the LOCK prefix in membar_sync on a
uniprocessor boot because membar_sync is only designed to coordinate
between normal memory on multiple CPUs, and is never necessary when
there's only one CPU involved.
But xen_mb is used to coordinate between the guest and the `device'
implemented by a hypervisor, which might be running on another
_physical_ CPU even if the NetBSD guest only sees one `CPU', i.e.,
one _virtual_ CPU. So even on `uniprocessor' boots, xen_mb must
still issue an instruction with store-before-load ordering on
multiprocessor systems, such as a LOCK ADD (or MFENCE, but MFENCE is
costlier for no benefit here).
No need to change xen_wmb (release ordering, load/store-before-store)
or xen_rmb (acquire ordering, load-before-load/store) because every
x86 store is a store-release and every x86 load is a load-acquire,
even on multiprocessor systems, so there's no hotpatching involved
anyway.
PR kern/57199
diff -r b61cb6f1e4ce -r 7ecbeb4758c5 common/lib/libc/arch/i386/atomic/atomi=
c.S
--- a/common/lib/libc/arch/i386/atomic/atomic.S Mon Jul 15 01:57:23 2024 +0=
000
+++ b/common/lib/libc/arch/i386/atomic/atomic.S Mon Jul 15 18:14:22 2024 +0=
000
@@ -211,6 +211,8 @@ ENTRY(_membar_sync)
* https://pvk.ca/Blog/2014/10/19/performance-optimisation-~-writing-an-e=
ssay/
* https://shipilev.net/blog/2014/on-the-fence-with-dependencies/
* https://www.agner.org/optimize/instruction_tables.pdf
+ *
+ * Sync with xen_mb in sys/arch/i386/i386/cpufunc.S.
*/
LOCK
addl $0, -4(%esp)
diff -r b61cb6f1e4ce -r 7ecbeb4758c5 common/lib/libc/arch/x86_64/atomic/ato=
mic.S
--- a/common/lib/libc/arch/x86_64/atomic/atomic.S Mon Jul 15 01:57:23 2024 =
+0000
+++ b/common/lib/libc/arch/x86_64/atomic/atomic.S Mon Jul 15 18:14:22 2024 =
+0000
@@ -279,6 +279,8 @@ ENTRY(_membar_sync)
* https://pvk.ca/Blog/2014/10/19/performance-optimisation-~-writing-an-e=
ssay/
* https://shipilev.net/blog/2014/on-the-fence-with-dependencies/
* https://www.agner.org/optimize/instruction_tables.pdf
+ *
+ * Sync with xen_mb in sys/arch/amd64/amd64/cpufunc.S.
*/
LOCK
addq $0, -8(%rsp)
diff -r b61cb6f1e4ce -r 7ecbeb4758c5 sys/arch/amd64/amd64/cpufunc.S
--- a/sys/arch/amd64/amd64/cpufunc.S Mon Jul 15 01:57:23 2024 +0000
+++ b/sys/arch/amd64/amd64/cpufunc.S Mon Jul 15 18:14:22 2024 +0000
@@ -61,6 +61,28 @@ ENTRY(x86_mfence)
ret
END(x86_mfence)
=20
+#ifdef XEN
+ENTRY(xen_mb)
+ /*
+ * Store-before-load ordering with respect to matching logic
+ * on the hypervisor side.
+ *
+ * This is the same as membar_sync, but without hotpatching
+ * away the LOCK prefix on uniprocessor boots -- because under
+ * Xen, we still have to coordinate with a `device' backed by a
+ * hypervisor that is potentially on another physical CPU even
+ * if we observe only one virtual CPU as the guest.
+ *
+ * See common/lib/libc/arch/x86_64/atomic/atomic.S for
+ * rationale and keep this in sync with the implementation
+ * of membar_sync there.
+ */
+ lock
+ addq $0,-8(%rsp)
+ ret
+END(xen_mb)
+#endif /* XEN */
+
#ifndef XENPV
ENTRY(invlpg)
#ifdef SVS
diff -r b61cb6f1e4ce -r 7ecbeb4758c5 sys/arch/i386/i386/cpufunc.S
--- a/sys/arch/i386/i386/cpufunc.S Mon Jul 15 01:57:23 2024 +0000
+++ b/sys/arch/i386/i386/cpufunc.S Mon Jul 15 18:14:22 2024 +0000
@@ -67,6 +67,28 @@ ENTRY(x86_mfence)
ret
END(x86_mfence)
=20
+#ifdef XEN
+ENTRY(xen_mb)
+ /*
+ * Store-before-load ordering with respect to matching logic
+ * on the hypervisor side.
+ *
+ * This is the same as membar_sync, but without hotpatching
+ * away the LOCK prefix on uniprocessor boots -- because under
+ * Xen, we still have to coordinate with a `device' backed by a
+ * hypervisor that is potentially on another physical CPU even
+ * if we observe only one virtual CPU as the guest.
+ *
+ * See common/lib/libc/arch/i386/atomic/atomic.S for
+ * rationale and keep this in sync with the implementation
+ * of membar_sync there.
+ */
+ lock
+ addl $0,-4(%esp)
+ ret
+END(xen_mb)
+#endif /* XEN */
+
#ifndef XENPV
ENTRY(lidt)
movl 4(%esp), %eax
diff -r b61cb6f1e4ce -r 7ecbeb4758c5 sys/arch/xen/include/hypervisor.h
--- a/sys/arch/xen/include/hypervisor.h Mon Jul 15 01:57:23 2024 +0000
+++ b/sys/arch/xen/include/hypervisor.h Mon Jul 15 18:14:22 2024 +0000
@@ -112,7 +112,7 @@ struct xen_npx_attach_args {
#undef xen_rmb
#undef xen_wmb
=20
-#define xen_mb() membar_sync()
+void xen_mb(void);
#define xen_rmb() membar_acquire()
#define xen_wmb() membar_release()
#endif /* __XEN_INTERFACE_VERSION */
diff -r b61cb6f1e4ce -r 7ecbeb4758c5 sys/arch/xen/include/xenring.h
--- a/sys/arch/xen/include/xenring.h Mon Jul 15 01:57:23 2024 +0000
+++ b/sys/arch/xen/include/xenring.h Mon Jul 15 18:14:22 2024 +0000
@@ -24,7 +24,7 @@
#undef xen_rmb
#undef xen_wmb
=20
-#define xen_mb() membar_sync()
+void xen_mb(void);
#define xen_rmb() membar_acquire()
#define xen_wmb() membar_release()
=20
--=_fDlpu8OzeqfFpAJUhARZcts9raUpcXJJ--
From: Brad Spencer <brad@anduin.eldar.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: bouyer@antioche.eu.org, gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org,
gdt@lexort.com
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 14:31:41 -0400
Taylor R Campbell <riastradh@NetBSD.org> writes:
> [1:text/plain Hide]
>
>> Date: Mon, 15 Jul 2024 17:33:17 +0000
>> From: Taylor R Campbell <riastradh@NetBSD.org>
>>
>> 2. This is a single-(v)CPU system which has patched out the lock
>> prefix in membar_sync.
>> [...]
>> If my hypothesis about (2) is correct, the right thing is probably
>> either to make xen_mb be an assembly stub that does
>>
>> lock
>> addq $0,-8(%rsp)
>>
>> (without the membar_sync hotpatching), or to make xen_mb be inline asm
>> to do the same.
>
> The attached patch implements this approach, and leaves extensive
> comments explaining what's going on, without issuing any unnecessary
> mfence/lfence/sfence instructions. Can you try it out?
>
> Under my hypothesis, the domU kernel certainly needs this change. And
> the dom0 kernel may also need it, because I believe it uses
> RING_PUSH_RESPONSES_AND_CHECK_NOTIFY and RING_FINAL_CHECK_FOR_REQUESTS
> which rely on xen_mb too.
a) Manuel's patch allowed the 32-bit PVH DOMU to do the untar without
hanging.
b) I will test Taylor's patch on the DOMU. Doing that on the DOM0 is
probably not something I can manage right now.
--
Brad Spencer - brad@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: bouyer@antioche.eu.org, gnats-bugs@NetBSD.org,
netbsd-bugs@NetBSD.org, gdt@lexort.com
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 18:52:14 +0000
> Date: Mon, 15 Jul 2024 14:31:41 -0400
> From: Brad Spencer <brad@anduin.eldar.org>
>
> b) I will test Taylor's patch on the DOMU. Doing that on the DOM0 is
> probably not something I can manage right now.
I should clarify: I suspect this may be needed on the dom0, but only
if the dom0 _also_ runs uniprocessor, i.e., single-vCPU.
If you can run crash(8) or enter ddb on the dom0, you can check like
so:
# crash
Crash version 10.0_STABLE, image version 10.0.
WARNING: versions differ, you may not be able to examine this image.
Kernel compiled without options LOCKDEBUG.
Output from a running system is unreliable.
crash> x/i membar_sync,3
_membar_enter: lock addq $0,fffffffffffffff8 (%rsp)
_membar_enter+0x7: ret
_membar_enter+0x8: nopl
If it says `lock addq' or `lock addl', you're good. If it's just
`addq' or `addl' with no `lock', the patch is needed.
From: Brad Spencer <brad@anduin.eldar.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 14:58:34 -0400
Taylor R Campbell <riastradh@NetBSD.org> writes:
> The following reply was made to PR kern/57199; it has been noted by GNATS.
>
> From: Taylor R Campbell <riastradh@NetBSD.org>
> To: Brad Spencer <brad@anduin.eldar.org>,
> Manuel Bouyer <bouyer@antioche.eu.org>
> Cc: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org, gdt@lexort.com
> Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
> Date: Mon, 15 Jul 2024 18:25:05 +0000
>
> This is a multi-part message in MIME format.
> --=_fDlpu8OzeqfFpAJUhARZcts9raUpcXJJ
>
> > Date: Mon, 15 Jul 2024 17:33:17 +0000
> > From: Taylor R Campbell <riastradh@NetBSD.org>
> >
> > 2. This is a single-(v)CPU system which has patched out the lock
> > prefix in membar_sync.
> > [...]
> > If my hypothesis about (2) is correct, the right thing is probably
> > either to make xen_mb be an assembly stub that does
> >
> > lock
> > addq $0,-8(%rsp)
> >
> > (without the membar_sync hotpatching), or to make xen_mb be inline asm
> > to do the same.
>
> The attached patch implements this approach, and leaves extensive
> comments explaining what's going on, without issuing any unnecessary
> mfence/lfence/sfence instructions. Can you try it out?
>
> Under my hypothesis, the domU kernel certainly needs this change. And
> the dom0 kernel may also need it, because I believe it uses
> RING_PUSH_RESPONSES_AND_CHECK_NOTIFY and RING_FINAL_CHECK_FOR_REQUESTS
> which rely on xen_mb too.
>
> --=_fDlpu8OzeqfFpAJUhARZcts9raUpcXJJ
> Content-Type: text/plain; charset="ISO-8859-1"; name="pr57199-xenmbuniproc"
> Content-Transfer-Encoding: quoted-printable
> Content-Disposition: attachment; filename="pr57199-xenmbuniproc.patch"
>
> # HG changeset patch
> # User Taylor R Campbell <riastradh@NetBSD.org>
> # Date 1721067262 0
> # Mon Jul 15 18:14:22 2024 +0000
> # Branch trunk
> # Node ID 7ecbeb4758c5d1bd73ce8492ada9c88cb4a3f4b3
> # Parent b61cb6f1e4ce13fd55dc3584c62bcd23b9a9d2dd
> # EXP-Topic riastradh-pr57199-xenmbuniproc
> xen: Don't hotpatch away LOCK prefix in xen_mb, even on UP boots.
>
> Both xen_mb and membar_sync are designed to provide store-before-load
> ordering, but xen_mb has to provide it in synchronizing guest with
> hypervisor, while membar_sync only has to provide it in synchronizing
> one (guest) CPU with another (guest) CPU.
>
> It is safe to hotpatch away the LOCK prefix in membar_sync on a
> uniprocessor boot because membar_sync is only designed to coordinate
> between normal memory on multiple CPUs, and is never necessary when
> there's only one CPU involved.
>
> But xen_mb is used to coordinate between the guest and the `device'
> implemented by a hypervisor, which might be running on another
> _physical_ CPU even if the NetBSD guest only sees one `CPU', i.e.,
> one _virtual_ CPU. So even on `uniprocessor' boots, xen_mb must
> still issue an instruction with store-before-load ordering on
> multiprocessor systems, such as a LOCK ADD (or MFENCE, but MFENCE is
> costlier for no benefit here).
>
> No need to change xen_wmb (release ordering, load/store-before-store)
> or xen_rmb (acquire ordering, load-before-load/store) because every
> x86 store is a store-release and every x86 load is a load-acquire,
> even on multiprocessor systems, so there's no hotpatching involved
> anyway.
>
> PR kern/57199
>
> diff -r b61cb6f1e4ce -r 7ecbeb4758c5 common/lib/libc/arch/i386/atomic/atomi=
> c.S
> --- a/common/lib/libc/arch/i386/atomic/atomic.S Mon Jul 15 01:57:23 2024 +0=
> 000
> +++ b/common/lib/libc/arch/i386/atomic/atomic.S Mon Jul 15 18:14:22 2024 +0=
> 000
> @@ -211,6 +211,8 @@ ENTRY(_membar_sync)
> * https://pvk.ca/Blog/2014/10/19/performance-optimisation-~-writing-an-e=
> ssay/
> * https://shipilev.net/blog/2014/on-the-fence-with-dependencies/
> * https://www.agner.org/optimize/instruction_tables.pdf
> + *
> + * Sync with xen_mb in sys/arch/i386/i386/cpufunc.S.
> */
> LOCK
> addl $0, -4(%esp)
> diff -r b61cb6f1e4ce -r 7ecbeb4758c5 common/lib/libc/arch/x86_64/atomic/ato=
> mic.S
> --- a/common/lib/libc/arch/x86_64/atomic/atomic.S Mon Jul 15 01:57:23 2024 =
> +0000
> +++ b/common/lib/libc/arch/x86_64/atomic/atomic.S Mon Jul 15 18:14:22 2024 =
> +0000
> @@ -279,6 +279,8 @@ ENTRY(_membar_sync)
> * https://pvk.ca/Blog/2014/10/19/performance-optimisation-~-writing-an-e=
> ssay/
> * https://shipilev.net/blog/2014/on-the-fence-with-dependencies/
> * https://www.agner.org/optimize/instruction_tables.pdf
> + *
> + * Sync with xen_mb in sys/arch/amd64/amd64/cpufunc.S.
> */
> LOCK
> addq $0, -8(%rsp)
> diff -r b61cb6f1e4ce -r 7ecbeb4758c5 sys/arch/amd64/amd64/cpufunc.S
> --- a/sys/arch/amd64/amd64/cpufunc.S Mon Jul 15 01:57:23 2024 +0000
> +++ b/sys/arch/amd64/amd64/cpufunc.S Mon Jul 15 18:14:22 2024 +0000
> @@ -61,6 +61,28 @@ ENTRY(x86_mfence)
> ret
> END(x86_mfence)
> =20
> +#ifdef XEN
> +ENTRY(xen_mb)
> + /*
> + * Store-before-load ordering with respect to matching logic
> + * on the hypervisor side.
> + *
> + * This is the same as membar_sync, but without hotpatching
> + * away the LOCK prefix on uniprocessor boots -- because under
> + * Xen, we still have to coordinate with a `device' backed by a
> + * hypervisor that is potentially on another physical CPU even
> + * if we observe only one virtual CPU as the guest.
> + *
> + * See common/lib/libc/arch/x86_64/atomic/atomic.S for
> + * rationale and keep this in sync with the implementation
> + * of membar_sync there.
> + */
> + lock
> + addq $0,-8(%rsp)
> + ret
> +END(xen_mb)
> +#endif /* XEN */
> +
> #ifndef XENPV
> ENTRY(invlpg)
> #ifdef SVS
> diff -r b61cb6f1e4ce -r 7ecbeb4758c5 sys/arch/i386/i386/cpufunc.S
> --- a/sys/arch/i386/i386/cpufunc.S Mon Jul 15 01:57:23 2024 +0000
> +++ b/sys/arch/i386/i386/cpufunc.S Mon Jul 15 18:14:22 2024 +0000
> @@ -67,6 +67,28 @@ ENTRY(x86_mfence)
> ret
> END(x86_mfence)
> =20
> +#ifdef XEN
> +ENTRY(xen_mb)
> + /*
> + * Store-before-load ordering with respect to matching logic
> + * on the hypervisor side.
> + *
> + * This is the same as membar_sync, but without hotpatching
> + * away the LOCK prefix on uniprocessor boots -- because under
> + * Xen, we still have to coordinate with a `device' backed by a
> + * hypervisor that is potentially on another physical CPU even
> + * if we observe only one virtual CPU as the guest.
> + *
> + * See common/lib/libc/arch/i386/atomic/atomic.S for
> + * rationale and keep this in sync with the implementation
> + * of membar_sync there.
> + */
> + lock
> + addl $0,-4(%esp)
> + ret
> +END(xen_mb)
> +#endif /* XEN */
> +
> #ifndef XENPV
> ENTRY(lidt)
> movl 4(%esp), %eax
> diff -r b61cb6f1e4ce -r 7ecbeb4758c5 sys/arch/xen/include/hypervisor.h
> --- a/sys/arch/xen/include/hypervisor.h Mon Jul 15 01:57:23 2024 +0000
> +++ b/sys/arch/xen/include/hypervisor.h Mon Jul 15 18:14:22 2024 +0000
> @@ -112,7 +112,7 @@ struct xen_npx_attach_args {
> #undef xen_rmb
> #undef xen_wmb
> =20
> -#define xen_mb() membar_sync()
> +void xen_mb(void);
> #define xen_rmb() membar_acquire()
> #define xen_wmb() membar_release()
> #endif /* __XEN_INTERFACE_VERSION */
> diff -r b61cb6f1e4ce -r 7ecbeb4758c5 sys/arch/xen/include/xenring.h
> --- a/sys/arch/xen/include/xenring.h Mon Jul 15 01:57:23 2024 +0000
> +++ b/sys/arch/xen/include/xenring.h Mon Jul 15 18:14:22 2024 +0000
> @@ -24,7 +24,7 @@
> #undef xen_rmb
> #undef xen_wmb
> =20
> -#define xen_mb() membar_sync()
> +void xen_mb(void);
> #define xen_rmb() membar_acquire()
> #define xen_wmb() membar_release()
> =20
>
> --=_fDlpu8OzeqfFpAJUhARZcts9raUpcXJJ--
>
This patch allowed the 32-bit PVH DOMU to do the untar without hanging
up.
From: Brad Spencer <brad@anduin.eldar.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 15:00:03 -0400
Taylor R Campbell <riastradh@NetBSD.org> writes:
> The following reply was made to PR kern/57199; it has been noted by GNATS.
>
> From: Taylor R Campbell <riastradh@NetBSD.org>
> To: Brad Spencer <brad@anduin.eldar.org>
> Cc: bouyer@antioche.eu.org, gnats-bugs@NetBSD.org,
> netbsd-bugs@NetBSD.org, gdt@lexort.com
> Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
> Date: Mon, 15 Jul 2024 18:52:14 +0000
>
> > Date: Mon, 15 Jul 2024 14:31:41 -0400
> > From: Brad Spencer <brad@anduin.eldar.org>
> >
> > b) I will test Taylor's patch on the DOMU. Doing that on the DOM0 is
> > probably not something I can manage right now.
>
> I should clarify: I suspect this may be needed on the dom0, but only
> if the dom0 _also_ runs uniprocessor, i.e., single-vCPU.
>
> If you can run crash(8) or enter ddb on the dom0, you can check like
> so:
>
> # crash
> Crash version 10.0_STABLE, image version 10.0.
> WARNING: versions differ, you may not be able to examine this image.
> Kernel compiled without options LOCKDEBUG.
> Output from a running system is unreliable.
> crash> x/i membar_sync,3
> _membar_enter: lock addq $0,fffffffffffffff8 (%rsp)
> _membar_enter+0x7: ret
> _membar_enter+0x8: nopl
>
> If it says `lock addq' or `lock addl', you're good. If it's just
> `addq' or `addl' with no `lock', the patch is needed.
>
The DOM0 is NetBSD_9.x, but it appears to be ok according to what you
wrote above:
DOM0# crash
Crash version 9.3_STABLE, image version 9.3_STABLE.
Output from a running system is unreliable.
crash> x/i membar_sync,3
_membar_sync: lock addq $0,fffffffffffffff8 (%rsp)
_membar_sync+0x7: ret
_membar_sync+0x8: nopl
crash>
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Brad Spencer <brad@anduin.eldar.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 21:50:03 +0200
On Mon, Jul 15, 2024 at 03:00:03PM -0400, Brad Spencer wrote:
> Taylor R Campbell <riastradh@NetBSD.org> writes:
>
> > The following reply was made to PR kern/57199; it has been noted by GNATS.
> >
> > From: Taylor R Campbell <riastradh@NetBSD.org>
> > To: Brad Spencer <brad@anduin.eldar.org>
> > Cc: bouyer@antioche.eu.org, gnats-bugs@NetBSD.org,
> > netbsd-bugs@NetBSD.org, gdt@lexort.com
> > Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
> > Date: Mon, 15 Jul 2024 18:52:14 +0000
> >
> > > Date: Mon, 15 Jul 2024 14:31:41 -0400
> > > From: Brad Spencer <brad@anduin.eldar.org>
> > >
> > > b) I will test Taylor's patch on the DOMU. Doing that on the DOM0 is
> > > probably not something I can manage right now.
> >
> > I should clarify: I suspect this may be needed on the dom0, but only
> > if the dom0 _also_ runs uniprocessor, i.e., single-vCPU.
> >
> > If you can run crash(8) or enter ddb on the dom0, you can check like
> > so:
> >
> > # crash
> > Crash version 10.0_STABLE, image version 10.0.
> > WARNING: versions differ, you may not be able to examine this image.
> > Kernel compiled without options LOCKDEBUG.
> > Output from a running system is unreliable.
> > crash> x/i membar_sync,3
> > _membar_enter: lock addq $0,fffffffffffffff8 (%rsp)
> > _membar_enter+0x7: ret
> > _membar_enter+0x8: nopl
> >
> > If it says `lock addq' or `lock addl', you're good. If it's just
> > `addq' or `addl' with no `lock', the patch is needed.
> >
>
> The DOM0 is NetBSD_9.x, but it appears to be ok according to what you
> wrote above:
>
> DOM0# crash
> Crash version 9.3_STABLE, image version 9.3_STABLE.
> Output from a running system is unreliable.
> crash> x/i membar_sync,3
> _membar_sync: lock addq $0,fffffffffffffff8 (%rsp)
> _membar_sync+0x7: ret
> _membar_sync+0x8: nopl
> crash>
Yes, I don't think hotpatch is run when running PV
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Brad Spencer <brad@anduin.eldar.org>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 15:54:54 -0400
Manuel Bouyer <bouyer@antioche.eu.org> writes:
> On Mon, Jul 15, 2024 at 03:00:03PM -0400, Brad Spencer wrote:
>> Taylor R Campbell <riastradh@NetBSD.org> writes:
>>
>> > The following reply was made to PR kern/57199; it has been noted by GNATS.
>> >
>> > From: Taylor R Campbell <riastradh@NetBSD.org>
>> > To: Brad Spencer <brad@anduin.eldar.org>
>> > Cc: bouyer@antioche.eu.org, gnats-bugs@NetBSD.org,
>> > netbsd-bugs@NetBSD.org, gdt@lexort.com
>> > Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
>> > Date: Mon, 15 Jul 2024 18:52:14 +0000
>> >
>> > > Date: Mon, 15 Jul 2024 14:31:41 -0400
>> > > From: Brad Spencer <brad@anduin.eldar.org>
>> > >
>> > > b) I will test Taylor's patch on the DOMU. Doing that on the DOM0 is
>> > > probably not something I can manage right now.
>> >
>> > I should clarify: I suspect this may be needed on the dom0, but only
>> > if the dom0 _also_ runs uniprocessor, i.e., single-vCPU.
>> >
>> > If you can run crash(8) or enter ddb on the dom0, you can check like
>> > so:
>> >
>> > # crash
>> > Crash version 10.0_STABLE, image version 10.0.
>> > WARNING: versions differ, you may not be able to examine this image.
>> > Kernel compiled without options LOCKDEBUG.
>> > Output from a running system is unreliable.
>> > crash> x/i membar_sync,3
>> > _membar_enter: lock addq $0,fffffffffffffff8 (%rsp)
>> > _membar_enter+0x7: ret
>> > _membar_enter+0x8: nopl
>> >
>> > If it says `lock addq' or `lock addl', you're good. If it's just
>> > `addq' or `addl' with no `lock', the patch is needed.
>> >
>>
>> The DOM0 is NetBSD_9.x, but it appears to be ok according to what you
>> wrote above:
>>
>> DOM0# crash
>> Crash version 9.3_STABLE, image version 9.3_STABLE.
>> Output from a running system is unreliable.
>> crash> x/i membar_sync,3
>> _membar_sync: lock addq $0,fffffffffffffff8 (%rsp)
>> _membar_sync+0x7: ret
>> _membar_sync+0x8: nopl
>> crash>
>
> Yes, I don't think hotpatch is run when running PV
That may explain something that was strange to me in that the 32-bit
test system when it was a PV+PVSHIM didn't hang.
--
Brad Spencer - brad@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: Brad Spencer <brad@anduin.eldar.org>, gnats-bugs@NetBSD.org,
netbsd-bugs@NetBSD.org, gdt@lexort.com
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Mon, 15 Jul 2024 21:56:31 +0200
On Mon, Jul 15, 2024 at 05:33:17PM +0000, Taylor R Campbell wrote:
> I can think of two ways this patch could have an impact:
>
> 1. Some Xen driver relies on write-combining memory (i.e.,
> `prefetchable' in PCIese and bus_dmaese), or on non-temporal
> stores. This seems unlikely.
>
> 2. This is a single-(v)CPU system which has patched out the lock
> prefix in membar_sync.
>
> Unless (1) is happening, I doubt there's any reason to need mfence,
> lfence, or sfence -- except in the circumstances of (1), mfence is
> just a more expensive version of a locked-add for store-before-load
> ordering, and lfence and sfence are never necessary. See, e.g., the
> AMD memory access ordering rules table:
>
> AMD64 Architecture Programmer's Manual, Volume 2: System Programming,
> 24593--Rev. 3.38--November 2021, Sec. 7.4.2 Memory Barrier Interaction
> with Memory Types, Table 7-3, p. 196.
> https://web.archive.org/web/20220625040004/https://www.amd.com/system/files/TechDocs/24593.pdf#page=256
>
>
> Is this a single-(v)CPU system? Can you enter crash(8) or drop into
> ddb and disassemble the membar_sync function? I bet you'll find no
> lock prefix there, which would explain the hangs.
>
> If my hypothesis about (2) is correct, the right thing is probably
> either to make xen_mb be an assembly stub that does
It is indeed a single-vCPU system, and in PV kernels we're probably not
running hotpatch.
>
> lock
> addq $0,-8(%rsp)
>
> (without the membar_sync hotpatching), or to make xen_mb be inline asm
> to do the same.
I misread the linux code in this area; mb() is not the same as smp_mb().
Linux is in fact not used *fence instructions for virt_*mb(), but
a lock addl for virt_mb() and just barrier() for virt_[rw]mb()
So just adding a lock addl xen_mb should be enough
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
brad@anduin.eldar.org
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Tue, 16 Jul 2024 10:58:38 +0200
On Mon, Jul 15, 2024 at 06:30:03PM +0000, Taylor R Campbell wrote:
> The attached patch implements this approach, and leaves extensive
> comments explaining what's going on, without issuing any unnecessary
> mfence/lfence/sfence instructions. Can you try it out?
This patch works for me, I couldn't reproduce the network hang and
the tar test also completed.
please commit and request a pullup to netbsd-10
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Brad Spencer <brad@anduin.eldar.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
Date: Tue, 16 Jul 2024 08:23:11 -0400
Manuel Bouyer <bouyer@antioche.eu.org> writes:
> The following reply was made to PR kern/57199; it has been noted by GNATS.
>
> From: Manuel Bouyer <bouyer@antioche.eu.org>
> To: gnats-bugs@netbsd.org
> Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
> brad@anduin.eldar.org
> Subject: Re: port-xen/57199: Pure PVH i386 guests hang on disk activity
> Date: Tue, 16 Jul 2024 10:58:38 +0200
>
> On Mon, Jul 15, 2024 at 06:30:03PM +0000, Taylor R Campbell wrote:
> > The attached patch implements this approach, and leaves extensive
> > comments explaining what's going on, without issuing any unnecessary
> > mfence/lfence/sfence instructions. Can you try it out?
>
> This patch works for me, I couldn't reproduce the network hang and
> the tar test also completed.
>
> please commit and request a pullup to netbsd-10
>
> --
> Manuel Bouyer <bouyer@antioche.eu.org>
> NetBSD: 26 ans d'experience feront toujours la difference
> --
>
Seconded.. and thanks very much for this fix. I owe some folks drinks
for this one. Much appreciated.
--
Brad Spencer - brad@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/57199 CVS commit: src
Date: Tue, 16 Jul 2024 22:44:38 +0000
Module Name: src
Committed By: riastradh
Date: Tue Jul 16 22:44:38 UTC 2024
Modified Files:
src/common/lib/libc/arch/i386/atomic: atomic.S
src/common/lib/libc/arch/x86_64/atomic: atomic.S
src/sys/arch/amd64/amd64: cpufunc.S
src/sys/arch/i386/i386: cpufunc.S
src/sys/arch/xen/include: hypervisor.h xenring.h
Log Message:
xen: Don't hotpatch away LOCK prefix in xen_mb, even on UP boots.
Both xen_mb and membar_sync are designed to provide store-before-load
ordering, but xen_mb has to provide it in synchronizing guest with
hypervisor, while membar_sync only has to provide it in synchronizing
one (guest) CPU with another (guest) CPU.
It is safe to hotpatch away the LOCK prefix in membar_sync on a
uniprocessor boot because membar_sync is only designed to coordinate
between normal memory on multiple CPUs, and is never necessary when
there's only one CPU involved.
But xen_mb is used to coordinate between the guest and the `device'
implemented by a hypervisor, which might be running on another
_physical_ CPU even if the NetBSD guest only sees one `CPU', i.e.,
one _virtual_ CPU. So even on `uniprocessor' boots, xen_mb must
still issue an instruction with store-before-load ordering on
multiprocessor systems, such as a LOCK ADD (or MFENCE, but MFENCE is
costlier for no benefit here).
No need to change xen_wmb (release ordering, load/store-before-store)
or xen_rmb (acquire ordering, load-before-load/store) because every
x86 store is a store-release and every x86 load is a load-acquire,
even on multiprocessor systems, so there's no hotpatching involved
anyway.
PR kern/57199
To generate a diff of this commit:
cvs rdiff -u -r1.36 -r1.37 src/common/lib/libc/arch/i386/atomic/atomic.S
cvs rdiff -u -r1.29 -r1.30 src/common/lib/libc/arch/x86_64/atomic/atomic.S
cvs rdiff -u -r1.67 -r1.68 src/sys/arch/amd64/amd64/cpufunc.S
cvs rdiff -u -r1.51 -r1.52 src/sys/arch/i386/i386/cpufunc.S
cvs rdiff -u -r1.59 -r1.60 src/sys/arch/xen/include/hypervisor.h
cvs rdiff -u -r1.7 -r1.8 src/sys/arch/xen/include/xenring.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->needs-pullups
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Tue, 16 Jul 2024 22:49:31 +0000
State-Changed-Why:
fixed in HEAD, needs pullup-10 and pullup-9
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/57199 CVS commit: src/common/lib/libc/arch/x86_64/atomic
Date: Tue, 16 Jul 2024 22:45:10 +0000
Module Name: src
Committed By: riastradh
Date: Tue Jul 16 22:45:10 UTC 2024
Modified Files:
src/common/lib/libc/arch/x86_64/atomic: atomic.S
Log Message:
amd64: Fix performance regression in uniprocessor atomics/membars.
Back in 2022, I eliminated the MFENCE hotpatch in membar_sync because
it's essentially always more expensive than LOCK ADD with no benefit
for CPU/CPU store-before-load ordering. (It is relevant only for
non-temporal stores or write-combining memory.)
https://mail-index.netbsd.org/source-changes/2022/07/30/msg140047.html
But in that change, I made a mistake and _also_ eliminated the LOCK
hotpatch on uniprocessor amd64. And our assembler gas helpfully
interprets uppercase LOCK just like lowercase lock and assembles them
the same way, so I didn't notice.
This change restores the LOCK hotpatch, so that when booting on a
uniprocessor system (or a uniprocessor guest on a multicore host),
the LOCK prefix is replaced by NOP for a cheaper instruction.
Found by puzzling over how my explanation for PR kern/57199 could
possibly be correct when (on an amd64 guest) ddb x/i membar_sync kept
showing the lock prefix even in uniprocessor boots.
To generate a diff of this commit:
cvs rdiff -u -r1.30 -r1.31 src/common/lib/libc/arch/x86_64/atomic/atomic.S
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org,
riastradh@NetBSD.org, brad@anduin.eldar.org
Subject: Re: kern/57199 (Pure PVH i386 guests hang on disk activity)
Date: Wed, 17 Jul 2024 01:15:12 +0200
On Tue, Jul 16, 2024 at 10:49:32PM +0000, riastradh@NetBSD.org wrote:
> Synopsis: Pure PVH i386 guests hang on disk activity
>
> State-Changed-From-To: open->needs-pullups
> State-Changed-By: riastradh@NetBSD.org
> State-Changed-When: Tue, 16 Jul 2024 22:49:31 +0000
> State-Changed-Why:
> fixed in HEAD, needs pullup-10 and pullup-9
AFAIK it's not needed in netbsd-9, PVH is not supported before -10
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
From: Taylor R Campbell <riastradh@NetBSD.org>
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, brad@anduin.eldar.org
Subject: Re: kern/57199 (Pure PVH i386 guests hang on disk activity)
Date: Tue, 16 Jul 2024 23:33:39 +0000
> Date: Wed, 17 Jul 2024 01:15:12 +0200
> From: Manuel Bouyer <bouyer@antioche.eu.org>
>
> On Tue, Jul 16, 2024 at 10:49:32PM +0000, riastradh@NetBSD.org wrote:
> > fixed in HEAD, needs pullup-10 and pullup-9
>
> AFAIK it's not needed in netbsd-9, PVH is not supported before -10
Don't XEN3_DOM0 and XEN3_DOMU kernels have, e.g., xennet(4)? If they
rely on xen_mb to enforce store-before-load ordering with respect to
the hypervisor (possibly on another physical CPU), they may need this
fix.
That said, by accident, it looks like this will only be relevant for
very old 32-bit x86 CPUs -- those without SSE2. On CPUs with SSE2,
netbsd-9 hotpatches membar_sync to use MFENCE. And all amd64 CPUs
have SSE2.
From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Taylor R Campbell <riastradh@NetBSD.org>
Cc: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org, netbsd-bugs@netbsd.org,
gnats-admin@netbsd.org, brad@anduin.eldar.org
Subject: Re: kern/57199 (Pure PVH i386 guests hang on disk activity)
Date: Wed, 17 Jul 2024 08:49:40 +0200
On Tue, Jul 16, 2024 at 11:33:39PM +0000, Taylor R Campbell wrote:
> > Date: Wed, 17 Jul 2024 01:15:12 +0200
> > From: Manuel Bouyer <bouyer@antioche.eu.org>
> >
> > On Tue, Jul 16, 2024 at 10:49:32PM +0000, riastradh@NetBSD.org wrote:
> > > fixed in HEAD, needs pullup-10 and pullup-9
> >
> > AFAIK it's not needed in netbsd-9, PVH is not supported before -10
>
> Don't XEN3_DOM0 and XEN3_DOMU kernels have, e.g., xennet(4)? If they
> rely on xen_mb to enforce store-before-load ordering with respect to
> the hypervisor (possibly on another physical CPU), they may need this
> fix.
>
> That said, by accident, it looks like this will only be relevant for
> very old 32-bit x86 CPUs -- those without SSE2. On CPUs with SSE2,
> netbsd-9 hotpatches membar_sync to use MFENCE. And all amd64 CPUs
> have SSE2.
AFAIK hotpatch is disabled for XEN PV kernels.
--
Manuel Bouyer <bouyer@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--
State-Changed-From-To: needs-pullups->pending-pullups
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Wed, 17 Jul 2024 12:13:53 +0000
State-Changed-Why:
pullup-10 #764 https://releng.netbsd.org/cgi-bin/req-10.cgi?show=764
not needed <10 because pv kernels don't hotpatch at all
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/57199 CVS commit: [netbsd-10] src
Date: Sat, 20 Jul 2024 16:11:27 +0000
Module Name: src
Committed By: martin
Date: Sat Jul 20 16:11:27 UTC 2024
Modified Files:
src/common/lib/libc/arch/i386/atomic [netbsd-10]: atomic.S
src/common/lib/libc/arch/x86_64/atomic [netbsd-10]: atomic.S
src/sys/arch/amd64/amd64 [netbsd-10]: cpufunc.S
src/sys/arch/i386/i386 [netbsd-10]: cpufunc.S
src/sys/arch/xen/include [netbsd-10]: hypervisor.h xenring.h
Log Message:
Pull up following revision(s) (requested by riastradh in ticket #764):
common/lib/libc/arch/i386/atomic/atomic.S: revision 1.37
sys/arch/xen/include/xenring.h: revision 1.8
sys/arch/i386/i386/cpufunc.S: revision 1.52
sys/arch/amd64/amd64/cpufunc.S: revision 1.68
sys/arch/xen/include/hypervisor.h: revision 1.60
common/lib/libc/arch/x86_64/atomic/atomic.S: revision 1.30
xen: Don't hotpatch away LOCK prefix in xen_mb, even on UP boots.
Both xen_mb and membar_sync are designed to provide store-before-load
ordering, but xen_mb has to provide it in synchronizing guest with
hypervisor, while membar_sync only has to provide it in synchronizing
one (guest) CPU with another (guest) CPU.
It is safe to hotpatch away the LOCK prefix in membar_sync on a
uniprocessor boot because membar_sync is only designed to coordinate
between normal memory on multiple CPUs, and is never necessary when
there's only one CPU involved.
But xen_mb is used to coordinate between the guest and the `device'
implemented by a hypervisor, which might be running on another
_physical_ CPU even if the NetBSD guest only sees one `CPU', i.e.,
one _virtual_ CPU. So even on `uniprocessor' boots, xen_mb must
still issue an instruction with store-before-load ordering on
multiprocessor systems, such as a LOCK ADD (or MFENCE, but MFENCE is
costlier for no benefit here).
No need to change xen_wmb (release ordering, load/store-before-store)
or xen_rmb (acquire ordering, load-before-load/store) because every
x86 store is a store-release and every x86 load is a load-acquire,
even on multiprocessor systems, so there's no hotpatching involved
anyway.
PR kern/57199
To generate a diff of this commit:
cvs rdiff -u -r1.36 -r1.36.2.1 src/common/lib/libc/arch/i386/atomic/atomic.S
cvs rdiff -u -r1.29 -r1.29.2.1 \
src/common/lib/libc/arch/x86_64/atomic/atomic.S
cvs rdiff -u -r1.65 -r1.65.18.1 src/sys/arch/amd64/amd64/cpufunc.S
cvs rdiff -u -r1.49 -r1.49.20.1 src/sys/arch/i386/i386/cpufunc.S
cvs rdiff -u -r1.55.4.3 -r1.55.4.4 src/sys/arch/xen/include/hypervisor.h
cvs rdiff -u -r1.6.20.1 -r1.6.20.2 src/sys/arch/xen/include/xenring.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: pending-pullups->closed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Wed, 24 Jul 2024 00:39:02 +0000
State-Changed-Why:
fixed in HEAD, pulled up to 10, inapplicable <10
>Unformatted:
10.0_BETA from 2023-01-10
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.