NetBSD Problem Report #58643
From www@netbsd.org Sun Aug 25 19:05:47 2024
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
client-signature RSA-PSS (2048 bits) client-digest SHA256)
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id D5A741A923F
for <gnats-bugs@gnats.NetBSD.org>; Sun, 25 Aug 2024 19:05:47 +0000 (UTC)
Message-Id: <20240825190546.21B321A9241@mollari.NetBSD.org>
Date: Sun, 25 Aug 2024 19:05:46 +0000 (UTC)
From: campbell+netbsd@mumble.net
Reply-To: campbell+netbsd@mumble.net
To: gnats-bugs@NetBSD.org
Subject: bus_dma(9) fails to bounce misaligned inputs requiring extra segments
X-Send-Pr-Version: www-1.0
>Number: 58643
>Category: kern
>Synopsis: bus_dma(9) fails to bounce misaligned inputs requiring extra segments
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: riastradh
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Aug 25 19:10:00 +0000 2024
>Last-Modified: Tue Aug 27 13:40:01 +0000 2024
>Originator: Taylor R Campbell
>Release: current, 10, 9, ...
>Organization:
The NetBounceDMA Foundation
>Environment:
>Description:
In the x86 bus_dma(9) implementation (and, I suspect, many others, but I haven't checked), bus_dmamap_load and its variants may fail with EFBIG when the transfer is not, in fact, too large to fit in the DMA map.
Specifically, create a DMA map with:
- size=PAGE_SIZE
- nseg=1
- maxsegsz=PAGE_SIZE
- boundary=PAGE_SIZE
And then load it with a page-sized transfer that's not page-aligned; say it starts at some address n*PAGE_SIZE + k where 0 < k < PAGE_SIZE.
What happens is that bus_dmamap_load tries to split the transfer into two segments, one of size k and one of size PAGE_SIZE - k. And it runs head first into the limit on the number of segments, which is 1.
At this point it could bounce, but bus_dmamap_create didn't consider that possibility so it didn't allocate a bounce buffer so it won't bounce -- it will just fail with EFBIG, because it thinks there are too many segments.
>How-To-Repeat:
as above
>Fix:
Yes, please!
This might not be a common problem. But I suspect it is common in NIC drivers, which often have a workaround of manually defragmenting the mbuf instead of using a bounce buffer. It's not clear what the right tradeoff is here -- maybe changing bus_dmamap_create so that it preallocates a bounce buffer if this situation is possible would waste a lot of wired kernel memory; maybe having the caller allocate a bounce buffer on the fly is better, if most of the time that buffer isn't needed. Or maybe that's what BUS_DMA_ALLOCNOW is for. In any case, it's very much not obvious from the bus_dma(9) documentation that this might happen -- it _sounds_ like bus_dma(9) is supposed to take care of these details internally.
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: kern-bug-people->thorpej
Responsible-Changed-By: riastradh@NetBSD.org
Responsible-Changed-When: Sun, 25 Aug 2024 19:11:13 +0000
Responsible-Changed-Why:
Can you assess this analysis, thorpej?
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org, netbsd-bugs@NetBSD.org
Cc:
Subject: Re: kern/58643: bus_dma(9) fails to bounce misaligned inputs requiring extra segments
Date: Sun, 25 Aug 2024 20:49:39 +0000
This is a multi-part message in MIME format.
--=_fvz4bw6Sgb7EvIjKFB6Nr1oMeWUc8Daj
Attached patch attempts to address the issue for x86.
The idea is that if userland attempts a uio transfer with, say,
{.iov_base = 0xabc00800, .iov_len = 0x1000},
{.iov_base = 0xdef00800, .iov_len = 0x1000},
{.iov_base = 0x12300800, .iov_len = 0x1000},
and maxsegsz=PAGE_SIZE, then (unless we get amazingly lucky and the
user's pages are physically contiguous), this requires six DMA
segments, for the ranges
[0xabc00800,0xabc01000)
[0xabc01000,0xabc01800)
[0xdef00800,0xdef01000)
[0xdef01000,0xdef01800)
[0x12300800,0x12301000)
[0x12301000,0x12301800)
If I got my arithmetic right (which I almost certainly didn't --
probably made some kind of fencepost errors), the new criterion always
allocates bounce buffers if this situation can happen, and doesn't if
it can't.
However, this might be very costly for drivers where this isn't really
an issue. Maybe there should be a flag to opt into this, like
BUS_DMA_BOUNCEMISALIGNED.
--=_fvz4bw6Sgb7EvIjKFB6Nr1oMeWUc8Daj
Content-Type: text/plain; charset="ISO-8859-1"; name="pr58643-x86busdmamisalignedbounce"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment; filename="pr58643-x86busdmamisalignedbounce.patch"
# HG changeset patch
# User Taylor R Campbell <riastradh@NetBSD.org>
# Date 1724615712 0
# Sun Aug 25 19:55:12 2024 +0000
# Branch trunk
# Node ID 4c129994ef155f8604d2b6acbd4493d8b73a084f
# Parent cf7a8f9687ea781207542c43a006460dc134ea3b
# EXP-Topic riastradh-pr58643-busdmamisalignedbounce
x86/bus_dma(9): Prepare to bounce for misaligned inputs.
PR kern/58643: bus_dma(9) fails to bounce misaligned inputs requiring
extra segments
diff -r cf7a8f9687ea -r 4c129994ef15 sys/arch/x86/x86/bus_dma.c
--- a/sys/arch/x86/x86/bus_dma.c Sat Aug 24 07:24:34 2024 +0000
+++ b/sys/arch/x86/x86/bus_dma.c Sun Aug 25 19:55:12 2024 +0000
@@ -322,6 +322,18 @@ static int
if (map->_dm_bounce_thresh !=3D 0)
cookieflags |=3D X86_DMA_MIGHT_NEED_BOUNCE;
=20
+ /*
+ * If we try to load a misaligned buffer, we may need to split
+ * all of the pages of the buffer into two segments apiece --
+ * one for the low part of the page, the next for the high part
+ * of the page.
+ *
+ * If there's not enough segments to fit the maximum transfer
+ * size split up this way, we may need to bounce.
+ */
+ if (nsegments/2 < howmany(size, MIN(PAGE_SIZE, maxsegsz)))
+ cookieflags |=3D X86_DMA_MIGHT_NEED_BOUNCE;
+
if ((cookieflags & X86_DMA_MIGHT_NEED_BOUNCE) =3D=3D 0) {
*dmamp =3D map;
return 0;
--=_fvz4bw6Sgb7EvIjKFB6Nr1oMeWUc8Daj--
Responsible-Changed-From-To: thorpej->riastradh
Responsible-Changed-By: thorpej@NetBSD.org
Responsible-Changed-When: Tue, 27 Aug 2024 13:37:47 +0000
Responsible-Changed-Why:
Comments added, back to Taylor.
From: Jason Thorpe <thorpej@me.com>
To: Taylor Campbell <riastradh@NetBSD.org>
Cc: "gnats-bugs@netbsd.org" <gnats-bugs@NetBSD.org>,
"netbsd-bugs@netbsd.org" <netbsd-bugs@NetBSD.org>
Subject: Re: kern/58643: bus_dma(9) fails to bounce misaligned inputs
requiring extra segments
Date: Tue, 27 Aug 2024 06:37:02 -0700
> On Aug 25, 2024, at 1:49=E2=80=AFPM, Taylor R Campbell =
<riastradh@NetBSD.org> wrote:
>=20
> Attached patch attempts to address the issue for x86.
Bounce buffer allocation (and resource allocation in general) is =
definitely something I=E2=80=99ve never been totally happy with in =
bus_dma. It may be worth revisiting the issue entirely (resource =
exhaustion in IOMMU-having systems is a concern, too). But I don=E2=80=99=
t particularly like having to allocate the resources lazily, either, =
because it=E2=80=99s then possible to get into a situation where the I/O =
may never take place.
In any case, yah, your analysis is correct and your patch is =E2=80=9Cfine=
=E2=80=9D modulo the memory cost issue. I=E2=80=99m not particularly =
stoked to add another flag to opt in to the behavior, because really any =
driver that has this constraint needs this and the information is =
already available in the arguments used to create the map.
-- thorpej
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.