NetBSD Problem Report #57643
From www@netbsd.org Wed Oct 4 06:51:29 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 195371A9238
for <gnats-bugs@gnats.NetBSD.org>; Wed, 4 Oct 2023 06:51:29 +0000 (UTC)
Message-Id: <20231004065127.2A5B61A923A@mollari.NetBSD.org>
Date: Wed, 4 Oct 2023 06:51:27 +0000 (UTC)
From: rokuyama.rk@gmail.com
Reply-To: rokuyama.rk@gmail.com
To: gnats-bugs@NetBSD.org
Subject: aarch64: smmu(4) seems mandatory for some SoC and/or memory configuration
X-Send-Pr-Version: www-1.0
>Number: 57643
>Category: port-arm
>Synopsis: aarch64: smmu(4) seems mandatory for some SoC and/or memory configuration
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-arm-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed Oct 04 06:55:00 +0000 2023
>Last-Modified: Tue Oct 10 04:10:01 +0000 2023
>Originator: Rin Okuyama
>Release: 10.99.8
>Organization:
Internet Initiative Japan Inc.
>Environment:
NetBSD lx2k-igc-intel 10.99.8 NetBSD 10.99.8 (GENERIC64) #152: Wed Oct 4 12:28:25 JST 2023 rin@netbsd:/home/rin/src/sys/arch/evbarm/compile/GENERIC64 evbarm aarch64
>Description:
For HoneyComb LK2K with its latest (2021-08-10) UEFI firmware:
https://images.solid-run.com/LX2k/lx2160a_uefi
TX stalls indefinitely for upcoming igc(4) driver, if 64-bit DMA tag
is used; only a part of TX packet is marked processed, with
succeeding buffers up to EOP being untouched forever.
As far as I can see, this behavior has never been observed for
OpenBSD and Linux, that have SMMU driver.
It occurs LK2K with 64 and 32GB memory. On the other hand, I've
never observed similar errors on ROCKPro64 (U-Boot, 4GB memory)
and Quartz64 (UEFI, *8GB* memory).
Even on LK2K, this behavior has never been observed with 32-bit
DMA tag. Therefore, I *guess* that at least some SoC and/or
configuration require SMMU support for DMA from/to whole
physical space.
>How-To-Repeat:
On LX2K with will-be-committed igc(4) driver:
$ j=0; while iperf3 -c somewhere; do j=$((j + 1)); echo $j; done
Then, it stalls for j <~ 100.
>Fix:
(1) Test LK2K with <= 4GB memory, other boards with > 4GB memory?
(2) Import smmu(4) from OpenBSD. But, don't we need to have MI
frameworks for IOMMUs?
>Audit-Trail:
From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/57643 CVS commit: src/sys/dev/pci
Date: Wed, 4 Oct 2023 07:35:27 +0000
Module Name: src
Committed By: rin
Date: Wed Oct 4 07:35:27 UTC 2023
Modified Files:
src/sys/dev/pci: files.pci
src/sys/dev/pci/igc: if_igc.c if_igc.h igc_api.c igc_api.h igc_base.c
igc_base.h igc_defines.h igc_hw.h igc_i225.c igc_i225.h igc_mac.c
igc_mac.h igc_nvm.c igc_nvm.h igc_phy.c igc_phy.h igc_regs.h
Added Files:
src/sys/dev/pci/igc: igc_evcnt.h
Log Message:
igc(4): Add support to Intel I225 / I226 series ethernet devices
Originally written by kevlo@o for OpenBSD, and ported by knakahara@,
msaitoh@, and myself.
The driver is *EXPERIMENTAL* at the moment, as some minor error
handling paths are not fully implemented.
Hardware VLAN tagging and TSO are not supported yet.
Although, we have never observed strange behaviors at least on amd64,
aarch64{,eb}, and evbppc (IBM405), except for PR port-arm/57643.
We will send pullup request to netbsd-10, after successful snapshot
build for -current.
To generate a diff of this commit:
cvs rdiff -u -r1.446 -r1.447 src/sys/dev/pci/files.pci
cvs rdiff -u -r1.1.1.1 -r1.2 src/sys/dev/pci/igc/if_igc.c \
src/sys/dev/pci/igc/if_igc.h src/sys/dev/pci/igc/igc_api.c \
src/sys/dev/pci/igc/igc_api.h src/sys/dev/pci/igc/igc_base.c \
src/sys/dev/pci/igc/igc_base.h src/sys/dev/pci/igc/igc_defines.h \
src/sys/dev/pci/igc/igc_hw.h src/sys/dev/pci/igc/igc_i225.c \
src/sys/dev/pci/igc/igc_i225.h src/sys/dev/pci/igc/igc_mac.c \
src/sys/dev/pci/igc/igc_mac.h src/sys/dev/pci/igc/igc_nvm.c \
src/sys/dev/pci/igc/igc_nvm.h src/sys/dev/pci/igc/igc_phy.c \
src/sys/dev/pci/igc/igc_phy.h src/sys/dev/pci/igc/igc_regs.h
cvs rdiff -u -r0 -r1.1 src/sys/dev/pci/igc/igc_evcnt.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/57643 CVS commit: src/sys/dev/pci/igc
Date: Wed, 4 Oct 2023 07:41:55 +0000
Module Name: src
Committed By: rin
Date: Wed Oct 4 07:41:55 UTC 2023
Modified Files:
src/sys/dev/pci/igc: if_igc.c
Log Message:
igc(4): XXX: Temporally disable 64-bit DMA for aarch64
Until PR port-arm/57643 is properly addressed.
To generate a diff of this commit:
cvs rdiff -u -r1.2 -r1.3 src/sys/dev/pci/igc/if_igc.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-arm/57643: aarch64: smmu(4) seems mandatory for some SoC and/or memory configuration
Date: Wed, 4 Oct 2023 13:25:46 +0000
> Date: Wed, 4 Oct 2023 06:55:00 +0000 (UTC)
> From: rokuyama.rk@gmail.com
>
> (2) Import smmu(4) from OpenBSD. But, don't we need to have MI
> frameworks for IOMMUs?
Isn't that what bus_dma is all about? What other framework is needed?
From: Tobias Nygren <tnn@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: rokuyama.rk@gmail.com
Subject: Re: port-arm/57643: aarch64: smmu(4) seems mandatory for some SoC
and/or memory configuration
Date: Wed, 4 Oct 2023 17:12:17 +0200
> It occurs LK2K with 64 and 32GB memory. On the other hand, I've
> never observed similar errors on ROCKPro64 (U-Boot, 4GB memory)
> and Quartz64 (UEFI, *8GB* memory).
Just to add a data point. Writing this letter from an LX2K, 64 GB
with wm(4) that says it uses 64-bit DMA:
wm0 at pci5 dev 0 function 0, 64-bit DMA: Intel i82574L (rev. 0x00)
This card works fine, as does ahcisata(4) and xhci(4) which also
use 64-bit DMA.
From: Rin Okuyama <rokuyama.rk@gmail.com>
To: Taylor R Campbell <riastradh@NetBSD.org>, gnats-bugs@netbsd.org
Cc: port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: port-arm/57643: aarch64: smmu(4) seems mandatory for some SoC
and/or memory configuration
Date: Fri, 6 Oct 2023 22:19:32 +0900
On 2023/10/04 22:25, Taylor R Campbell wrote:
>> Date: Wed, 4 Oct 2023 06:55:00 +0000 (UTC)
>> From: rokuyama.rk@gmail.com
>>
>> (2) Import smmu(4) from OpenBSD. But, don't we need to have MI
>> frameworks for IOMMUs?
>
> Isn't that what bus_dma is all about? What other framework is needed?
API for userland for virtualization?
But, it may be a jump. Yes, we should make it usable from
bus_dma(9) at the beginning.
Thanks,
rin
From: Rin Okuyama <rokuyama.rk@gmail.com>
To: Tobias Nygren <tnn@NetBSD.org>, gnats-bugs@netbsd.org,
netbsd-bugs@netbsd.org
Cc:
Subject: Re: port-arm/57643: aarch64: smmu(4) seems mandatory for some SoC
and/or memory configuration
Date: Fri, 6 Oct 2023 22:31:28 +0900
On 2023/10/05 0:12, Tobias Nygren wrote:
>> It occurs LK2K with 64 and 32GB memory. On the other hand, I've
>> never observed similar errors on ROCKPro64 (U-Boot, 4GB memory)
>> and Quartz64 (UEFI, *8GB* memory).
>
> Just to add a data point. Writing this letter from an LX2K, 64 GB
> with wm(4) that says it uses 64-bit DMA:
>
> wm0 at pci5 dev 0 function 0, 64-bit DMA: Intel i82574L (rev. 0x00)
>
> This card works fine, as does ahcisata(4) and xhci(4) which also
> use 64-bit DMA.
Thank you for your feedback.
Hmm, this suggests there may be some bugs in igc(4)...
It turns out that only 2GB of DRAM is mapped below 0xffff ffff
for LK2K (thanks hikaru@ for pointing it out).
I tried a 2GB SO-DIMM, but UEFI firmware crashes due to sync
exception (confirmed for multiple revisions). Maybe the vendor
do not test memory configuration < 4GB; only modules >= 4GB are
recommended by them:
https://developer.solid-run.com/knowledge-base/lx2160a-cex7-tested-memory-so-dimms/
I will get another module and test for sure...
So, the current status of igc(4) is not perfect, but I will
send a pull up request for netbsd-10; it would be better to
receive many feedback from NetBSD 10.0 RC1 testers.
Thanks,
rin
From: Rin Okuyama <rokuyama.rk@gmail.com>
To: port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, gnats-bugs@netbsd.org
Cc:
Subject: Re: port-arm/57643: aarch64: smmu(4) seems mandatory for some SoC
and/or memory configuration
Date: Tue, 10 Oct 2023 13:08:31 +0900
Some unsuccessful tries:
(1) UEFI firmware crashes for another 2GB module by other vendor.
(2) wm(4) driver (since rev 1.69 back to 2004) imposes a boundary
condition for TX/RX descriptors, to forbid across 4GB boundaries:
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/dev/pci/if_wm.c#rev1.69
https://nxr.netbsd.org/xref/src/sys/dev/pci/if_wm.c#wm_alloc_tx_descs
But this is irrelevant in this case; descriptor buffer is not
across 4GB boundaries when this symptom is observed.
(3) The problem cannot be reproduced for a I226 card:
igc0 at pci3 dev 0 function 0: Intel(R) Ethernet Controller I226-V (rev. 0x04)
But this is a dual port card, and it may be just because a
device is internally connected via ppb(4):
pci1 at acpipchb1 bus 1
pci1: i/o space, memory space enabled, rd/line, rd/mult, wr/inv o
ppb0 at pci1 dev 0 function 0: vendor 1b21 product 1182 (rev. 0x00)
ppb0: PCI Express capability version 2 <Upstream Port of PCI-E Switch>
pci2 at ppb0 bus 2
pci2: i/o space, memory space enabled
ppb1 at pci2 dev 3 function 0: vendor 1b21 product 1182 (rev. 0x00)
ppb1: PCI Express capability version 2 <Downstream Port of PCI-E
Switch> x1 @ 5.0GT/s
pci3 at ppb1 bus 3
I ordered a single port I226 card as well as a dual port I225 in
order to confirm this scenario.
Thanks,
rin
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.