NetBSD Problem Report #57643

From www@netbsd.org  Wed Oct  4 06:51:29 2023
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 195371A9238
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  4 Oct 2023 06:51:29 +0000 (UTC)
Message-Id: <20231004065127.2A5B61A923A@mollari.NetBSD.org>
Date: Wed,  4 Oct 2023 06:51:27 +0000 (UTC)
From: rokuyama.rk@gmail.com
Reply-To: rokuyama.rk@gmail.com
To: gnats-bugs@NetBSD.org
Subject: aarch64: smmu(4) seems mandatory for some SoC and/or memory configuration
X-Send-Pr-Version: www-1.0

>Number:         57643
>Category:       port-arm
>Synopsis:       aarch64: smmu(4) seems mandatory for some SoC and/or memory configuration
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    port-arm-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Oct 04 06:55:00 +0000 2023
>Last-Modified:  Tue Oct 10 04:10:01 +0000 2023
>Originator:     Rin Okuyama
>Release:        10.99.8
>Organization:
Internet Initiative Japan Inc.
>Environment:
NetBSD lx2k-igc-intel 10.99.8 NetBSD 10.99.8 (GENERIC64) #152: Wed Oct  4 12:28:25 JST 2023  rin@netbsd:/home/rin/src/sys/arch/evbarm/compile/GENERIC64 evbarm aarch64
>Description:
For HoneyComb LK2K with its latest (2021-08-10) UEFI firmware:

https://images.solid-run.com/LX2k/lx2160a_uefi

TX stalls indefinitely for upcoming igc(4) driver, if 64-bit DMA tag
is used; only a part of TX packet is marked processed, with
succeeding buffers up to EOP being untouched forever.

As far as I can see, this behavior has never been observed for
OpenBSD and Linux, that have SMMU driver.

It occurs LK2K with 64 and 32GB memory. On the other hand, I've
never observed similar errors on ROCKPro64 (U-Boot, 4GB memory)
and Quartz64 (UEFI, *8GB* memory).

Even on LK2K, this behavior has never been observed with 32-bit
DMA tag. Therefore, I *guess* that at least some SoC and/or
configuration require SMMU support for DMA from/to whole
physical space.
>How-To-Repeat:
On LX2K with will-be-committed igc(4) driver:

$ j=0; while iperf3 -c somewhere; do j=$((j + 1)); echo $j; done

Then, it stalls for j <~ 100.
>Fix:
(1) Test LK2K with <= 4GB memory, other boards with > 4GB memory?

(2) Import smmu(4) from OpenBSD. But, don't we need to have MI
    frameworks for IOMMUs?

>Audit-Trail:
From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57643 CVS commit: src/sys/dev/pci
Date: Wed, 4 Oct 2023 07:35:27 +0000

 Module Name:	src
 Committed By:	rin
 Date:		Wed Oct  4 07:35:27 UTC 2023

 Modified Files:
 	src/sys/dev/pci: files.pci
 	src/sys/dev/pci/igc: if_igc.c if_igc.h igc_api.c igc_api.h igc_base.c
 	    igc_base.h igc_defines.h igc_hw.h igc_i225.c igc_i225.h igc_mac.c
 	    igc_mac.h igc_nvm.c igc_nvm.h igc_phy.c igc_phy.h igc_regs.h
 Added Files:
 	src/sys/dev/pci/igc: igc_evcnt.h

 Log Message:
 igc(4): Add support to Intel I225 / I226 series ethernet devices

 Originally written by kevlo@o for OpenBSD, and ported by knakahara@,
 msaitoh@, and myself.

 The driver is *EXPERIMENTAL* at the moment, as some minor error
 handling paths are not fully implemented.

 Hardware VLAN tagging and TSO are not supported yet.

 Although, we have never observed strange behaviors at least on amd64,
 aarch64{,eb}, and evbppc (IBM405), except for PR port-arm/57643.

 We will send pullup request to netbsd-10, after successful snapshot
 build for -current.


 To generate a diff of this commit:
 cvs rdiff -u -r1.446 -r1.447 src/sys/dev/pci/files.pci
 cvs rdiff -u -r1.1.1.1 -r1.2 src/sys/dev/pci/igc/if_igc.c \
     src/sys/dev/pci/igc/if_igc.h src/sys/dev/pci/igc/igc_api.c \
     src/sys/dev/pci/igc/igc_api.h src/sys/dev/pci/igc/igc_base.c \
     src/sys/dev/pci/igc/igc_base.h src/sys/dev/pci/igc/igc_defines.h \
     src/sys/dev/pci/igc/igc_hw.h src/sys/dev/pci/igc/igc_i225.c \
     src/sys/dev/pci/igc/igc_i225.h src/sys/dev/pci/igc/igc_mac.c \
     src/sys/dev/pci/igc/igc_mac.h src/sys/dev/pci/igc/igc_nvm.c \
     src/sys/dev/pci/igc/igc_nvm.h src/sys/dev/pci/igc/igc_phy.c \
     src/sys/dev/pci/igc/igc_phy.h src/sys/dev/pci/igc/igc_regs.h
 cvs rdiff -u -r0 -r1.1 src/sys/dev/pci/igc/igc_evcnt.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Rin Okuyama" <rin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/57643 CVS commit: src/sys/dev/pci/igc
Date: Wed, 4 Oct 2023 07:41:55 +0000

 Module Name:	src
 Committed By:	rin
 Date:		Wed Oct  4 07:41:55 UTC 2023

 Modified Files:
 	src/sys/dev/pci/igc: if_igc.c

 Log Message:
 igc(4): XXX: Temporally disable 64-bit DMA for aarch64

 Until PR port-arm/57643 is properly addressed.


 To generate a diff of this commit:
 cvs rdiff -u -r1.2 -r1.3 src/sys/dev/pci/igc/if_igc.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-arm/57643: aarch64: smmu(4) seems mandatory for some SoC and/or memory configuration
Date: Wed, 4 Oct 2023 13:25:46 +0000

 > Date: Wed,  4 Oct 2023 06:55:00 +0000 (UTC)
 > From: rokuyama.rk@gmail.com
 > 
 > (2) Import smmu(4) from OpenBSD. But, don't we need to have MI
 >     frameworks for IOMMUs?

 Isn't that what bus_dma is all about?  What other framework is needed?

From: Tobias Nygren <tnn@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: rokuyama.rk@gmail.com
Subject: Re: port-arm/57643: aarch64: smmu(4) seems mandatory for some SoC
 and/or memory configuration
Date: Wed, 4 Oct 2023 17:12:17 +0200

 > It occurs LK2K with 64 and 32GB memory. On the other hand, I've
 > never observed similar errors on ROCKPro64 (U-Boot, 4GB memory)
 > and Quartz64 (UEFI, *8GB* memory).

 Just to add a data point. Writing this letter from an LX2K, 64 GB
 with wm(4) that says it uses 64-bit DMA:

 wm0 at pci5 dev 0 function 0, 64-bit DMA: Intel i82574L (rev. 0x00)

 This card works fine, as does ahcisata(4) and xhci(4) which also
 use 64-bit DMA.

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: Taylor R Campbell <riastradh@NetBSD.org>, gnats-bugs@netbsd.org
Cc: port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org,
 netbsd-bugs@netbsd.org
Subject: Re: port-arm/57643: aarch64: smmu(4) seems mandatory for some SoC
 and/or memory configuration
Date: Fri, 6 Oct 2023 22:19:32 +0900

 On 2023/10/04 22:25, Taylor R Campbell wrote:
 >> Date: Wed,  4 Oct 2023 06:55:00 +0000 (UTC)
 >> From: rokuyama.rk@gmail.com
 >>
 >> (2) Import smmu(4) from OpenBSD. But, don't we need to have MI
 >>      frameworks for IOMMUs?
 > 
 > Isn't that what bus_dma is all about?  What other framework is needed?

 API for userland for virtualization?

 But, it may be a jump. Yes, we should make it usable from
 bus_dma(9) at the beginning.

 Thanks,
 rin

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: Tobias Nygren <tnn@NetBSD.org>, gnats-bugs@netbsd.org,
 netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-arm/57643: aarch64: smmu(4) seems mandatory for some SoC
 and/or memory configuration
Date: Fri, 6 Oct 2023 22:31:28 +0900

 On 2023/10/05 0:12, Tobias Nygren wrote:
 >> It occurs LK2K with 64 and 32GB memory. On the other hand, I've
 >> never observed similar errors on ROCKPro64 (U-Boot, 4GB memory)
 >> and Quartz64 (UEFI, *8GB* memory).
 > 
 > Just to add a data point. Writing this letter from an LX2K, 64 GB
 > with wm(4) that says it uses 64-bit DMA:
 > 
 > wm0 at pci5 dev 0 function 0, 64-bit DMA: Intel i82574L (rev. 0x00)
 > 
 > This card works fine, as does ahcisata(4) and xhci(4) which also
 > use 64-bit DMA.

 Thank you for your feedback.

 Hmm, this suggests there may be some bugs in igc(4)...

 It turns out that only 2GB of DRAM is mapped below 0xffff ffff
 for LK2K (thanks hikaru@ for pointing it out).

 I tried a 2GB SO-DIMM, but UEFI firmware crashes due to sync
 exception (confirmed for multiple revisions). Maybe the vendor
 do not test memory configuration < 4GB; only modules >= 4GB are
 recommended by them:

 https://developer.solid-run.com/knowledge-base/lx2160a-cex7-tested-memory-so-dimms/

 I will get another module and test for sure...

 So, the current status of igc(4) is not perfect, but I will
 send a pull up request for netbsd-10; it would be better to
 receive many feedback from NetBSD 10.0 RC1 testers.

 Thanks,
 rin

From: Rin Okuyama <rokuyama.rk@gmail.com>
To: port-arm-maintainer@netbsd.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org, gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-arm/57643: aarch64: smmu(4) seems mandatory for some SoC
 and/or memory configuration
Date: Tue, 10 Oct 2023 13:08:31 +0900

 Some unsuccessful tries:

 (1) UEFI firmware crashes for another 2GB module by other vendor.

 (2) wm(4) driver (since rev 1.69 back to 2004) imposes a boundary
     condition for TX/RX descriptors, to forbid across 4GB boundaries:

 http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/dev/pci/if_wm.c#rev1.69
 https://nxr.netbsd.org/xref/src/sys/dev/pci/if_wm.c#wm_alloc_tx_descs

     But this is irrelevant in this case; descriptor buffer is not
     across 4GB boundaries when this symptom is observed.

 (3) The problem cannot be reproduced for a I226 card:

 igc0 at pci3 dev 0 function 0: Intel(R) Ethernet Controller I226-V (rev. 0x04)

     But this is a dual port card, and it may be just because a
     device is internally connected via ppb(4):

 pci1 at acpipchb1 bus 1
 pci1: i/o space, memory space enabled, rd/line, rd/mult, wr/inv o
 ppb0 at pci1 dev 0 function 0: vendor 1b21 product 1182 (rev. 0x00)
 ppb0: PCI Express capability version 2 <Upstream Port of PCI-E Switch>
 pci2 at ppb0 bus 2
 pci2: i/o space, memory space enabled
 ppb1 at pci2 dev 3 function 0: vendor 1b21 product 1182 (rev. 0x00)
 ppb1: PCI Express capability version 2 <Downstream Port of PCI-E
 Switch> x1 @ 5.0GT/s
 pci3 at ppb1 bus 3

     I ordered a single port I226 card as well as a dual port I225 in
     order to confirm this scenario.

 Thanks,
 rin

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.