NetBSD Problem Report #58649

From manu@netbsd.org  Tue Aug 27 00:47:07 2024
Return-Path: <manu@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256
	 client-signature RSA-PSS (2048 bits) client-digest SHA256)
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id C6ECC1A9241
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 27 Aug 2024 00:47:07 +0000 (UTC)
Message-Id: <20240827004707.32ECF84EF2@mail.netbsd.org>
Date: Tue, 27 Aug 2024 00:47:07 +0000 (UTC)
From: manu@netbsd.org
Reply-To: manu@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: Intel 82801H SATA interrupts bug on NetBSD 10.0 
X-Send-Pr-Version: 3.95

>Number:         58649
>Category:       kern
>Synopsis:       Intel 82801H SATA interrupts bug on NetBSD 10.0
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Aug 27 00:50:00 +0000 2024
>Last-Modified:  Tue Aug 27 23:00:01 +0000 2024
>Originator:     Emmanuel Dreyfus
>Release:        NetBSD 10.0
>Organization:
NetBSD
>Environment:
		NetBSD 10.0/i386
Machine: i386
>Description:
	The machine runs NetBSD 9.3 fine. Here is what is detected:

piixide0 at pci0 dev 31 function 2: Intel 82801H Serial ATA Controller (ICH8) (rev. 0x02)
piixide0: bus-master DMA support present
piixide0: primary channel configured to native-PCI mode
piixide0: using ioapic0 pin 19 for native-PCI interrupt
atabus0 at piixide0 channel 0
piixide0: secondary channel configured to native-PCI mode
atabus1 at piixide0 channel 1
piixide1 at pci0 dev 31 function 5: Intel 82801H Serial ATA Controller (ICH8) (rev. 0x02)
piixide1: bus-master DMA support present
piixide1: primary channel wired to native-PCI mode
piixide1: using ioapic0 pin 19 for native-PCI interrupt
atabus2 at piixide1 channel 0
piixide1: secondary channel wired to native-PCI mode
atabus3 at piixide1 channel 1
wd0 at atabus0 drive 0
wd0: <CF CARD 4GB>
wd0: drive supports 1-sector PIO transfers, LBA addressing
wd0: 3599 MB, 7314 cyl, 16 head, 63 sec, 512 bytes/sect x 7372512 sectors
wd0: 32-bit data port
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
wd0(piixide0:0:0): using PIO mode 4
wd1 at atabus2 drive 0
wd1: <WDC WD15EARX-00ZUDB0>
wd1: drive supports 16-sector PIO transfers, LBA48 addressing
wd1: 1397 GB, 2907021 cyl, 16 head, 63 sec, 512 bytes/sect x 2930277168 sectors (0 bytes/physsect; first aligned sector: 8)
wd1: 32-bit data port
wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133), NCQ (32 tags) w/PRIO
wd1(piixide1:0:0): using PIO mode 4

On NetBSD 10.0, no kernel disk access will work. While kernel boot is not
even finished, it loops on 
pixide0:0:0: lost interrupt
type: ata tc_bcount: 512 tc_skip: 0
piixide0:0:0: bus-master DMA error: missing interrupt, status=0x21
wd1(piixide1:0:0): using PIO mode 4

Booting NetBSD 10.0 with userconf disable piixide* gets it to multiuser,
the pciide driver is handling the disk. However the machine experiences 
interrupt storms even if there are no disk access at all. systat reports
more than 50% of time in interrupt while the machine is idle.
105031 on total 105406 are attributed to ioapic0 pin 19

We have theses devies connected to ioapic0 pin 19. 

uhci1: interrupting at ioapic0 pin 19
pciide0: using ioapic0 pin 19 for native-PCI interrupt
pciide1: using ioapic0 pin 19 for native-PCI interrupt

Note that no USB devie is connected to the system, uhci1 only has
internal USB hubs as children.

On NetBSD 9.3, systat vm reports a quiet ioapic0 pin 19, which is what
we expect for no disk access.

>How-To-Repeat:
	I assume one need Intel 82801H Serial ATA Controller
>Fix:
	Workaround is to stick to NetBSD 9.3 for now

>Audit-Trail:
From: Patrick Welche <prlw1@welche.eu>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/58649: Intel 82801H SATA interrupts bug on NetBSD 10.0
Date: Tue, 27 Aug 2024 23:30:46 +0100

 On Tue, Aug 27, 2024 at 12:50:01AM +0000, manu@netbsd.org wrote:
 > On NetBSD 10.0, no kernel disk access will work. While kernel boot is not
 > even finished, it loops on 
 > pixide0:0:0: lost interrupt
 > type: ata tc_bcount: 512 tc_skip: 0
 > piixide0:0:0: bus-master DMA error: missing interrupt, status=0x21
 > wd1(piixide1:0:0): using PIO mode 4

 This is reminding me of something along the lines of a delay being
 shortened as most drives are faster, but then having to add an
 option to the kernel configs of old servers with slow disks to get
 the old longer timeout back. Of course this has been a while and
 I can't remember the option...

From: Patrick Welche <prlw1@welche.eu>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/58649: Intel 82801H SATA interrupts bug on NetBSD 10.0
Date: Tue, 27 Aug 2024 23:57:48 +0100

 On Tue, Aug 27, 2024 at 10:35:01PM +0000, gnats-admin@netbsd.org wrote:
 > The following reply was made to PR kern/58649; it has been noted by GNATS.
 > 
 > From: Patrick Welche <prlw1@welche.eu>
 > To: gnats-bugs@netbsd.org
 > Cc: 
 > Subject: Re: kern/58649: Intel 82801H SATA interrupts bug on NetBSD 10.0
 > Date: Tue, 27 Aug 2024 23:30:46 +0100
 > 
 >  On Tue, Aug 27, 2024 at 12:50:01AM +0000, manu@netbsd.org wrote:
 >  > On NetBSD 10.0, no kernel disk access will work. While kernel boot is not
 >  > even finished, it loops on 
 >  > pixide0:0:0: lost interrupt
 >  > type: ata tc_bcount: 512 tc_skip: 0
 >  > piixide0:0:0: bus-master DMA error: missing interrupt, status=0x21
 >  > wd1(piixide1:0:0): using PIO mode 4
 >  
 >  This is reminding me of something along the lines of a delay being
 >  shortened as most drives are faster, but then having to add an
 >  option to the kernel configs of old servers with slow disks to get
 >  the old longer timeout back. Of course this has been a while and
 >  I can't remember the option...

 No sooner do I hit return - I was thinking of AHCISATA_EXTRA_DELAY
 which wasn't needed in 9 and which was reworked, jettisonned and
 pulled up to 10 in PR 56737.

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.