NetBSD Problem Report #54899

From kardel@kardel.name  Mon Jan 27 07:38:36 2020
Return-Path: <kardel@kardel.name>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 854C57A172
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 27 Jan 2020 07:38:36 +0000 (UTC)
Message-Id: <20200127073831.CB377DA0D98@pip.kardel.name>
Date: Mon, 27 Jan 2020 08:38:31 +0100 (CET)
From: kardel@netbsd.org
Reply-To: kardel@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: crash DIAGNOSTIC in extent_alloc_region in ahd driver attach
X-Send-Pr-Version: 3.95

>Number:         54899
>Category:       kern
>Synopsis:       crash DIAGNOSTIC in extent_alloc_region in ahd driver attach
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Jan 27 07:40:01 +0000 2020
>Closed-Date:    Thu Aug 05 05:36:10 +0000 2021
>Last-Modified:  Thu Aug 05 05:36:10 +0000 2021
>Originator:     Frank Kardel
>Release:        NetBSD 9.99.42
>Organization:

>Environment:


System: NetBSD pip.kardel.name 9.99.42 NetBSD 9.99.42 (PIPGEN) #0: Sat Jan 25 16:40:30 CET 2020 kardel@...:/src/NetBSD/act/src/obj.amd64/sys/arch/amd64/compile/PIPGEN amd64
Architecture: x86_64
Machine: amd64
>Description:
	When added a SCSI interface for the ahd driver The system crashes while attaching in extent_alloc_region.
/*
 * Allocate a specific region in an extent map.
 */
int
extent_alloc_region(struct extent *ex, u_long start, u_long size, int flags)
{
        struct extent_region *rp, *last, *myrp;
        u_long end = start + (size - 1);
        int error;

#ifdef DIAGNOSTIC
        /* Check arguments. */
        if (ex == NULL)
                panic("extent_alloc_region: NULL extent");
        if (size < 1) {
                printf("extent_alloc_region: extent `%s', size 0x%lx\n",
                    ex->ex_name, size);
                panic("extent_alloc_region: bad size");
        }
        if (end < start) {
                printf(
                 "extent_alloc_region: extent `%s', start 0x%lx, size 0x%lx\n",
#### -> crash     ex->ex_name, start, size);   #### <--- crash
                panic("extent_alloc_region: overflow");
        }
#endif

ex->ex_name is not correctly initialized/passed.
Also there seems to be an issue with the start and size paramters as
end < start is true. I try to gather the actual values of start and size
this evening.

Stack (manual partial copy from photo)
extent_alloc_region()
bus_space_reserve()
bus_space_map()
pci_mapreg_submap()
pci_mapreg_map()
ahd_pci_attach()
...

>How-To-Repeat:
	Use a ASUS PRIME X570-PRO motherboard with a Ryzen 9 CPU and a AIC-7901X 
	based SCSI controller.
>Fix:
	?

>Release-Note:

>Audit-Trail:
From: Frank Kardel <kardel@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/54899: crash DIAGNOSTIC in extent_alloc_region in ahd driver
 attach
Date: Wed, 5 Feb 2020 11:40:28 +0100

 Got a new BIOS 1406 - the card is visible again and initialized by BIOS 
 CSM legacy.

 The panic still happens as the memory address is claimed to be 64-bit 
 and located at
 07:04.0
    Type 0 ("normal" device) header:
      0x10: 0x0000b101 0xf7080004 0x40000000 0x0000b001
      0x20: 0x00000000 0x00000000 0x00000000 0x00459005
      0x30: 0xf7000000 0x000000dc 0x00000000 0x1928010a

      Base address register at 0x10
        type: 16-bit I/O
        base: 0x0000b100
        size: 0x00000100
      Base address register at 0x14
        type: 64-bit nonprefetchable memory
        base: 0x40000000f7080000
        size: 0x0000000000080000
      Base address register at 0x1c
        type: 16-bit I/O
        base: 0x0000b000
        size: 0x00001000
      Base address register at 0x20
        not implemented
      Base address register at 0x24
        not implemented

 The address 0x40000000f7080000 looks suspicious especially as this 
 controller chip is located behind a bridge that has its memory
 configured as
 06:00.0
      Base address register at 0x10
        type: 32-bit nonprefetchable memory
        base: 0xf7100000
        size: 0x00100000
      Base address register at 0x14
        not implemented
      Primary bus number: 0x06
      Secondary bus number: 0x07
      Subordinate bus number: 0x07

 The other bridges on the path also remain in the 32 bit range.

 To the PCI experts here: are we having a BIOS issue or do we have to fix 
 something in NetBSD? The previous motherboard remained in the 32-bit 
 range for the memory address.

 Frank


From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/54899: crash DIAGNOSTIC in extent_alloc_region in ahd
 driver attach
Date: Wed, 9 Jun 2021 04:31:32 +0000

 Found the following buried in the gnats administrator mailbox.

 Note that it's dated before the preceding followup message.

    ------

 From: Frank Kardel <kardel@netbsd.org>
 To: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
 Subject: Re: kern/54899: crash DIAGNOSTIC in extent_alloc_region in ahd driver
 	attach
 Date: Sun, 2 Feb 2020 14:12:13 +0100

 Turns out, that this card and main board (bios) have issues. The pci dump
 gives:

     Base address register at 0x14
       type: 64-bit nonprefetchable memory
       base: 0x40000000f7080000
       size: 0x0000000000080000

 the Bit 62 is set and exceeds the iomem range,

 On the previous board the value was:

     Base address register at 0x14
       type: 64-bit nonprefetchable memory
       base: 0x00000000fe180000

 which is much more sensible.

 Upgrading the BIOS made the BIOS start very slow and the card was not
 recognized any more

 except for delaying the startup from normally ~30 seconds to several minutes.

 Will close this bug.

 The Mainboard is an ASUS Prime X570-PRO with BIOS 1405

 Frank


State-Changed-From-To: open->closed
State-Changed-By: kardel@NetBSD.org
State-Changed-When: Thu, 05 Aug 2021 05:36:10 +0000
State-Changed-Why:
BIOS bug
BIOS maps 64-bit BAR to 0x4000000000000000 behind
a 32-bit bridge.
Workaround: disable memory mappped io


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.