NetBSD Problem Report #55067
From www@netbsd.org Thu Mar 12 21:04:01 2020
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id A827F1A9213
for <gnats-bugs@gnats.NetBSD.org>; Thu, 12 Mar 2020 21:04:01 +0000 (UTC)
Message-Id: <20200312210400.82D9E1A924B@mollari.NetBSD.org>
Date: Thu, 12 Mar 2020 21:04:00 +0000 (UTC)
From: thorpej@me.com
Reply-To: thorpej@me.com
To: gnats-bugs@NetBSD.org
Subject: bge(4) does not work on MIPS systems (it panics)
X-Send-Pr-Version: www-1.0
>Number: 55067
>Category: port-mips
>Synopsis: bge(4) does not work on MIPS systems (it panics)
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: thorpej
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Mar 12 21:05:00 +0000 2020
>Closed-Date: Fri Mar 13 03:51:06 +0000 2020
>Last-Modified: Fri Mar 13 07:40:01 +0000 2020
>Originator: Jason Thorpe
>Release: NetBSD 9.99.48
>Organization:
Riscy Business
>Environment:
NetBSD 9.99.48 cobalt mipsel
>Description:
The bge(4) driver does not function correctly on MIPS platforms. The first symptom is the following panic:
Waiting for duplicate address detection to finish...
Starting dhcpcd.
[ 23.6599208] panic: _bus_dmamap_sync: bad length
[ 23.6713571] cpu0: Begin traceback...
[ 23.6713571] pid -2086495680 not found
[ 23.6801778] cpu0: End traceback...
[ 23.6801778] kernel: breakpoint trap
Stopped in pid 119.1 (dhcpcd) at netbsd:cpu_Debugger+0x4: jr
ra
bdslot: nop
db> bt
0x83a29a68: cpu_Debugger+4 (3,8000,c,80657c40) ra 803cd228 sz 0
0x83a29a68: vpanic+15c (3,8000,c,80657c40) ra 803cd2bc sz 48
0x83a29a98: panic+24 (3,83edfc90,10998,0) ra 8000dd9c sz 32
0x83a29ab8: _bus_dmamap_sync+4bc (3,83edfc90,10998,0) ra 800726ec sz 72
0x83a29b00: bge_intr+730 (3,83edfc90,10998,0) ra 8000bef8 sz 104
0x83a29b68: icu_intr+130 (3,83edfc90,10998,0) ra 8000c8ac sz 64
0x83a29ba8: cpu_intr+17c (3,83edfc90,10998,0) ra 800119cc sz 56
0x83a29be0: mips3_kern_intr+cc (0,fb00,0,80657c40) ra 80064708 sz 192
0x83a29ca0: bge_ioctl+f8 (0,fb00,0,80657c40) ra 804750f8 sz 48
0x83a29cd0: doifioctl+9a4 (0,80906910,83e3f6e8,80657c40) ra 803dd8d4 sz 304
0x83a29e00: sys_ioctl+2d4 (0,80906910,83e3f6e8,80657c40) ra 8001f1dc sz 200
0x83a29ec8: syscall+28c (0,80906910,83e3f6e8,80657c40) ra 80011e60 sz 128
0x83a29f48: mips3_systemcall+e0 (0,80906910,83e3f6e8,80657c40) ra 7dd7fe40 sz 0
PC 0x7dd7fe40: not in kernel space
0x83a29f48: 0+7dd7fe40 (0,80906910,83e3f6e8,80657c40) ra 0 sz 0
User-level: pid 119.1
db>
There may be other problems.
>How-To-Repeat:
Attempt to use bge(4) on MIPS.
>Fix:
N/A
>Release-Note:
>Audit-Trail:
From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/55067: bge(4) does not work on MIPS systems (it panics)
Date: Thu, 12 Mar 2020 22:36:51 -0000 (UTC)
thorpej@me.com writes:
>[ 23.6599208] panic: _bus_dmamap_sync: bad length
>0x83a29ab8: _bus_dmamap_sync+4bc (3,83edfc90,10998,0) ra 800726ec sz 72
mips _bus_dmamap_sync erroneously traps len == 0. It should be gracefully
allowing that value like other archs.
--
--
Michael van Elst
Internet: mlelstv@serpens.de
"A potential Snark may lurk in every tree."
Responsible-Changed-From-To: kern-bug-people->thorpej
Responsible-Changed-By: thorpej@NetBSD.org
Responsible-Changed-When: Thu, 12 Mar 2020 23:15:47 +0000
Responsible-Changed-Why:
Take.
State-Changed-From-To: open->analyzed
State-Changed-By: thorpej@NetBSD.org
State-Changed-When: Thu, 12 Mar 2020 23:15:47 +0000
State-Changed-Why:
Have a fix.
From: Jason Thorpe <thorpej@me.com>
To: gnats-bugs@netbsd.org
Cc: netbsd-bugs@netbsd.org
Subject: Re: kern/55067: bge(4) does not work on MIPS systems (it panics)
Date: Thu, 12 Mar 2020 16:25:52 -0700
> On Mar 12, 2020, at 3:40 PM, Michael van Elst <mlelstv@serpens.de> =
wrote:
> mips _bus_dmamap_sync erroneously traps len =3D=3D 0. It should be =
gracefully
> allowing that value like other archs.
Yah, it stops crashing once I fix that. But it still doesn't work. =
Although, I think THAT problem is an issue with my test set-up. I'm =
using it in a 32-bit slot, but I suspect the firmware on the board =
thinks it got plugged into a 64-bit slot; I noticed that my "gsip" board =
also thinks it's connected to a 64-bit slot on this system.
I believe that ACK64# and REQ64# are supposed to be left floating on a =
32-bit slot (I need to re-read the spec to be sure). I'm guessing that =
they're being pulled in some direction they're not supposed to be on the =
Qube2 mainboard, causing the card to mis-detect. That will cause half =
of the data path to march off a cliff. Luckily, to make these cards =
physically fit the machine, I have to use a couple of risers in a sort =
of Rube Goldberg arrangement, so if I'm right about what's supposed to =
happen with ACK64# and REQ64#, I'll simply cut those traces on the =
risers.
-- thorpej
From: Jason Thorpe <thorpej@me.com>
To: gnats-bugs@netbsd.org
Cc: netbsd-bugs@netbsd.org
Subject: Re: kern/55067: bge(4) does not work on MIPS systems (it panics)
Date: Thu, 12 Mar 2020 20:28:03 -0700
> On Mar 12, 2020, at 4:25 PM, Jason Thorpe <thorpej@me.com> wrote:
>=20
> Yah, it stops crashing once I fix that. But it still doesn't work. =
Although, I think THAT problem is an issue with my test set-up. I'm =
using it in a 32-bit slot, but I suspect the firmware on the board =
thinks it got plugged into a 64-bit slot; I noticed that my "gsip" board =
also thinks it's connected to a 64-bit slot on this system.
>=20
> I believe that ACK64# and REQ64# are supposed to be left floating on a =
32-bit slot (I need to re-read the spec to be sure). I'm guessing that =
they're being pulled in some direction they're not supposed to be on the =
Qube2 mainboard, causing the card to mis-detect. That will cause half =
of the data path to march off a cliff. Luckily, to make these cards =
physically fit the machine, I have to use a couple of risers in a sort =
of Rube Goldberg arrangement, so if I'm right about what's supposed to =
happen with ACK64# and REQ64#, I'll simply cut those traces on the =
risers.
Well, I recalled incorrectly about 64-bit slot detection.
ACK64# and REQ64# are supposed to be pulled high by the system board on =
32-bit slots according to PCI 2.1... but I need to dig up an older =
version of the spec to see if those pins were floating / NC prior to PCI =
2.x.
However, it may simply be the case that these boards don't work in =
32-bit slots ... I tried a D-Link DGE-550T (64-bit card, "stge") and it =
worked perfectly.
-- thorpej
State-Changed-From-To: analyzed->closed
State-Changed-By: thorpej@NetBSD.org
State-Changed-When: Fri, 13 Mar 2020 03:51:06 +0000
State-Changed-Why:
Module Name: src
Committed By: thorpej
Date: Fri Mar 13 03:49:39 UTC 2020
Modified Files:
src/sys/arch/mips/mips: bus_dma.c
Log Message:
Allow len == 0 in bus_dmamap_sync().
XXX pullup-9
To generate a diff of this commit:
cvs rdiff -u -r1.38 -r1.39 src/sys/arch/mips/mips/bus_dma.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
No longer panics, but also still doesn't work, but that is a separate
problem.
From: Nick Hudson <nick.hudson@gmx.co.uk>
To: gnats-bugs@netbsd.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, thorpej@me.com
Cc:
Subject: Re: kern/55067: bge(4) does not work on MIPS systems (it panics)
Date: Fri, 13 Mar 2020 06:58:27 +0000
On 12/03/2020 22:40, Michael van Elst wrote:
> The following reply was made to PR kern/55067; it has been noted by GNAT=
S.
>
> From: mlelstv@serpens.de (Michael van Elst)
> To: gnats-bugs@netbsd.org
> Cc:
> Subject: Re: kern/55067: bge(4) does not work on MIPS systems (it panics=
)
> Date: Thu, 12 Mar 2020 22:36:51 -0000 (UTC)
>
> thorpej@me.com writes:
>
> >[ 23.6599208] panic: _bus_dmamap_sync: bad length
>
> >0x83a29ab8: _bus_dmamap_sync+4bc (3,83edfc90,10998,0) ra 800726ec sz =
72
>
> mips _bus_dmamap_sync erroneously traps len =3D=3D 0. It should be gra=
cefully
> allowing that value like other archs.
I said the same on source-changes-d, but here goes again.
The check for len !=3D0 in arm bus_dma.c has caught several bugs over the
years.
I plan on leaving it there.
Nick
From: mlelstv@serpens.de (Michael van Elst)
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/55067: bge(4) does not work on MIPS systems (it panics)
Date: Fri, 13 Mar 2020 07:38:14 -0000 (UTC)
nick.hudson@gmx.co.uk (Nick Hudson) writes:
>On 12/03/2020 22:40, Michael van Elst wrote:
>> mips _bus_dmamap_sync erroneously traps len == 0. It should be gracefully
>> allowing that value like other archs.
>I said the same on source-changes-d, but here goes again.
>The check for len !=0 in arm bus_dma.c has caught several bugs over the
>years.
>I plan on leaving it there.
If the check is supposed to catch bugs it should be a) a message, not a panic,
b) only for DIAGNOSTIC and c) maintained for all archs and not your lone
decision.
--
--
Michael van Elst
Internet: mlelstv@serpens.de
"A potential Snark may lurk in every tree."
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.