NetBSD Problem Report #56459
From www@netbsd.org Tue Oct 19 19:03:31 2021
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id C77281A9239
for <gnats-bugs@gnats.NetBSD.org>; Tue, 19 Oct 2021 19:03:31 +0000 (UTC)
Message-Id: <20211019190329.DE9691A923A@mollari.NetBSD.org>
Date: Tue, 19 Oct 2021 19:03:29 +0000 (UTC)
From: rorybolt@gmail.com
Reply-To: rorybolt@gmail.com
To: gnats-bugs@NetBSD.org
Subject: Rasberry Pi 3B+ boot failure and solution
X-Send-Pr-Version: www-1.0
>Number: 56459
>Category: port-arm
>Synopsis: Rasberry Pi 3B+ boot failure and solution
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: skrll
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Oct 19 19:05:00 +0000 2021
>Closed-Date: Tue Oct 19 19:18:29 +0000 2021
>Last-Modified: Tue Oct 19 20:25:01 +0000 2021
>Originator: Rory Bolt
>Release: 9.99.91
>Organization:
Kioxia
>Environment:
NetBSD arm64 9.99.91 NetBSD 9.99.91 (GENERIC64)
>Description:
The Raspberry Pi 3B+ platform has been broken since 9.99.88 with various boot problems. The latest is a panic with the following backtrace (sorry, this is from a picture of the panic and I did not type in all the addresses/details):
panic: kernel diagnostic assertion "l->l_stat == LSONPROC" failed in kern_sleepq.c
vpanic()
kern_assert()
sleepq_enqueue()
cv_enter()
cv_wait()
xc_wait()
pic_establish_intr()
bcm2836mp_intr_init()
arm_fdt_cpu_hatch()
cpu_hatch()
cpu_mpstart()
By adding debugging info I was able to verify that l->l_stat was LSIDL, we were trying to sleep on the idle lwp.
The fundamental problem is the same as the earlier ones Rin fixed: when the secondary processors are initializing on the idle lwp, they cannot suspend/sleep. As has been previously mentioned on the port-arm mailing list, there are MANY opportunities for locking in the processor initialization path - and it would be great if this were reworked.
The specific problem here is that the "cold" flag has been cleared before pic_establish_intr() was called, and as a result xc_broadcast() and xc_wait() are being executed instead of just pic_unblock_irqs().
>How-To-Repeat:
Attempt to boot any of the daily builds since June 2021 on a Raspberry Pi 3B+.
>Fix:
In this case the fix is easy, although as mentioned in the description I see many other opportunities to enter sleepq_enqueue() during the arm secondary processor initialization path.
The solution to the current problem is to move the "cold = 0" statement in sys/kern/init_main.c from its current location in configure2() at line 808 until AFTER the call to cpu_boot_secondary_processors() at line 827. I inserted it immediately prior to the "mp_ready" = true line.
By doing this I can successfully boot the latest development kernel on my Raspberry Pi 3B+
>Release-Note:
>Audit-Trail:
Responsible-Changed-From-To: port-arm-maintainer->skrll
Responsible-Changed-By: skrll@NetBSD.org
Responsible-Changed-When: Tue, 19 Oct 2021 19:18:29 +0000
Responsible-Changed-Why:
Take as I'm fixing this
State-Changed-From-To: open->closed
State-Changed-By: skrll@NetBSD.org
State-Changed-When: Tue, 19 Oct 2021 19:18:29 +0000
State-Changed-Why:
Close as duplicate of 56264.
Thanks for the analysis Rory - I hope you don't mind that I close this PR and track
the issue via 56264.
From: Rory Bolt <rory.bolt@gmail.com>
To: gnats-bugs@netbsd.org
Cc: skrll@netbsd.org, port-arm-maintainer@netbsd.org, netbsd-bugs@netbsd.org,
gnats-admin@netbsd.org
Subject: Re: port-arm/56459 (Rasberry Pi 3B+ boot failure and solution)
Date: Tue, 19 Oct 2021 13:21:10 -0700
--000000000000c4864105ceba6756
Content-Type: text/plain; charset="UTF-8"
BTW...
I attempted to join the port-arm mailing list to discuss this further, but
my request to join was denied?
-Rory
On Tue, Oct 19, 2021 at 12:18 PM <skrll@netbsd.org> wrote:
> Synopsis: Rasberry Pi 3B+ boot failure and solution
>
> Responsible-Changed-From-To: port-arm-maintainer->skrll
> Responsible-Changed-By: skrll@NetBSD.org
> Responsible-Changed-When: Tue, 19 Oct 2021 19:18:29 +0000
> Responsible-Changed-Why:
> Take as I'm fixing this
>
>
> State-Changed-From-To: open->closed
> State-Changed-By: skrll@NetBSD.org
> State-Changed-When: Tue, 19 Oct 2021 19:18:29 +0000
> State-Changed-Why:
> Close as duplicate of 56264.
>
> Thanks for the analysis Rory - I hope you don't mind that I close this PR
> and track
> the issue via 56264.
>
>
>
>
--000000000000c4864105ceba6756
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><div dir=3D"ltr">BTW...<div><br></div><div>I attempted to =
join the port-arm mailing list to discuss this further, but my request to j=
oin was denied?</div><div><br></div><div>-Rory</div></div><br><div class=3D=
"gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Tue, Oct 19, 2021 at=
12:18 PM <<a href=3D"mailto:skrll@netbsd.org">skrll@netbsd.org</a>> =
wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0=
px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Synopsis:=
Rasberry Pi 3B+ boot failure and solution<br>
<br>
Responsible-Changed-From-To: port-arm-maintainer->skrll<br>
Responsible-Changed-By: skrll@NetBSD.org<br>
Responsible-Changed-When: Tue, 19 Oct 2021 19:18:29 +0000<br>
Responsible-Changed-Why:<br>
Take as I'm fixing this<br>
<br>
<br>
State-Changed-From-To: open->closed<br>
State-Changed-By: skrll@NetBSD.org<br>
State-Changed-When: Tue, 19 Oct 2021 19:18:29 +0000<br>
State-Changed-Why:<br>
Close as duplicate of 56264.<br>
<br>
Thanks for the analysis Rory - I hope you don't mind that I close this =
PR and track<br>
the issue via 56264.<br>
<br>
<br>
<br>
</blockquote></div></div>
--000000000000c4864105ceba6756--
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.