NetBSD Problem Report #56459

From www@netbsd.org  Tue Oct 19 19:03:31 2021
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id C77281A9239
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 19 Oct 2021 19:03:31 +0000 (UTC)
Message-Id: <20211019190329.DE9691A923A@mollari.NetBSD.org>
Date: Tue, 19 Oct 2021 19:03:29 +0000 (UTC)
From: rorybolt@gmail.com
Reply-To: rorybolt@gmail.com
To: gnats-bugs@NetBSD.org
Subject: Rasberry Pi 3B+ boot failure and solution
X-Send-Pr-Version: www-1.0

>Number:         56459
>Category:       port-arm
>Synopsis:       Rasberry Pi 3B+ boot failure and solution
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    skrll
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Oct 19 19:05:00 +0000 2021
>Closed-Date:    Tue Oct 19 19:18:29 +0000 2021
>Last-Modified:  Tue Oct 19 20:25:01 +0000 2021
>Originator:     Rory Bolt
>Release:        9.99.91
>Organization:
Kioxia
>Environment:
NetBSD arm64 9.99.91 NetBSD 9.99.91 (GENERIC64) 
>Description:
The Raspberry Pi 3B+ platform has been broken since 9.99.88 with various boot problems. The latest is a panic with the following backtrace (sorry, this is from a picture of the panic and I did not type in all the addresses/details):

panic: kernel diagnostic assertion "l->l_stat == LSONPROC" failed in kern_sleepq.c

vpanic()
kern_assert()
sleepq_enqueue()
cv_enter()
cv_wait()
xc_wait()
pic_establish_intr()
bcm2836mp_intr_init()
arm_fdt_cpu_hatch()
cpu_hatch()
cpu_mpstart()

By adding debugging info I was able to verify that l->l_stat was LSIDL, we were trying to sleep on the idle lwp.

The fundamental problem is the same as the earlier ones Rin fixed: when the secondary processors are initializing on the idle lwp, they cannot suspend/sleep. As has been previously mentioned on the port-arm mailing list, there are MANY opportunities for locking in the processor initialization path - and it would be great if this were reworked. 

The specific problem here is that the "cold" flag has been cleared before pic_establish_intr() was called, and as a result xc_broadcast() and xc_wait() are being executed instead of just pic_unblock_irqs(). 

>How-To-Repeat:
Attempt to boot any of the daily builds since June 2021 on a Raspberry Pi 3B+.
>Fix:
In this case the fix is easy, although as mentioned in the description I see many other opportunities to enter sleepq_enqueue() during the arm secondary processor initialization path.

The solution to the current problem is to move the "cold = 0" statement in sys/kern/init_main.c from its current location in configure2() at line 808 until AFTER the call to cpu_boot_secondary_processors() at line 827. I inserted it immediately prior to the "mp_ready" = true line.

By doing this I can successfully boot the latest development kernel on my Raspberry Pi 3B+

>Release-Note:

>Audit-Trail:

Responsible-Changed-From-To: port-arm-maintainer->skrll
Responsible-Changed-By: skrll@NetBSD.org
Responsible-Changed-When: Tue, 19 Oct 2021 19:18:29 +0000
Responsible-Changed-Why:
Take as I'm fixing this


State-Changed-From-To: open->closed
State-Changed-By: skrll@NetBSD.org
State-Changed-When: Tue, 19 Oct 2021 19:18:29 +0000
State-Changed-Why:
Close as duplicate of 56264.

Thanks for the analysis Rory - I hope you don't mind that I close this PR and track
the issue via 56264.


From: Rory Bolt <rory.bolt@gmail.com>
To: gnats-bugs@netbsd.org
Cc: skrll@netbsd.org, port-arm-maintainer@netbsd.org, netbsd-bugs@netbsd.org, 
	gnats-admin@netbsd.org
Subject: Re: port-arm/56459 (Rasberry Pi 3B+ boot failure and solution)
Date: Tue, 19 Oct 2021 13:21:10 -0700

 --000000000000c4864105ceba6756
 Content-Type: text/plain; charset="UTF-8"

 BTW...

 I attempted to join the port-arm mailing list to discuss this further, but
 my request to join was denied?

 -Rory

 On Tue, Oct 19, 2021 at 12:18 PM <skrll@netbsd.org> wrote:

 > Synopsis: Rasberry Pi 3B+ boot failure and solution
 >
 > Responsible-Changed-From-To: port-arm-maintainer->skrll
 > Responsible-Changed-By: skrll@NetBSD.org
 > Responsible-Changed-When: Tue, 19 Oct 2021 19:18:29 +0000
 > Responsible-Changed-Why:
 > Take as I'm fixing this
 >
 >
 > State-Changed-From-To: open->closed
 > State-Changed-By: skrll@NetBSD.org
 > State-Changed-When: Tue, 19 Oct 2021 19:18:29 +0000
 > State-Changed-Why:
 > Close as duplicate of 56264.
 >
 > Thanks for the analysis Rory - I hope you don't mind that I close this PR
 > and track
 > the issue via 56264.
 >
 >
 >
 >

 --000000000000c4864105ceba6756
 Content-Type: text/html; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable

 <div dir=3D"ltr"><div dir=3D"ltr">BTW...<div><br></div><div>I attempted to =
 join the port-arm mailing list to discuss this further, but my request to j=
 oin was denied?</div><div><br></div><div>-Rory</div></div><br><div class=3D=
 "gmail_quote"><div dir=3D"ltr" class=3D"gmail_attr">On Tue, Oct 19, 2021 at=
  12:18 PM &lt;<a href=3D"mailto:skrll@netbsd.org">skrll@netbsd.org</a>&gt; =
 wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0=
 px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Synopsis:=
  Rasberry Pi 3B+ boot failure and solution<br>
 <br>
 Responsible-Changed-From-To: port-arm-maintainer-&gt;skrll<br>
 Responsible-Changed-By: skrll@NetBSD.org<br>
 Responsible-Changed-When: Tue, 19 Oct 2021 19:18:29 +0000<br>
 Responsible-Changed-Why:<br>
 Take as I&#39;m fixing this<br>
 <br>
 <br>
 State-Changed-From-To: open-&gt;closed<br>
 State-Changed-By: skrll@NetBSD.org<br>
 State-Changed-When: Tue, 19 Oct 2021 19:18:29 +0000<br>
 State-Changed-Why:<br>
 Close as duplicate of 56264.<br>
 <br>
 Thanks for the analysis Rory - I hope you don&#39;t mind that I close this =
 PR and track<br>
 the issue via 56264.<br>
 <br>
 <br>
 <br>
 </blockquote></div></div>

 --000000000000c4864105ceba6756--

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.