NetBSD Problem Report #53053
From abs@forsaken.absd.org Sun Feb 25 01:36:27 2018
Return-Path: <abs@forsaken.absd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 24E797A18F
for <gnats-bugs@gnats.NetBSD.org>; Sun, 25 Feb 2018 01:36:27 +0000 (UTC)
Message-Id: <20180225001744.7E89A3974B3@forsaken.absd.org>
Date: Sun, 25 Feb 2018 00:17:44 +0000 (GMT)
From: abs@absd.org
Reply-To: abs@absd.org
To: gnats-bugs@NetBSD.org
Subject: Building lang/go hangs DOM0 xen kernel
X-Send-Pr-Version: 3.95
>Number: 53053
>Notify-List: gson@gson.org
>Category: kern
>Synopsis: non-MULTIPROCESSOR hangs building lang/go
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Feb 25 01:40:00 +0000 2018
>Closed-Date: Mon Jun 11 10:49:45 +0000 2018
>Last-Modified: Mon Jun 11 10:49:45 +0000 2018
>Originator: abs@absd.org
>Release: NetBSD 8.0_BETA
>Organization:
>Environment:
System: NetBSD forsaken.absd.org 8.0_BETA NetBSD 8.0_BETA (GENERIC) #0: Sun Feb 11 02:12:12 GMT 2018 abs@hermes.social-events.net:/var/obj/nbbuild-8-amd64/objdir/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
Try to build pkgsrc/lang/go. Output stops at:
===> Building for go-1.9.4
cd /var/obj/pkg/lang/go/work/go/src && env GOROOT_BOOTSTRAP=/usr/pkg/go14 GOROOT
_FINAL=/usr/pkg/go /usr/pkg/bin/bash ./make.bash
##### Building Go bootstrap tool.
cmd/dist
Reproduced on two different boxes - an i7 based Dell desktop and
an i5 Thinkpad T420s.
Also tested under -current (kernel only) on same T420s, same issue
NetBSD descent.absd.org 8.99.12 NetBSD 8.99.12 (XEN3_DOM0) #0: Sat Feb 24 14:06:27 UTC 2018 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/xen/compile/XEN3_DOM0 amd64
Typically installed package for go14 has just been installed before ths go build and it ends up corrupted, requiring a pkg_delete go14;pkg_admin rebuild; pkg_add .../go14
>How-To-Repeat:
drop to boot prmopt
boot -1
cd pkgsrc/lang/go
cvs update -rpkgsrc-2017Q4
make depends; sync
make install
>Fix:
>Release-Note:
>Audit-Trail:
From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel
Date: Sun, 25 Feb 2018 15:39:13 +0000
Hand transcribed backtrace
NOTE: xen dom0 kernels aren't MULTIPROCESSOR.
cv_wait_sig
lwp_wait
exit_lwps
exit1
sys_exit
syscall
Reproduced on -current amd64 (non-xen), boot with multiprocessor disabled.
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel
Date: Wed, 21 Mar 2018 20:04:46 +0000
On Sun, Feb 25, 2018 at 03:40:01PM +0000, coypu@sdf.org wrote:
> Hand transcribed backtrace
> NOTE: xen dom0 kernels aren't MULTIPROCESSOR.
>
> cv_wait_sig
> lwp_wait
> exit_lwps
> exit1
> sys_exit
> syscall
That should not be able to wedge the system... are you sure the system
is really dead and it's not an application-level deadlock?
--
David A. Holland
dholland@netbsd.org
From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel
Date: Wed, 21 Mar 2018 21:46:27 +0000
> That should not be able to wedge the system... are you sure the system
> is really dead and it's not an application-level deadlock?
Yes, I was playing audio and it stopped playing when it got stuck.
However I could enter DDB.
From: coypu@sdf.org
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel
Date: Mon, 23 Apr 2018 11:25:30 +0000
This does need Go 1.9.x to reproduce, sorry for confusing whoever asked.
I can build Go 1.10 fine.
Steps to reproduce on a regular amd64 setup:
Drop to boot prompt
boot -1
# login etc
cd /usr/pkgsrc/lang/go;
cvs update -rpkgsrc-2017Q4 -Pd;
make install
From: David Brownlee <abs@absd.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel
Date: Mon, 23 Apr 2018 19:56:49 +0100
--000000000000e37609056a889a9f
Content-Type: text/plain; charset="UTF-8"
This still triggers for me on go 1.10 (or building syncthing with go 1.10)
Thanks
On 23 April 2018 at 12:30, <coypu@sdf.org> wrote:
> The following reply was made to PR kern/53053; it has been noted by GNATS.
>
> From: coypu@sdf.org
> To: gnats-bugs@NetBSD.org
> Cc:
> Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel
> Date: Mon, 23 Apr 2018 11:25:30 +0000
>
> This does need Go 1.9.x to reproduce, sorry for confusing whoever asked.
> I can build Go 1.10 fine.
>
> Steps to reproduce on a regular amd64 setup:
>
> Drop to boot prompt
> boot -1
>
> # login etc
>
> cd /usr/pkgsrc/lang/go;
> cvs update -rpkgsrc-2017Q4 -Pd;
> make install
>
>
--000000000000e37609056a889a9f
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><div>This still triggers for me on go 1.10 (or building sy=
ncthing with go 1.10)<br><br></div>Thanks<br></div><div class=3D"gmail_extr=
a"><br><div class=3D"gmail_quote">On 23 April 2018 at 12:30, <span dir=3D"=
ltr"><<a href=3D"mailto:coypu@sdf.org" target=3D"_blank">coypu@sdf.org</=
a>></span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0=
0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=3D"">The=
following reply was made to PR kern/53053; it has been noted by GNATS.<br>
<br>
From: <a href=3D"mailto:coypu@sdf.org">coypu@sdf.org</a><br>
</span>To: gnats-bugs@NetBSD.org<br>
<span class=3D"">Cc: <br>
Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel<br>
</span>Date: Mon, 23 Apr 2018 11:25:30 +0000<br>
<br>
=C2=A0This does need Go 1.9.x to reproduce, sorry for confusing whoever ask=
ed.<br>
=C2=A0I can build Go 1.10 fine.<br>
<br>
=C2=A0Steps to reproduce on a regular amd64 setup:<br>
<br>
=C2=A0Drop to boot prompt<br>
=C2=A0boot -1<br>
<br>
=C2=A0# login etc<br>
<br>
=C2=A0cd /usr/pkgsrc/lang/go;<br>
=C2=A0cvs update -rpkgsrc-2017Q4 -Pd;<br>
=C2=A0make install<br>
<br>
</blockquote></div><br></div>
--000000000000e37609056a889a9f--
From: Andreas Gustafsson <gson@gson.org>
To: abs@absd.org, gnats-bugs@NetBSD.org
Cc: Christos Zoulas <christos@zoulas.com>, Martin Husemann <martin@duskware.de>
Subject: Re: kern/53053: non-MULTIPROCESSOR hangs building lang/go
Date: Wed, 9 May 2018 11:33:20 +0300
Hello abs,
I believe this kernel hang has been fixed in -current by
src/sys/kern/kern_lwp.c 1.192. Can you confirm this?
I have requested a pullup to -8 (#805), but it has been stalled due
to claims that it "breaks golang". If you can help either confirm or
refute those claims, that would also be helpful.
--
Andreas Gustafsson, gson@gson.org
State-Changed-From-To: open->feedback
State-Changed-By: gson@NetBSD.org
State-Changed-When: Wed, 09 May 2018 08:36:56 +0000
State-Changed-Why:
feedback requested
From: Andreas Gustafsson <gson@NetBSD.org>
To: abs@absd.org, gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53053 (non-MULTIPROCESSOR hangs building lang/go)
Date: Wed, 23 May 2018 23:30:34 +0300
Hello abs,
Earlier, I wrote:
> I believe this kernel hang has been fixed in -current by
> src/sys/kern/kern_lwp.c 1.192. Can you confirm this?
This should now also be fixed in -8 as the pullup of
src/sys/kern/kern_lwp.c 1.192 has been processed.
A confirmation, refutation, or report of possible side
effects would still be appreciated.
--
Andreas Gustafsson, gson@NetBSD.org
From: David Brownlee <abs@absd.org>
To: Andreas Gustafsson <gson@netbsd.org>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/53053 (non-MULTIPROCESSOR hangs building lang/go)
Date: Sun, 27 May 2018 22:56:14 +0100
On 23 May 2018 at 21:30, Andreas Gustafsson <gson@netbsd.org> wrote:
> Earlier, I wrote:
> > I believe this kernel hang has been fixed in -current by
> > src/sys/kern/kern_lwp.c 1.192. Can you confirm this?
>
> This should now also be fixed in -8 as the pullup of
> src/sys/kern/kern_lwp.c 1.192 has been processed.
> A confirmation, refutation, or report of possible side
> effects would still be appreciated.
Apologies - it took a little longer than expected to get into a
position to test.
I can confirm that I do not appear to be able to trigger the issue any
more on xen DOM0 with 8.0_RC1, including when a DOMU is running
State-Changed-From-To: feedback->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Mon, 11 Jun 2018 10:49:45 +0000
State-Changed-Why:
Submitter can no longer reproduce the problem.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.