NetBSD Problem Report #53053

From abs@forsaken.absd.org  Sun Feb 25 01:36:27 2018
Return-Path: <abs@forsaken.absd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 24E797A18F
	for <gnats-bugs@gnats.NetBSD.org>; Sun, 25 Feb 2018 01:36:27 +0000 (UTC)
Message-Id: <20180225001744.7E89A3974B3@forsaken.absd.org>
Date: Sun, 25 Feb 2018 00:17:44 +0000 (GMT)
From: abs@absd.org
Reply-To: abs@absd.org
To: gnats-bugs@NetBSD.org
Subject: Building lang/go hangs DOM0 xen kernel
X-Send-Pr-Version: 3.95

>Number:         53053
>Notify-List:    gson@gson.org
>Category:       kern
>Synopsis:       non-MULTIPROCESSOR hangs building lang/go
>Confidential:   no
>Severity:       critical
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sun Feb 25 01:40:00 +0000 2018
>Closed-Date:    Mon Jun 11 10:49:45 +0000 2018
>Last-Modified:  Mon Jun 11 10:49:45 +0000 2018
>Originator:     abs@absd.org
>Release:        NetBSD 8.0_BETA
>Organization:

>Environment:
System: NetBSD forsaken.absd.org 8.0_BETA NetBSD 8.0_BETA (GENERIC) #0: Sun Feb 11 02:12:12 GMT 2018 abs@hermes.social-events.net:/var/obj/nbbuild-8-amd64/objdir/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
Try to build pkgsrc/lang/go. Output stops at:

    ===> Building for go-1.9.4
    cd /var/obj/pkg/lang/go/work/go/src && env GOROOT_BOOTSTRAP=/usr/pkg/go14 GOROOT
    _FINAL=/usr/pkg/go  /usr/pkg/bin/bash ./make.bash
    ##### Building Go bootstrap tool.
    cmd/dist

Reproduced on two different boxes - an i7 based Dell desktop and
an i5 Thinkpad T420s.

Also tested under -current (kernel only) on same T420s, same issue

NetBSD descent.absd.org 8.99.12 NetBSD 8.99.12 (XEN3_DOM0) #0: Sat Feb 24 14:06:27 UTC 2018  mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/xen/compile/XEN3_DOM0 amd64

Typically installed package for go14 has just been installed before ths go build and it ends up corrupted, requiring a pkg_delete go14;pkg_admin rebuild; pkg_add .../go14

>How-To-Repeat:
drop to boot prmopt
boot -1

cd pkgsrc/lang/go
cvs update -rpkgsrc-2017Q4
make depends; sync
make install

>Fix:


>Release-Note:

>Audit-Trail:
From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel
Date: Sun, 25 Feb 2018 15:39:13 +0000

 Hand transcribed backtrace
 NOTE: xen dom0 kernels aren't MULTIPROCESSOR.

 cv_wait_sig
 lwp_wait
 exit_lwps
 exit1
 sys_exit
 syscall

Reproduced on -current amd64 (non-xen), boot with multiprocessor disabled.


From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel
Date: Wed, 21 Mar 2018 20:04:46 +0000

 On Sun, Feb 25, 2018 at 03:40:01PM +0000, coypu@sdf.org wrote:
  >  Hand transcribed backtrace
  >  NOTE: xen dom0 kernels aren't MULTIPROCESSOR.
  >  
  >  cv_wait_sig
  >  lwp_wait
  >  exit_lwps
  >  exit1
  >  sys_exit
  >  syscall

 That should not be able to wedge the system... are you sure the system
 is really dead and it's not an application-level deadlock?

 -- 
 David A. Holland
 dholland@netbsd.org

From: coypu@sdf.org
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel
Date: Wed, 21 Mar 2018 21:46:27 +0000

 > That should not be able to wedge the system... are you sure the system
 > is really dead and it's not an application-level deadlock?

 Yes, I was playing audio and it stopped playing when it got stuck.
 However I could enter DDB.

From: coypu@sdf.org
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel
Date: Mon, 23 Apr 2018 11:25:30 +0000

 This does need Go 1.9.x to reproduce, sorry for confusing whoever asked.
 I can build Go 1.10 fine.

 Steps to reproduce on a regular amd64 setup:

 Drop to boot prompt
 boot -1

 # login etc

 cd /usr/pkgsrc/lang/go;
 cvs update -rpkgsrc-2017Q4 -Pd;
 make install

From: David Brownlee <abs@absd.org>
To: gnats-bugs@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel
Date: Mon, 23 Apr 2018 19:56:49 +0100

 --000000000000e37609056a889a9f
 Content-Type: text/plain; charset="UTF-8"

 This still triggers for me on go 1.10 (or building syncthing with go 1.10)

 Thanks

 On 23 April 2018 at 12:30, <coypu@sdf.org> wrote:

 > The following reply was made to PR kern/53053; it has been noted by GNATS.
 >
 > From: coypu@sdf.org
 > To: gnats-bugs@NetBSD.org
 > Cc:
 > Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel
 > Date: Mon, 23 Apr 2018 11:25:30 +0000
 >
 >  This does need Go 1.9.x to reproduce, sorry for confusing whoever asked.
 >  I can build Go 1.10 fine.
 >
 >  Steps to reproduce on a regular amd64 setup:
 >
 >  Drop to boot prompt
 >  boot -1
 >
 >  # login etc
 >
 >  cd /usr/pkgsrc/lang/go;
 >  cvs update -rpkgsrc-2017Q4 -Pd;
 >  make install
 >
 >

 --000000000000e37609056a889a9f
 Content-Type: text/html; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable

 <div dir=3D"ltr"><div>This still triggers for me on go 1.10 (or building sy=
 ncthing with go 1.10)<br><br></div>Thanks<br></div><div class=3D"gmail_extr=
 a"><br><div class=3D"gmail_quote">On 23 April 2018 at 12:30,  <span dir=3D"=
 ltr">&lt;<a href=3D"mailto:coypu@sdf.org" target=3D"_blank">coypu@sdf.org</=
 a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0=
  0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=3D"">The=
  following reply was made to PR kern/53053; it has been noted by GNATS.<br>
 <br>
 From: <a href=3D"mailto:coypu@sdf.org">coypu@sdf.org</a><br>
 </span>To: gnats-bugs@NetBSD.org<br>
 <span class=3D"">Cc: <br>
 Subject: Re: port-xen/53053: Building lang/go hangs DOM0 xen kernel<br>
 </span>Date: Mon, 23 Apr 2018 11:25:30 +0000<br>
 <br>
 =C2=A0This does need Go 1.9.x to reproduce, sorry for confusing whoever ask=
 ed.<br>
 =C2=A0I can build Go 1.10 fine.<br>
 <br>
 =C2=A0Steps to reproduce on a regular amd64 setup:<br>
 <br>
 =C2=A0Drop to boot prompt<br>
 =C2=A0boot -1<br>
 <br>
 =C2=A0# login etc<br>
 <br>
 =C2=A0cd /usr/pkgsrc/lang/go;<br>
 =C2=A0cvs update -rpkgsrc-2017Q4 -Pd;<br>
 =C2=A0make install<br>
 <br>
 </blockquote></div><br></div>

 --000000000000e37609056a889a9f--

From: Andreas Gustafsson <gson@gson.org>
To: abs@absd.org, gnats-bugs@NetBSD.org
Cc: Christos Zoulas <christos@zoulas.com>, Martin Husemann <martin@duskware.de>
Subject: Re: kern/53053: non-MULTIPROCESSOR hangs building lang/go
Date: Wed, 9 May 2018 11:33:20 +0300

 Hello abs,

 I believe this kernel hang has been fixed in -current by
 src/sys/kern/kern_lwp.c 1.192.  Can you confirm this?

 I have requested a pullup to -8 (#805), but it has been stalled due
 to claims that it "breaks golang".  If you can help either confirm or
 refute those claims, that would also be helpful.
 -- 
 Andreas Gustafsson, gson@gson.org

State-Changed-From-To: open->feedback
State-Changed-By: gson@NetBSD.org
State-Changed-When: Wed, 09 May 2018 08:36:56 +0000
State-Changed-Why:
feedback requested


From: Andreas Gustafsson <gson@NetBSD.org>
To: abs@absd.org, gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/53053 (non-MULTIPROCESSOR hangs building lang/go)
Date: Wed, 23 May 2018 23:30:34 +0300

 Hello abs,

 Earlier, I wrote:
 > I believe this kernel hang has been fixed in -current by
 > src/sys/kern/kern_lwp.c 1.192.  Can you confirm this?

 This should now also be fixed in -8 as the pullup of
 src/sys/kern/kern_lwp.c 1.192 has been processed.
 A confirmation, refutation, or report of possible side
 effects would still be appreciated.
 -- 
 Andreas Gustafsson, gson@NetBSD.org

From: David Brownlee <abs@absd.org>
To: Andreas Gustafsson <gson@netbsd.org>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/53053 (non-MULTIPROCESSOR hangs building lang/go)
Date: Sun, 27 May 2018 22:56:14 +0100

 On 23 May 2018 at 21:30, Andreas Gustafsson <gson@netbsd.org> wrote:
 > Earlier, I wrote:
 > > I believe this kernel hang has been fixed in -current by
 > > src/sys/kern/kern_lwp.c 1.192.  Can you confirm this?
 >
 > This should now also be fixed in -8 as the pullup of
 > src/sys/kern/kern_lwp.c 1.192 has been processed.
 > A confirmation, refutation, or report of possible side
 > effects would still be appreciated.

 Apologies - it took a little longer than expected to get into a
 position to test.

 I can confirm that I do not appear to be able to trigger the issue any
 more on xen DOM0 with 8.0_RC1, including when a DOMU is running

State-Changed-From-To: feedback->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Mon, 11 Jun 2018 10:49:45 +0000
State-Changed-Why:
Submitter can no longer reproduce the problem.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.