NetBSD Problem Report #50730

From www@NetBSD.org  Sat Jan 30 17:07:52 2016
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.NetBSD.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 98D307ABF0
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 30 Jan 2016 17:07:52 +0000 (UTC)
Message-Id: <20160130170751.367D87ACB3@mollari.NetBSD.org>
Date: Sat, 30 Jan 2016 17:07:51 +0000 (UTC)
From: bsiegert@NetBSD.org
Reply-To: bsiegert@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: Go test panics the kernel (kqueue related?)
X-Send-Pr-Version: www-1.0

>Number:         50730
>Category:       kern
>Synopsis:       Go test panics the kernel (kqueue related?)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Jan 30 17:10:01 +0000 2016
>Closed-Date:    Tue Jun 19 01:10:05 +0000 2018
>Last-Modified:  Tue Jun 19 01:10:05 +0000 2018
>Originator:     Benny Siegert
>Release:        NetBSD 7-amd64
>Organization:
The NetBSD Foundation
>Environment:
>Description:
This is from https://github.com/golang/go/issues/14127.

Running the TestRoundTripGzip from the net/http testsuite (in Go 1.6rc1) results in a kernel panic on amd64:

/netbsd: uvm_fault(0xfffffe8079bc8e80, 0x0, 1) -> e
netbsd-amd64 /netbsd: fatal page fault in supervisor mode
netbsd-amd64 /netbsd: trap type 6 code 0 rip ffffffff805bf9d5 cs 8 rflags 10246 cr2 18 ilevel 0 rsp fffffe80443f5c88
netbsd-amd64 /netbsd: curlwp 0xfffffe805bc46a80 pid 23983.2 lowest kstack 0xfffffe80443f32c0
netbsd-amd64 /netbsd: panic: trap
netbsd-amd64 /netbsd: cpu0: Begin traceback...
netbsd-amd64 /netbsd: vpanic() at netbsd:vpanic+0x13c
netbsd-amd64 /netbsd: snprintf() at netbsd:snprintf
netbsd-amd64 /netbsd: startlwp() at netbsd:startlwp
netbsd-amd64 /netbsd: alltraps() at netbsd:alltraps+0x96
netbsd-amd64 /netbsd: sys___kevent50() at netbsd:sys___kevent50+0x33
netbsd-amd64 /netbsd: syscall() at netbsd:syscall+0x9a
netbsd-amd64 /netbsd: --- syscall (number 435) ---
netbsd-amd64 /netbsd: 45f643:
netbsd-amd64 /netbsd: cpu0: End traceback...
netbsd-amd64 /netbsd: 
netbsd-amd64 /netbsd: dumping to dev 0,1 (offset=0, size=0): not possible
netbsd-amd64 /netbsd: rebooting...

That crash is in kqueue_scan().

This does not happen on my i386 test system.
>How-To-Repeat:
1. Install lang/go14 from pkgsrc.
2. Download and unpack https://github.com/golang/go/archive/go1.6rc1.tar.gz
3. cd go/src; env GOROOT_BOOTSTRAP=/usr/lang/go14
4. Add the bin dir to PATH.
5. cd net/http; go test -c
6. ./http.test -test.v -test.run=TestRoundTripGzip

Boom.
>Fix:

>Release-Note:

>Audit-Trail:
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/50730 CVS commit: src/sys
Date: Sat, 30 Jan 2016 23:40:02 -0500

 Module Name:	src
 Committed By:	christos
 Date:		Sun Jan 31 04:40:01 UTC 2016

 Modified Files:
 	src/sys/kern: kern_event.c
 	src/sys/sys: event.h

 Log Message:
 PR/50730: Benny Siegert: Go kqueue test panics kernel.
 - use a marker knote from the stack instead of allocating and freeing on
   each scan.
 - add more KASSERTS
 - introduce a KN_BUSY bit that indicates that the knote is currently being
   scanned, so that knote_detach does not end up deleting it when the file
   descriptor gets closed and we don't end up using/trashing free memory from
   the scan.


 To generate a diff of this commit:
 cvs rdiff -u -r1.84 -r1.85 src/sys/kern/kern_event.c
 cvs rdiff -u -r1.25 -r1.26 src/sys/sys/event.h

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc: 
Subject: Re: port-amd64/50730: Go test panics the kernel (kqueue related?)
Date: Sat, 30 Jan 2016 23:48:30 -0500

 On Jan 30,  5:10pm, bsiegert@NetBSD.org (bsiegert@NetBSD.org) wrote:
 -- Subject: port-amd64/50730: Go test panics the kernel (kqueue related?)

 This is still racy sometimes when the process exits.
     crash> ps
     1662     5 3   2   1000000   fffffe83f7aff700          http.test lwpwait
     1662 >   4 7   7   1100000   fffffe83f853f540          http.test
     1662     1 2   7   1100000   fffffe83f823f920          http.test

 This thread tries to close the socket:
     crash> t/a fffffe83f853f540
     trace: pid 1662 lid 4 at 0xfffffe810330bd68
     knote_detach() at knote_detach+0x23f
     knote_fdclose() at knote_fdclose+0x68
     fd_close() at fd_close+0x246
     sys_close() at sys_close+0x3a
     sy_call() at sy_call+0x40
     sy_invoke() at sy_invoke+0xd5
     syscall() at syscall+0xfe
     --- syscall (number 6) ---

 This thread is exiting:
     crash> t/a fffffe83f7aff700
     trace: pid 1662 lid 5 at 0xfffffe8103d8bc68
     sleepq_block() at sleepq_block+0xf6
     cv_wait() at cv_wait+0x116
     lwp_wait() at lwp_wait+0x34a
     exit_lwps() at exit_lwps+0x13b
     exit1() at exit1+0x146
     exit1() at exit1
     sy_call() at sy_call+0x40
     sy_invoke() at sy_invoke+0xd5
     syscall() at syscall+0xfe
     --- syscall (number 1) ---

 This thread is trying to lock the socket which we are trying to close:
     crash> t/a fffffe83f823f920
     trace: pid 1662 lid 1 at 0xfffffe81042919c8
     sleepq_block() at sleepq_block+0xf6
     turnstile_block() at turnstile_block+0x4e6
     mutex_enter() at mutex_enter+0x51d
     solock() at solock+0x23
     filt_soread() at filt_soread+0x36
     kqueue_scan() at kqueue_scan+0x4c2
     kevent1() at kevent1+0x315
     sys___kevent50() at sys___kevent50+0x5b
     sy_call() at sy_call+0x40
     sy_invoke() at sy_invoke+0xd5
     syscall() at syscall+0xfe
     --- syscall (number 435) ---

 So we deadlock.

State-Changed-From-To: open->needs-pullups
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 02 Jul 2017 20:16:41 +0000
State-Changed-Why:
As per discussion elsewhere this change needs to get into -7.


Responsible-Changed-From-To: port-amd64-maintainer->kern-bug-people
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Sun, 02 Jul 2017 20:22:43 +0000
Responsible-Changed-Why:
also, not amd64-specific.


State-Changed-From-To: needs-pullups->closed
State-Changed-By: maya@NetBSD.org
State-Changed-When: Tue, 19 Jun 2018 01:10:05 +0000
State-Changed-Why:
pullup-7 #1442 covers this.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.