NetBSD Problem Report #50730
From www@NetBSD.org Sat Jan 30 17:07:52 2016
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.NetBSD.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 98D307ABF0
for <gnats-bugs@gnats.NetBSD.org>; Sat, 30 Jan 2016 17:07:52 +0000 (UTC)
Message-Id: <20160130170751.367D87ACB3@mollari.NetBSD.org>
Date: Sat, 30 Jan 2016 17:07:51 +0000 (UTC)
From: bsiegert@NetBSD.org
Reply-To: bsiegert@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: Go test panics the kernel (kqueue related?)
X-Send-Pr-Version: www-1.0
>Number: 50730
>Category: kern
>Synopsis: Go test panics the kernel (kqueue related?)
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Jan 30 17:10:01 +0000 2016
>Closed-Date: Tue Jun 19 01:10:05 +0000 2018
>Last-Modified: Tue Jun 19 01:10:05 +0000 2018
>Originator: Benny Siegert
>Release: NetBSD 7-amd64
>Organization:
The NetBSD Foundation
>Environment:
>Description:
This is from https://github.com/golang/go/issues/14127.
Running the TestRoundTripGzip from the net/http testsuite (in Go 1.6rc1) results in a kernel panic on amd64:
/netbsd: uvm_fault(0xfffffe8079bc8e80, 0x0, 1) -> e
netbsd-amd64 /netbsd: fatal page fault in supervisor mode
netbsd-amd64 /netbsd: trap type 6 code 0 rip ffffffff805bf9d5 cs 8 rflags 10246 cr2 18 ilevel 0 rsp fffffe80443f5c88
netbsd-amd64 /netbsd: curlwp 0xfffffe805bc46a80 pid 23983.2 lowest kstack 0xfffffe80443f32c0
netbsd-amd64 /netbsd: panic: trap
netbsd-amd64 /netbsd: cpu0: Begin traceback...
netbsd-amd64 /netbsd: vpanic() at netbsd:vpanic+0x13c
netbsd-amd64 /netbsd: snprintf() at netbsd:snprintf
netbsd-amd64 /netbsd: startlwp() at netbsd:startlwp
netbsd-amd64 /netbsd: alltraps() at netbsd:alltraps+0x96
netbsd-amd64 /netbsd: sys___kevent50() at netbsd:sys___kevent50+0x33
netbsd-amd64 /netbsd: syscall() at netbsd:syscall+0x9a
netbsd-amd64 /netbsd: --- syscall (number 435) ---
netbsd-amd64 /netbsd: 45f643:
netbsd-amd64 /netbsd: cpu0: End traceback...
netbsd-amd64 /netbsd:
netbsd-amd64 /netbsd: dumping to dev 0,1 (offset=0, size=0): not possible
netbsd-amd64 /netbsd: rebooting...
That crash is in kqueue_scan().
This does not happen on my i386 test system.
>How-To-Repeat:
1. Install lang/go14 from pkgsrc.
2. Download and unpack https://github.com/golang/go/archive/go1.6rc1.tar.gz
3. cd go/src; env GOROOT_BOOTSTRAP=/usr/lang/go14
4. Add the bin dir to PATH.
5. cd net/http; go test -c
6. ./http.test -test.v -test.run=TestRoundTripGzip
Boom.
>Fix:
>Release-Note:
>Audit-Trail:
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/50730 CVS commit: src/sys
Date: Sat, 30 Jan 2016 23:40:02 -0500
Module Name: src
Committed By: christos
Date: Sun Jan 31 04:40:01 UTC 2016
Modified Files:
src/sys/kern: kern_event.c
src/sys/sys: event.h
Log Message:
PR/50730: Benny Siegert: Go kqueue test panics kernel.
- use a marker knote from the stack instead of allocating and freeing on
each scan.
- add more KASSERTS
- introduce a KN_BUSY bit that indicates that the knote is currently being
scanned, so that knote_detach does not end up deleting it when the file
descriptor gets closed and we don't end up using/trashing free memory from
the scan.
To generate a diff of this commit:
cvs rdiff -u -r1.84 -r1.85 src/sys/kern/kern_event.c
cvs rdiff -u -r1.25 -r1.26 src/sys/sys/event.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, port-amd64-maintainer@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Cc:
Subject: Re: port-amd64/50730: Go test panics the kernel (kqueue related?)
Date: Sat, 30 Jan 2016 23:48:30 -0500
On Jan 30, 5:10pm, bsiegert@NetBSD.org (bsiegert@NetBSD.org) wrote:
-- Subject: port-amd64/50730: Go test panics the kernel (kqueue related?)
This is still racy sometimes when the process exits.
crash> ps
1662 5 3 2 1000000 fffffe83f7aff700 http.test lwpwait
1662 > 4 7 7 1100000 fffffe83f853f540 http.test
1662 1 2 7 1100000 fffffe83f823f920 http.test
This thread tries to close the socket:
crash> t/a fffffe83f853f540
trace: pid 1662 lid 4 at 0xfffffe810330bd68
knote_detach() at knote_detach+0x23f
knote_fdclose() at knote_fdclose+0x68
fd_close() at fd_close+0x246
sys_close() at sys_close+0x3a
sy_call() at sy_call+0x40
sy_invoke() at sy_invoke+0xd5
syscall() at syscall+0xfe
--- syscall (number 6) ---
This thread is exiting:
crash> t/a fffffe83f7aff700
trace: pid 1662 lid 5 at 0xfffffe8103d8bc68
sleepq_block() at sleepq_block+0xf6
cv_wait() at cv_wait+0x116
lwp_wait() at lwp_wait+0x34a
exit_lwps() at exit_lwps+0x13b
exit1() at exit1+0x146
exit1() at exit1
sy_call() at sy_call+0x40
sy_invoke() at sy_invoke+0xd5
syscall() at syscall+0xfe
--- syscall (number 1) ---
This thread is trying to lock the socket which we are trying to close:
crash> t/a fffffe83f823f920
trace: pid 1662 lid 1 at 0xfffffe81042919c8
sleepq_block() at sleepq_block+0xf6
turnstile_block() at turnstile_block+0x4e6
mutex_enter() at mutex_enter+0x51d
solock() at solock+0x23
filt_soread() at filt_soread+0x36
kqueue_scan() at kqueue_scan+0x4c2
kevent1() at kevent1+0x315
sys___kevent50() at sys___kevent50+0x5b
sy_call() at sy_call+0x40
sy_invoke() at sy_invoke+0xd5
syscall() at syscall+0xfe
--- syscall (number 435) ---
So we deadlock.
State-Changed-From-To: open->needs-pullups
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Sun, 02 Jul 2017 20:16:41 +0000
State-Changed-Why:
As per discussion elsewhere this change needs to get into -7.
Responsible-Changed-From-To: port-amd64-maintainer->kern-bug-people
Responsible-Changed-By: dholland@NetBSD.org
Responsible-Changed-When: Sun, 02 Jul 2017 20:22:43 +0000
Responsible-Changed-Why:
also, not amd64-specific.
State-Changed-From-To: needs-pullups->closed
State-Changed-By: maya@NetBSD.org
State-Changed-When: Tue, 19 Jun 2018 01:10:05 +0000
State-Changed-Why:
pullup-7 #1442 covers this.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.