NetBSD Problem Report #51219
From tls@panix.com Sun Jun 5 22:46:21 2016
Return-Path: <tls@panix.com>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id E27F87ABE0
for <gnats-bugs@gnats.NetBSD.org>; Sun, 5 Jun 2016 22:46:21 +0000 (UTC)
Message-Id: <20160605224618.DAD12242AA@panix5.panix.com>
Date: Sun, 5 Jun 2016 18:46:18 -0400 (EDT)
From: tls@NetBSD.ORG
Reply-To: tls@NetBSD.ORG
To: gnats-bugs@NetBSD.org
Subject: TLB miss panic in in_cksum
X-Send-Pr-Version: 3.95
>Number: 51219
>Category: port-evbmips
>Synopsis: TLB miss panic in in_cksum triggered by piping gzip through ssh.
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: port-evbmips-maintainer
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Jun 05 22:50:01 +0000 2016
>Closed-Date: Thu May 25 06:14:15 +0000 2017
>Last-Modified: Thu May 25 06:14:15 +0000 2017
>Originator: tls@NetBSD.ORG
>Release: NetBSD 7.99.30
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD 7.99.30 NetBSD 7.99.30 (ERLITE.201606051010Z) evbmips
Architecture: mips64eb
Machine: evbmips
>Description:
# dd if=/dev/rsd0c bs=64k | gzip -1 | ssh root@192.168.100.1 dd bs=128k of=/diskless-mips64eb/erlite-factory.dd.gz
The authenticity of host '192.168.100.1 (192.168.100.1)' can't be established.
ECDSA key fingerprint is SHA256:u54QRuCPUUI+WZ6P1cQIz/bQawAjq6IVK4FAugd/nXo.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.100.1' (ECDSA) to the list of known hosts.
root@192.168.100.1's password:
pid 0(system): trap: cpu0, TLB miss (load or instr. fetch) in kernel mode
status=0x80fc83, cause=0x208, epc=0xffffffff8045cb98, vaddr=0xc00000000e5e4844
tf=0x9800000410003800 ksp=0x9800000410003940 ra=0xffffffff8025cf38 ppl=0
kernel: TLB miss (load or instr. fetch) trap
Stopped in pid 0.3 (system) at netbsd:cpu_in_cksum+0x128: lwu a6,-60(v
0)
db> trace
0x9800000410003940: cpu_in_cksum+128 (568,2c7e3a000,2785ba000,c8e4a832) ra fffff
fff8025cf38 sz 48
0x9800000410003970: in_delayed_cksum+40 (568,2c7e3a000,2785ba000,c8e4a832) ra ff
ffffff8025e760 sz 48
0x98000004100039a0: ip_output+f68 (980000000fee6400,2c7e3a000,2785ba000,c8e4a832
) ra ffffffff802697d4 sz 192
0x9800000410003a60: tcp_output+14e4 (980000000fee6400,2c7e3a000,2785ba000,c8e4a8
32) ra ffffffff80265b90 sz 320
0x9800000410003ba0: tcp_input+dc0 (980000000fee6400,14,6,20) ra ffffffff8025b8a0
sz 512
0x9800000410003da0: ipintr+9f0 (980000000fee6400,14,6,20) ra ffffffff803a0fc4 sz
160
0x9800000410003e40: softint_dispatch+114 (980000000fee6400,14,6,20) ra ffffffff8
0200244 sz 96
0x9800000410003ea0: softint_fast_dispatch+7c (0,14,6,20) ra 0 sz 32
User-level: pid 0.3
db>
Note the traceback through ipintr->tcp_input->tcp_output: is this an ack?
>How-To-Repeat:
Boot the ERLITE kernel. Try to back up the factory image using
dd if=/dev/rsd0c bs=64k | gzip -1 | ssh somehost dd of=img.dd.gz
Wham.
>Fix:
Building the kernel with options SOSEND_NO_LOAN makes the problem
disappear. A pmap issue? Whatever it is, it's been there for at
least a year -- a kernel from last May has the problem.
>Release-Note:
>Audit-Trail:
From: Thor Lancelot Simon <tls@panix.com>
To: gnats-bugs@NetBSD.org
Cc: tls@NetBSD.ORG
Subject: Re: port-evbmips/51219: TLB miss panic in in_cksum
Date: Sun, 5 Jun 2016 19:24:59 -0400
It can be triggered even more simply:
cat /dev/zero | ssh somehost dd of=/dev/null
The trace through ipintr->tcp_input->tcp_output suggests this is triggered
by sending an ack, I think.
id 0(system): trap: cpu0, TLB miss (load or instr. fetch) in kernel mode
status=0x80fc83, cause=0x208, epc=0xffffffff8045cb98, vaddr=0xc00000000e5c86a4
tf=0x9800000410003800 ksp=0x9800000410003940 ra=0xffffffff8025cf38 ppl=0
kernel: TLB miss (load or instr. fetch) trap
Stopped in pid 0.3 (system) at netbsd:cpu_in_cksum+0x128: lwu
a6,-60(v
0)
db> trace
0x9800000410003940: cpu_in_cksum+128 (568,30a6616a2,2bad416a2,ee330d06) ra
fffff
fff8025cf38 sz 48
0x9800000410003970: in_delayed_cksum+40 (568,30a6616a2,2bad416a2,ee330d06) ra
ff
ffffff8025e760 sz 48
0x98000004100039a0: ip_output+f68
(980000000fef3a00,30a6616a2,2bad416a2,ee330d06
) ra ffffffff802697d4 sz 192
0x9800000410003a60: tcp_output+14e4
(980000000fef3a00,30a6616a2,2bad416a2,ee330d
06) ra ffffffff80265b90 sz 320
0x9800000410003ba0: tcp_input+dc0 (980000000fef3a00,14,6,20) ra
ffffffff8025b8a0
sz 512
0x9800000410003da0: ipintr+9f0 (980000000fef3a00,14,6,20) ra ffffffff803a0fc4
sz
160
0x9800000410003e40: softint_dispatch+114 (980000000fef3a00,14,6,20) ra
ffffffff8
0200244 sz 96
0x9800000410003ea0: softint_fast_dispatch+7c (0,14,6,20) ra 0 sz 32
User-level: pid 0.3
State-Changed-From-To: open->feedback
State-Changed-By: maya@NetBSD.org
State-Changed-When: Mon, 31 Oct 2016 02:50:41 +0000
State-Changed-Why:
Seems like this was fixed with the rest of the problems and I can't reproduce it now on .41
State-Changed-From-To: feedback->closed
State-Changed-By: dholland@NetBSD.org
State-Changed-When: Thu, 25 May 2017 06:14:15 +0000
State-Changed-Why:
feedback timeout
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.