NetBSD Problem Report #51767
From martin@duskware.de Tue Jan 3 15:09:58 2017
Return-Path: <martin@duskware.de>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id A0F367A223
for <gnats-bugs@gnats.NetBSD.org>; Tue, 3 Jan 2017 15:09:58 +0000 (UTC)
Message-Id: <20170103150949.899795CC761@emmas.aprisoft.de>
Date: Tue, 3 Jan 2017 16:09:49 +0100 (CET)
From: martin@NetBSD.org
Reply-To: martin@NetBSD.org
To: gnats-bugs@NetBSD.org
Subject: reproducable kernel stack overflow(?!)
X-Send-Pr-Version: 3.95
>Number: 51767
>Category: kern
>Synopsis: reproducable kernel stack overflow(?!)
>Confidential: no
>Severity: critical
>Priority: high
>Responsible: christos
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Tue Jan 03 15:10:00 +0000 2017
>Closed-Date: Wed Jan 04 15:10:40 +0000 2017
>Last-Modified: Wed Jan 04 15:10:40 +0000 2017
>Originator: Martin Husemann
>Release: NetBSD 7.99.54
>Organization:
The NetBSD Foundation, Inc.
>Environment:
System: NetBSD night-owl.duskware.de 7.99.53 NetBSD 7.99.53 (NIGHT-OWL) #450: Wed Dec 28 12:18:50 CET 2016 martin@night-owl.duskware.de:/usr/src/sys/arch/amd64/compile/NIGHT-OWL amd64
Architecture: x86_64
Machine: amd64
... but the crash happens with a newer .54 kernel!
>Description:
ssh'ing to a machine that still has the SACK bug which recently got fixed
(not sure if this is relevant) and doing a cvs update there crashes
my machine ~instantly.
stack overflow detected; terminated
...
vpanic()
snprintf()
ssp_init()
tcp_output()+0x231e
tcp_input()+0x10b2
ipintr()
and the source lines are:
0xffffffff804f11d8 is in tcp_output (../../../../netinet/tcp_output.c:592).
587 #endif
588 uint64_t *tcps;
589
590 #ifdef DIAGNOSTIC
591 if (tp->t_inpcb && tp->t_in6pcb)
592 panic("tcp_output: both t_inpcb and t_in6pcb are set");
593 #endif
594 so = NULL;
595 ro = NULL;
596 if (tp->t_inpcb) {
0xffffffff804ecabc is in tcp_input (../../../../netinet/tcp_input.c:3027).
3022 * Return any desired output.
3023 */
3024 if (needoutput || (tp->t_flags & TF_ACKNOW)) {
3025 KERNEL_LOCK(1, NULL);
3026 (void) tcp_output(tp);
3027 KERNEL_UNLOCK_ONE(NULL);
3028 }
3029 if (tcp_saveti)
3030 m_freem(tcp_saveti);
3031
>How-To-Repeat:
s/a
>Fix:
n/a
>Release-Note:
>Audit-Trail:
From: =?UTF-8?B?SmFyb23DrXIgRG9sZcSNZWs=?= <jaromir.dolecek@gmail.com>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: kern/51767: reproducable kernel stack overflow(?!)
Date: Tue, 3 Jan 2017 16:13:39 +0100
Christos just fixed some off-by-one error in the general area, can you
check if that fixed this?
Jaromir
2017-01-03 16:10 GMT+01:00 <martin@netbsd.org>:
>>Number: 51767
>>Category: kern
>>Synopsis: reproducable kernel stack overflow(?!)
>>Confidential: no
>>Severity: critical
>>Priority: high
>>Responsible: kern-bug-people
>>State: open
>>Class: sw-bug
>>Submitter-Id: net
>>Arrival-Date: Tue Jan 03 15:10:00 +0000 2017
>>Originator: Martin Husemann
>>Release: NetBSD 7.99.54
>>Organization:
> The NetBSD Foundation, Inc.
>>Environment:
> System: NetBSD night-owl.duskware.de 7.99.53 NetBSD 7.99.53 (NIGHT-OWL) #450: Wed Dec 28 12:18:50 CET 2016 martin@night-owl.duskware.de:/usr/src/sys/arch/amd64/compile/NIGHT-OWL amd64
> Architecture: x86_64
> Machine: amd64
>
> ... but the crash happens with a newer .54 kernel!
>
>>Description:
>
> ssh'ing to a machine that still has the SACK bug which recently got fixed
> (not sure if this is relevant) and doing a cvs update there crashes
> my machine ~instantly.
>
> stack overflow detected; terminated
> ...
> vpanic()
> snprintf()
> ssp_init()
> tcp_output()+0x231e
> tcp_input()+0x10b2
> ipintr()
>
> and the source lines are:
> 0xffffffff804f11d8 is in tcp_output (../../../../netinet/tcp_output.c:592).
> 587 #endif
> 588 uint64_t *tcps;
> 589
> 590 #ifdef DIAGNOSTIC
> 591 if (tp->t_inpcb && tp->t_in6pcb)
> 592 panic("tcp_output: both t_inpcb and t_in6pcb are set");
> 593 #endif
> 594 so = NULL;
> 595 ro = NULL;
> 596 if (tp->t_inpcb) {
>
> 0xffffffff804ecabc is in tcp_input (../../../../netinet/tcp_input.c:3027).
> 3022 * Return any desired output.
> 3023 */
> 3024 if (needoutput || (tp->t_flags & TF_ACKNOW)) {
> 3025 KERNEL_LOCK(1, NULL);
> 3026 (void) tcp_output(tp);
> 3027 KERNEL_UNLOCK_ONE(NULL);
> 3028 }
> 3029 if (tcp_saveti)
> 3030 m_freem(tcp_saveti);
> 3031
>
>
>>How-To-Repeat:
> s/a
>
>>Fix:
> n/a
>
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/51767: reproducable kernel stack overflow(?!)
Date: Tue, 3 Jan 2017 16:18:13 +0100
On Tue, Jan 03, 2017 at 03:15:00PM +0000, Jaromír Dole?ek wrote:
> Christos just fixed some off-by-one error in the general area, can you
> check if that fixed this?
I have tcp_output.c rev 1.190 already.
Martin
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/51767: reproducable kernel stack overflow(?!)
Date: Tue, 3 Jan 2017 16:24:58 +0100
On Tue, Jan 03, 2017 at 03:20:01PM +0000, Martin Husemann wrote:
> I have tcp_output.c rev 1.190 already.
And it also happens with 1.191.
Martin
From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, martin@NetBSD.org
Cc:
Subject: Re: kern/51767: reproducable kernel stack overflow(?!)
Date: Tue, 3 Jan 2017 10:40:25 -0500
On Jan 3, 3:20pm, martin@duskware.de (Martin Husemann) wrote:
-- Subject: Re: kern/51767: reproducable kernel stack overflow(?!)
| I have tcp_output.c rev 1.190 already.
And that was not a problem... I reverted it.
christos
From: Thomas Klausner <wiz@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/51767: reproducable kernel stack overflow(?!)
Date: Tue, 3 Jan 2017 18:34:34 +0100
On Tue, Jan 03, 2017 at 03:10:01PM +0000, martin@NetBSD.org wrote:
> >Synopsis: reproducable kernel stack overflow(?!)
I see the same in a kernel from this morning on amd64.
Not immediate crashes, but quick ones.
Thomas
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/51767: reproducable kernel stack overflow(?!)
Date: Wed, 4 Jan 2017 09:01:57 +0100
Just to make sure we are not barking up the wrong tree: I ran the
following in src/sys/netinet:
cvs up -r1.31 tcp.h
cvs up -r1.351 tcp_input.c
cvs up -r1.187 tcp_output.c
cvs up -r1.268 tcp_subr.c
and this avoids the (still reproducable for me) crash.
Martin
Responsible-Changed-From-To: kern-bug-people->christos
Responsible-Changed-By: wiz@NetBSD.org
Responsible-Changed-When: Wed, 04 Jan 2017 13:59:10 +0000
Responsible-Changed-Why:
martin identified a commit by christos.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51767 CVS commit: src/sys/netinet
Date: Wed, 4 Jan 2017 15:09:37 +0000
Module Name: src
Committed By: martin
Date: Wed Jan 4 15:09:37 UTC 2017
Modified Files:
src/sys/netinet: tcp_output.c
Log Message:
Fix optlen calculation for the SACK block - 2 bytes too few were
calculated, causing corruption in PR kern/51767.
To generate a diff of this commit:
cvs rdiff -u -r1.193 -r1.194 src/sys/netinet/tcp_output.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: martin@NetBSD.org
State-Changed-When: Wed, 04 Jan 2017 15:10:40 +0000
State-Changed-Why:
Fixed
>Unformatted:
wiz@NetBSD.org
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.