NetBSD Problem Report #49427

From hauke@Espresso.Rhein-Neckar.DE  Fri Nov 28 21:17:39 2014
Return-Path: <hauke@Espresso.Rhein-Neckar.DE>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 2FC1EA650D
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 28 Nov 2014 21:17:39 +0000 (UTC)
Message-Id: <201411282106.sASL6EmH001960@pizza.causeuse.org>
Date: Fri, 28 Nov 2014 22:06:14 +0100 (CET)
From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Reply-To: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
To: gnats-bugs@NetBSD.org
Cc: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
Subject: netinet/in4_cksum.c message flood
X-Send-Pr-Version: 3.95

>Number:         49427
>Category:       kern
>Synopsis:       netinet/in4_cksum.c message flood
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Nov 28 21:20:00 +0000 2014
>Closed-Date:    Mon Feb 26 09:41:46 +0000 2018
>Last-Modified:  Mon Feb 26 09:41:46 +0000 2018
>Originator:     Hauke Fath
>Release:        NetBSD 7.0_BETA
>Organization:
Falling Raindrops
>Environment:


System: NetBSD pizza.causeuse.org 7.0_BETA NetBSD 7.0_BETA (BLACKBOX-$Revision: 1.85 $) #0: Thu Nov 27 17:56:03 CET 2014 hauke@pizza.causeuse.org:/var/obj/netbsd-builds/7/amd64/sys/arch/amd64/compile/BLACKBOX amd64
Architecture: x86_64
Machine: amd64
>Description:

	After the upgrade to netbsd-7, a router machine floods the
	console with an endless stream of in4_cksum: offset 0 too
	short for IP header 20

	and syslogd(8) is hogging a cpu core at 100%.

	The machine shows this with both pf and npf active. If it
	matters, /etc/sysctl.conf has

net.inet.ip.do_loopback_cksum=1
net.inet.tcp.do_loopback_cksum=1
net.inet.udp.do_loopback_cksum=1
net.inet6.tcp6.do_loopback_cksum=1
net.inet6.udp6.do_loopback_cksum=1


>How-To-Repeat:

	Install netbsd-7 on a filtering router.

>Fix:
	Yes, please.

>Release-Note:

>Audit-Trail:
From: Joerg Sonnenberger <joerg@britannica.bec.de>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/49427: netinet/in4_cksum.c message flood
Date: Fri, 28 Nov 2014 22:30:46 +0100

 On Fri, Nov 28, 2014 at 09:20:00PM +0000, Hauke Fath wrote:
 > 	After the upgrade to netbsd-7, a router machine floods the
 > 	console with an endless stream of in4_cksum: offset 0 too
 > 	short for IP header 20

 Please run a DIAGNOSTIC kernel and provide the backtrace of the panic.

 Joerg

From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org
Subject: Re: kern/49427: netinet/in4_cksum.c message flood
Date: Fri, 28 Nov 2014 23:42:41 +0100

 On Fri, 28 Nov 2014 21:35:00 +0000 (UTC), Joerg Sonnenberger wrote:
 >  On Fri, Nov 28, 2014 at 09:20:00PM +0000, Hauke Fath wrote:
 >  > 	After the upgrade to netbsd-7, a router machine floods the
 >  > 	console with an endless stream of in4_cksum: offset 0 too
 >  > 	short for IP header 20
 >  
 >  Please run a DIAGNOSTIC kernel and provide the backtrace of the panic.

 Will do.

 The panic is only taken when DIAGNOSTIC is defined? I was wondering... 
 interesting.

 hauke

From: Hauke Fath <hauke@Espresso.Rhein-Neckar.DE>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@NetBSD.org, gnats-admin@NetBSD.org
Subject: Re: kern/49427: netinet/in4_cksum.c message flood
Date: Sun, 30 Nov 2014 12:56:12 +0100

 On Fri, 28 Nov 2014 21:35:00 +0000 (UTC), Joerg Sonnenberger wrote:
 >  On Fri, Nov 28, 2014 at 09:20:00PM +0000, Hauke Fath wrote:
 >  > 	After the upgrade to netbsd-7, a router machine floods the
 >  > 	console with an endless stream of in4_cksum: offset 0 too
 >  > 	short for IP header 20
 >  
 >  Please run a DIAGNOSTIC kernel and provide the backtrace of the panic.

 Ah, now if only I had a serial console... sorry for the blurred shots, 
 after the morning coffee my hand wasn't steady enough for 1/8 sec.

 The machine stops with
 <https://www2.spg.tu-darmstadt.de/~hf/netbsd/pr49427/IMG_6391.jpg>, 
 usually after starting rarpd(8) or timed(8)

 The stack trace
 <https://www2.spg.tu-darmstadt.de/~hf/netbsd/pr49427/IMG_6392.jpg>

 Here, 'ps' gives me some fast-scrolling output, then a blanked screen 
 and a hard reset.

 savecore has problems with the DIAGNOSTIC kernel core written
 <https://www2.spg.tu-darmstadt.de/~hf/netbsd/pr49427/IMG_6389.jpg>

 HTH,
 hauke

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@NetBSD.org
Cc: christos@netbsd.org
Subject: Re: kern/49427: netinet/in4_cksum.c message flood
Date: Sun, 30 Nov 2014 23:32:01 +0700

 Maybe I'm missing something, but I cannot see how the code in question
 can possibly work ...

 looutput() does ...

                 if (csum_flags != 0 && IN_LOOPBACK_NEED_CHECKSUM(csum_flags)) {
                         ip_undefer_csum(m, 0, csum_flags);
                 }

 ip_undefer_csum(m, hdrlen, csum_flags) does ...

         if (csum_flags & M_CSUM_IPv4) {
                 csum = in4_cksum(m, 0, hdrlen, iphdrlen);

 Note, hdrlen passed down from looutput() is 0.

 in4_cksum(m, nxt, off, len) does ...

 Note that nxt & off are both 0 (nxt the const 0, and off because hdrlen == 0)

         if (__predict_false(off < sizeof(struct ip)))
                 PANIC("%s: offset %d too short for IP header %zu", __func__,
                     off, sizeof(struct ip));

 (where the PANIC() is just printf() & return if !DIAGNOSTIC).

 Since off is 0 (was hdrlen in ip_undefer_csum()), off < sizeof(almost anything)
 and the PANIC() is guaranteed.

 Turning off net.inet.ip.do_loopback_cksum=1 (making it be 0), so that
 IN_LOOPBACK_NEED_CHECKSUM() becomes false would avoid the problem, but
 someone who understands what is supposed to be happening here needs to
 look at this code carefully.

 To me it looks as if in4_cksum() cannot really be used to calculate IP
 header checksums - it always wants to include a pseudo-header checksum,
 suitable for UDP & TCP (and ICMPv6) but not for IP itself.

 That is, unless the

         if (nxt == 0) 
                 return cpu_in_cksum(m, len, off, 0);

 case is supposed to handle that, in which case, perhaps the problem is
 just that the validation tests immediately above shouldn't be done in
 this case.   I notice that switching the order of those tests is the
 most recent change to in4_cksum() which could explain why this being
 newly seen in NetBSD 7 (though it is about 18 months old - was in the
 6.99.x series for a long time - I guess almost no-one bothers turning
 in loopback checksum calculations).

 kre

State-Changed-From-To: open->closed
State-Changed-By: maxv@NetBSD.org
State-Changed-When: Mon, 26 Feb 2018 09:41:46 +0000
State-Changed-Why:
Fixed and pulled up in 2014.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.