NetBSD Problem Report #44508

From mouse@Sparkle.Rodents-Montreal.ORG  Thu Feb  3 16:33:23 2011
Return-Path: <mouse@Sparkle.Rodents-Montreal.ORG>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id 6BC5A63B873
	for <gnats-bugs@gnats.NetBSD.org>; Thu,  3 Feb 2011 16:33:23 +0000 (UTC)
Message-Id: <201102031633.LAA06448@Sparkle.Rodents-Montreal.ORG>
Date: Thu, 3 Feb 2011 11:33:19 -0500 (EST)
From: der Mouse <mouse@Rodents-Montreal.ORG>
Reply-To: mouse@Rodents-Montreal.ORG
To: gnats-bugs@gnats.NetBSD.org
Subject: [dM] ICMP_UNREACH_NEEDFRAG uses wrong mtu
X-Send-Pr-Version: 3.95

>Number:         44508
>Category:       kern
>Synopsis:       [dM] ICMP_UNREACH_NEEDFRAG uses wrong mtu
>Confidential:   no
>Severity:       serious
>Priority:       low
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Thu Feb 03 16:35:00 +0000 2011
>Last-Modified:  Fri Nov 18 15:20:02 +0000 2011
>Originator:     Mouse
>Release:        NetBSD 4.0.1
>Organization:
	Dis-
>Environment:
System: NetBSD VAIO-Frank.Rodents-Montreal.ORG 4.0.1 NetBSD 4.0.1 (VAIO-MP) #0: Wed Feb 2 21:56:02 EST 2011 mouse@VAIO-Frank.Rodents-Montreal.ORG:/home/mouse/kbuild/VAIO-MP i386
Architecture: i386
Machine: i386
>Description:
	When generating an ICMP_UNREACH_NEEDFRAG message, ip_input uses
	the target interface's MTU:

			destmtu = ipforward_rt.ro_rt->rt_ifp->if_mtu;

	However, if the route's MTU is less than the interface's, this
	is the wrong MTU to use; the resulting ICMP message gives an
	MTU that will not actually work.

	This is a regression as compared to 1.4T.  I discovered this
	bug the hard way when moving my house gateway (which, for
	reasons not relevant here, has such a route) from 1.4T to
	4.0.1, and found things breaking.
>How-To-Repeat:
	Configure a 4.0.1 machine as a router, with a route whose MTU
	is less than that of the target interface.  Try to send a
	packet with DF set through it; notice the MTU in the ICMP is
	wrong.

	My test setup when working on the fix for this involved two
	machines, A, with ex0 and vr0, and B, with re0 and cue0.  A ex0
	and B re0 are on the same switch; A vr0 and B cue0 are
	connected with a crossover cable.  (It's possible B's cue0 is
	unnecessary; I used it as a convenient way to get carrier on
	A's vr0.  Another switch would have worked as well, but cue0
	was handier - and, with a little ARP hackery, not shown below,
	cue0 can also be used to snoop the packets A wants to send on
	vr0.)

	Machine A:
	# ifconfig ex0 10.0.0.3/24
	# ifconfig vr0 10.1.0.1/24
	# route add -host 10.2.0.1 10.1.0.2
	# route add -host 10.2.0.2 10.1.0.2 -mtu 1400
	# sysctl -w net.inet.ip.forwarding=1
	# tcpdump -n -s 2000 -p -i ex0 icmp

	Machine B:
	# ifconfig re0 10.0.0.1/24
	# ifconfig cue0 up
	# route add -net 10.2.0.0 10.0.0.3 -netmask 255.255.255.0
	# ping -D -s 1472 -n -c 1 10.2.0.1
	# ping -D -s 1472 -n -c 1 10.2.0.2

	Note that the tcpdump on A sees the echo request and nothing
	more for the first ping, but sees a need-frag ICMP for the
	second.  With the bug, the MTU in the ICMP is 1500; when A is
	running a kernel with the fix below, it's 1400.
>Fix:
	Rather than duplicate the MTU logic from ip_output in ip_input,
	we can just use IP_RETURNMTU to have ip_output tell us what the
	necessary MTU is.  (Not that the logic is complicated...now.
	But having ip_output tell us what it decided it needs is more
	reliable than trusting two semantically equivalent pieces of
	code to stay in sync if/when their common task becoems more
	complicated.)

	--- OLD/sys/netinet/ip_input.c	2008-02-14 21:03:51.000000000 -0500
	+++ NEW/sys/netinet/ip_input.c	2011-02-02 21:55:25.000000000 -0500
	@@ -1843,6 +1843,7 @@
	 	int error, type = 0, code = 0, destmtu = 0;
	 	struct mbuf *mcopy;
	 	n_long dest;
	+	int rmtu;

	 	/*
	 	 * We are now in the output path.
	@@ -1934,9 +1935,10 @@
	 		}
	 	}

	+	rmtu = 0;
	 	error = ip_output(m, (struct mbuf *)0, &ipforward_rt,
	-	    (IP_FORWARDING | (ip_directedbcast ? IP_ALLOWBROADCAST : 0)),
	-	    (struct ip_moptions *)NULL, (struct socket *)NULL);
	+	    (IP_FORWARDING | IP_RETURNMTU | (ip_directedbcast ? IP_ALLOWBROADCAST : 0)),
	+	    (struct ip_moptions *)NULL, (struct socket *)NULL, &rmtu);

	 	if (error)
	 		ipstat.ips_cantforward++;
	@@ -1997,7 +1999,7 @@
	 			    &ipsecerror);
	 #endif

	-			destmtu = ipforward_rt.ro_rt->rt_ifp->if_mtu;
	+			destmtu = rmtu ? : ipforward_rt.ro_rt->rt_ifp->if_mtu;
	 #if defined(IPSEC) || defined(FAST_IPSEC)
	 			if (sp != NULL) {
	 				/* count IPsec header size */

	"It works for me."

/~\ The ASCII				  Mouse
\ / Ribbon Campaign
 X  Against HTML		mouse@rodents-montreal.org
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

>Audit-Trail:
From: Mouse <mouse@Rodents-Montreal.ORG>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/44508: [dM] ICMP_UNREACH_NEEDFRAG uses wrong mtu
Date: Fri, 18 Nov 2011 10:19:46 -0500 (EST)

 5.1 suffers from this as well.  Here's a diff that "works for me" on
 5.1, relative to ip_input.c,v 1.275.4.1, the one shipped with 5.1.

 /~\ The ASCII				  Mouse
 \ / Ribbon Campaign
  X  Against HTML		mouse@rodents-montreal.org
 / \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B

 diff --git a/sys/netinet/ip_input.c b/sys/netinet/ip_input.c
 index 8cd094d..f1920cb 100644
 --- a/sys/netinet/ip_input.c
 +++ b/sys/netinet/ip_input.c
 @@ -1847,6 +1847,7 @@ ip_forward(struct mbuf *m, int srcrt)
  		struct sockaddr		dst;
  		struct sockaddr_in	dst4;
  	} u;
 +	int rmtu;

  	/*
  	 * We are now in the output path.
 @@ -1926,8 +1927,8 @@ ip_forward(struct mbuf *m, int srcrt)
  	}

  	error = ip_output(m, NULL, &ipforward_rt,
 -	    (IP_FORWARDING | (ip_directedbcast ? IP_ALLOWBROADCAST : 0)),
 -	    (struct ip_moptions *)NULL, (struct socket *)NULL);
 +	    (IP_FORWARDING | IP_RETURNMTU | (ip_directedbcast ? IP_ALLOWBROADCAST : 0)),
 +	    (struct ip_moptions *)NULL, (struct socket *)NULL, &rmtu);

  	if (error)
  		IP_STATINC(IP_STAT_CANTFORWARD);
 @@ -1974,6 +1975,8 @@ ip_forward(struct mbuf *m, int srcrt)
  		if ((rt = rtcache_validate(&ipforward_rt)) != NULL)
  			destmtu = rt->rt_ifp->if_mtu;

 +		if (rmtu && (rmtu < destmtu)) destmtu = rmtu;
 +
  #if defined(IPSEC) || defined(FAST_IPSEC)
  		{
  			/*

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.