NetBSD Problem Report #52074

From kardel@gateway.kardel.name  Tue Mar 14 22:48:59 2017
Return-Path: <kardel@gateway.kardel.name>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 8E4487A16A
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 14 Mar 2017 22:48:59 +0000 (UTC)
Message-Id: <20170314224847.47355343EB2@gateway.kardel.name>
Date: Tue, 14 Mar 2017 22:48:47 +0000 (UTC)
From: kardel@netbsd.org
Reply-To: kardel@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: -current npf map directive broken
X-Send-Pr-Version: 3.95

>Number:         52074
>Category:       kern
>Synopsis:       -current npf map directive broken
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Mar 14 22:50:00 +0000 2017
>Closed-Date:    Sat Jan 19 21:47:16 +0000 2019
>Last-Modified:  Sat Jan 19 21:47:16 +0000 2019
>Originator:     kardel@netbsd.org
>Release:        NetBSD 7.99.65
>Organization:

>Environment:


System: NetBSD Gateway 7.99.65 NetBSD 7.99.65 (GATEWAY) #9: Tue Mar 14 09:27:38 CET 2017 kardel@xxx:/fs/raid2a/src/NetBSD/cur/src/obj.i386/sys/arch/i386/compile/GATEWAY i386
Architecture: i386
Machine: i386
>Description:
	/etc/npf.conf
	map tun0 dynamic 127.0.0.1 port smtp <- 10.x.y.z port smtp

	This used to currectly intercept and reroute to localhost:smtp in 7.99.16
	Now replied reply packets bear the source address 127.0.0.1 instead of 10.x.y.z.
	These packets travel happily through the tunnel to be dropped at the other side...

	This is a regression against 7.99.16 or some later time,
	It is also unlikely that the "man npf.conf map <-" example will still work.

	Additionally a panic was triggered during and /etc/rc.d/npf reload.
uvm_fault(0xc60dd018, 0, 2) -> 0xe
fatal page fault in supervisor mode
trap type 6 code 2 eip db9cda2c cs 8 eflags 10246 cr2 0 ilevel 0 esp c563c200
curlwp 0xc60242a0 pid 13351 lid 1 lowest kstack 0xdcf5c2c0
panic: trap
cpu1: Begin traceback...
vpanic(c0cb26d0,dcf5ebe0,dcf5ec5c,c0117302,c0cb26d0,dcf5ec68,dcf5ec68,1,dcf5c2c0,10246) at netbsd:vpanic+0x121
snprintf(c0cb26d0,dcf5ec68,dcf5ec68,1,dcf5c2c0,10246,0,0,c563c200,0) at netbsd:snprintf
trap_tss() at netbsd:trap_tss
--- trap via task gate ---
0:
cpu1: End traceback...

dumping to dev 0,9 offset 8
dump uvm_fault(0xc0faa880, 0xdb80f000, 2) -> 0xd
fatal page fault in supervisor mode
trap type 6 code 3 eip c011160c cs 8 eflags 10246 cr2 db80fdc0 ilevel 8 esp c0f68040
curlwp 0xc60242a0 pid 13351 lid 1 lowest kstack 0xdcf5c2c0
Skipping crash dump on recursive panic
panic: trap
cpu1: Begin traceback...
vpanic(c0cb26d0,dcf5ea70,dcf5eaec,c0117302,c0cb26d0,dcf5eaf8,dcf5eaf8,1,dcf5c2c0,10246) at netbsd:vpanic+0x121
snprintf(c0cb26d0,dcf5eaf8,dcf5eaf8,1,dcf5c2c0,10246,db80fdc0,8,c0f68040,0) at netbsd:snprintf
trap_tss() at netbsd:trap_tss
--- trap via task gate ---
1:
cpu1: End traceback...
rebooting...

>How-To-Repeat:
	Take a recent -current (20170312) and try to redirect a connection.
	See that the new destination address is used as source fr the reply
	packets instead of the orginal destination address,

>Fix:
	n/a

>Release-Note:

>Audit-Trail:
From: Frank Kardel <kardel@netbsd.org>
To: gnats-bugs@NetBSD.org
Cc: 
Subject: Re: kern/52074: -current npf map directive broken
Date: Sun, 07 May 2017 11:51:12 +0200

 Debugging this further the analysis is:

 Given a map:
 map <some if> dynamic 127.0.0.1 port 25 <- 10.200.1.2 port 25 
 (10.200.1.2 is not a local interface address)

 Following happens:
 1) packet arrives on interface (a.b.c.d:x -> 10.200.1.2:25
 2) NAT assoc gets creates
 3) packet gets NATed (=> a.b.c.d:x -> 127.0.0.1:25)
 4) packet gets entered in syn_cache
 5) syn_cache attempts to send SYN,ACK response (127.0.0.1:25 -> 
 a.b.c.d:x) via ip_output()
 6) ip_output calls pfil_hooks (->NPF)
 7) NPF find nat assoc
 8) SYN,ACK packet gets reverse NATed (10.200.1.2:25 -> a.b.c.d:x)
 9) ip_output tries for find the interface adress (ia) for 10.200.1.2 and 
 fails
 10) ip_output fumbles badly at ip_output:1.276:620 and declares the 
 packet coming from an invalid address(ip_ifaddrvalid(NULL))
 11) confusion commences from here on

 In ip_ouput ia is only used for interface transfer statistics and a 
 sanity check after pfil_run_hooks().
 ia will/must be NULL when non local source addresses are in the packet.

 The condition in 620 currently is (see if()):
      ia = in_get_ia_psref(ip->ip_src, &psref_ia);

      /* Ensure we only send from a valid address. */==>
      if ((ia != NULL || (flags & IP_FORWARDING) == 0) && <<<<!
          (error = ip_ifaddrvalid(ia)) != 0)
      {
          ARPLOG(LOG_ERR,
              "refusing to send from invalid address %s (pid %d)\n",
              ARPLOGADDR(ip->ip_src), curproc->p_pid);
          IP_STATINC(IP_STAT_ODROPPED);
          if (error == 1)
              /*
               * Address exists, but is tentative or detached.
               * We can't send from it because it's invalid,
               * so we drop the packet.
               */
              error = 0;
          else
              error = EADDRNOTAVAIL;
          goto bad;
      }

 Proposed fix is to replace the || inf the if with && as this also seems 
 to have been the original intention by the author.


State-Changed-From-To: open->analyzed
State-Changed-By: kardel@NetBSD.org
State-Changed-When: Sun, 07 May 2017 09:58:18 +0000
State-Changed-Why:
analysis was added


From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52074 CVS commit: src/sys/netinet
Date: Sun, 7 May 2017 12:41:23 -0400

 Module Name:	src
 Committed By:	christos
 Date:		Sun May  7 16:41:22 UTC 2017

 Modified Files:
 	src/sys/netinet: ip_output.c

 Log Message:
 PR/52074: Frank Kardel: current npf map directive broken
 Don't filter packets that can't be resolved to source interfaces because
 they could have been generated by a packet filter.


 To generate a diff of this commit:
 cvs rdiff -u -r1.276 -r1.277 src/sys/netinet/ip_output.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Martin Husemann <martin@duskware.de>
To: Joerg Sonnenberger <joerg@bec.de>
Cc: gnats-bugs@NetBSD.org
Subject: Re: kern/52074: -current npf map directive broken
Date: Thu, 11 May 2017 14:51:51 +0200

 On Thu, May 11, 2017 at 02:19:49PM +0200, Joerg Sonnenberger wrote:
 > On Thu, May 11, 2017 at 10:47:28AM +0100, Roy Marples wrote:
 > > I agree with Robert, we shouldn't be sending packets on the wire from an
 > > address we don't own.
 > 
 > That depends. The transparent proxy case is tricky in this regard...

 I wonder if we should just tag the mbufs as "authorized" somehow if
 they come via a packet filter and have been NATed.

 Martin

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/52074: -current npf map directive broken
Date: Tue, 16 May 2017 06:10:00 +0000

 Not sent to gnats.

    ------

 From: Mindaugas Rasiukevicius <rmind@netbsd.org>
 To: Frank Kardel <kardel@netbsd.org>, roy@netbsd.org
 Cc: netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, rmind@netbsd.org
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Sun, 7 May 2017 17:28:04 +0100

 >  The condition in 620 currently is (see if()):
 >       ia = in_get_ia_psref(ip->ip_src, &psref_ia);
 >  
 >       /* Ensure we only send from a valid address. */==>
 >       if ((ia != NULL || (flags & IP_FORWARDING) == 0) && <<<<!
 >           (error = ip_ifaddrvalid(ia)) != 0)
 >       {
 >           ARPLOG(LOG_ERR,
 >               "refusing to send from invalid address %s (pid %d)\n",
 >               ARPLOGADDR(ip->ip_src), curproc->p_pid);
 >           IP_STATINC(IP_STAT_ODROPPED);
 >           if (error == 1)
 >               /*
 >                * Address exists, but is tentative or detached.
 >                * We can't send from it because it's invalid,
 >                * so we drop the packet.
 >                */
 >               error = 0;
 >           else
 >               error = EADDRNOTAVAIL;
 >           goto bad;
 >       }
 >  
 >  Proposed fix is to replace the || inf the if with && as this also seems 
 >  to have been the original intention by the author.

 Good catch.

 Looks like the regression was introduced 7 months ago, as part of the
 ip_output.c rev 1.261.  Roy, would you like to have a look into this?

 -- 
 Mindaugas

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/52074: -current npf map directive broken
Date: Tue, 16 May 2017 06:10:53 +0000

 Not sent to gnats.

    ------

 From: Roy Marples <roy@marples.name>
 To: Mindaugas Rasiukevicius <rmind@netbsd.org>, Frank Kardel
 	<kardel@netbsd.org>
 Cc: netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Sun, 7 May 2017 20:50:07 +0100

 On 07/05/2017 17:28, Mindaugas Rasiukevicius wrote:
 > >  The condition in 620 currently is (see if()):
 > >       ia = in_get_ia_psref(ip->ip_src, &psref_ia);
 > > 
 > >       /* Ensure we only send from a valid address. */==>
 > >       if ((ia != NULL || (flags & IP_FORWARDING) == 0) && <<<<!
 > >           (error = ip_ifaddrvalid(ia)) != 0)
 > >       {
 > >           ARPLOG(LOG_ERR,
 > >               "refusing to send from invalid address %s (pid %d)\n",
 > >               ARPLOGADDR(ip->ip_src), curproc->p_pid);
 > >           IP_STATINC(IP_STAT_ODROPPED);
 > >           if (error == 1)
 > >               /*
 > >                * Address exists, but is tentative or detached.
 > >                * We can't send from it because it's invalid,
 > >                * so we drop the packet.
 > >                */
 > >               error = 0;
 > >           else
 > >               error = EADDRNOTAVAIL;
 > >           goto bad;
 > >       }
 > > 
 > >  Proposed fix is to replace the || inf the if with && as this also seems
 > >  to have been the original intention by the author.
 > 
 > Good catch.
 > 
 > Looks like the regression was introduced 7 months ago, as part of the
 > ip_output.c rev 1.261.  Roy, would you like to have a look into this?

 I  think xtos already comitted a fix, but I'm unsure his fix is correct.
 I think the workflow should be this:

 if (ia != NULL)
 	error = ip_ifaddrvalid(ia);
 else
 	error = flags & IP_FORWARDING ? 0 : -1;
 if (error != 0) { ...

 The idea is that if we claim to send from an address it has to be valid, but
 allow the NULL address if forwarded from the filter.

 Does this make sense?
 The same path probably needs adjustment in inet6.

 Roy

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/52074: -current npf map directive broken
Date: Tue, 16 May 2017 06:12:16 +0000

 Not sent to gnats.

    ------

 From: Frank Kardel <kardel@netbsd.org>
 To: Roy Marples <roy@marples.name>, Mindaugas Rasiukevicius <rmind@netbsd.org>
 CC: netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, Christos Zoulas
 	<christos@NetBSD.org>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Sun, 07 May 2017 23:07:42 +0200

 Hi Roy !

 Hmm, wouldn't this bring us back the bug again? ia == NULL for a non-local
 source addresses (generated via pfil_run_hooks-NAT operation) and
 IP_FORWARDING is not set as tcp_input.c:syn_cache_respond does rightfully not
 set IP_FORWARDING and pfil_run_hooks has no means to set that flag. That gives
 us error == -1 with your sequence.
 So we would return EADDRNOTAVAIL breaking packet filter NAT action again, if I
 didn't overlook something.

 From what I understand  this code originally attempted to avoid sending from
 invalid/unusable local address (e. g. duplicate IP - error, tentative and
 detached should just be dropped).
 No validation can be done for non-local addresses at all. IP_FORWARDING
 formerly used to be used to suppress infinite recursion on mcast forwarding,
 but it seems the semantics where extended a little bit in the mean time (like
 here to suppress a check).
 So I cannot say something about the intentions for the IP_FORWARDING check.

 For correct packet filter operation we need to distinguish the local and
 non-local address case here. If we keep the previous logic for IP_FORWARDING
 and ip_ifaddrvalid() then
 the new expression
     if (ia != NULL && (flags & IP_FORWARDING) == 0 &&
         (error = ip_ifaddrvalid(ia)) != 0)
 correctly limits the check to local interface addresses only keeping the
 original intention/implementation for IP_FORWARDING and validity checks.

 If then IP_FORWARDING and validity check needs to be adjusted that part must
 be limited to local interface addresses only. (ia != NULL).

 That's how I understand that code section or the intention there in the normal
 and packet filter context.

 I didn't check IPv6.

 Frank

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/52074: -current npf map directive broken
Date: Tue, 16 May 2017 06:17:51 +0000

 Huge pile more not sent to gnats.

    ------

 From: Christos Zoulas <christos@zoulas.com>
 To: Roy Marples <roy@marples.name>, Mindaugas Rasiukevicius <rmind@netbsd.org>,
 	Frank Kardel <kardel@netbsd.org>
 Cc: netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Sun, 7 May 2017 17:17:28 -0400

 On May 7,  8:50pm, roy@marples.name (Roy Marples) wrote:
 -- Subject: Re: kern/52074: -current npf map directive broken

 | I  think xtos already comitted a fix, but I'm unsure his fix is correct.
 | I think the workflow should be this:
 | 
 | if (ia != NULL)
 | 	error = ip_ifaddrvalid(ia);
 | else
 | 	error = flags & IP_FORWARDING ? 0 : -1;
 | if (error != 0) { ...
 | 
 | The idea is that if we claim to send from an address it has to be valid, 
 | but allow the NULL address if forwarded from the filter.
 | 
 | Does this make sense?
 | The same path probably needs adjustment in inet6.

 Sure, go for it. Why not put all the logic in ip_ifaddrvalid then?

 christos


 From: Robert Elz <kre@munnari.OZ.AU>
 To: Christos Zoulas <christos@zoulas.com>
 cc: Roy Marples <roy@marples.name>, Mindaugas Rasiukevicius <rmind@netbsd.org>,
 	Frank Kardel <kardel@netbsd.org>, netbsd-bugs@netbsd.org,
 	gnats-admin@netbsd.org
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Mon, 08 May 2017 06:12:04 +0700

     Date:        Sun, 7 May 2017 17:17:28 -0400
     From:        christos@zoulas.com (Christos Zoulas)
     Message-ID:  <20170507211728.8AF0A17FDA8@rebar.astron.com>

   | On May 7,  8:50pm, roy@marples.name (Roy Marples) wrote:

   | | The idea is that if we claim to send from an address it has to be valid, 
   | | but allow the NULL address if forwarded from the filter.
   | | 
   | | Does this make sense?
   | | The same path probably needs adjustment in inet6.
   | 
   | Sure, go for it. Why not put all the logic in ip_ifaddrvalid then?

 While you are fixing this, please also fix ...

                          * Address exists, but is tentative or detached.
                          * We can't send from it because it's invalid,

 Tentative addresses are not invalid, and we *have* to be able to send
 from them, without that DAD does't work correctly.

 Consider a (perhaps unlikely, but possible) case where 2 systems attempt
 to allocate the same address at approximately exactly the same time
 (broken dhcp server, or common reboot after power fail and they have both
 been configured to use the same addr by a broken sys-admin).

 In that case they both make the addr tentative, each then starts DAD by
 sending a "does anyone own..." query.   Each receives the other's DAD
 packet.   The protocol then requires that, as we have that address, even
 as a tentative one, we must reply (from the tentative address) and
 claim the addr - both systems should do that, both receive the other's
 defence of the addr, DAD fails (on both) and they both abandon the address.

 But if we are not able to send from the addr, because it is considered
 "invalid" then neither defends its addr, neither receives the other's
 defence, and both then conclude that the addr is safe, and both (eventually)
 reset the tentative flag, and chaos ensues.

 What we should be doing with tentative addreses is allowing them to be
 used (for any purpose), but just not announcing them to the rest of the
 system (except to tools like ifconfig and netstat) until DAD has
 completed.   That means no other process will discover that the addr
 exists, and so will not start to use it (nor will TCP accidentally see
 the addr as existing, and start to use it) and so the incentive that
 led to prohibiting sending from tentative addrs, or considering them
 to be invalid, is gone.

 It also permits applications that really need to start running quickly, and
 know how to deal with addresses that appear and disappear, to start using
 the tentative addr before it is made permanent, if they like (and if they
 can find some mechanism - perhaps using whatever hook ifconfig/netstat use -
 to discover that the addr exists.)

 kre


 From: Roy Marples <roy@marples.name>
 To: Robert Elz <kre@munnari.OZ.AU>, Christos Zoulas <christos@zoulas.com>
 Cc: Mindaugas Rasiukevicius <rmind@netbsd.org>, Frank Kardel
 	<kardel@netbsd.org>, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Mon, 8 May 2017 01:17:18 +0100

 On 08/05/2017 00:12, Robert Elz wrote:
 >     Date:        Sun, 7 May 2017 17:17:28 -0400
 >     From:        christos@zoulas.com (Christos Zoulas)
 >     Message-ID:  <20170507211728.8AF0A17FDA8@rebar.astron.com>
 > 
 >   | On May 7,  8:50pm, roy@marples.name (Roy Marples) wrote:
 > 
 >   | | The idea is that if we claim to send from an address it has to be valid,
 >   | | but allow the NULL address if forwarded from the filter.
 >   | |
 >   | | Does this make sense?
 >   | | The same path probably needs adjustment in inet6.
 >   |
 >   | Sure, go for it. Why not put all the logic in ip_ifaddrvalid then?
 > 
 > While you are fixing this, please also fix ...
 > 
 >                          * Address exists, but is tentative or detached.
 >                          * We can't send from it because it's invalid,
 > 
 > Tentative addresses are not invalid, and we *have* to be able to send
 > from them, without that DAD does't work correctly.

 DAD happens at the ARP level, not the IP level.
 Different checks are in place there.

 > Consider a (perhaps unlikely, but possible) case where 2 systems attempt
 > to allocate the same address at approximately exactly the same time
 > (broken dhcp server, or common reboot after power fail and they have both
 > been configured to use the same addr by a broken sys-admin).
 > 
 > In that case they both make the addr tentative, each then starts DAD by
 > sending a "does anyone own..." query.   Each receives the other's DAD
 > packet.   The protocol then requires that, as we have that address, even
 > as a tentative one, we must reply (from the tentative address) and
 > claim the addr - both systems should do that, both receive the other's
 > defence of the addr, DAD fails (on both) and they both abandon the address.

 Address defence only happens once DAD has passed.
 In the scenario you describe this has not occured for either host.
 If both recieves each others initial probe then both will abort DAD and mark
 as duplicate.
 This is described in RFC5227, 2.1.1:

    In addition, if during this period the host receives any ARP Probe
    where the packet's 'target IP address' is the address being probed
    for, and the packet's 'sender hardware address' is not the hardware
    address of any of the host's interfaces, then the host SHOULD
    similarly treat this as an address conflict and signal an error to
    the configuring agent as above.  This can occur if two (or more)
    hosts have, for whatever reason, been inadvertently configured with
    the same address, and both are simultaneously in the process of
    probing that address to see if it can safely be used.

 And RFC5227, 2.4:
    (b) If a host currently has active TCP connections or other reasons
        to prefer to keep the same IPv4 address, and it has not seen any
        other conflicting ARP packets within the last DEFEND_INTERVAL
        seconds, then it MAY elect to attempt to defend its address by
        recording the time that the conflicting ARP packet was received,
        and then broadcasting one single ARP Announcement, giving its own
        IP and hardware addresses as the sender addresses of the ARP,
        with the 'target IP address' set to its own IP address, and the
        'target hardware address' set to all zeroes.  Having done this,
        the host can then continue to use the address normally without
        any further special action.  However, if this is not the first
        conflicting ARP packet the host has seen, and the time recorded
        for the previous conflicting ARP packet is recent, within
        DEFEND_INTERVAL seconds, then the host MUST immediately cease
        using this address and signal an error to the configuring agent
        as described above.  This is necessary to ensure that two hosts
        do not get stuck in an endless loop with both hosts trying to
        defend the same address.

 > But if we are not able to send from the addr, because it is considered
 > "invalid" then neither defends its addr, neither receives the other's
 > defence, and both then conclude that the addr is safe, and both (eventually)
 > reset the tentative flag, and chaos ensues.

 This is not what happens.
 DAD has not passed, there is no address to defend, it's just marked DUPLICATED
 and stays that way until manual intervention.

 > What we should be doing with tentative addreses is allowing them to be
 > used (for any purpose), but just not announcing them to the rest of the
 > system (except to tools like ifconfig and netstat) until DAD has
 > completed.   That means no other process will discover that the addr
 > exists, and so will not start to use it (nor will TCP accidentally see
 > the addr as existing, and start to use it) and so the incentive that
 > led to prohibiting sending from tentative addrs, or considering them
 > to be invalid, is gone.

 On a bad day, DAD on IPv4 can take about 10 seconds to complete.
 That's a big window to run ifconfig or getifaddrs in to find and start using
 these non announced addresses.
 Consider the wireless interface

 dhcpcd starts, sees all interfaces are currently down, forks right away bootup
 continues
 more daemons start up
 wifi comes up, dhcpcd is offered an address and adds it
 ntpd starts at this point and tries to start using the tentative address.

 > It also permits applications that really need to start running quickly, and
 > know how to deal with addresses that appear and disappear, to start using
 > the tentative addr before it is made permanent, if they like (and if they
 > can find some mechanism - perhaps using whatever hook ifconfig/netstat use -
 > to discover that the addr exists.)

 If the application really neeeds to start running quickly, it can disable DAD
 like so:
 net.inet.ip.dad_count=0

 Otherwise it can wait for the address to finish DAD and then it can start
 using it.

 Roy


 From: Frank Kardel <kardel@kardel.name>
 To: Roy Marples <roy@marples.name>, Mindaugas Rasiukevicius <rmind@netbsd.org>,
 	Frank Kardel <kardel@netbsd.org>
 CC: netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, Christos Zoulas
 	<christos@NetBSD.org>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Mon, 08 May 2017 08:51:21 +0200

 Hi Roy !

 I had a quick look at IPv6. There the packet filter seems to be after local
 address checking and scope related checks.
 I haven't read up on the scope rules - so somebody should look at the
 different pfil run location and whether it is appropriate to handle in case
 the packet filter changes to source address to be a non local source address.

 Frank


 From: Roy Marples <roy@marples.name>
 To: Frank Kardel <kardel@netbsd.org>, Mindaugas Rasiukevicius
 	<rmind@netbsd.org>
 Cc: netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, Christos Zoulas
 	<christos@NetBSD.org>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Mon, 8 May 2017 10:04:01 +0100

 Hi Frank!

 On 07/05/2017 22:07, Frank Kardel wrote:
 > Hmm, wouldn't this bring us back the bug again? ia == NULL for a
 > non-local source addresses (generated via pfil_run_hooks-NAT operation)
 > and IP_FORWARDING is not set as tcp_input.c:syn_cache_respond does
 > rightfully not set IP_FORWARDING and pfil_run_hooks has no means to set
 > that flag. That gives us error == -1 with your sequence.
 > So we would return EADDRNOTAVAIL breaking packet filter NAT action
 > again, if I didn't overlook something.

 Ah yes.
 I didn't actually read your initial report .... my bad!
 OK, so I don't fully understand the use case for sending packets from an
 address we don't have locally. Could you fill me in on this please?

 > From what I understand  this code originally attempted to avoid sending
 > from invalid/unusable local address (e. g. duplicate IP - error,
 > tentative and detached should just be dropped).
 > No validation can be done for non-local addresses at all. IP_FORWARDING
 > formerly used to be used to suppress infinite recursion on mcast
 > forwarding, but it seems the semantics where extended a little bit in
 > the mean time (like here to suppress a check).
 > So I cannot say something about the intentions for the IP_FORWARDING check.

 I *think* the idea was IP_FORWARDING would be set and we could skip
 source address validation because the filter may have changed it.
 https://nxr.netbsd.org/xref/src/sys/net/npf/npf_sendpkt.c#178
 Does NAT not hit that? I ask because I do run NPF+NAT on my erlite
 router which uses -current, but my mapping rule uses a local address -
 hence asking for your use case about a non local address.

 Could NAT set another flag we could check? rmind?

 Roy


 From: Frank Kardel <kardel@netbsd.org>
 To: Roy Marples <roy@marples.name>, Mindaugas Rasiukevicius <rmind@netbsd.org>
 CC: netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, Christos Zoulas
 	<christos@NetBSD.org>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Mon, 08 May 2017 13:23:40 +0200

 On 05/08/17 11:04, Roy Marples wrote:
 > Hi Frank!
 > 
 > On 07/05/2017 22:07, Frank Kardel wrote:
 > > Hmm, wouldn't this bring us back the bug again? ia == NULL for a
 > > non-local source addresses (generated via pfil_run_hooks-NAT operation)
 > > and IP_FORWARDING is not set as tcp_input.c:syn_cache_respond does
 > > rightfully not set IP_FORWARDING and pfil_run_hooks has no means to set
 > > that flag. That gives us error == -1 with your sequence.
 > > So we would return EADDRNOTAVAIL breaking packet filter NAT action
 > > again, if I didn't overlook something.
 > Ah yes.
 > I didn't actually read your initial report .... my bad!
 No worries - I am just trying to keep us from running too far in the wrong
 direction.
 > OK, so I don't fully understand the use case for sending packets from an
 > address we don't have locally. Could you fill me in on this please?
 The mapping is as follows

 Machine A (Client)        Machine B (Gateway/NAT/redirect box)      Machine C
 (target)
 a.b.c.d:x        ->       W.X.Y.Z           -> 10.200.1.2 (will never see a
 packet from a.b.c.dx)
                           map 127.0.0.1 25 <- 10.200.1.2 25
                           redirects to 127.0.0.1 25 (local)
                           A must see the association a.b.c.d:x <->
 10.200.1.2:25

                           Thus local postfix will see a.b.c.d x <-> 127.0.0.1
 25

                           So dest addr 10.200.1.2 will be replaced incoming
 with
                           127.0.0.1 25 in pf_fil_run_hooks in ip_input and on
 the way back
                           the src addr of 127.0.0.1 is replaced with
 10.200.1.2 (not local)
                           in pfil_run_hooks in ip_output.

 To quote the condensed analysis from the bug + where in the stack it happens:

  Following happens:
  1) ip_input: packet arrives on interface (a.b.c.d:x -> 10.200.1.2:25
  1a) ip_input call pfil_run_hooks -> NPF
  2) NPF: NAT assoc gets created
  3) NPF: packet gets NATed (=> a.b.c.d:x -> 127.0.0.1:25)
  3a) NPF returns
  3b) ip_input hands over to tcp_input
  4) tcp_input: packet gets entered in syn_cache
  5) tcp_input: syn_cache attempts to send SYN,ACK response (127.0.0.1:25 ->
  a.b.c.d:x) via ip_output()
  6) ip_output: ip_output calls pfil_hooks (->NPF)
  7) NPF: find nat assoc
  8) NPF: SYN,ACK packet gets reverse NATed (10.200.1.2:25 -> a.b.c.d:x)
  8a) NPF returns
  9) ip_output: ip_output tries for find the interface adress (ia) for 10.200.1.2 and
  fails (ia == NULL)
  10) ip_output: ip_output fumbles badly at ip_output:1.276:620 and declares the
  packet coming from an invalid address(ip_ifaddrvalid(NULL))
  11) confusion commences from here on

 > 
 > >  From what I understand  this code originally attempted to avoid sending
 > > from invalid/unusable local address (e. g. duplicate IP - error,
 > > tentative and detached should just be dropped).
 > > No validation can be done for non-local addresses at all. IP_FORWARDING
 > > formerly used to be used to suppress infinite recursion on mcast
 > > forwarding, but it seems the semantics where extended a little bit in
 > > the mean time (like here to suppress a check).
 > > So I cannot say something about the intentions for the IP_FORWARDING check.
 > I *think* the idea was IP_FORWARDING would be set and we could skip
 > source address validation because the filter may have changed it.
 > https://nxr.netbsd.org/xref/src/sys/net/npf/npf_sendpkt.c#178
 That's what I thought was the intention. As you see above the transmitted
 SYN,ACK is
 not created by NPF and not by npf_sendpkt.c#178. NPF and other filters do an
 in place header modification and the flags variable is not part of
 pfil_run_hooks(). So
 the IP_FORWARDING logic cannot be used by the packet filters AFAICS.
 > Does NAT not hit that?
 Nope - see above - coming directly from
 tcp_input:syn_cache_add:syn_cache_respond. IP_FORWARD is not set there.
 Before the packet filter we still have a valid local address, Maybe we should
 move the check there - IPv6 also does the check before IIANM.
 >   I ask because I do run NPF+NAT on my erlite
 > router which uses -current, but my mapping rule uses a local address -
 > hence asking for your use case about a non local address.
 > 
 > Could NAT set another flag we could check? rmind?
 OTOMHI see
     1) extend pfil_run_hooks interface to pass a pointer to flags- relatively
 cheap, somewhat dirty
     2) add a tag the the mbuf -  more expensive b/c we also need to add checks

 Would we gain enough speed/functionality/abstraction to warrant an interface
 change?
 Does the condition "before packet filter we have either IP_FORWARD set or a
 local address as source" always hold?
 If so, we could just move the check and set up ia correctly for IFASTATS.IPv6
 checks before packet filter runs.

 I have not analyzed the invariants enough to make a quick recommendation here.

 I hope I explained the setup good enough to describe the issue with NAT
 redirection.

 > 
 > Roy
 Frank


 From: Robert Elz <kre@munnari.OZ.AU>
 To: Frank Kardel <kardel@netbsd.org>
 cc: Roy Marples <roy@marples.name>, Mindaugas Rasiukevicius <rmind@netbsd.org>,
 	netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, Christos Zoulas
 	<christos@NetBSD.org>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Wed, 10 May 2017 05:45:11 +0700

     Date:        Sun, 07 May 2017 23:07:42 +0200
     From:        Frank Kardel <kardel@netbsd.org>
     Message-ID:  <590F8C9E.3040102@netbsd.org>

   | From what I understand  this code originally attempted to avoid sending 
   | from invalid/unusable local address (e. g. duplicate IP - error, 
   | tentative and detached should just be dropped).

 You also shouldn't be able to send from an address you don't own
 (generally - a router has to be able to forward, as distinct from
 originate, packets from anywhere of course).

 Maybe you could explain what you're trying to achieve, rather than
 worry too much (right now) at some particular kernel test which seems
 to be defeating the way you are currently trying to accomplish that.

 If your aim is to have machine "B" (the router/NAT box) from your
 later e-mail example, intercept SMTP (and perhaps other) connection
 attempts that your internal system (A) is attempting to make to
 some external system (C) - so that the connection is handled by B
 instead (acting as a proxy) then I suspect that someone can share
 a config which accomplishes that ... but I very much doubt it will
 involve something so weird as attempting to NAT into the loopback address.

 If you have some other aim, explain it, and someone can probably
 show how to accomplish it.   But please explain the objective, not
 the technique you believe you can use to meet that objective.

 Once we know what it is you want to do, it is still possible that
 kernel (or other) changes might be needed to accomplish it (or it
 might just not be possible at all), but if changes are needed the
 right ones to make could be somewhere quite different from what
 you have been concentrating on.

 kre

 ps: this issue is of course totally unrelated to the other question of
 whether tentative addresses should be considered invalid, which certainly
 has nothing to do with your problem.


 From: Frank Kardel <kardel@netbsd.org>
 To: Robert Elz <kre@munnari.OZ.AU>
 CC: Roy Marples <roy@marples.name>, Mindaugas Rasiukevicius <rmind@netbsd.org>,
 	netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, Christos Zoulas
 	<christos@NetBSD.org>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Wed, 10 May 2017 11:11:13 +0200

 Hi Robert !

 On 05/10/17 00:45, Robert Elz wrote:
 >      Date:        Sun, 07 May 2017 23:07:42 +0200
 >      From:        Frank Kardel <kardel@netbsd.org>
 >      Message-ID:  <590F8C9E.3040102@netbsd.org>
 > 
 >    | From what I understand  this code originally attempted to avoid sending
 >    | from invalid/unusable local address (e. g. duplicate IP - error,
 >    | tentative and detached should just be dropped).
 > 
 > You also shouldn't be able to send from an address you don't own
 > (generally - a router has to be able to forward, as distinct from
 > originate, packets from anywhere of course).
 You are correct - in this case (52074) we are looking at both aspects - the
 local machine and the router/NAT box.
 It is *not* about originating packets from anywhere. It is about redirecting
 packets for non local targets to a locally existing proxy.
 > 
 > Maybe you could explain what you're trying to achieve, rather than
 > worry too much (right now) at some particular kernel test which seems
 > to be defeating the way you are currently trying to accomplish that.
 I am just reporting a regression that was introduced by the code related local
 address checks.
 > 
 > If your aim is to have machine "B" (the router/NAT box) from your
 > later e-mail example, intercept SMTP (and perhaps other) connection
 > attempts that your internal system (A) is attempting to make to
 > some external system (C) - so that the connection is handled by B
 > instead (acting as a proxy) then I suspect that someone can share
 > a config which accomplishes that ...
 See man 5 npf.conf - there is an example. It also used to work with pf years
 ago but I used pf long ago
 and haven't retested it with last years address-check/tentative/detached
 changes. pf could
 also be impacted by last years change.

 Example from pf.conf(5)
      Note that redirecting external incoming connections to the loopback
      address, as in

            rdr on ne3 inet proto tcp to port spamd -> 127.0.0.1 port smtp

 Example from npf.conf(5)
      map $ext_if dynamic proto tcp 10.1.1.2 port 22 <- $ext_if port 9022

 Not necessarily a local address, but local addresses are not forbidden either.
 >   but I very much doubt it will
 > involve something so weird as attempting to NAT into the loopback address.
 (Transparent) proxies might very well do that at boundaries between external
 and internal networks.
 Using any other local address instead of 127.0.0.1 wouldn't change a thing for
 the discussion as the
 original target address is redirected to the local address and needs to be
 rewritten back to the
 original target address on the way back.
 > 
 > If you have some other aim, explain it, and someone can probably
 > show how to accomplish it.   But please explain the objective, not
 > the technique you believe you can use to meet that objective.
 It is a proxy for a system C that is not always online. And the proxy is
 located at the gateway.
 The solution is a common redirect scenario. nothing special unless you have a
 regression in the stack.

 I believe that the current  local address check is just placed wrong. But I
 just uncovered that
 while analyzing why the NAT redirect broke - I am not responsible for the
 address check code which broke
 NAT redirection to local addresses as an unwanted/unexpected side effect.

 I think we have a discussion going on how to implement the address check
 properly. And I think
 there may be a better solution which needs to be found and implemented which
 enforces the local address
 check more sensibly..

 If we put the local address check before the packet filter call (doing the
 reverse NAT)
 then all would be fine, But I think it should be discussed (roy@, rmind@,
 christos@, ...)
 whether this is the right solution to have a valid local address check and the
 redirect
 to local address NAT functionality.
 > 
 > Once we know what it is you want to do, it is still possible that
 > kernel (or other) changes might be needed to accomplish it (or it
 > might just not be possible at all), but if changes are needed the
 > right ones to make could be somewhere quite different from what
 > you have been concentrating on.
 A proxy on a gateway is something that should work and it used to work with
 npf and pf.
 Last year a regression was introduced breaking that functionality.

 I am happy learn another way to configure a proxy using local addresses of the
 gateway
 *without* changing ip_output.c - we just need to fix our own documentation
 then,

 Can you provide an alternative configuration to accomplish the above?
 > kre
 > 
 > ps: this issue is of course totally unrelated to the other question of
 > whether tentative addresses should be considered invalid, which certainly
 > has nothing to do with your problem.
 > 
 Frank


 From: Roy Marples <roy@marples.name>
 To: Robert Elz <kre@munnari.OZ.AU>, Frank Kardel <kardel@netbsd.org>
 Cc: Mindaugas Rasiukevicius <rmind@netbsd.org>, netbsd-bugs@netbsd.org,
 	gnats-admin@netbsd.org, Christos Zoulas <christos@NetBSD.org>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Wed, 10 May 2017 13:22:01 +0100

 On 09/05/2017 23:45, Robert Elz wrote:
 >     Date:        Sun, 07 May 2017 23:07:42 +0200
 >     From:        Frank Kardel <kardel@netbsd.org>
 >     Message-ID:  <590F8C9E.3040102@netbsd.org>
 > 
 >   | From what I understand  this code originally attempted to avoid sending 
 >   | from invalid/unusable local address (e. g. duplicate IP - error, 
 >   | tentative and detached should just be dropped).
 > 
 > You also shouldn't be able to send from an address you don't own
 > (generally - a router has to be able to forward, as distinct from
 > originate, packets from anywhere of course).

 This is what my initial code did.
 What I'm more concerned about though is the panic.
 I think we should revert xtos's change and solve the panic, as this just
 masks over it.

 Roy


 From: Christos Zoulas <christos@zoulas.com>
 To: Roy Marples <roy@marples.name>, Robert Elz <kre@munnari.OZ.AU>, Frank
 	Kardel <kardel@netbsd.org>
 Cc: Mindaugas Rasiukevicius <rmind@netbsd.org>, netbsd-bugs@netbsd.org,
 	gnats-admin@netbsd.org
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Wed, 10 May 2017 09:33:25 -0400

 On May 10,  1:22pm, roy@marples.name (Roy Marples) wrote:
 -- Subject: Re: kern/52074: -current npf map directive broken

 | > You also shouldn't be able to send from an address you don't own
 | > (generally - a router has to be able to forward, as distinct from
 | > originate, packets from anywhere of course).
 | 
 | This is what my initial code did.
 | What I'm more concerned about though is the panic.
 | I think we should revert xtos's change and solve the panic, as this just
 | masks over it.

 This is not about fixing the panic. It is about the ability of the packet
 filter to construct packets for which the origin interface cannot be
 determined from the packet source address.

 christos


 From: Frank Kardel <kardel@netbsd.org>
 To: Roy Marples <roy@marples.name>, Robert Elz <kre@munnari.OZ.AU>
 CC: Mindaugas Rasiukevicius <rmind@netbsd.org>, netbsd-bugs@netbsd.org,
 	gnats-admin@netbsd.org, Christos Zoulas <christos@NetBSD.org>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Wed, 10 May 2017 15:37:50 +0200

 Hi Roy !

 Don't worry about the panic at all - it is unrelated - that was a side issue I
 already analyzed and discussed with christos@ and rmind@. Root cause there is
 currently a semantic problem in NPF regarding naming of dynamic rules.

 Frank


 From: Christos Zoulas <christos@zoulas.com>
 To: Frank Kardel <kardel@netbsd.org>, Roy Marples <roy@marples.name>, Robert
 	Elz <kre@munnari.OZ.AU>
 Cc: Mindaugas Rasiukevicius <rmind@netbsd.org>, netbsd-bugs@netbsd.org,
 	gnats-admin@netbsd.org
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Wed, 10 May 2017 09:42:34 -0400

 On May 10,  3:37pm, kardel@netbsd.org (Frank Kardel) wrote:
 -- Subject: Re: kern/52074: -current npf map directive broken

 | Hi Roy !
 | 
 | Don't worry about the panic at all - it is unrelated - that was a side 
 | issue I already analyzed and discussed with christos@ and rmind@. Root 
 | cause there is currently a semantic problem in NPF regarding naming of 
 | dynamic rules.

 To be even more clear: the panic is unrelated to packet forwarding and
 has to do with an internal issue of npf; that issue is unrelated to
 the formation of packets by npf.

 christos


 From: Robert Elz <kre@munnari.OZ.AU>
 To: Roy Marples <roy@marples.name>
 cc: Frank Kardel <kardel@netbsd.org>, Mindaugas Rasiukevicius
 	<rmind@netbsd.org>, netbsd-bugs@netbsd.org, gnats-admin@netbsd.org,
 	Christos Zoulas <christos@NetBSD.org>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Thu, 11 May 2017 17:43:27 +0700

     Date:        Thu, 11 May 2017 10:47:28 +0100
     From:        Roy Marples <roy@marples.name>
     Message-ID:  <c2d2d033-a421-df5f-a0cd-4046b7d00808@marples.name>

   | I agree with Robert, we shouldn't be sending packets on the wire from an
   | address we don't own.
   | But you're not sending on the wire are you?

 Actually, he is, he wants to masquerade as some other (remote) host
 (proxy for them) and return packets as if the client host was communicating
 with that one, when it is really communicating with his NAT box.

 That is, he wants to pretend he is forwarding a packet from the remote
 host back to the client, when the packet has actually been originated
 locally.

 This has no chance of working if any protocol that verifies identity
 is in use (even SMTP has that available now, I think), but otherwise
 could work.

 Personally I wouldn't be trying to fake it at that level though, I'd
 be returning MX records to the client listing the NAT box as the
 preferred (only, probably) mail relay for the host in question
 (or for anyone that the client wants to talk to, if that is the
 objective.)   Of course, that only works for SMTP.

 kre


 From: Joerg Sonnenberger <joerg@bec.de>
 To: Roy Marples <roy@marples.name>
 Cc: netbsd-bugs@netbsd.org, gnats-admin@netbsd.org
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Thu, 11 May 2017 14:19:49 +0200

 On Thu, May 11, 2017 at 10:47:28AM +0100, Roy Marples wrote:
 > I agree with Robert, we shouldn't be sending packets on the wire from an
 > address we don't own.

 That depends. The transparent proxy case is tricky in this regard...

 Joerg


 From: Frank Kardel <kardel@netbsd.org>
 To: Roy Marples <roy@marples.name>, Robert Elz <kre@munnari.OZ.AU>
 CC: Mindaugas Rasiukevicius <rmind@netbsd.org>, netbsd-bugs@netbsd.org,
 	gnats-admin@netbsd.org, Christos Zoulas <christos@NetBSD.org>, Martin
 	Husemann <martin@duskware.de>, Joerg Sonnenberger <joerg@bec.de>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Thu, 11 May 2017 22:33:58 +0200

 Hi Roy and all!

 On 05/11/17 11:47, Roy Marples wrote:
 > Hi Frank On 10/05/2017 10:11, Frank Kardel wrote:
 > > On 05/10/17 00:45, Robert Elz wrote:
 > > > Date: Sun, 07 May 2017 23:07:42 +0200 From: Frank Kardel
 > > > <kardel@netbsd.org> Message-ID: <590F8C9E.3040102@netbsd.org> |
 > > > From what I understand this code originally attempted to avoid
 > > > sending | from invalid/unusable local address (e. g. duplicate IP
 > > > - error, | tentative and detached should just be dropped). You
 > > > also shouldn't be able to send from an address you don't own
 > > > (generally - a router has to be able to forward, as distinct from
 > > > originate, packets from anywhere of course).
 > > You are correct - in this case (52074) we are looking at both aspects
 > > - the local machine and the router/NAT box. It is *not* about
 > > originating packets from anywhere. It is about redirecting packets for
 > > non local targets to a locally existing proxy.
 > I agree with Robert, we shouldn't be sending packets on the wire from an
 > address we don't own. But you're not sending on the wire are you?
 See kre@'s comment. He described it.The local addres gets replaced with the
 remote address we proxied for.
 > I think a check to satisfy us all would be to test for IP_FORWARDING on
 > the packet or IFF_LOOPBACK on the outgoing interface - if either are true
 > we can skip address validation.
 The IFF_LOOPBACK property is of no use here. What counts is that the redirect
 is to a local address to masquerade a remote address.
 Any local address used for redirect would trigger the issue as the packet
 filter puts back the original remote address and that's violating the
 assumption
 the we must use local addresses on locally generated packets.
 > Thoughts? Roy
 I think the solution is much simpler (as I mentioned before).

 If we actually take look at ip_output we see roughly following processing
 steps:
  1) insert ip options
  2) create IP header unless forwarding or raw output
  3) handle routing (choose output interface)
  4) if no source address is set, use the interface address of the outgoing
 interface
  5) handle IPSEC
  6) run packet filter
  7) check whether we are sending from a valid local address (our discussion
 point)
  8) do xmit accounting
  9) handle TSO offloading or packet not needing fragmentation and output via
 interface -> done
 10) handle checksum offloading/calculation
 11) handle fragmentation output fragmented packets
 12) finalize

 During 1...5 the assumption/requirement that non IP_FORWARDING packets must
 originate from local addresses holds.
 Number 6 breaks this assumption in case of a redirected remote address to a
 local address. There the test in 7 relies
 on a requirement that cannot be guaranteed after packet filter NAT operations.

 Moving ia pickup and !IP_FORWARDING local address check from 7 to 5a will
 achieve what we want (guard against from invalid local addresses). Also
 IFA accounting would correctly measure bytes-sent at the right local address
 involved in the redirect scenario.

 I am currently running a check on local address violations before the packet
 filter call. The invariant that 1..5 follow the local address requirement
 holds so
 far. Traffic processed is: normal connections, ipsec transport/NATT, some
 multicast.

 Can somebody verify that strategy as a correct solution?

 NB: the current fix has the drawback of at least miscounting output bytes and
 missing some local checks.

 Frank


 From: Roy Marples <roy@marples.name>
 To: Frank Kardel <kardel@netbsd.org>, Robert Elz <kre@munnari.OZ.AU>
 Cc: Mindaugas Rasiukevicius <rmind@netbsd.org>, netbsd-bugs@netbsd.org,
 	gnats-admin@netbsd.org, Christos Zoulas <christos@NetBSD.org>, Martin
 	Husemann <martin@duskware.de>, Joerg Sonnenberger <joerg@bec.de>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Thu, 11 May 2017 21:42:07 +0100

 Yo

 On 11/05/2017 21:33, Frank Kardel wrote:
 > I think the solution is much simpler (as I mentioned before).
 > 
 > If we actually take look at ip_output we see roughly following
 > processing steps:
 >  1) insert ip options
 >  2) create IP header unless forwarding or raw output
 >  3) handle routing (choose output interface)
 >  4) if no source address is set, use the interface address of the
 > outgoing interface
 >  5) handle IPSEC
 >  6) run packet filter
 >  7) check whether we are sending from a valid local address (our
 > discussion point)
 >  8) do xmit accounting
 >  9) handle TSO offloading or packet not needing fragmentation and output
 > via interface -> done
 > 10) handle checksum offloading/calculation
 > 11) handle fragmentation output fragmented packets
 > 12) finalize
 > 
 > During 1...5 the assumption/requirement that non IP_FORWARDING packets
 > must originate from local addresses holds.
 > Number 6 breaks this assumption in case of a redirected remote address
 > to a local address. There the test in 7 relies
 > on a requirement that cannot be guaranteed after packet filter NAT
 > operations.
 > 
 > Moving ia pickup and !IP_FORWARDING local address check from 7 to 5a
 > will achieve what we want (guard against from invalid local addresses).
 > Also
 > IFA accounting would correctly measure bytes-sent at the right local
 > address involved in the redirect scenario.
 > 
 > I am currently running a check on local address violations before the
 > packet filter call. The invariant that 1..5 follow the local address
 > requirement holds so
 > far. Traffic processed is: normal connections, ipsec transport/NATT,
 > some multicast.
 > 
 > Can somebody verify that strategy as a correct solution?
 > 
 > NB: the current fix has the drawback of at least miscounting output
 > bytes and missing some local checks.

 Why not grab the pre-filter address at 5 and if the post filter address is
 NULL, use the pre-filter address?

 Would that work and solve the counting issue as well?

 Roy


 From: Frank Kardel <kardel@netbsd.org>
 To: Roy Marples <roy@marples.name>, Robert Elz <kre@munnari.OZ.AU>
 CC: Mindaugas Rasiukevicius <rmind@netbsd.org>, netbsd-bugs@netbsd.org,
 	gnats-admin@netbsd.org, Christos Zoulas <christos@NetBSD.org>, Martin
 	Husemann <martin@duskware.de>, Joerg Sonnenberger <joerg@bec.de>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Thu, 11 May 2017 23:06:42 +0200

 Hi Roy !

 Would be possible, but I think we are not gaining any benefit from the
 additional look-up.
 Before the packet filter we see the originating interface. That is the one for
 which we need to count the bytes sent.
 After the packet filter we see either no change most of the time, some other
 local interface or nothing.
 So I would prefer just to look the the originating interface.

 There is one subtle case for not being able to find the correct local
 interface from the source address.
 In BSD multiple interfaces can have the same IP address. In this situation we
 cannot correctly determine
 the correct interface from the source address alone for bytes-sent accounting.
 As this didn't work properly up to now we are not making things worse with
 respect to bytes-sent accounting.

 Frank


From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/52074: -current npf map directive broken
Date: Tue, 16 May 2017 06:20:11 +0000

 Another.

    ------

 From: Roy Marples <roy@marples.name>
 To: Frank Kardel <kardel@netbsd.org>, Robert Elz <kre@munnari.OZ.AU>
 Cc: Mindaugas Rasiukevicius <rmind@netbsd.org>, netbsd-bugs@netbsd.org,
 	gnats-admin@netbsd.org, Christos Zoulas <christos@NetBSD.org>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Thu, 11 May 2017 10:47:28 +0100

 Hi Frank

 On 10/05/2017 10:11, Frank Kardel wrote:
 > On 05/10/17 00:45, Robert Elz wrote:
 >>      Date:        Sun, 07 May 2017 23:07:42 +0200
 >>      From:        Frank Kardel <kardel@netbsd.org>
 >>      Message-ID:  <590F8C9E.3040102@netbsd.org>
 >>
 >>    | From what I understand  this code originally attempted to avoid
 >> sending
 >>    | from invalid/unusable local address (e. g. duplicate IP - error,
 >>    | tentative and detached should just be dropped).
 >>
 >> You also shouldn't be able to send from an address you don't own
 >> (generally - a router has to be able to forward, as distinct from
 >> originate, packets from anywhere of course).
 > You are correct - in this case (52074) we are looking at both aspects -
 > the local machine and the router/NAT box.
 > It is *not* about originating packets from anywhere. It is about
 > redirecting packets for non local targets to a locally existing proxy.

 I agree with Robert, we shouldn't be sending packets on the wire from an
 address we don't own.
 But you're not sending on the wire are you?

 I think a check to satisfy us all would be to test for IP_FORWARDING on
 the packet or IFF_LOOPBACK on the outgoing interface - if either are
 true we can skip address validation.

 Thoughts?

 Roy

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/52074: -current npf map directive broken
Date: Mon, 12 Mar 2018 00:22:25 +0000

 Not sent to gnats.

    ------

 From: Mindaugas Rasiukevicius <rmind@netbsd.org>
 To: Frank Kardel <kardel@netbsd.org>
 Cc: Roy Marples <roy@marples.name>, Robert Elz <kre@munnari.OZ.AU>,
 	netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, Christos Zoulas
 	<christos@NetBSD.org>, Martin Husemann <martin@duskware.de>, Joerg
 	Sonnenberger <joerg@bec.de>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Sun, 10 Dec 2017 00:31:09 +0000

 Frank Kardel <kardel@netbsd.org> wrote:
 > Hi Roy !
 > 
 > Would be possible, but I think we are not gaining any benefit from the 
 > additional look-up.
 > Before the packet filter we see the originating interface. That is the 
 > one for which we need to count the bytes sent.
 > After the packet filter we see either no change most of the time, some 
 > other local interface or nothing.
 > So I would prefer just to look the the originating interface.
 > 
 > There is one subtle case for not being able to find the correct local 
 > interface from the source address.
 > In BSD multiple interfaces can have the same IP address. In this 
 > situation we cannot correctly determine
 > the correct interface from the source address alone for bytes-sent 
 > accounting.
 > As this didn't work properly up to now we are not making things worse 
 > with respect to bytes-sent accounting.
 > 

 AFAIK this fixed in -current, right?

 -- 
 Mindaugas


 From: Roy Marples <roy@marples.name>
 To: Mindaugas Rasiukevicius <rmind@netbsd.org>, Frank Kardel
 	<kardel@netbsd.org>
 Cc: Robert Elz <kre@munnari.OZ.AU>, netbsd-bugs@netbsd.org,
 	gnats-admin@netbsd.org, Christos Zoulas <christos@NetBSD.org>, Martin
 	Husemann <martin@duskware.de>, Joerg Sonnenberger <joerg@bec.de>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Sun, 10 Dec 2017 10:43:37 +0000

 On 10/12/2017 00:31, Mindaugas Rasiukevicius wrote:
 > Frank Kardel <kardel@netbsd.org> wrote:
 > > Hi Roy !
 > > 
 > > Would be possible, but I think we are not gaining any benefit from the
 > > additional look-up.
 > > Before the packet filter we see the originating interface. That is the
 > > one for which we need to count the bytes sent.
 > > After the packet filter we see either no change most of the time, some
 > > other local interface or nothing.
 > > So I would prefer just to look the the originating interface.
 > > 
 > > There is one subtle case for not being able to find the correct local
 > > interface from the source address.
 > > In BSD multiple interfaces can have the same IP address. In this
 > > situation we cannot correctly determine
 > > the correct interface from the source address alone for bytes-sent
 > > accounting.
 > > As this didn't work properly up to now we are not making things worse
 > > with respect to bytes-sent accounting.
 > > 
 > 
 > AFAIK this fixed in -current, right?
 > 

 Christos fixed if in sys/netinet/ip_output -r1.277

 Roy


 From: Frank Kardel <kardel@netbsd.org>
 To: Roy Marples <roy@marples.name>, Mindaugas Rasiukevicius <rmind@netbsd.org>
 Cc: Robert Elz <kre@munnari.OZ.AU>, netbsd-bugs@netbsd.org,
 	gnats-admin@netbsd.org, Christos Zoulas <christos@NetBSD.org>, Martin
 	Husemann <martin@duskware.de>, Joerg Sonnenberger <joerg@bec.de>
 Subject: Re: kern/52074: -current npf map directive broken
 Date: Sun, 10 Dec 2017 11:48:42 +0100

 Yes the current ip_output is fixed in that regard.

 Frank

State-Changed-From-To: analyzed->closed
State-Changed-By: rmind@NetBSD.org
State-Changed-When: Sat, 19 Jan 2019 21:47:16 +0000
State-Changed-Why:
The problem was unrelated to NPF; it has been fixed.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.