NetBSD Problem Report #52643

From www@NetBSD.org  Mon Oct 23 07:40:36 2017
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 9F2E47A1CE
	for <gnats-bugs@gnats.NetBSD.org>; Mon, 23 Oct 2017 07:40:36 +0000 (UTC)
Message-Id: <20171023074034.6B6177A1CE@mollari.NetBSD.org>
Date: Mon, 23 Oct 2017 07:40:34 +0000 (UTC)
From: ozaki-r@netbsd.org
Reply-To: ozaki-r@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: npfctl reload on rump kernels fails randomly
X-Send-Pr-Version: www-1.0

>Number:         52643
>Category:       kern
>Synopsis:       npfctl reload on rump kernels fails randomly
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Oct 23 07:45:00 +0000 2017
>Closed-Date:    Fri Aug 31 14:40:45 +0000 2018
>Last-Modified:  Fri Aug 31 14:40:45 +0000 2018
>Originator:     Ryota Ozaki
>Release:        -current (and maybe earlier releases)
>Organization:
>Environment:
NetBSD rangeley 8.99.3 NetBSD 8.99.3 (RANGELEY) #77: Wed Oct  4 10:23:38 JST 2017  ozaki-r@rangeley:(hidden) amd64
>Description:
npfctl reload fails sometimes on rump kernels with EIO like this:

  tc-so:Executing command [ env LD_PRELOAD=/usr/lib/librumphijack.so RUMPHIJACK=path=/rump,socket=all:nolocal,sysctl=yes,blanket=/dev/npf npfctl reload ./npf.conf ]
  tc-se:Fail: incorrect exit status: 1, expected: 0
  tc-se:stdout:
  tc-se:
  tc-se:stderr:
  tc-se:npfctl: source  line 333501697
  tc-se:npfctl: object: 535
  tc-se:npfctl: npfctl_config_send: Input/output error
  tc-se:

The error message is generated by libnpf that uses libprop.

npfctl uses npf_config_submit of libnpf to call ioctl to the kernel.
libnpf uses prop_dictionary_sendrecv_ioctl to call ioctl and
get an errno of the ioctl by prop_dictionary_get_int32.
On npfctl reload on rump kernels, prop_dictionary_sendrecv_ioctl succeeds
but prop_dictionary_get_int32 fails because errno isn't set correctly.

It seems that npfctl_load that is a kernel function of npf and called
via the prop_dictionary_sendrecv_ioctl doesn't set the errno to the return
value of libprop: https://nxr.netbsd.org/xref/src/sys/net/npf/npf_ctl.c#635

          /* Error report. */
  #if !defined(_NPF_TESTING) && !defined(_NPF_STANDALONE)
          prop_dictionary_set_int32(errdict, "errno", error);
          prop_dictionary_copyout_ioctl(pref, cmd, errdict);
          prop_object_release(errdict);
          error = 0;
  #endif
          return error;

The code setting an errno is trimmed on rump kernels since by default
rump kernels set _NPF_TESTING. If we enable the code, npfctl reload
works expectedly.

Note that npfctl reload sometimes succeeds, I guess, because npf_config_submit
tries to read an errno from a uninitialized memory and it sometimes works luckily.
>How-To-Repeat:
# From an ATF test script that is under development

rump_server -lrumpnet -lrumpnet_net -lrumpnet_netinet -lrumpnet_shmif -lrumpdev -lrumpvfs -lrumpdev_bpf -lrumpnet_bpfjit -lrumpnet_npf -lrumpnet_netinet6 unix://npf_nat_local

export RUMP_SERVER=unix://npf_nat_local
rump.ifconfig shmif0 create
rump.ifconfig shmif0 linkstr ./bus_npf_local
rump.ifconfig shmif1 create
rump.ifconfig shmif1 linkstr ./bus_npf_nat

rump.sysctl -q -w net.inet.ip.dad_count=0
rump.ifconfig shmif0 10.0.1.1/24
rump.ifconfig shmif1 20.0.0.1/24
rump.sysctl -q -w net.inet.ip.forwarding=1
rump.route -n add -net 10.0.2.0 20.0.0.2

cat > ./npf.conf <<-EOF
set bpf.jit off
\$int_if = inet4(shmif0)
\$ext_if = inet4(shmif1)
\$localnet = { 10.0.1.0/24 }
map \$ext_if dynamic \$localnet -> \$ext_if
group "external" on \$ext_if {
        pass stateful out final all
}
group "internal" on \$int_if {
        block in all
        pass in final from \$localnet
        pass out final all
}
group default {
        pass final on lo0 all
        block all
}
EOF

env LD_PRELOAD=/usr/lib/librumphijack.so RUMPHIJACK=path=/rump,socket=all:nolocal,sysctl=yes,blanket=/dev/npf npfctl reload ./npf.conf
>Fix:
As described above, enabling the code fixes the issue, but I'm not sure
if the fix is appropriate.

diff --git a/sys/net/npf/npf_ctl.c b/sys/net/npf/npf_ctl.c
index b1af2d347ac..43105db9c85 100644
--- a/sys/net/npf/npf_ctl.c
+++ b/sys/net/npf/npf_ctl.c
@@ -633,12 +633,12 @@ fail:
        prop_object_release(npf_dict);

        /* Error report. */
-#if !defined(_NPF_TESTING) && !defined(_NPF_STANDALONE)
+//#if !defined(_NPF_TESTING) && !defined(_NPF_STANDALONE)
        prop_dictionary_set_int32(errdict, "errno", error);
        prop_dictionary_copyout_ioctl(pref, cmd, errdict);
        prop_object_release(errdict);
        error = 0;
-#endif
+//#endif
        return error;
 }


>Release-Note:

>Audit-Trail:
From: Ryota Ozaki <ozaki-r@netbsd.org>
To: "gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/52643: npfctl reload on rump kernels fails randomly
Date: Fri, 27 Oct 2017 12:41:33 +0900

 On Mon, Oct 23, 2017 at 4:45 PM,  <ozaki-r@netbsd.org> wrote:
 >>Number:         52643
 >>Category:       kern
 >>Synopsis:       npfctl reload on rump kernels fails randomly
 >>Confidential:   no
 >>Severity:       serious
 >>Priority:       medium
 >>Responsible:    kern-bug-people
 >>State:          open
 >>Class:          sw-bug
 >>Submitter-Id:   net
 >>Arrival-Date:   Mon Oct 23 07:45:00 +0000 2017
 >>Originator:     Ryota Ozaki
 >>Release:        -current (and maybe earlier releases)
 >>Organization:
 >>Environment:
 > NetBSD rangeley 8.99.3 NetBSD 8.99.3 (RANGELEY) #77: Wed Oct  4 10:23:38 JST 2017  ozaki-r@rangeley:(hidden) amd64
 >>Description:
 > npfctl reload fails sometimes on rump kernels with EIO like this:
 >
 >   tc-so:Executing command [ env LD_PRELOAD=/usr/lib/librumphijack.so RUMPHIJACK=path=/rump,socket=all:nolocal,sysctl=yes,blanket=/dev/npf npfctl reload ./npf.conf ]
 >   tc-se:Fail: incorrect exit status: 1, expected: 0
 >   tc-se:stdout:
 >   tc-se:
 >   tc-se:stderr:
 >   tc-se:npfctl: source  line 333501697
 >   tc-se:npfctl: object: 535
 >   tc-se:npfctl: npfctl_config_send: Input/output error
 >   tc-se:
 >
 > The error message is generated by libnpf that uses libprop.
 >
 > npfctl uses npf_config_submit of libnpf to call ioctl to the kernel.
 > libnpf uses prop_dictionary_sendrecv_ioctl to call ioctl and
 > get an errno of the ioctl by prop_dictionary_get_int32.
 > On npfctl reload on rump kernels, prop_dictionary_sendrecv_ioctl succeeds
 > but prop_dictionary_get_int32 fails because errno isn't set correctly.
 >
 > It seems that npfctl_load that is a kernel function of npf and called
 > via the prop_dictionary_sendrecv_ioctl doesn't set the errno to the return
 > value of libprop: https://nxr.netbsd.org/xref/src/sys/net/npf/npf_ctl.c#635
 >
 >           /* Error report. */
 >   #if !defined(_NPF_TESTING) && !defined(_NPF_STANDALONE)
 >           prop_dictionary_set_int32(errdict, "errno", error);
 >           prop_dictionary_copyout_ioctl(pref, cmd, errdict);
 >           prop_object_release(errdict);
 >           error = 0;
 >   #endif
 >           return error;
 >
 > The code setting an errno is trimmed on rump kernels since by default
 > rump kernels set _NPF_TESTING. If we enable the code, npfctl reload
 > works expectedly.
 >
 > Note that npfctl reload sometimes succeeds, I guess, because npf_config_submit
 > tries to read an errno from a uninitialized memory and it sometimes works luckily.
 >>How-To-Repeat:
 > # From an ATF test script that is under development
 >
 > rump_server -lrumpnet -lrumpnet_net -lrumpnet_netinet -lrumpnet_shmif -lrumpdev -lrumpvfs -lrumpdev_bpf -lrumpnet_bpfjit -lrumpnet_npf -lrumpnet_netinet6 unix://npf_nat_local
 >
 > export RUMP_SERVER=unix://npf_nat_local
 > rump.ifconfig shmif0 create
 > rump.ifconfig shmif0 linkstr ./bus_npf_local
 > rump.ifconfig shmif1 create
 > rump.ifconfig shmif1 linkstr ./bus_npf_nat
 >
 > rump.sysctl -q -w net.inet.ip.dad_count=0
 > rump.ifconfig shmif0 10.0.1.1/24
 > rump.ifconfig shmif1 20.0.0.1/24
 > rump.sysctl -q -w net.inet.ip.forwarding=1
 > rump.route -n add -net 10.0.2.0 20.0.0.2
 >
 > cat > ./npf.conf <<-EOF
 > set bpf.jit off
 > \$int_if = inet4(shmif0)
 > \$ext_if = inet4(shmif1)
 > \$localnet = { 10.0.1.0/24 }
 > map \$ext_if dynamic \$localnet -> \$ext_if
 > group "external" on \$ext_if {
 >         pass stateful out final all
 > }
 > group "internal" on \$int_if {
 >         block in all
 >         pass in final from \$localnet
 >         pass out final all
 > }
 > group default {
 >         pass final on lo0 all
 >         block all
 > }
 > EOF
 >
 > env LD_PRELOAD=/usr/lib/librumphijack.so RUMPHIJACK=path=/rump,socket=all:nolocal,sysctl=yes,blanket=/dev/npf npfctl reload ./npf.conf
 >>Fix:
 > As described above, enabling the code fixes the issue, but I'm not sure
 > if the fix is appropriate.
 >
 > diff --git a/sys/net/npf/npf_ctl.c b/sys/net/npf/npf_ctl.c
 > index b1af2d347ac..43105db9c85 100644
 > --- a/sys/net/npf/npf_ctl.c
 > +++ b/sys/net/npf/npf_ctl.c
 > @@ -633,12 +633,12 @@ fail:
 >         prop_object_release(npf_dict);
 >
 >         /* Error report. */
 > -#if !defined(_NPF_TESTING) && !defined(_NPF_STANDALONE)
 > +//#if !defined(_NPF_TESTING) && !defined(_NPF_STANDALONE)
 >         prop_dictionary_set_int32(errdict, "errno", error);
 >         prop_dictionary_copyout_ioctl(pref, cmd, errdict);
 >         prop_object_release(errdict);
 >         error = 0;
 > -#endif
 > +//#endif
 >         return error;
 >  }
 >

 Further investigations found the following requirements:
 - prop_object_release(errdict) must be always called
   because errdict = prop_dictionary_create() is always called
 - prop_object_release(npf_dict) needs to be called only if
   npf_dict is allocated by prop_dictionary_copyin_ioctl_size
 - _NPF_STANDALONE doesn't require to set prop
 - For _NPF_TESTING, if npfctl_testing, setting prop isn't
   needed, otherwise it's needed

 So I think a fix looks like the following patch. Comments?

   ozaki-r


 --

 diff --git a/sys/net/npf/npf_ctl.c b/sys/net/npf/npf_ctl.c
 index b1af2d347ac..3e82db6bf73 100644
 --- a/sys/net/npf/npf_ctl.c
 +++ b/sys/net/npf/npf_ctl.c
 @@ -630,15 +630,25 @@ fail:
         if (tblset) {
                 npf_tableset_destroy(tblset);
         }
 -       prop_object_release(npf_dict);
 +#if defined(_NPF_TESTING) || defined(_NPF_STANDALONE)
 +       if (!npfctl_testing)
 +#endif
 +               prop_object_release(npf_dict);

 -       /* Error report. */
 -#if !defined(_NPF_TESTING) && !defined(_NPF_STANDALONE)
 -       prop_dictionary_set_int32(errdict, "errno", error);
 -       prop_dictionary_copyout_ioctl(pref, cmd, errdict);
 +#ifndef _NPF_STANDALONE
 +#ifdef _NPF_TESTING
 +       if (!npfctl_testing) {
 +#endif
 +               /* Error report. */
 +               prop_dictionary_set_int32(errdict, "errno", error);
 +               prop_dictionary_copyout_ioctl(pref, cmd, errdict);
 +               error = 0;
 +#ifdef _NPF_TESTING
 +       }
 +#endif
 +#endif /* _NPF_STANDALONE */
         prop_object_release(errdict);
 -       error = 0;
 -#endif
 +
         return error;
  }

From: christos@zoulas.com (Christos Zoulas)
To: Ryota Ozaki <ozaki-r@netbsd.org>, 
	"gnats-bugs@NetBSD.org" <gnats-bugs@netbsd.org>
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, 
	netbsd-bugs@netbsd.org
Subject: Re: kern/52643: npfctl reload on rump kernels fails randomly
Date: Fri, 27 Oct 2017 07:28:20 -0400

 On Oct 27, 12:41pm, ozaki-r@netbsd.org (Ryota Ozaki) wrote:
 -- Subject: Re: kern/52643: npfctl reload on rump kernels fails randomly

 | Further investigations found the following requirements:
 | - prop_object_release(errdict) must be always called
 |   because errdict = prop_dictionary_create() is always called
 | - prop_object_release(npf_dict) needs to be called only if
 |   npf_dict is allocated by prop_dictionary_copyin_ioctl_size
 | - _NPF_STANDALONE doesn't require to set prop
 | - For _NPF_TESTING, if npfctl_testing, setting prop isn't
 |   needed, otherwise it's needed
 | 
 | So I think a fix looks like the following patch. Comments?

 Go for it, add the big fat comment above to the patch though :-)

 christos

From: "Ryota Ozaki" <ozaki-r@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52643 CVS commit: src/sys/net/npf
Date: Mon, 30 Oct 2017 03:02:35 +0000

 Module Name:	src
 Committed By:	ozaki-r
 Date:		Mon Oct 30 03:02:35 UTC 2017

 Modified Files:
 	src/sys/net/npf: npf_ctl.c

 Log Message:
 Fix npfclt reload on rump kernels

 It fails because npfctl cannot get an errno when it calls ioctl to the (rump)
 kernel; npfctl (libnpf) expects that an errno is returned via proplib,
 however, the rump library of npf doesn't so. It happens because of mishandlings
 of complicate npf kernel options.

 PR kern/52643


 To generate a diff of this commit:
 cvs rdiff -u -r1.48 -r1.49 src/sys/net/npf/npf_ctl.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Soren Jacobsen" <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/52643 CVS commit: [netbsd-8] src
Date: Fri, 17 Nov 2017 20:43:11 +0000

 Module Name:	src
 Committed By:	snj
 Date:		Fri Nov 17 20:43:11 UTC 2017

 Modified Files:
 	src/distrib/sets/lists/debug [netbsd-8]: mi
 	src/distrib/sets/lists/tests [netbsd-8]: mi
 	src/etc/mtree [netbsd-8]: NetBSD.dist.tests
 	src/sys/net/npf [netbsd-8]: npf_ctl.c
 	src/tests/net [netbsd-8]: net_common.sh
 	src/tests/net/ipsec [netbsd-8]: Makefile algorithms.sh
 	src/usr.sbin/npf/npfctl [netbsd-8]: npfctl.c
 Added Files:
 	src/tests/net/ipsec [netbsd-8]: natt_terminator.c t_ipsec_natt.sh

 Log Message:
 Pull up following revision(s) (requested by ozaki-r in ticket #357):
 	distrib/sets/lists/debug/mi: 1.228
 	distrib/sets/lists/tests/mi: 1.765-1.766
 	etc/mtree/NetBSD.dist.tests: 1.149
 	sys/net/npf/npf_ctl.c: 1.49
 	tests/net/ipsec/Makefile: 1.10
 	tests/net/ipsec/algorithms.sh: 1.6
 	tests/net/ipsec/natt_terminator.c: 1.1
 	tests/net/ipsec/t_ipsec_natt.sh: 1.1
 	tests/net/net_common.sh: 1.23-1.24
 	usr.sbin/npf/npfctl/npfctl.c: 1.54
 Handle esp-udp for NAT-T
 --
 Fix npfclt reload on rump kernels
 It fails because npfctl cannot get an errno when it calls ioctl to the (rump)
 kernel; npfctl (libnpf) expects that an errno is returned via proplib,
 however, the rump library of npf doesn't so. It happens because of mishandlings
 of complicate npf kernel options.
 PR kern/52643
 --
 Fix showing translated port (ntohs-ed twice wrongly)
 --
 Add test cases of NAT-T (transport mode)
 A small C program is added to make a special socket (UDP_ENCAP_ESPINUDP)
 and keep it to handle UDP-encapsulated ESP packets.
 --
 Add net/ipsec debug lib directory
 --
 Add ./usr/libdata/debug/usr/tests/net/ipsec
 --
 Stop using bpfjit
 Because most architectures don't support it and npf still works without it.


 To generate a diff of this commit:
 cvs rdiff -u -r1.216.2.4 -r1.216.2.5 src/distrib/sets/lists/debug/mi
 cvs rdiff -u -r1.752.2.5 -r1.752.2.6 src/distrib/sets/lists/tests/mi
 cvs rdiff -u -r1.147.2.1 -r1.147.2.2 src/etc/mtree/NetBSD.dist.tests
 cvs rdiff -u -r1.48 -r1.48.2.1 src/sys/net/npf/npf_ctl.c
 cvs rdiff -u -r1.18.2.2 -r1.18.2.3 src/tests/net/net_common.sh
 cvs rdiff -u -r1.6.2.1 -r1.6.2.2 src/tests/net/ipsec/Makefile
 cvs rdiff -u -r1.4.2.1 -r1.4.2.2 src/tests/net/ipsec/algorithms.sh
 cvs rdiff -u -r0 -r1.1.2.2 src/tests/net/ipsec/natt_terminator.c \
     src/tests/net/ipsec/t_ipsec_natt.sh
 cvs rdiff -u -r1.53 -r1.53.6.1 src/usr.sbin/npf/npfctl/npfctl.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: maxv@NetBSD.org
State-Changed-When: Fri, 31 Aug 2018 14:40:45 +0000
State-Changed-Why:
Fixed and pulled up, close this PR.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.