NetBSD Problem Report #49264
From www@NetBSD.org Thu Oct 9 14:59:27 2014
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 5AF5DA6602
for <gnats-bugs@gnats.NetBSD.org>; Thu, 9 Oct 2014 14:59:27 +0000 (UTC)
Message-Id: <20141009145925.CC331A6653@mollari.NetBSD.org>
Date: Thu, 9 Oct 2014 14:59:25 +0000 (UTC)
From: ozaki-r@netbsd.org
Reply-To: ozaki-r@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: vlan(4): concurrent executions of ifconfig cause a fatal page fault
X-Send-Pr-Version: www-1.0
>Number: 49264
>Category: kern
>Synopsis: vlan(4): concurrent executions of ifconfig cause a fatal page fault
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Oct 09 15:00:00 +0000 2014
>Closed-Date: Wed Jun 01 02:55:33 +0000 2016
>Last-Modified: Sun Sep 24 20:10:00 +0000 2017
>Originator: Ryota Ozaki
>Release: current
>Organization:
>Environment:
NetBSD kvm 7.99.1 NetBSD 7.99.1 (KVM) #89: Thu Oct 9 20:43:55 JST 2014 ozaki-r@(hidden):(hidden) amd64
>Description:
Run ifconfig vlan0 -vlanif vioif0 and ifconfig vlan0 destroy in parallel with some load, then a fatal page fault sometimes occurs:
uvm_fault(0xfffffe8002e14188, 0x0, 1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip ffffffff8025cc44 cs 8 rflags 10246 cr2 50 ilevel 6 rsp fffffe8000bacc08
curlwp 0xfffffe8000d48440 pid 2376.1 lowest kstack 0xfffffe8000ba92c0
kernel: page fault trap, code=0
Stopped in pid 2376.1 (ifconfig) at netbsd:vlan_unconfig+0x32: cmpb $0x6,50(%rax)
db{0}> bt
vlan_unconfig() at netbsd:vlan_unconfig+0x32
vlan_ioctl() at netbsd:vlan_ioctl+0x235
doifioctl() at netbsd:doifioctl+0x2d8
soo_ioctl() at netbsd:soo_ioctl+0x2af
sys_ioctl() at netbsd:sys_ioctl+0x17e
syscall() at netbsd:syscall+0x9a
--- syscall (number 54) ---
7f7ff6ccea0a:
vlan_unconfig+0x32 is here:
switch (ifv->ifv_p->if_type) {
is the source code. ifv->ifv_p is NULL at that point unexpectedly. Non-NULL check of ifv->ifv_p is done at the beginning of the function, so another LWP has run between the check and the above point.
vlan_unconfig is protected by splnet and KERNEL_LOCK in soo_ioctl, but (*ifv->ifv_msw->vmsw_purgemulti)(ifv) in vlan_unconfig may sleep and thus a LWP can enter the function while an original LWP is sleeping there.
We have to serialize executions of vlan_unconfig somehow.
>How-To-Repeat:
Run the following script with some load:
while true; do
ifconfig vlan0 create
ifconfig vlan0 vlan 10 vlanif vioif0
ifconfig vlan0 -vlanif vioif0 &
ifconfig vlan0 destroy
done
>Fix:
Introduce a mutex to protect vlan_unconfig.
diff --git a/sys/net/if_vlan.c b/sys/net/if_vlan.c
index 5c75e34..30b8724 100644
--- a/sys/net/if_vlan.c
+++ b/sys/net/if_vlan.c
@@ -180,6 +180,8 @@ void vlanattach(int);
/* XXX This should be a hash table with the tag as the basis of the key. */
static LIST_HEAD(, ifvlan) ifv_list;
+static kmutex_t ifv_mtx __cacheline_aligned;
+
struct if_clone vlan_cloner =
IF_CLONE_INITIALIZER("vlan", vlan_clone_create, vlan_clone_destroy);
@@ -191,6 +193,7 @@ vlanattach(int n)
{
LIST_INIT(&ifv_list);
+ mutex_init(&ifv_mtx, MUTEX_DEFAULT, IPL_NONE);
if_clone_attach(&vlan_cloner);
}
@@ -359,8 +362,12 @@ vlan_unconfig(struct ifnet *ifp)
{
struct ifvlan *ifv = ifp->if_softc;
- if (ifv->ifv_p == NULL)
+ mutex_enter(&ifv_mtx);
+
+ if (ifv->ifv_p == NULL) {
+ mutex_exit(&ifv_mtx);
return;
+ }
/*
* Since the interface is being unconfigured, we need to empty the
@@ -412,6 +419,8 @@ vlan_unconfig(struct ifnet *ifp)
if_down(ifp);
ifp->if_flags &= ~(IFF_UP|IFF_RUNNING);
ifp->if_capabilities = 0;
+
+ mutex_exit(&ifv_mtx);
}
/*
>Release-Note:
>Audit-Trail:
From: "Ryota Ozaki" <ozaki-r@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/49264 CVS commit: src/sys/net
Date: Sat, 11 Oct 2014 10:16:49 +0000
Module Name: src
Committed By: ozaki-r
Date: Sat Oct 11 10:16:49 UTC 2014
Modified Files:
src/sys/net: if_vlan.c
Log Message:
Protect vlan_unconfig with a mutex
It is not thread-safe but is likely to be executed in concurrent.
See PR 49264 for more detail.
To generate a diff of this commit:
cvs rdiff -u -r1.75 -r1.76 src/sys/net/if_vlan.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Ryota Ozaki" <ozaki-r@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/49264 CVS commit: src/sys/net
Date: Sat, 11 Oct 2014 10:27:31 +0000
Module Name: src
Committed By: ozaki-r
Date: Sat Oct 11 10:27:31 UTC 2014
Modified Files:
src/sys/net: if_vlan.c
Log Message:
Execute if_detach within splnet where vlan_unconfig is
With the fix, a ifnet data of vlan can avoid use after free
that results in a fatal page fault.
This problem was found when fixing PR 49264. See
http://mail-index.netbsd.org/netbsd-bugs/2014/10/10/msg038536.html
for more detail.
To generate a diff of this commit:
cvs rdiff -u -r1.77 -r1.78 src/sys/net/if_vlan.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: pgoyette@NetBSD.org
State-Changed-When: Wed, 01 Jun 2016 02:55:33 +0000
State-Changed-Why:
Committed on Oct 11 10:27:31 UTC 2014 by ozaki-r
From: "Soren Jacobsen" <snj@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/49264 CVS commit: [netbsd-7] src/sys
Date: Sun, 24 Sep 2017 20:05:03 +0000
Module Name: src
Committed By: snj
Date: Sun Sep 24 20:05:03 UTC 2017
Modified Files:
src/sys/arch/xen/xen [netbsd-7]: if_xennet_xenbus.c xennetback_xenbus.c
src/sys/net [netbsd-7]: if_bridge.c if_ether.h if_ethersubr.c if_vlan.c
Log Message:
Pull up following revision(s) (requested by manu in ticket #1409):
sys/arch/xen/xen/if_xennet_xenbus.c: 1.65
sys/arch/xen/xen/xennetback_xenbus.c: 1.53, 1.56 via patch
sys/net/if_bridge.c: 1.105
sys/net/if_ether.h: 1.65
sys/net/if_ethersubr.c: 1.215, 1.235
sys/net/if_vlan.c: 1.76, 1.77, 1.83, 1.88, 1.94
Protect vlan_unconfig with a mutex
It is not thread-safe but is likely to be executed in concurrent.
See PR 49264 for more detail.
--
Tweak vlan_unconfig
No functional change.
--
Add handling of VLAN packets in if_bridge where the parent interface supports
them (Jean-Jacques.Puig%espci.fr@localhost). Factor out the vlan_mtu enabling and
disabling code.
--
Enable the VLAN mtu capability and check for the adjusted packet size
(Jean-Jacques.Puig at espci.fr).
Factor out the packet-size checking function for clarity.
--
Don't increment the reference count only when it was 0...
From Jean-Jacques.Puig
--
Account for the CRC len (Jean-Jacques.Puig)
--
Fix a bug that the parent interface's callback wasn't called when the vlan
interface is configured. A callback function uses VLAN_ATTACHED() function
which check ec->ec_nvlans, the value should be incremented before calling the
callback. This bug was added in if_vlan.c rev. 1.83 (2015/11/19).
To generate a diff of this commit:
cvs rdiff -u -r1.63.2.2 -r1.63.2.3 src/sys/arch/xen/xen/if_xennet_xenbus.c
cvs rdiff -u -r1.52.4.1 -r1.52.4.2 src/sys/arch/xen/xen/xennetback_xenbus.c
cvs rdiff -u -r1.90 -r1.90.2.1 src/sys/net/if_bridge.c
cvs rdiff -u -r1.64 -r1.64.2.1 src/sys/net/if_ether.h
cvs rdiff -u -r1.204.2.1 -r1.204.2.2 src/sys/net/if_ethersubr.c
cvs rdiff -u -r1.70.2.4 -r1.70.2.5 src/sys/net/if_vlan.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.