NetBSD Problem Report #54189
From ozaki-r@netbsd.org Fri May 10 09:22:58 2019
Return-Path: <ozaki-r@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 0ADBB7A16F
for <gnats-bugs@gnats.NetBSD.org>; Fri, 10 May 2019 09:22:58 +0000 (UTC)
Message-Id: <20190510092256.B03457A1D2@mollari.NetBSD.org>
Date: Fri, 10 May 2019 09:22:56 +0000 (UTC)
From: ozaki-r@netbsd.org
Reply-To: ozaki-r@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: IFNET_LOCK for if_mcast_op can cause a deadlock
X-Send-Pr-Version: 3.95
>Number: 54189
>Category: kern
>Synopsis: IFNET_LOCK for if_mcast_op can cause a deadlock
>Confidential: no
>Severity: critical
>Priority: medium
>Responsible: kern-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri May 10 09:25:00 +0000 2019
>Closed-Date: Wed Apr 17 21:56:42 +0000 2024
>Last-Modified: Wed Apr 17 21:56:42 +0000 2024
>Originator: Ryota Ozaki
>Release: current and 8
>Organization:
>Environment:
System: NetBSD rangeley 8.99.38 NetBSD 8.99.38 (RANGELEY) #2: Fri May 10 13:37:40 JST 2019 ozaki-r@rangeley:(hidden) amd64
Architecture: x86_64
Machine: amd64
>Description:
From src/doc/TODO.smpnet:
To avoid data race on if_flags it should be protected by a lock (currently it's
IFNET_LOCK). (snip)
IFF_ALLMULTI can be set/unset via if_mcast_op. To protect updates of the flag,
we had added IFNET_LOCK around if_mcast_op. However that was not a good
approach because if_mcast_op is typically called in the middle of a call path
and holding IFNET_LOCK such places is problematic. Actually a deadlock is
observed. Probably we should remove IFNET_LOCK and manage IFF_ALLMULTI
somewhere other than if_flags, for example ethercom or driver itself (or a
common driver framework once it appears). Such a change is feasible because
IFF_ALLMULTI is only set/unset by a driver and not accessed from any common
components such as network protocols.
The resource dependency graph of a deadlock I encountered is like this:
Thread A: IFNET_LOCK => psref_target_destroy => softint
Thread B: softint => softnet_lock
Thread C: softnet_lock => IFNET_LOCK (for if_mcast_op)
Note that the deadlock doesn't occur if NET_MPSAFE is NOT enabled because
psref_target_destroy is called only if NET_MPSAFE is enabled.
>How-To-Repeat:
Run userspace processes below concurrently while receive some packets destinating to the host, then wait several hours:
- a process adding/removing an IP address on an interface, and
- a process configuring/unconfiguring multicast groups on the same interface.
>Fix:
One solution is to remove IFNET_LOCK for if_mcat_op as described above and
protect operations of IFF_ALLMULTI in another way.
The below patch introduces a new field to struct ethercom (ec_flags) and store
the IFF_ALLMULTI flag to it (as ETHER_F_ALLMULTI). So we can use ETHER_LOCK
to protect operations of the flag instead of IFNET_LOCK.
https://www.netbsd.org/~ozaki-r/allmulti.diff
One known issue of the approach is that it can't be applied to netbsd-8 because
the change breaks ABI IIUC.
>Release-Note:
>Audit-Trail:
From: "Ryota Ozaki" <ozaki-r@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54189 CVS commit: src/sys/dev/pci
Date: Tue, 14 May 2019 09:43:55 +0000
Module Name: src
Committed By: ozaki-r
Date: Tue May 14 09:43:55 UTC 2019
Modified Files:
src/sys/dev/pci: if_wm.c
src/sys/dev/pci/ixgbe: ixgbe.c
Log Message:
Remove unnecessary checks of IFF_ALLMULTI
IFF_ALLMULTI is changed by only driver itself, so we don't need to check its
change on ec_ifflags_cb.
This is part of PR kern/54189. NFCI.
To generate a diff of this commit:
cvs rdiff -u -r1.634 -r1.635 src/sys/dev/pci/if_wm.c
cvs rdiff -u -r1.181 -r1.182 src/sys/dev/pci/ixgbe/ixgbe.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Ryota Ozaki" <ozaki-r@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54189 CVS commit: src/sys
Date: Wed, 15 May 2019 02:56:48 +0000
Module Name: src
Committed By: ozaki-r
Date: Wed May 15 02:56:48 UTC 2019
Modified Files:
src/sys/dev/ic: dwc_gmac.c
src/sys/dev/pci: if_wm.c
src/sys/dev/pci/ixgbe: ixgbe.c
src/sys/net: if.c if_ether.h if_ethersubr.c
Log Message:
Store IFF_ALLMULTI in ec_flags instead of if_flags to avoid data races
IFF_ALLMULTI is set/unset to if_flags via if_mcast_op. To avoid data races on
if_flags, IFNET_LOCK was added for if_mcast_op. Unfortunately it produces
a deadlock so we want to remove added IFNET_LOCK by avoiding the data races by
another approach.
This fix introduces ec_flags to struct ethercom and stores IFF_ALLMULTI to it.
ec_flags is protected by ETHER_LOCK and thus IFNET_LOCK is no longer necessary
for if_mcast_op. Note that the fix is applied only to MP-safe drivers that
the data races matter.
In the kernel, IFF_ALLMULTI is set by a driver and used by the driver itself.
So changing the storing place doesn't break anything. One exception is
ioctl(SIOCGIFFLAGS); we have to include IFF_ALLMULTI in a result if needed to
export the flag as well as before.
A upcoming commit will remove IFNET_LOCK.
PR kern/54189
To generate a diff of this commit:
cvs rdiff -u -r1.59 -r1.60 src/sys/dev/ic/dwc_gmac.c
cvs rdiff -u -r1.635 -r1.636 src/sys/dev/pci/if_wm.c
cvs rdiff -u -r1.182 -r1.183 src/sys/dev/pci/ixgbe/ixgbe.c
cvs rdiff -u -r1.451 -r1.452 src/sys/net/if.c
cvs rdiff -u -r1.77 -r1.78 src/sys/net/if_ether.h
cvs rdiff -u -r1.273 -r1.274 src/sys/net/if_ethersubr.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Ryota Ozaki" <ozaki-r@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/54189 CVS commit: src/sys
Date: Wed, 15 May 2019 02:59:19 +0000
Module Name: src
Committed By: ozaki-r
Date: Wed May 15 02:59:19 UTC 2019
Modified Files:
src/sys/net: if_vlan.c
src/sys/netinet: in_pcb.c ip_output.c
src/sys/netinet6: in6_pcb.c ip6_output.c
Log Message:
Get rid of IFNET_LOCK for if_mcast_op to avoid a deadlock
The IFNET_LOCK was added to avoid data races on if_flags for IFF_ALLMULTI.
Unfortunatetly it caused a deadlock instead. A known scenario causing a
deadlock is to occur the following two operations concurrently: (a) a removal of
an IP adddres assigned to an interface and (b) a manipulation of multicast
groups to the interface. The resource dependency graph is like this:
softnet_lock => IFNET_LOCK => psref_target_destroy => softint => softnet_lock
Thanks to the previous commit that avoids data races on if_flags for
IFF_ALLMULTI by another approach, we can remove IFNET_LOCK and defuse the
deadlock.
PR kern/54189
To generate a diff of this commit:
cvs rdiff -u -r1.135 -r1.136 src/sys/net/if_vlan.c
cvs rdiff -u -r1.182 -r1.183 src/sys/netinet/in_pcb.c
cvs rdiff -u -r1.311 -r1.312 src/sys/netinet/ip_output.c
cvs rdiff -u -r1.165 -r1.166 src/sys/netinet6/in6_pcb.c
cvs rdiff -u -r1.219 -r1.220 src/sys/netinet6/ip6_output.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Wed, 17 Apr 2024 21:56:42 +0000
State-Changed-Why:
Unless you want to pull this up to 8, in some way that doesn't break
kernel ABI, I think we can consider this closed?
(If there are other drivers that need updates, let's file separate PRs
for each driver to track pullups.)
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2024
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.