NetBSD Problem Report #55395

From kardel@Kardel.name  Wed Jun 17 15:33:43 2020
Return-Path: <kardel@Kardel.name>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 51F9F1A9244
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 17 Jun 2020 15:33:43 +0000 (UTC)
Message-Id: <20200617153309.9820B44B33@Andromeda.Kardel.name>
Date: Wed, 17 Jun 2020 17:33:09 +0200 (CEST)
From: kardel@netbsd.org
Reply-To: kardel@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: panic: locking against myself (interface, bridges, vlans)
X-Send-Pr-Version: 3.95

>Number:         55395
>Category:       kern
>Synopsis:       panic: locking against myself (interface, bridges, vlans)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jun 17 15:35:00 +0000 2020
>Originator:     Frank Kardel
>Release:        NetBSD 9.99.65
>Organization:

>Environment:


System: NetBSD dolomiti.hw.abs.acrys.com 9.99.65 NetBSD 9.99.65 (GENERIC) #2: Tue Jun 9 01:44:28 CEST 2020 kardel@Andromeda:/src/NetBSD/cur/src/obj.amd64/sys/arch/amd64/compile/GENERIC amd64
Architecture: x86_64
Machine: amd64
>Description:
	Given following network configuration

	wm0(ip-addr-1)-+-bridge0-xvif1i0(ip-addr-2)
                       |
		       +-vlan10(no ip addr)-bridge1-?
                       |
		       +-vlan20(no ip addr)-bridge2-?
                       |
		       \-vlan30(no ip addr)-bridge3-?

	leads to a panic:
	      System panicked: lock error: Mutex: mutex_vector_enter,543: locking against myself: lock 0xffffc5066b824080 cpu 1 lwp 0xffffc499ddef6600
	      Backtrace from time of crash is available.
	      crash> bt
	      _KERNEL_OPT_NARCNET() at 0
	      ostype() at ffffffff8143717b
	      sys_reboot() at sys_reboot
	      vpanic() at vpanic+0x15b
	      snprintf() at snprintf
	      lockdebug_abort() at lockdebug_abort+0xd3
	      mutex_vector_enter() at mutex_vector_enter+0x402
	      bridge_input() at bridge_input+0xb11
	      vlan_input() at vlan_input+0x102
	      ether_input() at ether_input+0x4ce
	      bridge_input() at bridge_input+0xb33
	      if_percpuq_softint() at if_percpuq_softint+0x90
	      softint_dispatch() at softint_dispatch+0x2d1
	      DDB lost frame for Xsoftintr+0x4f, trying 0xffff9323fbea40f0
	      Xsoftintr() at Xsoftintr+0x4f
	      --- interrupt ---

	A panic can be avoided with using a second interface given following configuration:

	wm0(ip-addr-1)-bridge0-xvif1i0(ip-addr-2)

	wm1(no-ip-addr)-+-vlan10(no ip addr)-bridge1-?
                        |
		        +-vlan20(no ip addr)-bridge2-?
                        |
		        \-vlan30(no ip addr)-bridge3-?

	Reading the code (if_bridge.c) I see that we hold the softnet_lock for calling ether_input
	and when ether_input encounters another bridge the bridge code attempts a acquire the
	softnet_lock again and panics.

>How-To-Repeat:
	configure the the interfaces as above in the first example and wait for the panic.
>Fix:
	re-work locking / decouple packet processing?

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.