NetBSD Problem Report #56455

From www@netbsd.org  Fri Oct 15 05:51:36 2021
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id E70D91A9FC7
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 15 Oct 2021 05:51:35 +0000 (UTC)
Message-Id: <20211015032440.3EA3F1A923B@mollari.NetBSD.org>
Date: Fri, 15 Oct 2021 03:24:40 +0000 (UTC)
From: s-yamaguchi@iij.ad.jp
Reply-To: s-yamaguchi@iij.ad.jp
To: gnats-bugs@NetBSD.org
Subject: panic in if_get_bylla when if_set_sadl is called
X-Send-Pr-Version: www-1.0

>Number:         56455
>Category:       kern
>Synopsis:       panic in if_get_bylla when if_set_sadl is called
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Oct 15 05:55:00 +0000 2021
>Originator:     Shoichi Yamaguchi
>Release:        NetBSD 9.99.91
>Organization:
Internet Initiative Japan Inc.
>Environment:
NetBSD hidden 9.99.91 NetBSD 9.99.91 (GENERIC) #12: Fri Oct 15 09:21:21 JST 2021  hidden/obj/sys/arch/amd64/compile/GENERIC amd64

>Description:
[ 478.9688824] fatal page fault in supervisor mode
[ 478.9796964] trap type 6 code 0 rip 0xffffffff80dfccc3 cs 0x8 rflags 0x10246 cr2 0x5 ilevel 0x5 rsp 0xffffb180af52ce30
[ 478.9902694] curlwp 0xffff9aa6c5205480 pid 0.3 lowest kstack 0xffffb180af5282c0
kernel: page fault trap, code=0
Stopped in pid 0.3 (system) at  netbsd:if_get_bylla+0x60:       movzbl  5(%rax),
%edx
if_get_bylla() at netbsd:if_get_bylla+0x60
in_arpinput() at netbsd:in_arpinput+0xb5
arpintr() at netbsd:arpintr+0x160
softint_dispatch() at netbsd:softint_dispatch+0xf2
DDB lost frame for netbsd:Xsoftintr+0x4f, trying 0xffffb180af52d0f0
Xsoftintr() at netbsd:Xsoftintr+0x4f
--- interrupt ---
0:
ds          84
es          74
fs          20
gs          70e0
rdi         ffff9aa5562bc016
rsi         ffff9aa555d08523
rbp         ffffb180af52ce80
rbx         6
rdx         6
rcx         e2
rax         0
r8          0
r9          ffff9aa5574a2808
r10         ffff9aa5562bc00e
r11         dead
r12         ffff9aa5562bc016
r13         6
r14         4
r15         ffff9aa5574a2808
rip         ffffffff80dfccc3    if_get_bylla+0x60
cs          8
rflags      10246
rsp         ffffb180af52ce30
ss          10
netbsd:if_get_bylla+0x60:       movzbl  5(%rax),%edx
db{0}> 

if_deactivate_sadl() called from if_set_sadl stores NULL to ifp->if_sadl,
and if_get_bylla() refers ifp->if_sadl.Both of two are doing without lock
or other exclusive controls. 
Therefore, the panic is appeared when doing commands that changes MAC
address e.g. "ifconfig vlan0 vlan 1 vlanif wm0" and
"ifconfig lagg0 laggport wm0", and receiving ARP frames at the same time.

>How-To-Repeat:
HOST-A% ifconfig wm0 192.168.0.1/24
HOST-A% arping -q -f 192.168.0.254

TARGET% ifconfig wm0 192.168.0.2/24
TARGET% ifconfig vlan0 create
TARGET% while true
do
    ifconfig vlan0 vlan 1 vlanif wm1
    sleep 0.$((RANDOM % 9 + 1))
    ifconfig vlan0 -vlanif
    sleep 0.$((RANDOM % 9 + 1))
done

----
I applied the following patch for the kernel of TARGET to cause the panic easily.

index 130ca14235a..41b5c2dea66 100644
--- a/sys/net/if.c
+++ b/sys/net/if.c
@@ -520,8 +520,10 @@ if_alloc_sadl(struct ifnet *ifp)
         * now.  This is useful for interfaces that can change
         * link types, and thus switch link names often.
         */
-       if (ifp->if_sadl != NULL)
+       if (ifp->if_sadl != NULL) {
                if_free_sadl(ifp, 0);
+               delay(10000);
+       }

        ifa = if_dl_create(ifp, &sdl);

>Fix:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.