NetBSD Problem Report #56844

From ocb@dc.localdomain  Wed May 18 13:55:37 2022
Return-Path: <ocb@dc.localdomain>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 9E0771A921F
	for <gnats-bugs@gnats.NetBSD.org>; Wed, 18 May 2022 13:55:37 +0000 (UTC)
Message-Id: <20220518123749.4DD40272918@dc.localdomain>
Date: Wed, 18 May 2022 14:37:49 +0200 (CEST)
From: ocb@dc.localdomain
Reply-To: ocb@dc.localdomain
To: gnats-bugs@NetBSD.org
Subject: delete auto-modified network route crash 
X-Send-Pr-Version: 3.95

>Number:         56844
>Category:       port-amd64
>Synopsis:       delete auto-modified network route crash
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    port-amd64-maintainer
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed May 18 14:00:00 +0000 2022
>Last-Modified:  Wed Feb 22 19:00:02 +0000 2023
>Originator:     ocb@l25.fi
>Release:        NetBSD 9.2
>Organization:

>Environment:


System: NetBSD 9.2 (KERNEL1_KASLR) #0: Fri Jan 21 03:09:17 UTC 2022 kernel@dc:/usr/kernel/usr/src/sys/arch/amd64/compile/KERNEL1_KASLR amd64
Architecture: x86_64
Machine: amd64
>Description:
	The system uses IP .2 as default gateway, where IP .2 is behind a router IP .1. Connection works, everything is ok. To allow some additional virtual machines from different subnet access IP .2, a route 'route add 192.168.88.2 192.168.88.1' is added. Connection works, everything is ok.

dc# route -n show  | grep 192.168.88.2
default            192.168.88.2       UG          -        -      -  wm0
192.168.88/24      link#1             U           -        -      -  wm0
192.168.88.2       192.168.88.1       UGH         -        -      -  wm0
192.168.88.252     link#1             UHl         -        -      -  lo0
192.168.88.2       4c:5e:0c:b7:7e:08  UH          -        -      -  wm0

Now I decide to connect to IP .2 SSH port and the recently added route gets modified.

dc# route -n show  | grep 192.168.88.2
default            192.168.88.2       UG          -        -      -  wm0
192.168.88/24      link#1             U           -        -      -  wm0
192.168.88.2       192.168.88.2       UGH         -        -      -  wm0 <-- note this line
192.168.88.252     link#1             UHl         -        -      -  lo0
192.168.88.2       4c:5e:0c:b7:7e:08  UH          -        -      -  wm0

Finally if we try to remove this route by executing 'route delete 192.168.88.2 192.168.88.2', machine running KASLR kernel will be very limited, X continues to be visually responsive, already running terminal emulators can run commands but will hang and can not be paused or killed. New terminal emulators can not be started. Changing active virtual console will finally crash X and kernel. I am not experienced with crashes but this appears like memory is slowly getting corrupted. Kernel crash dump successfully gets saved to swap, but upon reboot is unable to retrieve it from swap. Note, crash dump saving and retrieving works fine on non-kaslr kernel, but not in this situation, since.. 

On a GENERIC (non-KASLR) kernel, the situation is different. After executing 'route delete' command, the system freezes instantly and requires hard poweroff on the button. No backtrace, no crash log. 

>How-To-Repeat:

>Fix:


>Release-Note:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-amd64/56844: delete auto-modified network route crash
Date: Wed, 18 May 2022 16:10:39 +0200

 On Wed, May 18, 2022 at 02:00:00PM +0000, ocb@dc.localdomain wrote:

 > 	The system uses IP .2 as default gateway, where IP .2 is behind a router IP .1. Connection works, everything is ok. To allow some additional virtual machines from different subnet access IP .2, a route 'route add 192.168.88.2 192.168.88.1' is added.

 How can 192.168.88.2 be behind a router if you have the full /24 on wm0?

 > 192.168.88/24      link#1             U           -        -      -  wm0

 I don't understand how you intend your setup to work.

 Martin

From: ocb@l25.fi
To: gnats-bugs@NetBSD.org
Cc: martin@duskware.de
Subject: Re: port-amd64/56844
Date: Wed, 18 May 2022 19:35:24 +0000

 Excuse me for not being clear.

 All devices named below are physical devices.

 192.168.88.1 - mikrotik router
 192.168.88.2 - mikrotik antenna connected to PoE port 10 on the router
 192.168.88.252 - netbsd workstation connected to port 6 on the router

 Network configuration was made according to ifconfig.if(5), and all network connectivity with local devices and remote hosts works as expected.

 dc$ cat /etc/ifconfig.wm0
 up
 inet 192.168.88.252 netmask 255.255.255.0
 !route add 192.168.88.2 192.168.88.1

 dc$ cat /etc/rc.conf | grep -E 'defaultroute'
 defaultroute="192.168.88.2"

 dc$ ifconfig wm0
 wm0: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
         capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
         capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
         capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
         enabled=0
         ec_capabilities=17<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,EEE>
         ec_enabled=2<VLAN_HWTAGGING>
         address: 3c:97:0e:4e:db:de
         media: Ethernet none (none)
         inet 192.168.88.252/24 broadcast 192.168.88.255 flags 0x0
         inet6 fe80::3e97:eff:fe4e:dbde%wm0/64 flags 0x0 scopeid 0x1

 The reason I add '192.168.88.2 192.168.88.1' route is so that a tap interface (behind 192.168.88.252 which is acting as hypervisor) from network '192.168.50.0/30' can reach '192.168.88.2'

 # commands before hang

 dc# route delete 192.168.88.2 192.168.88.1
 delete host 192.168.88.2: gateway 192.168.88.1
 dc# route add 192.168.88.2 192.168.88.1
 add host 192.168.88.2: gateway 192.168.88.1
 dc# fping -l -A 192.168.88.2
 192.168.88.2 : [0], 84 bytes, 0.76 ms (0.76 avg, 0% loss)
 ICMP Redirect from  for ICMP Echo sent to 192.168.88.2
 ##### it does not happen always, usually happens when route is active for more than 10 minutes.
 dc# route delete 192.168.88.2 192.168.88.2
 delete host 192.168.88.2: gateway 192.168.88.2
 dc# route add 192.168.88.2 192.168.88.1
 add host 192.168.88.2: gateway 192.168.88.1
 dc# fping -l -A 192.168.88.2
 192.168.88.2 : [0], 84 bytes, 0.76 ms (0.76 avg, 0% loss)
 ICMP Redirect from  for ICMP Echo sent to 192.168.88.2
 dc# route -n show | grep 192.168.88
 default            192.168.88.2       UG          -        -      -  wm0
 192.168.88/24      link#1             U           -        -      -  wm0
 192.168.88.2       192.168.88.2       UGH         -        -      -  wm0
 192.168.88.252     link#1             UHl         -        -      -  lo0
 192.168.88.1       b8:69:f4:db:8d:07  UH          -        -      -  wm0
 192.168.88.2       4c:5e:0c:b7:7e:08  UH          -        -      -  wm0
 ##### away for lunch
 dc# route delete 192.168.88.2 192.168.88.2
 <HANG>

 System starts acting up: typed keys shown with delay or not shown at all, network connections are dead, network related commands hang, at random times typing starts working normally then breaks again, trying to attach gdb to hanging route
 process also hangs.

 Finally, change virtual console or kill X to crash or freeze system.

 It takes multiple tries to reproduce. I was able to reproduce it 10+ times. With KASLR kernel it either hangs when trying to change virtual console or kill X, at other times it crashes kernel after killing X. A couple of times received a backtrace that stops at ':breakpoint'. With GENERIC (non-KASLR) kernel it freezes immediately and requires a hard reset.

 Thanks for your time, will look further into this.

From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: ocb@l25.fi
Subject: Re: port-amd64/56844
Date: Thu, 19 May 2022 22:11:01 +0700

     Date:        Wed, 18 May 2022 21:15:01 +0000 (UTC)
     From:        ocb@l25.fi
     Message-ID:  <20220518211502.0136F1A923A@mollari.NetBSD.org>

 First, no matter how badly you misconfigure the
 network, NetBSD should not hang/crash, so it looks
 as if something should be investigated here, and fixed.

 But you should be easily able to avoid the issue
 by configuring things properly.

   |  All devices named below are physical devices.
   |  
   |  192.168.88.1 - mikrotik router
   |  192.168.88.2 - mikrotik antenna connected to PoE port 10 on the router

 If the router is really a router, and not a switch (or a switch
 with some routing abilities) then that config is wrong.  In IP
 networking, each link should have its own network number.
 That is, for a device on each link, the IP addr anded with
 the netmask should produce different answers for each link.
 The router interfaces should have an address on each link
 (some point to point links can avoid having network addresses
 at all, but that is a different issue).   In the cases where
 this is not possible, the router must fake it, by pretending
 to be all the devices on one link to devices on the other
 link, and forwarding packets between them (which does not
 work for all protocols)

 Routes must always name a destination on the same link as
 the sending device (the one with the route) IP uses hop
 by hop routing, all anything ever does is send packets one
 hop closer to the destination.   It makes no sense to
 specify a next hop for a route which is not on the same
 link as the sender, at best it would mean that every packet
 sent would need to lookup the first route, and then upon
 discovering that the destination is not local (the next
 hop must be a destination we can send to directly) look
 up another route to see where to instead of where the
 first route directed things.  Better to avoid doing that
 for every packet, and instead do it once when installing
 the route.

 So, in this case the default route should be 192.168.88.1


 If the "router" is really a switch, then it has essentially
 no role in IP level packet forwarding at all.  It can have
 ad address (usually does) for admin/stats/... which is best
 viewed as a host connected to the switch.

 In that case the default route should be to 192.168.88.2
 and 192.168.88.1 should never be used for anything related
 to routing.

 Some devices are both switch and router, those are best
 considered as 2 separate connected devices, if you're on the
 same logical network as the destination you are using the
 switch and the router is not involved.  If you are using
 the router the destination should have a different network
 number, and the route to that network will reference the
 router's address on our network.

   |  Network configuration was made according to ifconfig.if(5),
   | and all network connectivity with local devices and remote hosts
   | works as expected.

 IP networking can be remarkably resilient.   Lots of things
 often seem to work, when perhaps they shouldn't.


   |  dc$ cat /etc/ifconfig.wm0
   |  up
   |  inet 192.168.88.252 netmask 255.255.255.0
   |  !route add 192.168.88.2 192.168.88.1

 that route with that netmask can never be correct.
 If we know how to reach ...1 (which that implies
 we must) then we know hiw to reach ....2 as that
 is on the same link.

 If it isn't on the same link its addr should be changed,
 or the router must do proxy arp for it.

   |  dc$ cat /etc/rc.conf | grep -E 'defaultroute'
   |  defaultroute="192.168.88.2"

 That is fine if ....2 is on the same link as the
 host, and wrong otherwise.   The next hop must be
 on the same link.  Always.  Otherwise it is not a
 "next" hop, but something later.


   |  The reason I add '192.168.88.2 192.168.88.1' route is so
   |  that a tap interface (behind 192.168.88.252 which is acting
   |  as hypervisor) from network '192.168.50.0/30' can reach
   |  '192.168.88.2'

 Things on 195.168.50/24 should reach devices (any devices)
 on 192.168.88/24 via a route to the router on their local
 link, which will be 192.168.50.xx and be the address on
 the netbsd (base) host for that network.  For this the
 netbsd system is the router (anything which forwards
 packets fron one IP network to another is a router).

   |  ICMP Redirect from  for ICMP Echo sent to 192.168.88.2

 That suggests to me that the "router" (196.168.88.1) is
 being sent a packet for 192.168.88.2 and it is telling
 you to stop bothering it, send directly to 192.168.88.2
 instead.   That suggests the "router" is acting as a switch,
 not a router (ignore what vendors call their products,
 they call them whatever they think sells best).

 NetBSD is probably reacting to that by adjusting the
 route destination from ...1 (you told it) to ...2
 (which ...1 told it).

 Now we have a route pointing to itself, which is a perfect
 candidate for infinite loops.

 Get the networking config correct, and all these issues will
 simply vanish.

 NetBSD probably could do with some improvements, but avoiding
 misconfig errors by the root user typically simply reduces
 flexibility.

 kre

From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, ocb@l25.fi
Subject: Re: port-amd64/56844: delete auto-modified network route crash
Date: Thu, 19 May 2022 20:49:36 +0000

 > Date: Wed, 18 May 2022 14:00:00 +0000 (UTC)
 > From: ocb@dc.localdomain
 > 
 > Finally if we try to remove this route by executing 'route delete
 > 192.168.88.2 192.168.88.2', machine running KASLR kernel will be
 > very limited, X continues to be visually responsive, already running
 > terminal emulators can run commands but will hang and can not be
 > paused or killed. New terminal emulators can not be started.
 > Changing active virtual console will finally crash X and kernel. I
 > am not experienced with crashes but this appears like memory is
 > slowly getting corrupted. Kernel crash dump successfully gets saved
 > to swap, but upon reboot is unable to retrieve it from swap. Note,
 > crash dump saving and retrieving works fine on non-kaslr kernel, but
 > not in this situation, since..

 Do you have a serial console, or a working network interface where you
 can redirect output via netcat?

 Can you start crash(8) running before triggering this (with output to
 serial console or netcat), and, when it starts to hang, type `ps' at
 crash(8)?  And, if any threads listed there are in `tstile' or are
 marked with a `>' on the line, can you type `bt/a 0xffff...' at
 crash(8), where 0xffff... is the lwp address shown in ps output?

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-amd64/56844: delete auto-modified network route crash
Date: Tue, 12 Jul 2022 07:10:00 +0000

 On Thu, May 19, 2022 at 08:50:02PM +0000, Taylor R Campbell wrote:
  >  Do you have a serial console, or a working network interface where you
  >  can redirect output via netcat?
  >  
  >  Can you start crash(8) running before triggering this (with output to
  >  serial console or netcat), and, when it starts to hang, type `ps' at
  >  crash(8)?  And, if any threads listed there are in `tstile' or are
  >  marked with a `>' on the line, can you type `bt/a 0xffff...' at
  >  crash(8), where 0xffff... is the lwp address shown in ps output?

 The originator hasn't replied and their email address is invalid, so
 setting the PR to feedback is useless and we aren't going to get
 answers.

 If we can reproduce the crashes, we should fix them; otherwise, close
 the PR.

 -- 
 David A. Holland
 dholland@netbsd.org

From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: ocb@l25.fi, ozaki-r@NetBSD.org
Subject: Re: kern/56844: delete auto-modified network route crash
Date: Sat, 3 Dec 2022 02:36:47 +0000

 There's some logic in rt_free to defer the freeing action to workqueue
 if the caller is in softint, presumably because cv_wait tripped an
 assertion that forbids sleeping in softint context:

 void
 rt_free(struct rtentry *rt)
 {

         KASSERTMSG(rt->rt_refcnt > 0, "rt_refcnt=3D%d", rt->rt_refcnt);
         if (rt_wait_ok()) {
                 atomic_dec_uint(&rt->rt_refcnt);
                 _rt_free(rt);
                 return;
         }

         mutex_enter(&rt_free_global.lock);
         /* No need to add a reference here. */
         SLIST_INSERT_HEAD(&rt_free_global.queue, rt, rt_free);
         if (!rt_free_global.enqueued) {
                 workqueue_enqueue(rt_free_global.wq, &rt_free_global.wk, NU=
 LL);
                 rt_free_global.enqueued =3D true;
         }
         mutex_exit(&rt_free_global.lock);
 }

 Unfortunately, this doesn't work.  It appears that some lock is held
 around the rt_free and cv_wait (probably softnet_lock), and that lock
 is taken in softint context, so cv_wait under it is forbidden too --
 but there's no assertion to catch it, so _most_ of the time this code
 gets away with it.  That is, until someone hits a softint deadlock.

 I think for now rt_wait_ok should be made to always return false, but
 this logic needs some more thought to ensure starvation won't happen.

From: ocb@l25.fi
To: gnats-bugs@NetBSD.org, riastradh@NetBSD.org
Cc: ocb@l25.fi, ozaki-r@NetBSD.org
Subject: Re: kern/56844: delete auto-modified network route crash
Date: Sat,  3 Dec 2022 06:37:27 +0100 (CET)

 requested information is below.

 fatal breakpoint trap in supervisor mode
 [ 1082.4798295] trap type 1 code 0 rip 0xffffffff80235315 cs 0x8 rflags 0x202 cr2 0x7acf3de6a660 ilevel 0x8 rsp 0xffffa70127c1cdc8
 [ 1082.4798295] curlwp 0xffff9f35f87fd040 pid 0.2 lowest kstack 0xffffa70127c182c0
 Stopped in pid 0.2 (system) at  netbsd:breakpoint+0x5:  leave
 breakpoint() at netbsd:breakpoint+0x5
 comintr() at netbsd:comintr+0x7e0
 intr_kdtrace_wrapper() at netbsd:intr_kdtrace_wrapper+0x26
 Xhandle_ioapic_edge1() at netbsd:Xhandle_ioapic_edge1+0x75
 --- interrupt ---
 x86_stihlt() at netbsd:x86_stihlt+0x6
 acpicpu_cstate_idle() at netbsd:acpicpu_cstate_idle+0x19a
 idle_loop() at netbsd:idle_loop+0x14c
 ds          a9c0
 es          cdc8
 fs          1cc9
 gs          cdc8
 rdi         ffffffff818450a0    x86_io
 rsi         800
 rbp         ffffa70127c1cdc8
 rbx         ffffa70007dd718a
 rdx         7f
 rcx         2b
 rax         1
 r8          19985
 r9          20
 r10         0
 r11         0
 r12         ffff9f35c120c4d0
 r13         800
 r14         cc
 r15         ffff9f35c120c400
 rip         ffffffff80235315    breakpoint+0x5
 cs          8
 rflags      202
 rsp         ffffa70127c1cdc8
 ss          10
 netbsd:breakpoint+0x5:  leave


 db{0}> ps
 PID    LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
 1207  1207 3   0         0   ffff9f35f6e06600              route rtentry
 1195  1195 3   3       180   ffff9f35f679b200                 sh wait
 1187  1187 3   0       180   ffff9f35f7e4f540              login wait
 1188  1188 3   2       180   ffff9f35f7d65580               cron nanoslp
 1277  1277 3   3       180   ffff9f35f6e061c0              inetd kqueue
 954    954 3   3       180   ffff9f35f7190a00             powerd kqueue
 639    639 3   1       180   ffff9f35f7190180            syslogd kqueue
 1        1 3   2       180   ffff9f35f7b82080               init wait
 0      178 3   2       200   ffff9f35f7dba0c0            physiod physiod
 0      222 3   1       200   ffff9f35f7e4f100          pooldrain pooldrain
 0      221 3   1       200   ffff9f35f7dba940            ioflush syncer
 0      220 3   2       200   ffff9f35f7dba500           pgdaemon pgdaemon
 0      217 3   3       200   ffff9f35f7b824c0          swwreboot swwreboot
 0      216 3   1       200   ffff9f35f7bce600             npfgc0 npfgcw
 0      215 3   2       200   ffff9f35f7b658c0            rt_free rt_free
 0      214 3   2       200   ffff9f35f7b65480              unpgc unpgc
 0      213 3   1       200   ffff9f35f7b65040    key_timehandler key_timehandler
 0      212 3   3       200   ffff9f35f7b2fbc0    icmp6_wqinput/3 icmp6_wqinput
 0      211 3   2       200   ffff9f35f7b2f780    icmp6_wqinput/2 icmp6_wqinput
 0      210 3   1       200   ffff9f35f7b2f340    icmp6_wqinput/1 icmp6_wqinput
 0      209 3   0       200   ffff9f35f7bf95c0    icmp6_wqinput/0 icmp6_wqinput
 0      208 3   0       200   ffff9f35f7be4140               usb3 usbevt
 0      207 3   0       200   ffff9f35f7be49c0               usb0 usbevt
 0      206 3   2       200   ffff9f35f7be4580               usb1 usbevt
 0      205 3   1       200   ffff9f35f7eb9b80               usb2 usbevt
 0      204 3   0       200   ffff9f35f7eb9740          nd6_timer nd6_timer
 0      203 3   3       200   ffff9f35f7eb9300    carp6_wqinput/3 carp6_wqinput
 0      202 3   2       200   ffff9f35f7ee4b40    carp6_wqinput/2 carp6_wqinput
 0      201 3   1       200   ffff9f35f7ee4700    carp6_wqinput/1 carp6_wqinput
 0      200 3   0       200   ffff9f35f7ee42c0    carp6_wqinput/0 carp6_wqinput
 0      199 3   3       200   ffff9f35f7ecfb00     carp_wqinput/3 carp_wqinput
 0      198 3   2       200   ffff9f35f7ecf6c0     carp_wqinput/2 carp_wqinput
 0      197 3   1       200   ffff9f35f7ecf280     carp_wqinput/1 carp_wqinput
 0      196 3   0       200   ffff9f35f7ebaac0     carp_wqinput/0 carp_wqinput
 0      195 3   3       200   ffff9f35f7eba680     icmp_wqinput/3 icmp_wqinput
 0      194 3   2       200   ffff9f35f7eba240     icmp_wqinput/2 icmp_wqinput
 0      193 3   1       200   ffff9f35f7ba5a80     icmp_wqinput/1 icmp_wqinput
 0      192 3   0       200   ffff9f35f7bf9180     icmp_wqinput/0 icmp_wqinput
 0      185 3   2       200   ffff9f35f7ba5640          atapibus0 sccomp
 0       31 3   3       200   ffff9f35f7ba5200           rt_timer rt_timer
 0       63 3   1       200   ffff9f35f7bcea40        vmem_rehash vmem_rehash
 0      117 3   0       200   ffff9f35c1392980          entbutler entropy
 0      116 3   0       200   ffff9f35c1392540              viomb balloon
 0      115 3   2       200   ffff9f35c1392100         usbtask-dr usbtsk
 0      114 3   1       200   ffff9f35c12ab940         usbtask-hc usbtsk
 0      113 3   3       200   ffff9f35c12ab500           wm0Reset wm0Reset
 0      112 3   3       200   ffff9f35c12ab0c0          wm0TxRx/3 wm0TxRx
 0      111 3   2       200   ffff9f35c13d6900          wm0TxRx/2 wm0TxRx
 0      110 3   1       200   ffff9f35c13d64c0          wm0TxRx/1 wm0TxRx
 0      109 3   0       200   ffff9f35c13d6080          wm0TxRx/0 wm0TxRx
 0      108 3   0       200   ffff9f35c12d08c0            atabus1 atath
 0      107 3   0       200   ffff9f35c12d0480            atabus0 atath
 0      106 3   0       200   ffff9f35c12d0040               pms0 pmsreset
 0      105 3   3       200   ffff9f35c1126bc0            xcall/3 xcall
 0      104 1   3       200   ffff9f35c1126780          softser/3
 0      103 1   3       200   ffff9f35c1126340          softclk/3
 0      102 1   3       200   ffff9f35c1149b80          softbio/3
 0      101 1   3       200   ffff9f35c1149740          softnet/3
 0    > 100 1   3       201   ffff9f35c1149300             idle/3
 0       99 3   2       200   ffff9f35c1069b40            xcall/2 xcall
 0       98 1   2       200   ffff9f35c1069700          softser/2
 0       97 3   2       200   ffff9f35c10692c0          softclk/2 tstile
 0       96 1   2       200   ffff9f35c108cb00          softbio/2
 0       30 1   2       200   ffff9f35c108c6c0          softnet/2
 0    >  29 1   2       201   ffff9f35c108c280             idle/2
 0       28 3   1       200   ffff9f35c0fadac0            xcall/1 xcall
 0       27 1   1       200   ffff9f35c0fad680          softser/1
 0       26 1   1       200   ffff9f35c0fad240          softclk/1
 0       25 1   1       200   ffff9f35c0f9da80          softbio/1
 0       24 1   1       200   ffff9f35c0f9d640          softnet/1
 0    >  23 1   1       201   ffff9f35c0f9d200             idle/1
 0       22 3   0       200   ffff9f35f7f3aa40           lnxsyswq lnxsyswq
 0       21 3   0       200   ffff9f35f7f3a600           lnxubdwq lnxubdwq
 0       20 3   0       200   ffff9f35f7f3a1c0           lnxpwrwq lnxpwrwq
 0       19 3   0       200   ffff9f35f7f4fa00           lnxlngwq lnxlngwq
 0       18 3   0       200   ffff9f35f7f4f5c0           lnxhipwq lnxhipwq
 0       17 3   0       200   ffff9f35f7f4f180           lnxrcugc lnxrcugc
 0       16 3   0       200   ffff9f35f7f629c0             sysmon smtaskq
 0       15 3   0       200   ffff9f35f7f62580         pmfsuspend pmfsuspend
 0       14 3   0       200   ffff9f35f7f62140           pmfevent pmfevent
 0       13 3   0       200   ffff9f35f7f77980         sopendfree sopendfr
 0       12 3   2       200   ffff9f35f7f77540             ifwdog ifwdog
 0       11 3   1       200   ffff9f35f7f77100            iflnkst iflnkst
 0       10 3   3       200   ffff9f35f87a2940           nfssilly nfssilly
 0        9 3   0       200   ffff9f35f87a2500             vdrain vdrain
 0        8 3   0       200   ffff9f35f87a20c0          modunload mod_unld
 0        7 3   0       200   ffff9f35f87d3900            xcall/0 xcall
 0        6 1   0       200   ffff9f35f87d34c0          softser/0
 0        5 3   0       200   ffff9f35f87d3080          softclk/0 tstile
 0        4 1   0       200   ffff9f35f87fd8c0          softbio/0
 0        3 1   0       200   ffff9f35f87fd480          softnet/0
 0    >   2 1   0       201   ffff9f35f87fd040             idle/0
 0        0 3   1       200   ffffffff8188a6c0            swapper uvm



 db{0}> ps/w
 PID   LID          COMMAND     EMUL  PRI WAIT-MSG    WAIT-CHANNEL
 1207  1207            route   netbsd   43 rtentry      ffff9f35f7c26298
 1195  1195               sh   netbsd   43 wait         ffff9f35f67c5b98
 1187  1187            login   netbsd   43 wait         ffff9f35f7b7a3d8
 1188  1188             cron   netbsd   43 nanoslp      ffff9f35f7d65580
 1277  1277            inetd   netbsd   43 kqueue       ffff9f35f7c0cfa0
 954    954           powerd   netbsd   43 kqueue       ffff9f35c117a460
 639    639          syslogd   netbsd   43 kqueue       ffff9f35c11fd220
 1        1             init   netbsd   43 wait         ffff9f35f7b7a058
 0      178           system   netbsd  123 physiod      ffff9f35f768c848
 0      222           system   netbsd  125 pooldrain    ffffffff8190f900
 0      221           system   netbsd  124 syncer       ffff9f35f7dba940
 0      220           system   netbsd  126 pgdaemon     ffffffff8190d4c8
 0      217           system   netbsd   43 swwreboot    ffff9f35c142b1c8
 0      216           system   netbsd   96 npfgcw       ffff9f35c11fca08
 0      215           system   netbsd  222 rt_free      ffff9f35f79c1d88
 0      214           system   netbsd   96 unpgc        ffffffff81980a30
 0      213           system   netbsd  222 key_timehandler ffff9f35f79c1c48
 0      212           system   netbsd  222 icmp6_wqinput ffff9f35f79c3f08
 0      211           system   netbsd  222 icmp6_wqinput ffff9f35f79c3ec8
 0      210           system   netbsd  222 icmp6_wqinput ffff9f35f79c3e88
 0      209           system   netbsd  222 icmp6_wqinput ffff9f35f79c3e48
 0      208           system   netbsd   96 usbevt       ffff9f35c127e4b8
 0      207           system   netbsd   96 usbevt       ffffa70007de2478
 0      206           system   netbsd   96 usbevt       ffffa70007de4478
 0      205           system   netbsd   96 usbevt       ffffa70007de6478
 0      204           system   netbsd  222 nd6_timer    ffff9f35f79c19c8
 0      203           system   netbsd  222 carp6_wqinput ffff9f35f7edd508
 0      202           system   netbsd  222 carp6_wqinput ffff9f35f7edd4c8
 0      201           system   netbsd  222 carp6_wqinput ffff9f35f7edd488
 0      200           system   netbsd  222 carp6_wqinput ffff9f35f7edd448
 0      199           system   netbsd  222 carp_wqinput ffff9f35f7edd108
 0      198           system   netbsd  222 carp_wqinput ffff9f35f7edd0c8
 0      197           system   netbsd  222 carp_wqinput ffff9f35f7edd088
 0      196           system   netbsd  222 carp_wqinput ffff9f35f7edd048
 0      195           system   netbsd  222 icmp_wqinput ffff9f35f7f7cd08
 0      194           system   netbsd  222 icmp_wqinput ffff9f35f7f7ccc8
 0      193           system   netbsd  222 icmp_wqinput ffff9f35f7f7cc88
 0      192           system   netbsd  222 icmp_wqinput ffff9f35f7f7cc48
 0      185           system   netbsd   96 sccomp       ffffa70007dd88f8
 0       31           system   netbsd  222 rt_timer     ffff9f35f79c1888
 0       63           system   netbsd  125 vmem_rehash  ffff9f35f79c14c8
 0      117           system   netbsd   43 entropy      ffffffff818b1d28
 0      116           system   netbsd    0 balloon      ffff9f35c1366608
 0      115           system   netbsd   96 usbtsk       ffffffff818d52d8
 0      114           system   netbsd   96 usbtsk       ffffffff818d5298
 0      113           system   netbsd  222 wm0Reset     ffff9f35c10f7988
 0      112           system   netbsd  222 wm0TxRx      ffff9f35f7f7c908
 0      111           system   netbsd  222 wm0TxRx      ffff9f35f7f7c8c8
 0      110           system   netbsd  222 wm0TxRx      ffff9f35f7f7c888
 0      109           system   netbsd  222 wm0TxRx      ffff9f35f7f7c848
 0      108           system   netbsd   96 atath        ffffa70007dd8938
 0      107           system   netbsd   96 atath        ffffa70007dd83c0
 0      106           system   netbsd   96 pmsreset     ffff9f35c121fc94
 0      105           system   netbsd  127 xcall        ffffa70127db9010
 0      104           system   netbsd  223              0
 0      103           system   netbsd  220              0
 0      102           system   netbsd  221              0
 0      101           system   netbsd  222              0
 0    > 100           system   netbsd    0              0
 0       99           system   netbsd  127 xcall        ffffa70127d7c010
 0       98           system   netbsd  223              0
 0       97           system   netbsd  220 tstile       ffff9f35f8a6f080
 0       96           system   netbsd  221              0
 0       30           system   netbsd  222              0
 0    >  29           system   netbsd    0              0
 0       28           system   netbsd  127 xcall        ffffa70127acc010
 0       27           system   netbsd  223              0
 0       26           system   netbsd  220              0
 0       25           system   netbsd  221              0
 0       24           system   netbsd  222              0
 0    >  23           system   netbsd    0              0
 0       22           system   netbsd   43 lnxsyswq     ffff9f35f8a5ec08
 0       21           system   netbsd   43 lnxubdwq     ffff9f35f8a5eb08
 0       20           system   netbsd   43 lnxpwrwq     ffff9f35f8a5ea08
 0       19           system   netbsd   43 lnxlngwq     ffff9f35f8a5e908
 0       18           system   netbsd   43 lnxhipwq     ffff9f35f8a5e808
 0       17           system   netbsd   43 lnxrcugc     ffffffff818b0308
 0       16           system   netbsd   96 smtaskq      ffffffff818f5f60
 0       15           system   netbsd   43 pmfsuspend   ffff9f35f8812808
 0       14           system   netbsd   43 pmfevent     ffff9f35f88126c8
 0       13           system   netbsd   96 sopendfr     ffffffff819809b0
 0       12           system   netbsd  222 ifwdog       ffff9f35f8812588
 0       11           system   netbsd  222 iflnkst      ffff9f35f8812448
 0       10           system   netbsd   43 nfssilly     ffff9f35f8812308
 0        9           system   netbsd  125 vdrain       ffffffff81981bb0
 0        8           system   netbsd  125 mod_unld     ffffffff81973830
 0        7           system   netbsd  127 xcall        ffffffff8183bcd0
 0        6           system   netbsd  223              0
 0        5           system   netbsd  220 tstile       ffff9f35f8a6f080
 0        4           system   netbsd  221              0
 0        3           system   netbsd  222              0
 0    >   2           system   netbsd    0              0
 0        0           system   netbsd  125 uvm          ffffffff8188a6c0


 db{0}> bt/a ffff9f35f6e06600
 trace: pid 1207 lid 1207 at 0xffffa70138749af0
 sleepq_block() at netbsd:sleepq_block+0x13a
 cv_wait() at netbsd:cv_wait+0x49
 _rt_free() at netbsd:_rt_free+0x44
 route_output() at netbsd:route_output+0x4c0
 route_send_wrapper() at netbsd:route_send_wrapper+0x6d
 sosend() at netbsd:sosend+0x944
 soo_write() at netbsd:soo_write+0x2f
 dofilewrite() at netbsd:dofilewrite+0x80
 sys_write() at netbsd:sys_write+0x49
 syscall() at netbsd:syscall+0x196
 --- syscall (number 4) ---
 netbsd:syscall+0x196:


 db{0}> bt/a ffff9f35c10692c0
 trace: pid 0 lid 97 at 0xffffa70127da6e60
 sleepq_block() at netbsd:sleepq_block+0x13a
 turnstile_block() at netbsd:turnstile_block+0x3b8
 mutex_vector_enter() at netbsd:mutex_vector_enter+0x12b
 tcp_slowtimo() at netbsd:tcp_slowtimo+0x10
 callout_softclock() at netbsd:callout_softclock+0xd2
 softint_dispatch() at netbsd:softint_dispatch+0x10b
 DDB lost frame for netbsd:Xsoftintr+0x4c, trying 0xffffa70127da70f0
 Xsoftintr() at netbsd:Xsoftintr+0x4c
 --- interrupt ---
 0:


 db{0}> bt/a ffff9f35f87d3080
 trace: pid 0 lid 5 at 0xffffa70127c7ae40
 sleepq_block() at netbsd:sleepq_block+0x13a
 turnstile_block() at netbsd:turnstile_block+0x3b8
 mutex_vector_enter() at netbsd:mutex_vector_enter+0x12b
 ip_slowtimo() at netbsd:ip_slowtimo+0x10
 pfslowtimo() at netbsd:pfslowtimo+0x34
 callout_softclock() at netbsd:callout_softclock+0xd2
 softint_dispatch() at netbsd:softint_dispatch+0x10b
 DDB lost frame for netbsd:Xsoftintr+0x4c, trying 0xffffa70127c7b0f0
 Xsoftintr() at netbsd:Xsoftintr+0x4c
 --- interrupt ---
 0:


 db{0}> x/Lx ffff9f35f8a6f080
 ffff9f35c10692c0:       ffff9f35f679ba82

 db{0}> bt/a ffff9f35f679ba80
 trace: pid 1214 lid 1214 at 0xffffa70138749af0
 sleepq_block() at netbsd:sleepq_block+0x13a
 cv_wait() at netbsd:cv_wait+0x49
 _rt_free() at netbsd:_rt_free+0x44
 route_output() at netbsd:route_output+0x4c0
 route_send_wrapper() at netbsd:route_send_wrapper+0x6d
 sosend() at netbsd:sosend+0x944
 soo_write() at netbsd:soo_write+0x2f
 dofilewrite() at netbsd:dofilewrite+0x80
 sys_write() at netbsd:sys_write+0x49
 syscall() at netbsd:syscall+0x196
 --- syscall (number 4) ---
 netbsd:syscall+0x196:


 db{0}> show routes
 rtentry=0xffff9f35f7c262c8 flags=0x803 refcnt=0 use=936 expire=0
  key=[16,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
  mask=[]
  gw=[16,2,0,0,192,168,88,2,0,0,0,0,0,0,0,0]
  ifp=0xffffa70007de0060 (wm0) ifa=0xffff9f35f76a4c88
   ifa_addr=[16,2,0,0,192,168,88,56,0,0,0,0,0,0,0,0]
   ifa_dsta=[16,2,0,0,192,168,88,255,0,0,0,0,0,0,0,0]
   ifa_mask=[7,2,0,0,255,255,255]
   flags=0x101,refcnt=6,metric=0
  gwroute=0x0 llinfo=0x0
 rtentry=0xffff9f35f708b7b8 flags=0x80b refcnt=0 use=0 expire=0
  key=[16,2,0,0,127,0,0,0,0,0,0,0,0,0,0,0]
  mask=[5,255,255,255,255]
  gw=[16,2,0,0,127,0,0,1,0,0,0,0,0,0,0,0]
  ifp=0xffff9f35f7db60c0 (lo0) ifa=0xffff9f35f71b0048
   ifa_addr=[16,2,0,0,127,0,0,1,0,0,0,0,0,0,0,0]
   ifa_dsta=[16,2,0,0,127,0,0,1,0,0,0,0,0,0,0,0]
   ifa_mask=[5,2,0,0,255]
   flags=0x0,refcnt=4,metric=0
  gwroute=0x0 llinfo=0x0
 rtentry=0xffff9f35f708b038 flags=0x40005 refcnt=0 use=0 expire=0
  key=[16,2,0,0,127,0,0,1,0,0,0,0,0,0,0,0]
  mask=[NULL] gw=[11,18,2,0,24,3,0,0,108,111,48]
  ifp=0xffff9f35f7db60c0 (lo0) ifa=0xffff9f35f71b0048
   ifa_addr=[16,2,0,0,127,0,0,1,0,0,0,0,0,0,0,0]
   ifa_dsta=[16,2,0,0,127,0,0,1,0,0,0,0,0,0,0,0]
   ifa_mask=[5,2,0,0,255]
   flags=0x0,refcnt=4,metric=0
  gwroute=0x0 llinfo=0x0
 rtentry=0xffff9f35f7c26048 flags=0x101 refcnt=0 use=933 expire=0
  key=[16,2,0,0,192,168,88,0,0,0,0,0,0,0,0,0]
  mask=[7,255,255,255,255,255,255]
  gw=[17,18,1,0,6,0,0,0,0,0,0,0,0,0,0,0,0]
  ifp=0xffffa70007de0060 (wm0) ifa=0xffff9f35f76a4c88
   ifa_addr=[16,2,0,0,192,168,88,56,0,0,0,0,0,0,0,0]
   ifa_dsta=[16,2,0,0,192,168,88,255,0,0,0,0,0,0,0,0]
   ifa_mask=[7,2,0,0,255,255,255]
   flags=0x101,refcnt=6,metric=0
  gwroute=0x0 llinfo=0x0
 rtentry=0xffff9f35f78a0e00 flags=0x40005 refcnt=0 use=0 expire=0
  key=[16,2,0,0,192,168,88,56,0,0,0,0,0,0,0,0]
  mask=[NULL] gw=[17,18,1,0,6,0,0,0,0,0,0,0,0,0,0,0,0]
  ifp=0xffff9f35f7db60c0 (lo0) ifa=0xffff9f35f76a4c88
   ifa_addr=[16,2,0,0,192,168,88,56,0,0,0,0,0,0,0,0]
   ifa_dsta=[16,2,0,0,192,168,88,255,0,0,0,0,0,0,0,0]
   ifa_mask=[7,2,0,0,255,255,255]
   flags=0x101,refcnt=6,metric=0
  gwroute=0x0 llinfo=0x0

From: ocb@l25.fi
To: gnats-bugs@NetBSD.org, ocb@l25.fi, riastradh@NetBSD.org
Cc: ozaki-r@NetBSD.org
Subject: Re: kern/56844: delete auto-modified network route crash
Date: Sun, 18 Dec 2022 09:18:56 +0100 (CET)

 per your specification; making rt_wait_ok always return false appears to resolve the issue with 9.9.108, tested multiple times through a 10 hour window.

 $ diff sys/net/route.c sys/net/route.c.bak
 646,647c646
 <       /* return !cpu_softintr_p(); */
 <       return 0;
 ---
 >       return !cpu_softintr_p();

From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56844 CVS commit: src/sys/net
Date: Thu, 22 Dec 2022 13:54:57 +0000

 Module Name:	src
 Committed By:	riastradh
 Date:		Thu Dec 22 13:54:57 UTC 2022

 Modified Files:
 	src/sys/net: route.c

 Log Message:
 route(4): Work around deadlock in rt_free wait path.

 PR kern/56844

 XXX pullup-8
 XXX pullup-9
 XXX pullup-10


 To generate a diff of this commit:
 cvs rdiff -u -r1.235 -r1.236 src/sys/net/route.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56844 CVS commit: [netbsd-10] src/sys/net
Date: Wed, 22 Feb 2023 18:52:46 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Wed Feb 22 18:52:46 UTC 2023

 Modified Files:
 	src/sys/net [netbsd-10]: route.c

 Log Message:
 Pull up following revision(s) (requested by riastradh in ticket #99):

 	sys/net/route.c: revision 1.236

 route(4): Work around deadlock in rt_free wait path.
 PR kern/56844


 To generate a diff of this commit:
 cvs rdiff -u -r1.235 -r1.235.2.1 src/sys/net/route.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56844 CVS commit: [netbsd-9] src/sys/net
Date: Wed, 22 Feb 2023 18:53:56 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Wed Feb 22 18:53:56 UTC 2023

 Modified Files:
 	src/sys/net [netbsd-9]: route.c

 Log Message:
 Pull up following revision(s) (requested by riastradh in ticket #1602):

 	sys/net/route.c: revision 1.236

 route(4): Work around deadlock in rt_free wait path.
 PR kern/56844


 To generate a diff of this commit:
 cvs rdiff -u -r1.219.2.2 -r1.219.2.3 src/sys/net/route.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/56844 CVS commit: [netbsd-8] src/sys/net
Date: Wed, 22 Feb 2023 18:55:07 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Wed Feb 22 18:55:07 UTC 2023

 Modified Files:
 	src/sys/net [netbsd-8]: route.c

 Log Message:
 Pull up following revision(s) (requested by riastradh in ticket #1801):

 	sys/net/route.c: revision 1.236

 route(4): Work around deadlock in rt_free wait path.
 PR kern/56844


 To generate a diff of this commit:
 cvs rdiff -u -r1.194.6.15 -r1.194.6.16 src/sys/net/route.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.