NetBSD Problem Report #56844
From ocb@dc.localdomain Wed May 18 13:55:37 2022
Return-Path: <ocb@dc.localdomain>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 9E0771A921F
for <gnats-bugs@gnats.NetBSD.org>; Wed, 18 May 2022 13:55:37 +0000 (UTC)
Message-Id: <20220518123749.4DD40272918@dc.localdomain>
Date: Wed, 18 May 2022 14:37:49 +0200 (CEST)
From: ocb@dc.localdomain
Reply-To: ocb@dc.localdomain
To: gnats-bugs@NetBSD.org
Subject: delete auto-modified network route crash
X-Send-Pr-Version: 3.95
>Number: 56844
>Category: port-amd64
>Synopsis: delete auto-modified network route crash
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: port-amd64-maintainer
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Wed May 18 14:00:00 +0000 2022
>Last-Modified: Wed Feb 22 19:00:02 +0000 2023
>Originator: ocb@l25.fi
>Release: NetBSD 9.2
>Organization:
>Environment:
System: NetBSD 9.2 (KERNEL1_KASLR) #0: Fri Jan 21 03:09:17 UTC 2022 kernel@dc:/usr/kernel/usr/src/sys/arch/amd64/compile/KERNEL1_KASLR amd64
Architecture: x86_64
Machine: amd64
>Description:
The system uses IP .2 as default gateway, where IP .2 is behind a router IP .1. Connection works, everything is ok. To allow some additional virtual machines from different subnet access IP .2, a route 'route add 192.168.88.2 192.168.88.1' is added. Connection works, everything is ok.
dc# route -n show | grep 192.168.88.2
default 192.168.88.2 UG - - - wm0
192.168.88/24 link#1 U - - - wm0
192.168.88.2 192.168.88.1 UGH - - - wm0
192.168.88.252 link#1 UHl - - - lo0
192.168.88.2 4c:5e:0c:b7:7e:08 UH - - - wm0
Now I decide to connect to IP .2 SSH port and the recently added route gets modified.
dc# route -n show | grep 192.168.88.2
default 192.168.88.2 UG - - - wm0
192.168.88/24 link#1 U - - - wm0
192.168.88.2 192.168.88.2 UGH - - - wm0 <-- note this line
192.168.88.252 link#1 UHl - - - lo0
192.168.88.2 4c:5e:0c:b7:7e:08 UH - - - wm0
Finally if we try to remove this route by executing 'route delete 192.168.88.2 192.168.88.2', machine running KASLR kernel will be very limited, X continues to be visually responsive, already running terminal emulators can run commands but will hang and can not be paused or killed. New terminal emulators can not be started. Changing active virtual console will finally crash X and kernel. I am not experienced with crashes but this appears like memory is slowly getting corrupted. Kernel crash dump successfully gets saved to swap, but upon reboot is unable to retrieve it from swap. Note, crash dump saving and retrieving works fine on non-kaslr kernel, but not in this situation, since..
On a GENERIC (non-KASLR) kernel, the situation is different. After executing 'route delete' command, the system freezes instantly and requires hard poweroff on the button. No backtrace, no crash log.
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-amd64/56844: delete auto-modified network route crash
Date: Wed, 18 May 2022 16:10:39 +0200
On Wed, May 18, 2022 at 02:00:00PM +0000, ocb@dc.localdomain wrote:
> The system uses IP .2 as default gateway, where IP .2 is behind a router IP .1. Connection works, everything is ok. To allow some additional virtual machines from different subnet access IP .2, a route 'route add 192.168.88.2 192.168.88.1' is added.
How can 192.168.88.2 be behind a router if you have the full /24 on wm0?
> 192.168.88/24 link#1 U - - - wm0
I don't understand how you intend your setup to work.
Martin
From: ocb@l25.fi
To: gnats-bugs@NetBSD.org
Cc: martin@duskware.de
Subject: Re: port-amd64/56844
Date: Wed, 18 May 2022 19:35:24 +0000
Excuse me for not being clear.
All devices named below are physical devices.
192.168.88.1 - mikrotik router
192.168.88.2 - mikrotik antenna connected to PoE port 10 on the router
192.168.88.252 - netbsd workstation connected to port 6 on the router
Network configuration was made according to ifconfig.if(5), and all network connectivity with local devices and remote hosts works as expected.
dc$ cat /etc/ifconfig.wm0
up
inet 192.168.88.252 netmask 255.255.255.0
!route add 192.168.88.2 192.168.88.1
dc$ cat /etc/rc.conf | grep -E 'defaultroute'
defaultroute="192.168.88.2"
dc$ ifconfig wm0
wm0: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=0
ec_capabilities=17<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,EEE>
ec_enabled=2<VLAN_HWTAGGING>
address: 3c:97:0e:4e:db:de
media: Ethernet none (none)
inet 192.168.88.252/24 broadcast 192.168.88.255 flags 0x0
inet6 fe80::3e97:eff:fe4e:dbde%wm0/64 flags 0x0 scopeid 0x1
The reason I add '192.168.88.2 192.168.88.1' route is so that a tap interface (behind 192.168.88.252 which is acting as hypervisor) from network '192.168.50.0/30' can reach '192.168.88.2'
# commands before hang
dc# route delete 192.168.88.2 192.168.88.1
delete host 192.168.88.2: gateway 192.168.88.1
dc# route add 192.168.88.2 192.168.88.1
add host 192.168.88.2: gateway 192.168.88.1
dc# fping -l -A 192.168.88.2
192.168.88.2 : [0], 84 bytes, 0.76 ms (0.76 avg, 0% loss)
ICMP Redirect from for ICMP Echo sent to 192.168.88.2
##### it does not happen always, usually happens when route is active for more than 10 minutes.
dc# route delete 192.168.88.2 192.168.88.2
delete host 192.168.88.2: gateway 192.168.88.2
dc# route add 192.168.88.2 192.168.88.1
add host 192.168.88.2: gateway 192.168.88.1
dc# fping -l -A 192.168.88.2
192.168.88.2 : [0], 84 bytes, 0.76 ms (0.76 avg, 0% loss)
ICMP Redirect from for ICMP Echo sent to 192.168.88.2
dc# route -n show | grep 192.168.88
default 192.168.88.2 UG - - - wm0
192.168.88/24 link#1 U - - - wm0
192.168.88.2 192.168.88.2 UGH - - - wm0
192.168.88.252 link#1 UHl - - - lo0
192.168.88.1 b8:69:f4:db:8d:07 UH - - - wm0
192.168.88.2 4c:5e:0c:b7:7e:08 UH - - - wm0
##### away for lunch
dc# route delete 192.168.88.2 192.168.88.2
<HANG>
System starts acting up: typed keys shown with delay or not shown at all, network connections are dead, network related commands hang, at random times typing starts working normally then breaks again, trying to attach gdb to hanging route
process also hangs.
Finally, change virtual console or kill X to crash or freeze system.
It takes multiple tries to reproduce. I was able to reproduce it 10+ times. With KASLR kernel it either hangs when trying to change virtual console or kill X, at other times it crashes kernel after killing X. A couple of times received a backtrace that stops at ':breakpoint'. With GENERIC (non-KASLR) kernel it freezes immediately and requires a hard reset.
Thanks for your time, will look further into this.
From: Robert Elz <kre@munnari.OZ.AU>
To: gnats-bugs@netbsd.org
Cc: ocb@l25.fi
Subject: Re: port-amd64/56844
Date: Thu, 19 May 2022 22:11:01 +0700
Date: Wed, 18 May 2022 21:15:01 +0000 (UTC)
From: ocb@l25.fi
Message-ID: <20220518211502.0136F1A923A@mollari.NetBSD.org>
First, no matter how badly you misconfigure the
network, NetBSD should not hang/crash, so it looks
as if something should be investigated here, and fixed.
But you should be easily able to avoid the issue
by configuring things properly.
| All devices named below are physical devices.
|
| 192.168.88.1 - mikrotik router
| 192.168.88.2 - mikrotik antenna connected to PoE port 10 on the router
If the router is really a router, and not a switch (or a switch
with some routing abilities) then that config is wrong. In IP
networking, each link should have its own network number.
That is, for a device on each link, the IP addr anded with
the netmask should produce different answers for each link.
The router interfaces should have an address on each link
(some point to point links can avoid having network addresses
at all, but that is a different issue). In the cases where
this is not possible, the router must fake it, by pretending
to be all the devices on one link to devices on the other
link, and forwarding packets between them (which does not
work for all protocols)
Routes must always name a destination on the same link as
the sending device (the one with the route) IP uses hop
by hop routing, all anything ever does is send packets one
hop closer to the destination. It makes no sense to
specify a next hop for a route which is not on the same
link as the sender, at best it would mean that every packet
sent would need to lookup the first route, and then upon
discovering that the destination is not local (the next
hop must be a destination we can send to directly) look
up another route to see where to instead of where the
first route directed things. Better to avoid doing that
for every packet, and instead do it once when installing
the route.
So, in this case the default route should be 192.168.88.1
If the "router" is really a switch, then it has essentially
no role in IP level packet forwarding at all. It can have
ad address (usually does) for admin/stats/... which is best
viewed as a host connected to the switch.
In that case the default route should be to 192.168.88.2
and 192.168.88.1 should never be used for anything related
to routing.
Some devices are both switch and router, those are best
considered as 2 separate connected devices, if you're on the
same logical network as the destination you are using the
switch and the router is not involved. If you are using
the router the destination should have a different network
number, and the route to that network will reference the
router's address on our network.
| Network configuration was made according to ifconfig.if(5),
| and all network connectivity with local devices and remote hosts
| works as expected.
IP networking can be remarkably resilient. Lots of things
often seem to work, when perhaps they shouldn't.
| dc$ cat /etc/ifconfig.wm0
| up
| inet 192.168.88.252 netmask 255.255.255.0
| !route add 192.168.88.2 192.168.88.1
that route with that netmask can never be correct.
If we know how to reach ...1 (which that implies
we must) then we know hiw to reach ....2 as that
is on the same link.
If it isn't on the same link its addr should be changed,
or the router must do proxy arp for it.
| dc$ cat /etc/rc.conf | grep -E 'defaultroute'
| defaultroute="192.168.88.2"
That is fine if ....2 is on the same link as the
host, and wrong otherwise. The next hop must be
on the same link. Always. Otherwise it is not a
"next" hop, but something later.
| The reason I add '192.168.88.2 192.168.88.1' route is so
| that a tap interface (behind 192.168.88.252 which is acting
| as hypervisor) from network '192.168.50.0/30' can reach
| '192.168.88.2'
Things on 195.168.50/24 should reach devices (any devices)
on 192.168.88/24 via a route to the router on their local
link, which will be 192.168.50.xx and be the address on
the netbsd (base) host for that network. For this the
netbsd system is the router (anything which forwards
packets fron one IP network to another is a router).
| ICMP Redirect from for ICMP Echo sent to 192.168.88.2
That suggests to me that the "router" (196.168.88.1) is
being sent a packet for 192.168.88.2 and it is telling
you to stop bothering it, send directly to 192.168.88.2
instead. That suggests the "router" is acting as a switch,
not a router (ignore what vendors call their products,
they call them whatever they think sells best).
NetBSD is probably reacting to that by adjusting the
route destination from ...1 (you told it) to ...2
(which ...1 told it).
Now we have a route pointing to itself, which is a perfect
candidate for infinite loops.
Get the networking config correct, and all these issues will
simply vanish.
NetBSD probably could do with some improvements, but avoiding
misconfig errors by the root user typically simply reduces
flexibility.
kre
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@netbsd.org
Cc: port-amd64-maintainer@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, ocb@l25.fi
Subject: Re: port-amd64/56844: delete auto-modified network route crash
Date: Thu, 19 May 2022 20:49:36 +0000
> Date: Wed, 18 May 2022 14:00:00 +0000 (UTC)
> From: ocb@dc.localdomain
>
> Finally if we try to remove this route by executing 'route delete
> 192.168.88.2 192.168.88.2', machine running KASLR kernel will be
> very limited, X continues to be visually responsive, already running
> terminal emulators can run commands but will hang and can not be
> paused or killed. New terminal emulators can not be started.
> Changing active virtual console will finally crash X and kernel. I
> am not experienced with crashes but this appears like memory is
> slowly getting corrupted. Kernel crash dump successfully gets saved
> to swap, but upon reboot is unable to retrieve it from swap. Note,
> crash dump saving and retrieving works fine on non-kaslr kernel, but
> not in this situation, since..
Do you have a serial console, or a working network interface where you
can redirect output via netcat?
Can you start crash(8) running before triggering this (with output to
serial console or netcat), and, when it starts to hang, type `ps' at
crash(8)? And, if any threads listed there are in `tstile' or are
marked with a `>' on the line, can you type `bt/a 0xffff...' at
crash(8), where 0xffff... is the lwp address shown in ps output?
From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: port-amd64/56844: delete auto-modified network route crash
Date: Tue, 12 Jul 2022 07:10:00 +0000
On Thu, May 19, 2022 at 08:50:02PM +0000, Taylor R Campbell wrote:
> Do you have a serial console, or a working network interface where you
> can redirect output via netcat?
>
> Can you start crash(8) running before triggering this (with output to
> serial console or netcat), and, when it starts to hang, type `ps' at
> crash(8)? And, if any threads listed there are in `tstile' or are
> marked with a `>' on the line, can you type `bt/a 0xffff...' at
> crash(8), where 0xffff... is the lwp address shown in ps output?
The originator hasn't replied and their email address is invalid, so
setting the PR to feedback is useless and we aren't going to get
answers.
If we can reproduce the crashes, we should fix them; otherwise, close
the PR.
--
David A. Holland
dholland@netbsd.org
From: Taylor R Campbell <riastradh@NetBSD.org>
To: gnats-bugs@NetBSD.org
Cc: ocb@l25.fi, ozaki-r@NetBSD.org
Subject: Re: kern/56844: delete auto-modified network route crash
Date: Sat, 3 Dec 2022 02:36:47 +0000
There's some logic in rt_free to defer the freeing action to workqueue
if the caller is in softint, presumably because cv_wait tripped an
assertion that forbids sleeping in softint context:
void
rt_free(struct rtentry *rt)
{
KASSERTMSG(rt->rt_refcnt > 0, "rt_refcnt=3D%d", rt->rt_refcnt);
if (rt_wait_ok()) {
atomic_dec_uint(&rt->rt_refcnt);
_rt_free(rt);
return;
}
mutex_enter(&rt_free_global.lock);
/* No need to add a reference here. */
SLIST_INSERT_HEAD(&rt_free_global.queue, rt, rt_free);
if (!rt_free_global.enqueued) {
workqueue_enqueue(rt_free_global.wq, &rt_free_global.wk, NU=
LL);
rt_free_global.enqueued =3D true;
}
mutex_exit(&rt_free_global.lock);
}
Unfortunately, this doesn't work. It appears that some lock is held
around the rt_free and cv_wait (probably softnet_lock), and that lock
is taken in softint context, so cv_wait under it is forbidden too --
but there's no assertion to catch it, so _most_ of the time this code
gets away with it. That is, until someone hits a softint deadlock.
I think for now rt_wait_ok should be made to always return false, but
this logic needs some more thought to ensure starvation won't happen.
From: ocb@l25.fi
To: gnats-bugs@NetBSD.org, riastradh@NetBSD.org
Cc: ocb@l25.fi, ozaki-r@NetBSD.org
Subject: Re: kern/56844: delete auto-modified network route crash
Date: Sat, 3 Dec 2022 06:37:27 +0100 (CET)
requested information is below.
fatal breakpoint trap in supervisor mode
[ 1082.4798295] trap type 1 code 0 rip 0xffffffff80235315 cs 0x8 rflags 0x202 cr2 0x7acf3de6a660 ilevel 0x8 rsp 0xffffa70127c1cdc8
[ 1082.4798295] curlwp 0xffff9f35f87fd040 pid 0.2 lowest kstack 0xffffa70127c182c0
Stopped in pid 0.2 (system) at netbsd:breakpoint+0x5: leave
breakpoint() at netbsd:breakpoint+0x5
comintr() at netbsd:comintr+0x7e0
intr_kdtrace_wrapper() at netbsd:intr_kdtrace_wrapper+0x26
Xhandle_ioapic_edge1() at netbsd:Xhandle_ioapic_edge1+0x75
--- interrupt ---
x86_stihlt() at netbsd:x86_stihlt+0x6
acpicpu_cstate_idle() at netbsd:acpicpu_cstate_idle+0x19a
idle_loop() at netbsd:idle_loop+0x14c
ds a9c0
es cdc8
fs 1cc9
gs cdc8
rdi ffffffff818450a0 x86_io
rsi 800
rbp ffffa70127c1cdc8
rbx ffffa70007dd718a
rdx 7f
rcx 2b
rax 1
r8 19985
r9 20
r10 0
r11 0
r12 ffff9f35c120c4d0
r13 800
r14 cc
r15 ffff9f35c120c400
rip ffffffff80235315 breakpoint+0x5
cs 8
rflags 202
rsp ffffa70127c1cdc8
ss 10
netbsd:breakpoint+0x5: leave
db{0}> ps
PID LID S CPU FLAGS STRUCT LWP * NAME WAIT
1207 1207 3 0 0 ffff9f35f6e06600 route rtentry
1195 1195 3 3 180 ffff9f35f679b200 sh wait
1187 1187 3 0 180 ffff9f35f7e4f540 login wait
1188 1188 3 2 180 ffff9f35f7d65580 cron nanoslp
1277 1277 3 3 180 ffff9f35f6e061c0 inetd kqueue
954 954 3 3 180 ffff9f35f7190a00 powerd kqueue
639 639 3 1 180 ffff9f35f7190180 syslogd kqueue
1 1 3 2 180 ffff9f35f7b82080 init wait
0 178 3 2 200 ffff9f35f7dba0c0 physiod physiod
0 222 3 1 200 ffff9f35f7e4f100 pooldrain pooldrain
0 221 3 1 200 ffff9f35f7dba940 ioflush syncer
0 220 3 2 200 ffff9f35f7dba500 pgdaemon pgdaemon
0 217 3 3 200 ffff9f35f7b824c0 swwreboot swwreboot
0 216 3 1 200 ffff9f35f7bce600 npfgc0 npfgcw
0 215 3 2 200 ffff9f35f7b658c0 rt_free rt_free
0 214 3 2 200 ffff9f35f7b65480 unpgc unpgc
0 213 3 1 200 ffff9f35f7b65040 key_timehandler key_timehandler
0 212 3 3 200 ffff9f35f7b2fbc0 icmp6_wqinput/3 icmp6_wqinput
0 211 3 2 200 ffff9f35f7b2f780 icmp6_wqinput/2 icmp6_wqinput
0 210 3 1 200 ffff9f35f7b2f340 icmp6_wqinput/1 icmp6_wqinput
0 209 3 0 200 ffff9f35f7bf95c0 icmp6_wqinput/0 icmp6_wqinput
0 208 3 0 200 ffff9f35f7be4140 usb3 usbevt
0 207 3 0 200 ffff9f35f7be49c0 usb0 usbevt
0 206 3 2 200 ffff9f35f7be4580 usb1 usbevt
0 205 3 1 200 ffff9f35f7eb9b80 usb2 usbevt
0 204 3 0 200 ffff9f35f7eb9740 nd6_timer nd6_timer
0 203 3 3 200 ffff9f35f7eb9300 carp6_wqinput/3 carp6_wqinput
0 202 3 2 200 ffff9f35f7ee4b40 carp6_wqinput/2 carp6_wqinput
0 201 3 1 200 ffff9f35f7ee4700 carp6_wqinput/1 carp6_wqinput
0 200 3 0 200 ffff9f35f7ee42c0 carp6_wqinput/0 carp6_wqinput
0 199 3 3 200 ffff9f35f7ecfb00 carp_wqinput/3 carp_wqinput
0 198 3 2 200 ffff9f35f7ecf6c0 carp_wqinput/2 carp_wqinput
0 197 3 1 200 ffff9f35f7ecf280 carp_wqinput/1 carp_wqinput
0 196 3 0 200 ffff9f35f7ebaac0 carp_wqinput/0 carp_wqinput
0 195 3 3 200 ffff9f35f7eba680 icmp_wqinput/3 icmp_wqinput
0 194 3 2 200 ffff9f35f7eba240 icmp_wqinput/2 icmp_wqinput
0 193 3 1 200 ffff9f35f7ba5a80 icmp_wqinput/1 icmp_wqinput
0 192 3 0 200 ffff9f35f7bf9180 icmp_wqinput/0 icmp_wqinput
0 185 3 2 200 ffff9f35f7ba5640 atapibus0 sccomp
0 31 3 3 200 ffff9f35f7ba5200 rt_timer rt_timer
0 63 3 1 200 ffff9f35f7bcea40 vmem_rehash vmem_rehash
0 117 3 0 200 ffff9f35c1392980 entbutler entropy
0 116 3 0 200 ffff9f35c1392540 viomb balloon
0 115 3 2 200 ffff9f35c1392100 usbtask-dr usbtsk
0 114 3 1 200 ffff9f35c12ab940 usbtask-hc usbtsk
0 113 3 3 200 ffff9f35c12ab500 wm0Reset wm0Reset
0 112 3 3 200 ffff9f35c12ab0c0 wm0TxRx/3 wm0TxRx
0 111 3 2 200 ffff9f35c13d6900 wm0TxRx/2 wm0TxRx
0 110 3 1 200 ffff9f35c13d64c0 wm0TxRx/1 wm0TxRx
0 109 3 0 200 ffff9f35c13d6080 wm0TxRx/0 wm0TxRx
0 108 3 0 200 ffff9f35c12d08c0 atabus1 atath
0 107 3 0 200 ffff9f35c12d0480 atabus0 atath
0 106 3 0 200 ffff9f35c12d0040 pms0 pmsreset
0 105 3 3 200 ffff9f35c1126bc0 xcall/3 xcall
0 104 1 3 200 ffff9f35c1126780 softser/3
0 103 1 3 200 ffff9f35c1126340 softclk/3
0 102 1 3 200 ffff9f35c1149b80 softbio/3
0 101 1 3 200 ffff9f35c1149740 softnet/3
0 > 100 1 3 201 ffff9f35c1149300 idle/3
0 99 3 2 200 ffff9f35c1069b40 xcall/2 xcall
0 98 1 2 200 ffff9f35c1069700 softser/2
0 97 3 2 200 ffff9f35c10692c0 softclk/2 tstile
0 96 1 2 200 ffff9f35c108cb00 softbio/2
0 30 1 2 200 ffff9f35c108c6c0 softnet/2
0 > 29 1 2 201 ffff9f35c108c280 idle/2
0 28 3 1 200 ffff9f35c0fadac0 xcall/1 xcall
0 27 1 1 200 ffff9f35c0fad680 softser/1
0 26 1 1 200 ffff9f35c0fad240 softclk/1
0 25 1 1 200 ffff9f35c0f9da80 softbio/1
0 24 1 1 200 ffff9f35c0f9d640 softnet/1
0 > 23 1 1 201 ffff9f35c0f9d200 idle/1
0 22 3 0 200 ffff9f35f7f3aa40 lnxsyswq lnxsyswq
0 21 3 0 200 ffff9f35f7f3a600 lnxubdwq lnxubdwq
0 20 3 0 200 ffff9f35f7f3a1c0 lnxpwrwq lnxpwrwq
0 19 3 0 200 ffff9f35f7f4fa00 lnxlngwq lnxlngwq
0 18 3 0 200 ffff9f35f7f4f5c0 lnxhipwq lnxhipwq
0 17 3 0 200 ffff9f35f7f4f180 lnxrcugc lnxrcugc
0 16 3 0 200 ffff9f35f7f629c0 sysmon smtaskq
0 15 3 0 200 ffff9f35f7f62580 pmfsuspend pmfsuspend
0 14 3 0 200 ffff9f35f7f62140 pmfevent pmfevent
0 13 3 0 200 ffff9f35f7f77980 sopendfree sopendfr
0 12 3 2 200 ffff9f35f7f77540 ifwdog ifwdog
0 11 3 1 200 ffff9f35f7f77100 iflnkst iflnkst
0 10 3 3 200 ffff9f35f87a2940 nfssilly nfssilly
0 9 3 0 200 ffff9f35f87a2500 vdrain vdrain
0 8 3 0 200 ffff9f35f87a20c0 modunload mod_unld
0 7 3 0 200 ffff9f35f87d3900 xcall/0 xcall
0 6 1 0 200 ffff9f35f87d34c0 softser/0
0 5 3 0 200 ffff9f35f87d3080 softclk/0 tstile
0 4 1 0 200 ffff9f35f87fd8c0 softbio/0
0 3 1 0 200 ffff9f35f87fd480 softnet/0
0 > 2 1 0 201 ffff9f35f87fd040 idle/0
0 0 3 1 200 ffffffff8188a6c0 swapper uvm
db{0}> ps/w
PID LID COMMAND EMUL PRI WAIT-MSG WAIT-CHANNEL
1207 1207 route netbsd 43 rtentry ffff9f35f7c26298
1195 1195 sh netbsd 43 wait ffff9f35f67c5b98
1187 1187 login netbsd 43 wait ffff9f35f7b7a3d8
1188 1188 cron netbsd 43 nanoslp ffff9f35f7d65580
1277 1277 inetd netbsd 43 kqueue ffff9f35f7c0cfa0
954 954 powerd netbsd 43 kqueue ffff9f35c117a460
639 639 syslogd netbsd 43 kqueue ffff9f35c11fd220
1 1 init netbsd 43 wait ffff9f35f7b7a058
0 178 system netbsd 123 physiod ffff9f35f768c848
0 222 system netbsd 125 pooldrain ffffffff8190f900
0 221 system netbsd 124 syncer ffff9f35f7dba940
0 220 system netbsd 126 pgdaemon ffffffff8190d4c8
0 217 system netbsd 43 swwreboot ffff9f35c142b1c8
0 216 system netbsd 96 npfgcw ffff9f35c11fca08
0 215 system netbsd 222 rt_free ffff9f35f79c1d88
0 214 system netbsd 96 unpgc ffffffff81980a30
0 213 system netbsd 222 key_timehandler ffff9f35f79c1c48
0 212 system netbsd 222 icmp6_wqinput ffff9f35f79c3f08
0 211 system netbsd 222 icmp6_wqinput ffff9f35f79c3ec8
0 210 system netbsd 222 icmp6_wqinput ffff9f35f79c3e88
0 209 system netbsd 222 icmp6_wqinput ffff9f35f79c3e48
0 208 system netbsd 96 usbevt ffff9f35c127e4b8
0 207 system netbsd 96 usbevt ffffa70007de2478
0 206 system netbsd 96 usbevt ffffa70007de4478
0 205 system netbsd 96 usbevt ffffa70007de6478
0 204 system netbsd 222 nd6_timer ffff9f35f79c19c8
0 203 system netbsd 222 carp6_wqinput ffff9f35f7edd508
0 202 system netbsd 222 carp6_wqinput ffff9f35f7edd4c8
0 201 system netbsd 222 carp6_wqinput ffff9f35f7edd488
0 200 system netbsd 222 carp6_wqinput ffff9f35f7edd448
0 199 system netbsd 222 carp_wqinput ffff9f35f7edd108
0 198 system netbsd 222 carp_wqinput ffff9f35f7edd0c8
0 197 system netbsd 222 carp_wqinput ffff9f35f7edd088
0 196 system netbsd 222 carp_wqinput ffff9f35f7edd048
0 195 system netbsd 222 icmp_wqinput ffff9f35f7f7cd08
0 194 system netbsd 222 icmp_wqinput ffff9f35f7f7ccc8
0 193 system netbsd 222 icmp_wqinput ffff9f35f7f7cc88
0 192 system netbsd 222 icmp_wqinput ffff9f35f7f7cc48
0 185 system netbsd 96 sccomp ffffa70007dd88f8
0 31 system netbsd 222 rt_timer ffff9f35f79c1888
0 63 system netbsd 125 vmem_rehash ffff9f35f79c14c8
0 117 system netbsd 43 entropy ffffffff818b1d28
0 116 system netbsd 0 balloon ffff9f35c1366608
0 115 system netbsd 96 usbtsk ffffffff818d52d8
0 114 system netbsd 96 usbtsk ffffffff818d5298
0 113 system netbsd 222 wm0Reset ffff9f35c10f7988
0 112 system netbsd 222 wm0TxRx ffff9f35f7f7c908
0 111 system netbsd 222 wm0TxRx ffff9f35f7f7c8c8
0 110 system netbsd 222 wm0TxRx ffff9f35f7f7c888
0 109 system netbsd 222 wm0TxRx ffff9f35f7f7c848
0 108 system netbsd 96 atath ffffa70007dd8938
0 107 system netbsd 96 atath ffffa70007dd83c0
0 106 system netbsd 96 pmsreset ffff9f35c121fc94
0 105 system netbsd 127 xcall ffffa70127db9010
0 104 system netbsd 223 0
0 103 system netbsd 220 0
0 102 system netbsd 221 0
0 101 system netbsd 222 0
0 > 100 system netbsd 0 0
0 99 system netbsd 127 xcall ffffa70127d7c010
0 98 system netbsd 223 0
0 97 system netbsd 220 tstile ffff9f35f8a6f080
0 96 system netbsd 221 0
0 30 system netbsd 222 0
0 > 29 system netbsd 0 0
0 28 system netbsd 127 xcall ffffa70127acc010
0 27 system netbsd 223 0
0 26 system netbsd 220 0
0 25 system netbsd 221 0
0 24 system netbsd 222 0
0 > 23 system netbsd 0 0
0 22 system netbsd 43 lnxsyswq ffff9f35f8a5ec08
0 21 system netbsd 43 lnxubdwq ffff9f35f8a5eb08
0 20 system netbsd 43 lnxpwrwq ffff9f35f8a5ea08
0 19 system netbsd 43 lnxlngwq ffff9f35f8a5e908
0 18 system netbsd 43 lnxhipwq ffff9f35f8a5e808
0 17 system netbsd 43 lnxrcugc ffffffff818b0308
0 16 system netbsd 96 smtaskq ffffffff818f5f60
0 15 system netbsd 43 pmfsuspend ffff9f35f8812808
0 14 system netbsd 43 pmfevent ffff9f35f88126c8
0 13 system netbsd 96 sopendfr ffffffff819809b0
0 12 system netbsd 222 ifwdog ffff9f35f8812588
0 11 system netbsd 222 iflnkst ffff9f35f8812448
0 10 system netbsd 43 nfssilly ffff9f35f8812308
0 9 system netbsd 125 vdrain ffffffff81981bb0
0 8 system netbsd 125 mod_unld ffffffff81973830
0 7 system netbsd 127 xcall ffffffff8183bcd0
0 6 system netbsd 223 0
0 5 system netbsd 220 tstile ffff9f35f8a6f080
0 4 system netbsd 221 0
0 3 system netbsd 222 0
0 > 2 system netbsd 0 0
0 0 system netbsd 125 uvm ffffffff8188a6c0
db{0}> bt/a ffff9f35f6e06600
trace: pid 1207 lid 1207 at 0xffffa70138749af0
sleepq_block() at netbsd:sleepq_block+0x13a
cv_wait() at netbsd:cv_wait+0x49
_rt_free() at netbsd:_rt_free+0x44
route_output() at netbsd:route_output+0x4c0
route_send_wrapper() at netbsd:route_send_wrapper+0x6d
sosend() at netbsd:sosend+0x944
soo_write() at netbsd:soo_write+0x2f
dofilewrite() at netbsd:dofilewrite+0x80
sys_write() at netbsd:sys_write+0x49
syscall() at netbsd:syscall+0x196
--- syscall (number 4) ---
netbsd:syscall+0x196:
db{0}> bt/a ffff9f35c10692c0
trace: pid 0 lid 97 at 0xffffa70127da6e60
sleepq_block() at netbsd:sleepq_block+0x13a
turnstile_block() at netbsd:turnstile_block+0x3b8
mutex_vector_enter() at netbsd:mutex_vector_enter+0x12b
tcp_slowtimo() at netbsd:tcp_slowtimo+0x10
callout_softclock() at netbsd:callout_softclock+0xd2
softint_dispatch() at netbsd:softint_dispatch+0x10b
DDB lost frame for netbsd:Xsoftintr+0x4c, trying 0xffffa70127da70f0
Xsoftintr() at netbsd:Xsoftintr+0x4c
--- interrupt ---
0:
db{0}> bt/a ffff9f35f87d3080
trace: pid 0 lid 5 at 0xffffa70127c7ae40
sleepq_block() at netbsd:sleepq_block+0x13a
turnstile_block() at netbsd:turnstile_block+0x3b8
mutex_vector_enter() at netbsd:mutex_vector_enter+0x12b
ip_slowtimo() at netbsd:ip_slowtimo+0x10
pfslowtimo() at netbsd:pfslowtimo+0x34
callout_softclock() at netbsd:callout_softclock+0xd2
softint_dispatch() at netbsd:softint_dispatch+0x10b
DDB lost frame for netbsd:Xsoftintr+0x4c, trying 0xffffa70127c7b0f0
Xsoftintr() at netbsd:Xsoftintr+0x4c
--- interrupt ---
0:
db{0}> x/Lx ffff9f35f8a6f080
ffff9f35c10692c0: ffff9f35f679ba82
db{0}> bt/a ffff9f35f679ba80
trace: pid 1214 lid 1214 at 0xffffa70138749af0
sleepq_block() at netbsd:sleepq_block+0x13a
cv_wait() at netbsd:cv_wait+0x49
_rt_free() at netbsd:_rt_free+0x44
route_output() at netbsd:route_output+0x4c0
route_send_wrapper() at netbsd:route_send_wrapper+0x6d
sosend() at netbsd:sosend+0x944
soo_write() at netbsd:soo_write+0x2f
dofilewrite() at netbsd:dofilewrite+0x80
sys_write() at netbsd:sys_write+0x49
syscall() at netbsd:syscall+0x196
--- syscall (number 4) ---
netbsd:syscall+0x196:
db{0}> show routes
rtentry=0xffff9f35f7c262c8 flags=0x803 refcnt=0 use=936 expire=0
key=[16,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
mask=[]
gw=[16,2,0,0,192,168,88,2,0,0,0,0,0,0,0,0]
ifp=0xffffa70007de0060 (wm0) ifa=0xffff9f35f76a4c88
ifa_addr=[16,2,0,0,192,168,88,56,0,0,0,0,0,0,0,0]
ifa_dsta=[16,2,0,0,192,168,88,255,0,0,0,0,0,0,0,0]
ifa_mask=[7,2,0,0,255,255,255]
flags=0x101,refcnt=6,metric=0
gwroute=0x0 llinfo=0x0
rtentry=0xffff9f35f708b7b8 flags=0x80b refcnt=0 use=0 expire=0
key=[16,2,0,0,127,0,0,0,0,0,0,0,0,0,0,0]
mask=[5,255,255,255,255]
gw=[16,2,0,0,127,0,0,1,0,0,0,0,0,0,0,0]
ifp=0xffff9f35f7db60c0 (lo0) ifa=0xffff9f35f71b0048
ifa_addr=[16,2,0,0,127,0,0,1,0,0,0,0,0,0,0,0]
ifa_dsta=[16,2,0,0,127,0,0,1,0,0,0,0,0,0,0,0]
ifa_mask=[5,2,0,0,255]
flags=0x0,refcnt=4,metric=0
gwroute=0x0 llinfo=0x0
rtentry=0xffff9f35f708b038 flags=0x40005 refcnt=0 use=0 expire=0
key=[16,2,0,0,127,0,0,1,0,0,0,0,0,0,0,0]
mask=[NULL] gw=[11,18,2,0,24,3,0,0,108,111,48]
ifp=0xffff9f35f7db60c0 (lo0) ifa=0xffff9f35f71b0048
ifa_addr=[16,2,0,0,127,0,0,1,0,0,0,0,0,0,0,0]
ifa_dsta=[16,2,0,0,127,0,0,1,0,0,0,0,0,0,0,0]
ifa_mask=[5,2,0,0,255]
flags=0x0,refcnt=4,metric=0
gwroute=0x0 llinfo=0x0
rtentry=0xffff9f35f7c26048 flags=0x101 refcnt=0 use=933 expire=0
key=[16,2,0,0,192,168,88,0,0,0,0,0,0,0,0,0]
mask=[7,255,255,255,255,255,255]
gw=[17,18,1,0,6,0,0,0,0,0,0,0,0,0,0,0,0]
ifp=0xffffa70007de0060 (wm0) ifa=0xffff9f35f76a4c88
ifa_addr=[16,2,0,0,192,168,88,56,0,0,0,0,0,0,0,0]
ifa_dsta=[16,2,0,0,192,168,88,255,0,0,0,0,0,0,0,0]
ifa_mask=[7,2,0,0,255,255,255]
flags=0x101,refcnt=6,metric=0
gwroute=0x0 llinfo=0x0
rtentry=0xffff9f35f78a0e00 flags=0x40005 refcnt=0 use=0 expire=0
key=[16,2,0,0,192,168,88,56,0,0,0,0,0,0,0,0]
mask=[NULL] gw=[17,18,1,0,6,0,0,0,0,0,0,0,0,0,0,0,0]
ifp=0xffff9f35f7db60c0 (lo0) ifa=0xffff9f35f76a4c88
ifa_addr=[16,2,0,0,192,168,88,56,0,0,0,0,0,0,0,0]
ifa_dsta=[16,2,0,0,192,168,88,255,0,0,0,0,0,0,0,0]
ifa_mask=[7,2,0,0,255,255,255]
flags=0x101,refcnt=6,metric=0
gwroute=0x0 llinfo=0x0
From: ocb@l25.fi
To: gnats-bugs@NetBSD.org, ocb@l25.fi, riastradh@NetBSD.org
Cc: ozaki-r@NetBSD.org
Subject: Re: kern/56844: delete auto-modified network route crash
Date: Sun, 18 Dec 2022 09:18:56 +0100 (CET)
per your specification; making rt_wait_ok always return false appears to resolve the issue with 9.9.108, tested multiple times through a 10 hour window.
$ diff sys/net/route.c sys/net/route.c.bak
646,647c646
< /* return !cpu_softintr_p(); */
< return 0;
---
> return !cpu_softintr_p();
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56844 CVS commit: src/sys/net
Date: Thu, 22 Dec 2022 13:54:57 +0000
Module Name: src
Committed By: riastradh
Date: Thu Dec 22 13:54:57 UTC 2022
Modified Files:
src/sys/net: route.c
Log Message:
route(4): Work around deadlock in rt_free wait path.
PR kern/56844
XXX pullup-8
XXX pullup-9
XXX pullup-10
To generate a diff of this commit:
cvs rdiff -u -r1.235 -r1.236 src/sys/net/route.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56844 CVS commit: [netbsd-10] src/sys/net
Date: Wed, 22 Feb 2023 18:52:46 +0000
Module Name: src
Committed By: martin
Date: Wed Feb 22 18:52:46 UTC 2023
Modified Files:
src/sys/net [netbsd-10]: route.c
Log Message:
Pull up following revision(s) (requested by riastradh in ticket #99):
sys/net/route.c: revision 1.236
route(4): Work around deadlock in rt_free wait path.
PR kern/56844
To generate a diff of this commit:
cvs rdiff -u -r1.235 -r1.235.2.1 src/sys/net/route.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56844 CVS commit: [netbsd-9] src/sys/net
Date: Wed, 22 Feb 2023 18:53:56 +0000
Module Name: src
Committed By: martin
Date: Wed Feb 22 18:53:56 UTC 2023
Modified Files:
src/sys/net [netbsd-9]: route.c
Log Message:
Pull up following revision(s) (requested by riastradh in ticket #1602):
sys/net/route.c: revision 1.236
route(4): Work around deadlock in rt_free wait path.
PR kern/56844
To generate a diff of this commit:
cvs rdiff -u -r1.219.2.2 -r1.219.2.3 src/sys/net/route.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/56844 CVS commit: [netbsd-8] src/sys/net
Date: Wed, 22 Feb 2023 18:55:07 +0000
Module Name: src
Committed By: martin
Date: Wed Feb 22 18:55:07 UTC 2023
Modified Files:
src/sys/net [netbsd-8]: route.c
Log Message:
Pull up following revision(s) (requested by riastradh in ticket #1801):
sys/net/route.c: revision 1.236
route(4): Work around deadlock in rt_free wait path.
PR kern/56844
To generate a diff of this commit:
cvs rdiff -u -r1.194.6.15 -r1.194.6.16 src/sys/net/route.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2023
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.