NetBSD Problem Report #59236
From gson@gson.org Sun Mar 30 09:13:12 2025
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
key-exchange X25519 server-signature RSA-PSS (2048 bits)
client-signature RSA-PSS (2048 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 8ED051A9239
for <gnats-bugs@gnats.NetBSD.org>; Sun, 30 Mar 2025 09:13:12 +0000 (UTC)
Message-Id: <20250330091310.650C9253F02@guava.gson.org>
Date: Sun, 30 Mar 2025 12:13:10 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: Multiple segfaults in erlite3 boot
X-Send-Pr-Version: 3.95
>Number: 59236
>Notify-List: riastradh@NetBSD.org
>Category: port-evbmips
>Synopsis: Multiple segfaults in erlite3 boot
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: port-evbmips-maintainer
>State: needs-pullups
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Mar 30 09:15:00 +0000 2025
>Closed-Date:
>Last-Modified: Wed May 14 23:18:12 +0000 2025
>Originator: Andreas Gustafsson
>Release: NetBSD-current, source date 2025.03.29.19.40.42
>Organization:
>Environment:
System: NetBSD
Architecture: evbmips
Machine: mips64eb
>Description:
I tried to install NetBSD-current on a Ubiquity EdgeRouter Lite aka
ERLite-3 by building a evbmips-mips64eb release (without debug
symbols because PR 59233), uncompressing octeon.img.gz, dd:ing it
onto on a USB flash drive, plugging the drive into the internal
USB connector, and using the u-boot configuration from
https://www.cambus.net/netbsd-on-the-edgerouter-lite/.
During the initial boot, ps dumped core:
[ 1.7995905] WARNING: CHECK AND RESET THE DATE!
Sat Mar 29 19:40:42 UTC 2025
[1] Segmentation fault /bin/sh -c "ps -p \$\$ -o ppid="
Starting root file system check:
The system nonetheless managed to resize the root partition and
reboot, but during the second boot, there were further segfaults
from ps, mount, and fsck:
[ 6.6503459] WARNING: CHECK AND RESET THE DATE!
Sat Mar 29 19:40:44 UTC 2025
[1] Segmentation fault /bin/sh -c "ps -p \$\$ -o ppid="
Starting root file system check:
/dev/rdk1: file system is clean; not checking
Not resizing / (NAME=octeon-root): already correct size
mount: /: Segmentation fault
Setting sysctl variables:
ddb.onpanic: 1 -> 0
Starting file system checks:
[1] Segmentation fault fsck -x / ${fsck_flags}
Unknown error 139; help!
ERROR: ABORTING BOOT (sending SIGTERM to parent)!
init: `/bin/sh' Enter pathname of shell or RETURN for /bin/sh:
A full console log is at
https://www.gson.org/netbsd/bugs/erlite3/erlite3-2025.03.29.19.40.42-boot.txt
A system built from CVS source date 2025.01.01.00.00.05 does not have
this problem. Perhaps it is related to the mipsn64 problem reported in
https://mail-index.netbsd.org/port-mips/2025/02/03/msg001435.html
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
State-Changed-From-To: open->feedback
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Fri, 18 Apr 2025 19:04:59 +0000
State-Changed-Why:
This is probably the the same CN50xx bug that we have been puzzling
over in PR port-mips/59064: jemalloc switch to 5.3 broke userland
<https://gnats.NetBSD.org/59064>.
Can you try the patch at the bottom of this message?
https://mail-index.NetBSD.org/netbsd-bugs/2025/04/14/msg088307.html
If you open one of the core dumps in gdb (if you are able to do that
from another machine where everything isn't segfaulting all the time,
e.g. if the core dump is written to nfs) and do `x/i $pc' and `bt', I
bet you will find it in malloc_default (via some stack trace through
jemalloc) at this instruction:
00008a58 <malloc_default>:
malloc_default():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2727
8a58: 27bdff70 addiu sp,sp,-144
8a5c: ffbc0078 sd gp,120(sp)
8a60: 3c1c0000 lui gp,0x0
8a60: R_MIPS_GPREL16 malloc_default
8a60: R_MIPS_SUB *ABS*
8a60: R_MIPS_HI16 *ABS*
8a64: 0399e021 addu gp,gp,t9
8a68: 279c0000 addiu gp,gp,0
8a68: R_MIPS_GPREL16 malloc_default
8a68: R_MIPS_SUB *ABS*
8a68: R_MIPS_LO16 *ABS*
tsd_fetch_impl():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tsd.h:270
8a6c: 8f820000 lw v0,0(gp)
8a6c: R_MIPS_TLS_GOTTPREL je_tsd_tls
8a70: 7c03e83b 0x7c03e83b
malloc_default():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2727
8a74: ffb10040 sd s1,64(sp)
8a78: ffb00038 sd s0,56(sp)
tsd_fetch_impl():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tsd.h:270
8a7c: 00433021 addu a2,v0,v1
malloc_default():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2727
8a80: ffbf0088 sd ra,136(sp)
8a84: ffbe0080 sd s8,128(sp)
8a88: ffb70070 sd s7,112(sp)
8a8c: ffb60068 sd s6,104(sp)
8a90: ffb50060 sd s5,96(sp)
8a94: ffb40058 sd s4,88(sp)
8a98: ffb30050 sd s3,80(sp)
8a9c: ffb20048 sd s2,72(sp)
tsd_fetch_impl():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tsd.h:422
=> 8aa0: 90c30258 lbu v1,600(a2)
And I bet you will find that $v0 holds the address malloc_default+0x18,
i.e., the pc of this instruction:
tsd_fetch_impl():
/home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tsd.h:270
8a6c: 8f820000 lw v0,0(gp)
8a6c: R_MIPS_TLS_GOTTPREL je_tsd_tls
=> 8a70: 7c03e83b 0x7c03e83b
The instruction 0x7c03e83b is sometimes also written
rdhwr $3,$29
or
rdhwr v1,ulr
but it is architecturally undefined so it traps to the kernel to
emulate, and the kernel is supposed to return the thread's tcb pointer
in v1.
But as a side effect, the emulation clobbers the register v0 with the
address of the excepting instruction, rather than leaving it as the
value it found at -1234(gp) (or whatever; written as 0(gp) above, but
the linker will replace it by some probably-nonzero number; you can use
`objdump --disassemble=malloc_default libc.so' to find it), which is
decidedly not the instruction address malloc_default+0x18 but rather
some tls offset that is reasonable to add to the tcb pointer.
Now, the emulation routine
https://nxr.netbsd.org/xref/src/sys/arch/mips/mips/mipsX_subr.S?r=1.115#1297
is not _supposed_ to clobber v0 -- it goes out of its way to save v0 on
the kernel stack and restore it before returning from the exception:
1312 /* Need two working registers */
1313 REG_S AT, CALLFRAME_SIZ+TF_REG_AST(k0)
1314 REG_S v0, CALLFRAME_SIZ+TF_REG_V0(k0)
...
1349 REG_L AT, CALLFRAME_SIZ+TF_REG_AST(k0)# restore reg
1350 REG_L v0, CALLFRAME_SIZ+TF_REG_V0(k0) # restore reg
1351 eret
But, in all my trials, it has been consistently corrupted in the same
way. The best theory we have for why it is corrupted is cn50xx CPUs --
found in erlite3 (but not er4) -- have some kind of register-writeback
bug (which passes through some register renaming unchanged) provoked by
the particular combination of reading MIPS_COP_0_EXC_PC and eret so
that after the eret, the exception pc gets written back to v0 even
though we just restored v0 from the kernel stack.
So, all that said, here is a summary of the science we did on my
erlite3, together with a patch that seems to address the issue and --
under the theory that it is the register that we move MIPS_COP_0_EXC_PC
into -- will only corrupt a temporary register k0 which is not
accessible to userland and treated as garbage on any kernel entry
points, so it's safe:
https://mail-index.NetBSD.org/netbsd-bugs/2025/04/14/msg088307.html
From: Rin Okuyama <rokuyama.rk@gmail.com>
To: gnats-bugs@netbsd.org, port-evbmips-maintainer@netbsd.org,
netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, riastradh@NetBSD.org,
Andreas Gustafsson <gson@gson.org>, Martin Husemann <martin@duskware.de>
Cc:
Subject: Re: port-evbmips/59236 (Multiple segfaults in erlite3 boot)
Date: Sat, 19 Apr 2025 15:06:09 +0900
On 2025/04/19 4:04, riastradh@NetBSD.org wrote:
> Synopsis: Multiple segfaults in erlite3 boot
>
> State-Changed-From-To: open->feedback
> State-Changed-By: riastradh@NetBSD.org
> State-Changed-When: Fri, 18 Apr 2025 19:04:59 +0000
> State-Changed-Why:
> This is probably the the same CN50xx bug that we have been puzzling
> over in PR port-mips/59064: jemalloc switch to 5.3 broke userland
> <https://gnats.NetBSD.org/59064>.
>
> Can you try the patch at the bottom of this message?
>
> https://mail-index.NetBSD.org/netbsd-bugs/2025/04/14/msg088307.html
Thank you very much for working on this problem!
However, unfortunately, even with your patch, erlite3 cannot boot
into multiuser mode, both for n64 and n32 userlands:
https://gist.github.com/rokuyama/7bbe1619e55e8e3aba5bf3b112a23725
On the other hand, MIPSSIM64 kernel on QEMU successfully boots into
multiuser mode.
In the above-mentioned log, debug printf is enabled for trap():
```
diff --git a/sys/arch/mips/mips/trap.c b/sys/arch/mips/mips/trap.c
index 58caf19e2d2..a079dec91dd 100644
--- a/sys/arch/mips/mips/trap.c
+++ b/sys/arch/mips/mips/trap.c
@@ -448,8 +448,8 @@ trap(uint32_t status, uint32_t cause, vaddr_t vaddr,
vaddr_t pc,
rv = uvm_fault(map, va, ftype);
pcb->pcb_onfault = onfault;
-#if defined(VMFAULT_TRACE)
- if (!KERNLAND_P(va))
+#if defined(VMFAULT_TRACE) || 1
+ if (!KERNLAND_P(va) && rv != 0)
printf(
"uvm_fault(%p (pmap %p), %#"PRIxVADDR
" (%"PRIxVADDR"), %d) -> %d at pc %#"PRIxVADDR"\n",
```
You can see SEGVs are caused by read access to NULL:
```
[ 13.3599689] uvm_fault(0x980000041f9c0c00 (pmap 0x980000041fce44d0), 0
(0), 1) -> 14 at pc 0xfff83b1db4
[1] Segmentation fault (core dumped) /sbin/ifconfig lo0 inet6
>/dev/null 2>&1
...
[ 19.5399661] uvm_fault(0x980000041f20c800 (pmap 0x980000041fce44d0), 0
(0), 1) -> 14 at pc 0xfff8391db4
[1] Segmentation fault (core dumped) awk "/^sendmail[ \t]/{print\$2}"
/etc/mailer.conf
```
As you pointed out earlier, SEGVs can be avoided by replacing
`user_reserved_insn` with `user_gen_exception`, i.e.:
https://gist.github.com/rokuyama/c7a50b8e7a62dc25f3f536f1434eea9b
By grep'ping into Linux codes, I've found they check TLB entry
for PC before fetching it:
https://github.com/torvalds/linux/commit/5b10496b6e65#diff-bbe4c1a54ce7bd13e6109d887383993c3b5276a1362f84092e9ef31dc84064d9R390
and our `user_gen_exception` path uses copyin(9), of course.
I don't know ~anything for mips, and much more destructive results
may happen for this "double-fault scenario", although...
Thanks,
rin
> If you open one of the core dumps in gdb (if you are able to do that
> from another machine where everything isn't segfaulting all the time,
> e.g. if the core dump is written to nfs) and do `x/i $pc' and `bt', I
> bet you will find it in malloc_default (via some stack trace through
> jemalloc) at this instruction:
>
> 00008a58 <malloc_default>:
> malloc_default():
> /home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2727
> 8a58: 27bdff70 addiu sp,sp,-144
> 8a5c: ffbc0078 sd gp,120(sp)
> 8a60: 3c1c0000 lui gp,0x0
> 8a60: R_MIPS_GPREL16 malloc_default
> 8a60: R_MIPS_SUB *ABS*
> 8a60: R_MIPS_HI16 *ABS*
> 8a64: 0399e021 addu gp,gp,t9
> 8a68: 279c0000 addiu gp,gp,0
> 8a68: R_MIPS_GPREL16 malloc_default
> 8a68: R_MIPS_SUB *ABS*
> 8a68: R_MIPS_LO16 *ABS*
> tsd_fetch_impl():
> /home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tsd.h:270
> 8a6c: 8f820000 lw v0,0(gp)
> 8a6c: R_MIPS_TLS_GOTTPREL je_tsd_tls
> 8a70: 7c03e83b 0x7c03e83b
> malloc_default():
> /home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2727
> 8a74: ffb10040 sd s1,64(sp)
> 8a78: ffb00038 sd s0,56(sp)
> tsd_fetch_impl():
> /home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tsd.h:270
> 8a7c: 00433021 addu a2,v0,v1
> malloc_default():
> /home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../dist/src/jemalloc.c:2727
> 8a80: ffbf0088 sd ra,136(sp)
> 8a84: ffbe0080 sd s8,128(sp)
> 8a88: ffb70070 sd s7,112(sp)
> 8a8c: ffb60068 sd s6,104(sp)
> 8a90: ffb50060 sd s5,96(sp)
> 8a94: ffb40058 sd s4,88(sp)
> 8a98: ffb30050 sd s3,80(sp)
> 8a9c: ffb20048 sd s2,72(sp)
> tsd_fetch_impl():
> /home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tsd.h:422
> => 8aa0: 90c30258 lbu v1,600(a2)
>
> And I bet you will find that $v0 holds the address malloc_default+0x18,
> i.e., the pc of this instruction:
>
> tsd_fetch_impl():
> /home/riastradh/netbsd/current/src/external/bsd/jemalloc/lib/../include/jemalloc/internal/tsd.h:270
> 8a6c: 8f820000 lw v0,0(gp)
> 8a6c: R_MIPS_TLS_GOTTPREL je_tsd_tls
> => 8a70: 7c03e83b 0x7c03e83b
>
> The instruction 0x7c03e83b is sometimes also written
>
> rdhwr $3,$29
>
> or
>
> rdhwr v1,ulr
>
> but it is architecturally undefined so it traps to the kernel to
> emulate, and the kernel is supposed to return the thread's tcb pointer
> in v1.
>
> But as a side effect, the emulation clobbers the register v0 with the
> address of the excepting instruction, rather than leaving it as the
> value it found at -1234(gp) (or whatever; written as 0(gp) above, but
> the linker will replace it by some probably-nonzero number; you can use
> `objdump --disassemble=malloc_default libc.so' to find it), which is
> decidedly not the instruction address malloc_default+0x18 but rather
> some tls offset that is reasonable to add to the tcb pointer.
>
> Now, the emulation routine
> https://nxr.netbsd.org/xref/src/sys/arch/mips/mips/mipsX_subr.S?r=1.115#1297
> is not _supposed_ to clobber v0 -- it goes out of its way to save v0 on
> the kernel stack and restore it before returning from the exception:
>
> 1312 /* Need two working registers */
> 1313 REG_S AT, CALLFRAME_SIZ+TF_REG_AST(k0)
> 1314 REG_S v0, CALLFRAME_SIZ+TF_REG_V0(k0)
> ...
> 1349 REG_L AT, CALLFRAME_SIZ+TF_REG_AST(k0)# restore reg
> 1350 REG_L v0, CALLFRAME_SIZ+TF_REG_V0(k0) # restore reg
> 1351 eret
>
> But, in all my trials, it has been consistently corrupted in the same
> way. The best theory we have for why it is corrupted is cn50xx CPUs --
> found in erlite3 (but not er4) -- have some kind of register-writeback
> bug (which passes through some register renaming unchanged) provoked by
> the particular combination of reading MIPS_COP_0_EXC_PC and eret so
> that after the eret, the exception pc gets written back to v0 even
> though we just restored v0 from the kernel stack.
>
> So, all that said, here is a summary of the science we did on my
> erlite3, together with a patch that seems to address the issue and --
> under the theory that it is the register that we move MIPS_COP_0_EXC_PC
> into -- will only corrupt a temporary register k0 which is not
> accessible to userland and treated as garbage on any kernel entry
> points, so it's safe:
>
> https://mail-index.NetBSD.org/netbsd-bugs/2025/04/14/msg088307.html
>
>
>
From: Andreas Gustafsson <gson@gson.org>
To: riastradh@NetBSD.org, gnats-bugs@netbsd.org
Cc:
Subject: Re: port-evbmips/59236 (Multiple segfaults in erlite3 boot)
Date: Sat, 19 Apr 2025 21:46:24 +0300
riastradh@NetBSD.org wrote:
> Can you try the patch at the bottom of this message?
With the patch, I'm still getting coredumps. This time, the root file
system resizing step did not happen, but I did get a login prompt:
[ 1.8020371] WARNING: no TOD clock present
[ 1.8020371] WARNING: using filesystem time
[ 1.8101320] WARNING: CHECK AND RESET THE DATE!
Sat Apr 19 07:20:47 UTC 2025
[1] Segmentation fault /bin/sh -c "ps -p \$\$ -o ppid="
[2] Segmentation fault rcorder -s nostart ${rc_rcorder_flags} ${scripts}
rcorder terminated with signal 11
The following components reported failures:
rcorder
See /var/run/rc.log for more information.
Sat Apr 19 07:20:47 UTC 2025
init: can't add init: can't add init: kernel secinit: can't add
NetBSD/evbmips (Amnesiac) (constty)
login:
I was able to log in, and the login shell works, but if I try to run a
subshell from it, the subshell dumps core. Many other commands, such
as ls or ps, also dump core. The ls and ps in /rescue do work.
> If you open one of the core dumps in gdb (if you are able to do that
> from another machine where everything isn't segfaulting all the time,
> e.g. if the core dump is written to nfs)
I'm not using nfs, but I managed to get some core dumps (after
remounting the root file system r/w) and saved them by removing the
embedded USB flash drive from the erlite3 and reading it on another
system. You can find them in
https://www.gson.org/netbsd/bugs/erlite3/cores.tar.gz
I have not looked at them myself yet.
--
Andreas Gustafsson, gson@gson.org
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/59236 CVS commit: src
Date: Sun, 20 Apr 2025 22:31:01 +0000
Module Name: src
Committed By: riastradh
Date: Sun Apr 20 22:31:01 UTC 2025
Modified Files:
src/distrib/sets/lists/debug: mi
src/distrib/sets/lists/tests: mi
src/tests/kernel: Makefile t_signal_and_sp.c
src/tests/kernel/arch/aarch64: stack_pointer.h
Added Files:
src/tests/kernel: h_execsp.c h_execsp.h
src/tests/kernel/arch/aarch64: execsp.S signalsphandler.S
src/tests/kernel/arch/x86_64: execsp.S signalsphandler.S
stack_pointer.h
Log Message:
Test stack pointer alignment in various scenarios.
1. elf entry point
2. main function
3. signal handler
Extend the test to amd64 while here -- fortunately both aarch64 and
amd64 pass, but others, such as mips, will fail:
PR kern/59327: user stack pointer is not aligned properly
This extends the test that was previously written for:
PR kern/58149: aarch64: Cannot return from a signal handler if SP was
misaligned when the signal arrived
With any luck, this will help us to systematically eradicate misaligned
stack pointers as hypothesized to be the reason for:
PR port-mips/59236: Multiple segfaults in erlite3 boot
To generate a diff of this commit:
cvs rdiff -u -r1.476 -r1.477 src/distrib/sets/lists/debug/mi
cvs rdiff -u -r1.1369 -r1.1370 src/distrib/sets/lists/tests/mi
cvs rdiff -u -r1.87 -r1.88 src/tests/kernel/Makefile
cvs rdiff -u -r0 -r1.1 src/tests/kernel/h_execsp.c \
src/tests/kernel/h_execsp.h
cvs rdiff -u -r1.1 -r1.2 src/tests/kernel/t_signal_and_sp.c
cvs rdiff -u -r0 -r1.1 src/tests/kernel/arch/aarch64/execsp.S \
src/tests/kernel/arch/aarch64/signalsphandler.S
cvs rdiff -u -r1.1 -r1.2 src/tests/kernel/arch/aarch64/stack_pointer.h
cvs rdiff -u -r0 -r1.1 src/tests/kernel/arch/x86_64/execsp.S \
src/tests/kernel/arch/x86_64/signalsphandler.S \
src/tests/kernel/arch/x86_64/stack_pointer.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/59236 CVS commit: src
Date: Sun, 20 Apr 2025 22:32:26 +0000
Module Name: src
Committed By: riastradh
Date: Sun Apr 20 22:32:25 UTC 2025
Modified Files:
src/sys/arch/mips/include: mips_param.h
src/tests/kernel: Makefile t_signal_and_sp.c
Added Files:
src/tests/kernel/arch/mips: execsp.S signalsphandler.S stack_pointer.h
Log Message:
t_signal_and_sp: Add mips support.
PR kern/59327: user stack pointer is not aligned properly
PR kern/58149: Cannot return from a signal handler if SP was
misaligned when the signal arrived
Stack pointer misaligment in some cases hypothesized to be a possible
cause of:
PR port-evbmips/59236: Multiple segfaults in erlite3 boot
To generate a diff of this commit:
cvs rdiff -u -r1.52 -r1.53 src/sys/arch/mips/include/mips_param.h
cvs rdiff -u -r1.88 -r1.89 src/tests/kernel/Makefile
cvs rdiff -u -r1.4 -r1.5 src/tests/kernel/t_signal_and_sp.c
cvs rdiff -u -r0 -r1.1 src/tests/kernel/arch/mips/execsp.S \
src/tests/kernel/arch/mips/signalsphandler.S \
src/tests/kernel/arch/mips/stack_pointer.h
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: "Taylor R Campbell" <riastradh@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/59236 CVS commit: src/sys/arch/mips/mips
Date: Thu, 24 Apr 2025 01:40:27 +0000
Module Name: src
Committed By: riastradh
Date: Thu Apr 24 01:40:27 UTC 2025
Modified Files:
src/sys/arch/mips/mips: mipsX_subr.S
Log Message:
mips: Disable rdhwr emulation fast path on Octeon CPUs.
They are haunted.
PR kern/59064: jemalloc switch to 5.3 broke userland
PR kern/59236: Multiple segfaults in erlite3 boot
To generate a diff of this commit:
cvs rdiff -u -r1.115 -r1.116 src/sys/arch/mips/mips/mipsX_subr.S
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Andreas Gustafsson <gson@gson.org>
To: riastradh@NetBSD.org, gnats-bugs@netbsd.org
Cc:
Subject: Re: port-evbmips/59236 (Multiple segfaults in erlite3 boot)
Date: Thu, 24 Apr 2025 21:53:42 +0300
Last week, riastradh@NetBSD.org wrote:
> Can you try the patch at the bottom of this message?
>
> https://mail-index.NetBSD.org/netbsd-bugs/2025/04/14/msg088307.html
I built a new octeon.img from -current sources from 2025.04.24.12.54.43
with the patch (and a second patch to add comp and test to the
"sets=" line in src/distrib/utils/embedded/conf/octeon.conf), booted
it from a USB stick, and it sucessfully resized the root FS and
rebooted without any core dumps. I'm not sure which commit fixed it,
but since it probably was one of yours, thank you!
I have not tried this version without the patch.
The system did later lock up while running the ATF tests, but that's
a separate issue that should get its own PR.
--
Andreas Gustafsson, gson@gson.org
State-Changed-From-To: feedback->needs-pullups
State-Changed-By: riastradh@NetBSD.org
State-Changed-When: Wed, 14 May 2025 23:18:12 +0000
State-Changed-Why:
fixed in HEAD, needs pullup-9 and pullup-10
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.47 2022/09/11 19:34:41 kim Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2025
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.