NetBSD Problem Report #56701

From www@netbsd.org  Wed Feb  9 16:33:43 2022
Return-Path: <www@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 3513B1A9239
	for <gnats-bugs@gnats.NetBSD.org>; Wed,  9 Feb 2022 16:33:43 +0000 (UTC)
Message-Id: <20220209163341.BACE91A923B@mollari.NetBSD.org>
Date: Wed,  9 Feb 2022 16:33:41 +0000 (UTC)
From: nervoso@k1.com.br
Reply-To: nervoso@k1.com.br
To: gnats-bugs@NetBSD.org
Subject: paninc on ufs2 while running on QEMU 6.2.0
X-Send-Pr-Version: www-1.0

>Number:         56701
>Category:       kern
>Synopsis:       paninc on ufs2 while running on QEMU 6.2.0
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Feb 09 16:35:00 +0000 2022
>Last-Modified:  Thu Feb 17 15:00:01 +0000 2022
>Originator:     sergio lenzi
>Release:        9.99.93  on i386
>Organization:
k1 systems
>Environment:
NetBSD i386.netbsd 9.99.93 NetBSD 9.99.93 (LZT9) #0: Wed Nov 17 23:32:44 -03 2021  NetBSD@VMS.lenzicasa:/home/NetBSD/BUILD/9/i386/OBJ/sys/arch/i386/compile/GENERIC i386
>Description:
The system on the HOST runs NetBSD on 9.99.93 amd64. it is an amd 8350 with 32gb of memory and runs several qemu vms...  some with archlinux, some FreeBSD. Now I am trying to run a NetBSD-i386-9.99.93 it installs ok from the ISO, the system boots, enter internet...

The same configurtion runs OK on NetBSDD-9.2_STABLE i386, something is wrong, there is a race condition on ufs2 FS... only on HEAD

the qemu script:  adjust DISK and ISO image...

qemu-system-i386 -m 4g -accel nvmm \
        -cpu core2duo -smp cpus=4 -m 4g \
        -drive if=virtio,format=${FORMAT},file=${DISK} \
        -device virtio-rng-pci,rng=viornd0 \
        -object rng-random,filename=/dev/urandom,id=viornd0 \
        -netdev tap,id=${IFACE},ifname=${IFACE},script=no \
        -device virtio-net-pci,netdev=${IFACE} \
        -cdrom ${ISO}

It does not matter the disk format, size... it always crash

>How-To-Repeat:
The problem is that after download the pkgsrc. when I try to build the first packagee (pkg-tools/cwrapper)  the system panics IN THE VM, with a message about the null in inode name.  I am not being able to collect the kernel crash.

create disk image:  qemu-image -f raw 128G DISK_IMAGE.raw

run the qemu script
>Fix:

>Audit-Trail:
From: Paul Goyette <paul@whooppee.com>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
Date: Wed, 9 Feb 2022 08:46:41 -0800 (PST)

 On Wed, 9 Feb 2022, nervoso@k1.com.br wrote:

 >> How-To-Repeat:
 > The problem is that after download the pkgsrc. when I try to build the 
 > first packagee (pkg-tools/cwrapper)  the system panics IN THE VM, with 
 > a message about the null in inode name.  I am not being able to 
 > collect the kernel crash.

 You should at least be able to capture serial console output from the
 VM and get us a stack backtrace.


 +--------------------+--------------------------+----------------------+
 | Paul Goyette       | PGP Key fingerprint:     | E-mail addresses:    |
 | (Retired)          | FA29 0E3B 35AF E8AE 6651 | paul@whooppee.com    |
 | Software Developer | 0786 F758 55DE 53BA 7731 | pgoyette@netbsd.org  |
 | & Network Engineer |                          | pgoyette99@gmail.com |
 +--------------------+--------------------------+----------------------+

From: nervoso <nervoso@k1.com.br>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
Date: Fri, 11 Feb 2022 14:36:31 -0300

 --=-O0PWnGL0RMRN0/kX8cLN
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: 8bit

 Em qua, 2022-02-09 às 16:50 +0000, Paul Goyette escreveu:
 > The following reply was made to PR kern/56701; it has been noted by GNATS.
 > 
 > From: Paul Goyette <paul@whooppee.com>
 > To: gnats-bugs@netbsd.org
 > Cc: 
 > Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
 > Date: Wed, 9 Feb 2022 08:46:41 -0800 (PST)
 > 
 >  On Wed, 9 Feb 2022, nervoso@k1.com.br wrote:
 >  
 >  >> How-To-Repeat:
 >  > The problem is that after download the pkgsrc. when I try to build the 
 >  > first packagee (pkg-tools/cwrapper)  the system panics IN THE VM, with 
 >  > a message about the null in inode name.  I am not being able to 
 >  > collect the kernel crash.
 >  
 >  You should at least be able to capture serial console output from the
 >  VM and get us a stack backtrace.
 >  
 Thank you.... the message appers in the console of the qemu, either in
 the gtk image, or in the vnc console
 it shows the write of kernel dump... but on reboot, even if the "disk"
  have a swap partition, there is no 
 core dump or /var/crash/....   I will try to make it happen...

 >  
 >  +--------------------+--------------------------+----------------------+
 >  | Paul Goyette       | PGP Key fingerprint:     | E-mail addresses:    |
 >  | (Retired)          | FA29 0E3B 35AF E8AE 6651 | paul@whooppee.com    |
 >  | Software Developer | 0786 F758 55DE 53BA 7731 | pgoyette@netbsd.org  |
 >  | & Network Engineer |                          | pgoyette99@gmail.com |
 >  +--------------------+--------------------------+----------------------+
 >  


 --=-O0PWnGL0RMRN0/kX8cLN
 Content-Type: text/html; charset="utf-8"
 Content-Transfer-Encoding: quoted-printable

 <html><head></head><body><div>Em qua, 2022-02-09 =C3=A0s 16:50 +0000, Paul =
 Goyette escreveu:</div><blockquote type=3D"cite" style=3D"margin:0 0 0 .8ex=
 ; border-left:2px #729fcf solid;padding-left:1ex"><pre>The following reply =
 was made to PR kern/56701; it has been noted by GNATS.</pre><pre><br></pre>=
 <pre>From: Paul Goyette &lt;<a href=3D"mailto:paul@whooppee.com">paul@whoop=
 pee.com</a>&gt;</pre><pre>To: <a href=3D"mailto:gnats-bugs@netbsd.org">gnat=
 s-bugs@netbsd.org</a></pre><pre>Cc: </pre><pre>Subject: Re: kern/56701: pan=
 inc on ufs2 while running on QEMU 6.2.0</pre><pre>Date: Wed, 9 Feb 2022 08:=
 46:41 -0800 (PST)</pre><pre><br></pre><pre> On Wed, 9 Feb 2022, <a href=3D"=
 mailto:nervoso@k1.com.br">nervoso@k1.com.br</a> wrote:</pre><pre> </pre><pr=
 e> &gt;&gt; How-To-Repeat:</pre><pre> &gt; The problem is that after downlo=
 ad the pkgsrc. when I try to build the </pre><pre> &gt; first packagee (pkg=
 -tools/cwrapper)  the system panics IN THE VM, with </pre><pre> &gt; a mess=
 age about the null in inode name.  I am not being able to </pre><pre> &gt; =
 collect the kernel crash.</pre><pre> </pre><pre> You should at least be abl=
 e to capture serial console output from the</pre><pre> VM and get us a stac=
 k backtrace.</pre><pre> </pre></blockquote><div>Thank you.... the message a=
 ppers in the console of the qemu, either in the gtk image, or in the vnc co=
 nsole</div><div>it shows the write of kernel dump... but on reboot, even if=
  the "disk" &nbsp;have a swap partition, there is no&nbsp;</div><div>core d=
 ump or /var/crash/.... &nbsp; I will try to make it happen...</div><div><br=
 ></div><blockquote type=3D"cite" style=3D"margin:0 0 0 .8ex; border-left:2p=
 x #729fcf solid;padding-left:1ex"><pre> </pre><pre> +--------------------+-=
 -------------------------+----------------------+</pre><pre> | Paul Goyette=
        | PGP Key fingerprint:     | E-mail addresses:    |</pre><pre> | (Re=
 tired)          | FA29 0E3B 35AF E8AE 6651 | <a href=3D"mailto:paul@whooppe=
 e.com">paul@whooppee.com</a>    |</pre><pre> | Software Developer | 0786 F7=
 58 55DE 53BA 7731 | <a href=3D"mailto:pgoyette@netbsd.org">pgoyette@netbsd.=
 org</a>  |</pre><pre> | &amp; Network Engineer |                          |=
  <a href=3D"mailto:pgoyette99@gmail.com">pgoyette99@gmail.com</a> |</pre><p=
 re> +--------------------+--------------------------+----------------------=
 +</pre><pre> </pre></blockquote><div><br></div><div class=3D"-x-evo-signatu=
 re-wrapper"><span class=3D"-x-evo-signature" id=3D"none"></span></div></bod=
 y></html>

 --=-O0PWnGL0RMRN0/kX8cLN--

From: nervoso <nervoso@k1.com.br>
To: gnats-bugs@netbsd.org
Cc: paul@whooppee.com
Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
Date: Sat, 12 Feb 2022 21:18:30 -0300

 --=-J8kKjNKu8sTUKS+HZiR0
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: 8bit

 Em qua, 2022-02-09 às 16:50 +0000, Paul Goyette escreveu:
 >  You should at least be able to capture serial console output from the
 >  VM and get us a stack backtrace.
 >  
 ok... finally I have the crash dump

 both files are is in http://k1.com.br/crash/

 You can proceed from that ..
 the amd64 version is OK.
 One more tip:
 On FreeBSD  bhyve, using  the same image it does not boot, gives
 undefined error

 Thank you...

 --=-J8kKjNKu8sTUKS+HZiR0
 Content-Type: text/html; charset="utf-8"
 Content-Transfer-Encoding: quoted-printable

 <html><head></head><body><div>Em qua, 2022-02-09 =C3=A0s 16:50 +0000, Paul =
 Goyette escreveu:</div><blockquote type=3D"cite" style=3D"margin:0 0 0 .8ex=
 ; border-left:2px #729fcf solid;padding-left:1ex"><pre>&nbsp;You should at =
 least be able to capture serial console output from the</pre><pre>&nbsp;VM =
 and get us a stack backtrace.</pre><pre>&nbsp;</pre></blockquote><div>ok...=
  finally I have the crash dump</div><div><br></div><div>both files are is i=
 n&nbsp;<a href=3D"http://k1.com.br/crash/">http://k1.com.br/crash/</a></div=
 ><div><br></div><div>You can proceed from that ..</div><div>the amd64 versi=
 on is OK.</div><div>One more tip:</div><div>On FreeBSD &nbsp;bhyve, using &=
 nbsp;the same image it does not boot, gives undefined error</div><div><br><=
 /div><div>Thank you...</div><div class=3D"-x-evo-signature-wrapper"><span c=
 lass=3D"-x-evo-signature" id=3D"none"></span></div></body></html>

 --=-J8kKjNKu8sTUKS+HZiR0--

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
Date: Sun, 13 Feb 2022 12:02:45 +0100

 The files are not accessable (403).

 Martin

From: nervoso <nervoso@k1.com.br>
To: gnats-bugs@netbsd.org
Cc: martin@duskware.de
Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
Date: Sun, 13 Feb 2022 10:24:56 -0300

 --=-RqYfGxsbeloJ11s3F9T5
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: 8bit

 Em dom, 2022-02-13 às 11:05 +0000, Martin Husemann escreveu:
 > The following reply was made to PR kern/56701; it has been noted by GNATS.
 > 
 > From: Martin Husemann <martin@duskware.de>
 > To: gnats-bugs@netbsd.org
 > Cc: 
 > Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
 > Date: Sun, 13 Feb 2022 12:02:45 +0100
 > 
 >  The files are not accessable (403).
 >  
 >  Martin
 > 
 OOPS.... permission fixed... please try again

 the crash files are from a qemu 6.0.2 running the qemu-system-x86_64
  on a HEAD amd64 host
 I have the same crash files running from qemu-system-i386   that shows
 the same backtrace
 the file is named netbs.1*. Please note that the the amd64 version runs
 fine on qemu using the same source code
 built from version 9.99.93 

 http://k1.com.br/crash



 --=-RqYfGxsbeloJ11s3F9T5
 Content-Type: text/html; charset="utf-8"
 Content-Transfer-Encoding: quoted-printable

 <html><head></head><body><div>Em dom, 2022-02-13 =C3=A0s 11:05 +0000, Marti=
 n Husemann escreveu:</div><blockquote type=3D"cite" style=3D"margin:0 0 0 .=
 8ex; border-left:2px #729fcf solid;padding-left:1ex"><pre>The following rep=
 ly was made to PR kern/56701; it has been noted by GNATS.</pre><pre><br></p=
 re><pre>From: Martin Husemann &lt;<a href=3D"mailto:martin@duskware.de">mar=
 tin@duskware.de</a>&gt;</pre><pre>To: <a href=3D"mailto:gnats-bugs@netbsd.o=
 rg">gnats-bugs@netbsd.org</a></pre><pre>Cc: </pre><pre>Subject: Re: kern/56=
 701: paninc on ufs2 while running on QEMU 6.2.0</pre><pre>Date: Sun, 13 Feb=
  2022 12:02:45 +0100</pre><pre><br></pre><pre> The files are not accessable=
  (403).</pre><pre> </pre><pre> Martin</pre><pre><br></pre></blockquote><div=
 >OOPS.... permission fixed... please try again</div><div><br></div><div>the=
  crash files are from a qemu 6.0.2 running the qemu-system-x86_64 &nbsp;on =
 a HEAD amd64 host</div><div>I have the same crash files running from qemu-s=
 ystem-i386 &nbsp; that shows the same backtrace</div><div>the file is named=
  netbs.1*. Please note that the the amd64 version runs fine on qemu using t=
 he same source code</div><div>built from version 9.99.93&nbsp;</div><div><b=
 r></div><div><a href=3D"http://k1.com.br/crash">http://k1.com.br/crash</a><=
 /div><div><br></div><div><br></div><div class=3D"-x-evo-signature-wrapper">=
 <span class=3D"-x-evo-signature" id=3D"none"></span></div></body></html>

 --=-RqYfGxsbeloJ11s3F9T5--

From: Martin Husemann <martin@duskware.de>
To: nervoso <nervoso@k1.com.br>
Cc: gnats-bugs@netbsd.org
Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
Date: Sun, 13 Feb 2022 17:24:00 +0100

 System panicked: /: bad dir ino 11444433 at offset 376: NUL in name [.extract_mak] i=12, namlen=20
 #0  0xc011b67d in cpu_reboot ()
 #1  0xc0c1976a in kern_reboot ()
 #2  0xc0c558e1 in vpanic ()
 #3  0xc0c5596f in panic ()
 #4  0xc0baefe5 in ufs_lookup ()
 #5  0xc0cbf92c in VOP_LOOKUP ()
 #6  0xc0ca174f in lookup_once ()
 #7  0xc0ca2647 in namei_tryemulroot ()
 #8  0xc0ca4153 in namei ()
 #9  0xc0caacc1 in do_sys_unlinkat ()
 #10 0xc041f39a in syscall ()
 #11 0xc01007f8 in Xsyscall ()

 This means: something corrupted your disk contents in a directory.

 Martin

From: nervoso <nervoso@k1.com.br>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
Date: Sun, 13 Feb 2022 17:02:51 -0300

 --=-BFRwdO7DIaWCpdG8Qm8o
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: 8bit

 Em dom, 2022-02-13 às 16:25 +0000, Martin Husemann escreveu:
 > The following reply was made to PR kern/56701; it has been noted by GNATS.
 > 
 > From: Martin Husemann <martin@duskware.de>
 > To: nervoso <nervoso@k1.com.br>
 > Cc: gnats-bugs@netbsd.org
 > Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
 > Date: Sun, 13 Feb 2022 17:24:00 +0100
 > 
 >  System panicked: /: bad dir ino 11444433 at offset 376: NUL in name [.extract_mak] i=12, namlen=20
 >  #0  0xc011b67d in cpu_reboot ()
 >  #1  0xc0c1976a in kern_reboot ()
 >  #2  0xc0c558e1 in vpanic ()
 >  #3  0xc0c5596f in panic ()
 >  #4  0xc0baefe5 in ufs_lookup ()
 >  #5  0xc0cbf92c in VOP_LOOKUP ()
 >  #6  0xc0ca174f in lookup_once ()
 >  #7  0xc0ca2647 in namei_tryemulroot ()
 >  #8  0xc0ca4153 in namei ()
 >  #9  0xc0caacc1 in do_sys_unlinkat ()
 >  #10 0xc041f39a in syscall ()
 >  #11 0xc01007f8 in Xsyscall ()
 >  
 >  This means: something corrupted your disk contents in a directory.
 >  
 >  Martin
 >  
 correct, but only happens in the i386 version and only in qemu and only
 using HEAD  9.99.93
 if I change the /netbsd kernel in the SAME filesystem  for a /netbsd
 9.2_STABLE from the original ISO,,  the system boots,
 runs fsck finds no problems,  marks the filesystem clean and  runs fine
 if I change back the /netbsd kernel to the 9.99.93,  the system panics

 Something have changed in the ufs2 in the 9.99.93 and only in the
 i386... perhaps some .asm code???  or inline???

 thanks 



 --=-BFRwdO7DIaWCpdG8Qm8o
 Content-Type: text/html; charset="utf-8"
 Content-Transfer-Encoding: quoted-printable

 <html><head></head><body><div>Em dom, 2022-02-13 =C3=A0s 16:25 +0000, Marti=
 n Husemann escreveu:</div><blockquote type=3D"cite" style=3D"margin:0 0 0 .=
 8ex; border-left:2px #729fcf solid;padding-left:1ex"><pre>The following rep=
 ly was made to PR kern/56701; it has been noted by GNATS.</pre><pre><br></p=
 re><pre>From: Martin Husemann &lt;<a href=3D"mailto:martin@duskware.de">mar=
 tin@duskware.de</a>&gt;</pre><pre>To: nervoso &lt;<a href=3D"mailto:nervoso=
 @k1.com.br">nervoso@k1.com.br</a>&gt;</pre><pre>Cc: <a href=3D"mailto:gnats=
 -bugs@netbsd.org">gnats-bugs@netbsd.org</a></pre><pre>Subject: Re: kern/567=
 01: paninc on ufs2 while running on QEMU 6.2.0</pre><pre>Date: Sun, 13 Feb =
 2022 17:24:00 +0100</pre><pre><br></pre><pre> System panicked: /: bad dir i=
 no 11444433 at offset 376: NUL in name [.extract_mak] i=3D12, namlen=3D20</=
 pre><pre> #0  0xc011b67d in cpu_reboot ()</pre><pre> #1  0xc0c1976a in kern=
 _reboot ()</pre><pre> #2  0xc0c558e1 in vpanic ()</pre><pre> #3  0xc0c5596f=
  in panic ()</pre><pre> #4  0xc0baefe5 in ufs_lookup ()</pre><pre> #5  0xc0=
 cbf92c in VOP_LOOKUP ()</pre><pre> #6  0xc0ca174f in lookup_once ()</pre><p=
 re> #7  0xc0ca2647 in namei_tryemulroot ()</pre><pre> #8  0xc0ca4153 in nam=
 ei ()</pre><pre> #9  0xc0caacc1 in do_sys_unlinkat ()</pre><pre> #10 0xc041=
 f39a in syscall ()</pre><pre> #11 0xc01007f8 in Xsyscall ()</pre><pre> </pr=
 e><pre> This means: something corrupted your disk contents in a directory.<=
 /pre><pre> </pre><pre> Martin</pre><pre> </pre></blockquote><div>correct, b=
 ut only happens in the i386 version and only in qemu and only using HEAD &n=
 bsp;9.99.93</div><div>if I change the /netbsd kernel in the SAME filesystem=
  &nbsp;for a /netbsd 9.2_STABLE from the original ISO,, &nbsp;the system bo=
 ots,</div><div>runs fsck finds no problems, &nbsp;marks the filesystem clea=
 n and &nbsp;runs fine</div><div>if I change back the /netbsd kernel to the =
 9.99.93, &nbsp;the system panics</div><div><br></div><div>Something have ch=
 anged in the ufs2 in the 9.99.93 and only in the i386... perhaps some .asm =
 code??? &nbsp;or inline???</div><div><br></div><div>thanks&nbsp;</div><div>=
 <br></div><div><br></div><div class=3D"-x-evo-signature-wrapper"><span clas=
 s=3D"-x-evo-signature" id=3D"none"></span></div></body></html>

 --=-BFRwdO7DIaWCpdG8Qm8o--

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
Date: Mon, 14 Feb 2022 09:54:49 +0100

 On Sun, Feb 13, 2022 at 08:05:01PM +0000, nervoso wrote:
 >  Something have changed in the ufs2 in the 9.99.93 and only in the
 >  i386... perhaps some .asm code???  or inline???

 This is probably totaly unrelated to ffs code.
 My guess is random memory corruption.

 Could be cause by:
  - a bug in qemu
  - some random NetBSD kernel bug

 Also note that we heavily use Qemu to test i386 builds in our automatic
 test setup, and we have not seen this issue there.

 Martin

From: nervoso <nervoso@k1.com.br>
To: gnats-bugs@netbsd.org
Cc: martin@duskware.de
Subject: Re: kern/56701: panic on ufs2 while running on QEMU 6.2.0
Date: Mon, 14 Feb 2022 18:50:10 -0300

 --=-meaMNUb+2EEPEFchhSMr
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: 8bit

 Em seg, 2022-02-14 às 08:55 +0000, Martin Husemann escreveu:
 > his is probably totaly unrelated to ffs code.
 >  My guess is random memory corruption.
 >  
 >  Could be cause by:
 >   - a bug in qemu
 >   - some random NetBSD kernel bug
 >  
 >  Also note that we heavily use Qemu to test i386 builds in our automatic
 >  test setup, and we have not seen this issue there.
 >  
 >  Martin
 correct... but why the netbsd-i386-9.2_STABLE runs fine??? ok I admit
 that I should not use the HEAD to build systems
 but I think it must be tracked down. It the same machine, the same
 emulator, the only change is the kerrnel 
 even if the runtime (base) is HEAD, runing the the 9.2 STABLE kernel it
 runs fine...
 there is a bug in the HEAD  and only using a paralell make when trying
 to build /usr/pkgsrc/pkgtools/cwrappers
 a fsck -fy on the fs reports NO errors.. so the problem happens in the
 buffer before
 it is written on the media probably a race condition in the HEAD
 kernel 
 I will wait the 9.99.94 to see if it happens... 

 Thanks for you attention

 --=-meaMNUb+2EEPEFchhSMr
 Content-Type: text/html; charset="utf-8"
 Content-Transfer-Encoding: quoted-printable

 <html><head></head><body><div>Em seg, 2022-02-14 =C3=A0s 08:55 +0000, Marti=
 n Husemann escreveu:</div><blockquote type=3D"cite" style=3D"margin:0 0 0 .=
 8ex; border-left:2px #729fcf solid;padding-left:1ex"><pre>his is probably t=
 otaly unrelated to ffs code.</pre><pre>&nbsp;My guess is random memory corr=
 uption.</pre><pre>&nbsp;</pre><pre>&nbsp;Could be cause by:</pre><pre>&nbsp=
 ; - a bug in qemu</pre><pre>&nbsp; - some random NetBSD kernel bug</pre><pr=
 e>&nbsp;</pre><pre>&nbsp;Also note that we heavily use Qemu to test i386 bu=
 ilds in our automatic</pre><pre>&nbsp;test setup, and we have not seen this=
  issue there.</pre><pre>&nbsp;</pre><pre>&nbsp;Martin</pre></blockquote><di=
 v>correct... but why the netbsd-i386-9.2_STABLE runs fine??? ok I admit tha=
 t I should not use the HEAD to build systems</div><div>but I think it must =
 be tracked down. It the same machine, the same emulator, the only change is=
  the kerrnel&nbsp;</div><div>even if the runtime (base) is HEAD, runing the=
  the 9.2 STABLE kernel it runs fine...</div><div>there is a bug in the HEAD=
  &nbsp;and only using a paralell make when trying</div><div>to build /usr/p=
 kgsrc/pkgtools/cwrappers</div><div>a fsck -fy on the fs reports NO errors..=
  so the problem happens in the buffer before</div><div>it is written on the=
  media probably a race condition in the HEAD kernel&nbsp;</div><div>I will =
 wait the 9.99.94 to see if it happens...&nbsp;</div><div><br></div><div>Tha=
 nks for you attention</div><div class=3D"-x-evo-signature-wrapper"><span cl=
 ass=3D"-x-evo-signature" id=3D"none"></span></div></body></html>

 --=-meaMNUb+2EEPEFchhSMr--

From: David Holland <dholland-bugs@netbsd.org>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
Date: Mon, 14 Feb 2022 22:27:00 +0000

 On Mon, Feb 14, 2022 at 08:55:01AM +0000, Martin Husemann wrote:
  > The following reply was made to PR kern/56701; it has been noted by GNATS.
  >  This is probably totaly unrelated to ffs code.
  >  My guess is random memory corruption.
  >  
  >  Could be cause by:
  >   - a bug in qemu
  >   - some random NetBSD kernel bug
  >  
  >  Also note that we heavily use Qemu to test i386 builds in our automatic
  >  test setup, and we have not seen this issue there.

 My guess would be a 32-bit issue in the virtio disk code, since that's
 probably rarely used with 32-bit kernels.

 -- 
 David A. Holland
 dholland@netbsd.org

From: nervoso <nervoso@k1.com.br>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
Date: Mon, 14 Feb 2022 20:17:24 -0300

 --=-DGW0muiPBI/jiuGVE1cT
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: 8bit

 Em seg, 2022-02-14 às 22:30 +0000, David Holland escreveu:
 > The following reply was made to PR kern/56701; it has been noted by GNATS.
 > 
 > From: David Holland <dholland-bugs@netbsd.org>
 > To: gnats-bugs@netbsd.org
 > Cc: 
 > Subject: Re: kern/56701: paninc on ufs2 while running on QEMU 6.2.0
 > Date: Mon, 14 Feb 2022 22:27:00 +0000
 > 
 >  On Mon, Feb 14, 2022 at 08:55:01AM +0000, Martin Husemann wrote:
 >   > The following reply was made to PR kern/56701; it has been noted by GNATS.
 >   >  This is probably totaly unrelated to ffs code.
 >   >  My guess is random memory corruption.
 >   >  
 >   >  Could be cause by:
 >   >   - a bug in qemu
 >   >   - some random NetBSD kernel bug
 >   >  
 >   >  Also note that we heavily use Qemu to test i386 builds in our automatic
 >   >  test setup, and we have not seen this issue there.
 >  
 >  My guess would be a 32-bit issue in the virtio disk code, since that's
 >  probably rarely used with 32-bit kernels.
 >  
 >  -- 
 >  David A. Holland
 >  dholland@netbsd.org
 >  
 good... will try to use a not virtio-disk 
 and tell the result...  probably a not align dma??? that
 by default is aligned on 64 bit boundary on amd64 and in 32bit on
 i386...
 probably the 9.2_stable code for virtio changed 

 Will let you know

 --=-DGW0muiPBI/jiuGVE1cT
 Content-Type: text/html; charset="utf-8"
 Content-Transfer-Encoding: quoted-printable

 <html><head></head><body><div>Em seg, 2022-02-14 =C3=A0s 22:30 +0000, David=
  Holland escreveu:</div><blockquote type=3D"cite" style=3D"margin:0 0 0 .8e=
 x; border-left:2px #729fcf solid;padding-left:1ex"><pre>The following reply=
  was made to PR kern/56701; it has been noted by GNATS.</pre><pre><br></pre=
 ><pre>From: David Holland &lt;<a href=3D"mailto:dholland-bugs@netbsd.org">d=
 holland-bugs@netbsd.org</a>&gt;</pre><pre>To: <a href=3D"mailto:gnats-bugs@=
 netbsd.org">gnats-bugs@netbsd.org</a></pre><pre>Cc: </pre><pre>Subject: Re:=
  kern/56701: paninc on ufs2 while running on QEMU 6.2.0</pre><pre>Date: Mon=
 , 14 Feb 2022 22:27:00 +0000</pre><pre><br></pre><pre> On Mon, Feb 14, 2022=
  at 08:55:01AM +0000, Martin Husemann wrote:</pre><pre>  &gt; The following=
  reply was made to PR kern/56701; it has been noted by GNATS.</pre><pre>  &=
 gt;  This is probably totaly unrelated to ffs code.</pre><pre>  &gt;  My gu=
 ess is random memory corruption.</pre><pre>  &gt;  </pre><pre>  &gt;  Could=
  be cause by:</pre><pre>  &gt;   - a bug in qemu</pre><pre>  &gt;   - some =
 random NetBSD kernel bug</pre><pre>  &gt;  </pre><pre>  &gt;  Also note tha=
 t we heavily use Qemu to test i386 builds in our automatic</pre><pre>  &gt;=
   test setup, and we have not seen this issue there.</pre><pre> </pre><pre>=
  My guess would be a 32-bit issue in the virtio disk code, since that's</pr=
 e><pre> probably rarely used with 32-bit kernels.</pre><pre> </pre><pre> --=
  </pre><pre> David A. Holland</pre><pre> <a href=3D"mailto:dholland@netbsd.=
 org">dholland@netbsd.org</a></pre><pre> </pre></blockquote><div>good... wil=
 l try to use a not virtio-disk&nbsp;</div><div>and tell the result... &nbsp=
 ;probably a not align dma??? that</div><div>by default is aligned on 64 bit=
  boundary on amd64 and in 32bit on i386...</div><div>probably the 9.2_stabl=
 e code for virtio changed&nbsp;</div><div><br></div><div>Will let you know<=
 /div><div class=3D"-x-evo-signature-wrapper"><span class=3D"-x-evo-signatur=
 e" id=3D"none"></span></div></body></html>

 --=-DGW0muiPBI/jiuGVE1cT--

From: nervoso <nervoso@k1.com.br>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56701: panic on ufs2 while running on QEMU 6.2.0
Date: Thu, 17 Feb 2022 10:43:47 -0300

 --=-nOg+dZnFAflE3mo85tgM
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: 8bit

 Em seg, 2022-02-14 às 22:30 +0000, David Holland escreveu:
 >  
 >  My guess would be a 32-bit issue in the virtio disk code, since that's
 >  probably rarely used with 32-bit kernels.
 >  
 tracked down to probably is the processor the distribution was
 created....  I did the sequence using an amd64 8350 running HEAD amd64
 1) download the HEAD i386  iso from netbsd.org
 2) install the head in  the VM
 3) make the test, it works, no problem  (so the qemu, and host OS
 netbsd HEAD amd64)  is ok

 4) download the netbsd HEAD src in /usr/src
 5) build a custom kernel  with users=128 no npf, pppoe server....
 6) copy the netbsd to /netbsd.NEW
 7) boot from the new kernel, .... the system works OK...

 Conclusion, the kernel built in an amd8350 using NetBSD amd64 HEAD,
 does not work 
 it will only work if built on a i386 machine (can be a vm on qemu) with
 a i386 gcc native compiler... using build.sh

 probably the gcc compiler on amd64 produces wrong i386 code... atomic
 OP issues???

 well you can consider the PR close with this advice...

 Thank you

 --=-nOg+dZnFAflE3mo85tgM
 Content-Type: text/html; charset="utf-8"
 Content-Transfer-Encoding: quoted-printable

 <html><head></head><body><div>Em seg, 2022-02-14 =C3=A0s 22:30 +0000, David=
  Holland escreveu:</div><blockquote type=3D"cite" style=3D"margin:0 0 0 .8e=
 x; border-left:2px #729fcf solid;padding-left:1ex"><pre>&nbsp;</pre><pre>&n=
 bsp;My guess would be a 32-bit issue in the virtio disk code, since that's<=
 /pre><pre>&nbsp;probably rarely used with 32-bit kernels.</pre><pre>&nbsp;<=
 /pre></blockquote><div>tracked down to probably is the processor the distri=
 bution was created.... &nbsp;I did the sequence using an amd64 8350 running=
  HEAD amd64</div><div>1) download the HEAD i386 &nbsp;iso from netbsd.org</=
 div><div>2) install the head in &nbsp;the VM</div><div>3) make the test, it=
  works, no problem &nbsp;(so the qemu, and host OS netbsd HEAD amd64) &nbsp=
 ;is ok</div><div><br></div><div>4) download the netbsd HEAD src in /usr/src=
 </div><div>5) build a custom kernel &nbsp;with users=3D128 no npf, pppoe se=
 rver....</div><div>6) copy the netbsd to /netbsd.NEW</div><div>7) boot from=
  the new kernel, .... the system works OK...</div><div><br></div><div>Concl=
 usion, the kernel built in an amd8350 using NetBSD amd64 HEAD, does not wor=
 k&nbsp;</div><div>it will only work if built on a i386 machine (can be a vm=
  on qemu) with a i386 gcc native compiler... using build.sh</div><div><br><=
 /div><div>probably the gcc compiler on amd64 produces wrong i386 code... at=
 omic OP issues???</div><div><br></div><div>well you can consider the PR clo=
 se with this advice...</div><div><br></div><div>Thank you</div><div class=
 =3D"-x-evo-signature-wrapper"><span class=3D"-x-evo-signature" id=3D"none">=
 </span></div></body></html>

 --=-nOg+dZnFAflE3mo85tgM--

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56701: panic on ufs2 while running on QEMU 6.2.0
Date: Thu, 17 Feb 2022 14:52:04 +0100

 On Thu, Feb 17, 2022 at 01:45:01PM +0000, nervoso wrote:
 >  probably the gcc compiler on amd64 produces wrong i386 code... atomic
 >  OP issues???

 All builds are cross compiled with the same compiler, even if you build
 on the same arch as the target.

 The cross compiler emitting different code when compiled in a different
 environment is possible (and would be a gcc bug), but in this case sounds
 very unlikely.

 Martin

From: nervoso <nervoso@k1.com.br>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56701: panic on ufs2 while running on QEMU 6.2.0
Date: Thu, 17 Feb 2022 11:49:11 -0300

 --=-uo4iG31YMwQvCZ7Ei5uH
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: 8bit

 Em qui, 2022-02-17 às 13:55 +0000, Martin Husemann escreveu:
 > The following reply was made to PR kern/56701; it has been noted by GNATS.
 > 
 > From: Martin Husemann <martin@duskware.de>
 > To: gnats-bugs@netbsd.org
 > Cc: 
 > Subject: Re: kern/56701: panic on ufs2 while running on QEMU 6.2.0
 > Date: Thu, 17 Feb 2022 14:52:04 +0100
 > 
 >  On Thu, Feb 17, 2022 at 01:45:01PM +0000, nervoso wrote:
 >  >  probably the gcc compiler on amd64 produces wrong i386 code... atomic
 >  >  OP issues???
 >  
 >  All builds are cross compiled with the same compiler, even if you build
 >  on the same arch as the target.
 >  
 >  The cross compiler emitting different code when compiled in a different
 >  environment is possible (and would be a gcc bug), but in this case sounds
 >  very unlikely.
 >  
 >  Martin
 >  
 but that is what is happening... HEAD  uses gcc 10.3 while 9.2 uses gcc
 7.5.
 the interesting part is that when I build the GENERIC kernel using
 build.sh .... kernel=GENERIC
 it builds the TOOLS  
 if I build ther TOOLS on amd64, and then builds the kernel  it breaks
 but if I build the TOOLS using i386 and then build the kenrnel, the
 system works

 but if I build the GUEST using sources from 9.2_STABLE in no way the
 system breaks...
 the only explanation possible is that when building the TOOLS, the
 tools of the guest for 9.2
 result in a gcc 7.5 compiler while the HEAD builds the system using gcc
 10.3.
 Conclusion:
 the only way to build a running  i386 HEAD kernel in qemu is to build
 the system using
 the i386 VM  machine... booted from the HEAD original ISO from
 netbsd.org..
 Once booted, I am able to build any custom kernel (no need to build
 modules) and all of the kernels are operational

 Unfortunately I do not have any intel machine here... but I will have a
 core2 in some days...
 then ! will test it running on "bare metal"..  I will let you know

 This is weird, I know


 --=-uo4iG31YMwQvCZ7Ei5uH
 Content-Type: text/html; charset="utf-8"
 Content-Transfer-Encoding: quoted-printable

 <html><head></head><body><div>Em qui, 2022-02-17 =C3=A0s 13:55 +0000, Marti=
 n Husemann escreveu:</div><blockquote type=3D"cite" style=3D"margin:0 0 0 .=
 8ex; border-left:2px #729fcf solid;padding-left:1ex"><pre>The following rep=
 ly was made to PR kern/56701; it has been noted by GNATS.</pre><pre><br></p=
 re><pre>From: Martin Husemann &lt;<a href=3D"mailto:martin@duskware.de">mar=
 tin@duskware.de</a>&gt;</pre><pre>To: <a href=3D"mailto:gnats-bugs@netbsd.o=
 rg">gnats-bugs@netbsd.org</a></pre><pre>Cc: </pre><pre>Subject: Re: kern/56=
 701: panic on ufs2 while running on QEMU 6.2.0</pre><pre>Date: Thu, 17 Feb =
 2022 14:52:04 +0100</pre><pre><br></pre><pre> On Thu, Feb 17, 2022 at 01:45=
 :01PM +0000, nervoso wrote:</pre><pre> &gt;  probably the gcc compiler on a=
 md64 produces wrong i386 code... atomic</pre><pre> &gt;  OP issues???</pre>=
 <pre> </pre><pre> All builds are cross compiled with the same compiler, eve=
 n if you build</pre><pre> on the same arch as the target.</pre><pre> </pre>=
 <pre> The cross compiler emitting different code when compiled in a differe=
 nt</pre><pre> environment is possible (and would be a gcc bug), but in this=
  case sounds</pre><pre> very unlikely.</pre><pre> </pre><pre> Martin</pre><=
 pre> </pre></blockquote><div>but that is what is happening... HEAD &nbsp;us=
 es gcc 10.3 while 9.2 uses gcc 7.5.</div><div>the interesting part is that =
 when I build the GENERIC kernel using build.sh .... kernel=3DGENERIC</div><=
 div>it builds the TOOLS &nbsp;</div><div>if I build ther TOOLS on amd64, an=
 d then builds the kernel &nbsp;it breaks</div><div>but if I build the TOOLS=
  using i386 and then build the kenrnel, the system works</div><div><br></di=
 v><div>but if I build the GUEST using sources from 9.2_STABLE in no way the=
  system breaks...</div><div>the only explanation possible is that when buil=
 ding the TOOLS, the tools of the guest for 9.2</div><div>result in a gcc 7.=
 5 compiler while the HEAD builds the system using gcc 10.3.</div><div>Concl=
 usion:</div><div>the only way to build a running &nbsp;i386 HEAD kernel in =
 qemu is to build the system using</div><div>the i386 VM &nbsp;machine... bo=
 oted from the HEAD original ISO from netbsd.org..</div><div>Once booted, I =
 am able to build any custom kernel (no need to build modules) and all of th=
 e kernels are operational</div><div><br></div><div>Unfortunately I do not h=
 ave any intel machine here... but I will have a core2 in some days...</div>=
 <div>then ! will test it running on "bare metal".. &nbsp;I will let you kno=
 w</div><div><br></div><div>This is weird, I know</div><div><br></div><div c=
 lass=3D"-x-evo-signature-wrapper"><span class=3D"-x-evo-signature" id=3D"no=
 ne"></span></div></body></html>

 --=-uo4iG31YMwQvCZ7Ei5uH--

From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: kern/56701: panic on ufs2 while running on QEMU 6.2.0
Date: Thu, 17 Feb 2022 15:58:54 +0100

 On Thu, Feb 17, 2022 at 02:50:01PM +0000, nervoso wrote:
 >  the interesting part is that when I build the GENERIC kernel using
 >  build.sh .... kernel=GENERIC
 >  it builds the TOOLS  
 >  if I build ther TOOLS on amd64, and then builds the kernel  it breaks
 >  but if I build the TOOLS using i386 and then build the kenrnel, the
 >  system works

 You do all those steps with the same sources from -HEAD?
 This would result in two variants of gcc 10 (one build on a i386 host,
 one on an amd64 host), and you say these two compilers generate different
 code?

 Or do you mean something else?

 Martin

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.