NetBSD Problem Report #53072
From www@NetBSD.org Sun Mar 4 14:39:14 2018
Return-Path: <www@NetBSD.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
by mollari.NetBSD.org (Postfix) with ESMTPS id 4F35D7A169
for <gnats-bugs@gnats.NetBSD.org>; Sun, 4 Mar 2018 14:39:14 +0000 (UTC)
Message-Id: <20180304143913.472EE7A266@mollari.NetBSD.org>
Date: Sun, 4 Mar 2018 14:39:13 +0000 (UTC)
From: rcbixler@nyx.net
Reply-To: rcbixler@nyx.net
To: gnats-bugs@NetBSD.org
Subject: netbsd-8 regression: startx (nv driver) crashes system
X-Send-Pr-Version: www-1.0
>Number: 53072
>Notify-List: bsiegert@NetBSD.org
>Category: kern
>Synopsis: netbsd-8 regression: startx (nv driver) crashes system
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: mrg
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sun Mar 04 14:40:01 +0000 2018
>Closed-Date: Tue Mar 06 09:40:18 +0000 2018
>Last-Modified: Sun Mar 11 18:10:00 +0000 2018
>Originator: Roy Bixler
>Release: netbsd-8
>Organization:
>Environment:
NetBSD laptop.bix.org 8.0_BETA NetBSD 8.0_BETA (GENERIC.201802271630Z) i386
>Description:
System is an old Dell laptop (Precision M70) with an NVidia graphics card. It doesn't work with the Nouveau DRM (problem reported in bug #50804), but it works with the old X server nv driver. When I upgraded to the kernel above, I found that "startx" crashes the system. It reboots and leaves a crash dump:
_KERNEL_OPT_NARCNET(0,104,c011e2a5,8,c0fff385,0,104,c0f73de5,dabefc5c,dabefc40) a
t 0
__kernel_end(104,0,c0f73de5,dabefc5c,c2b6ed40,6,dabefce4,dabefc50,c0947c9a,c0f73
de5) at dabefc5c
vpanic(c0f73de5,dabefc5c,dabefcd8,c0120935,c0f73de5,dabefce4,dabefce4,1,dabed2c0
,13246) at vpanic+0x131
snprintf(c0f73de5,dabefce4,dabefce4,1,dabed2c0,13246,8,0,0,0) at snprintf
trap_tss() at trap_tss
--- trap via task gate ---
_KERNEL_OPT_BEEP_ONHALT_COUNT+0x2:
The Xorg.0.log file from the crashed "startx" is:
[ 123.004] (**) |-->Input Device "Keyboard0"
[ 123.004] (==) Not automatically adding devices
[ 123.004] (==) Not automatically enabling devices
[ 123.004] (==) Not automatically adding GPU devices
[ 123.004] (==) Max clients allowed: 256, resource mask: 0x1fffff
[ 123.005] (**) FontPath set to:
/usr/X11R7/lib/X11/fonts/misc/,
/usr/X11R7/lib/X11/fonts/TTF/,
/usr/X11R7/lib/X11/fonts/Type1/,
/usr/X11R7/lib/X11/fonts/75dpi/,
/usr/X11R7/lib/X11/fonts/100dpi/,
/usr/X11R7/lib/X11/fonts/misc/,
/usr/X11R7/lib/X11/fonts/TTF/,
/usr/X11R7/lib/X11/fonts/Type1/,
/usr/X11R7/lib/X11/fonts/75dpi/,
/usr/X11R7/lib/X11/fonts/100dpi/
[ 123.005] (**) ModulePath set to "/usr/X11R7/lib/modules"
[ 123.005] Number of created screens does not match number of detected devices.
Configuration failed.
[ 123.006] (EE) Server terminated with error (2). Closing log fil
When I revert to the previous build:
NetBSD laptop.bix.org 8.0_BETA NetBSD 8.0_BETA (GENERIC.201802262040Z) i386
the X works and the system is usable again. I suspect that this change "[pullup-8 #593] please pullup pmap & pool(9) fixes for netbsd-8" is the issue.
>How-To-Repeat:
Boot up a NetBSD-8 build on or after 201802271630Z with nouveau driver disabled and run "startx" as a normal user.
>Fix:
Revert to a prior NetBSD-8 build, such as the previous one on 201802262040Z.
>Release-Note:
>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: Re: kern/53072: netbsd-8 regression: startx (nv driver) crashes
system
Date: Sun, 4 Mar 2018 17:31:03 +0100
On Sun, Mar 04, 2018 at 02:40:01PM +0000, rcbixler@nyx.net wrote:
> the X works and the system is usable again. I suspect that this
> change "[pullup-8 #593] please pullup pmap & pool(9) fixes for
> netbsd-8" is the issue.
This sounds a bit unlikely - would you be able to test a -8 kernel
just before that pullup happened?
Martin
From: rcbixler@nyx.net
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53072: netbsd-8 regression: startx (nv driver) crashes
system
Date: Sun, 4 Mar 2018 10:29:32 -0700
> The following reply was made to PR kern/53072; it has been noted by GNATS.
>
> From: Martin Husemann <martin@duskware.de>
> To: gnats-bugs@NetBSD.org
> Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
> netbsd-bugs@netbsd.org
> Subject: Re: kern/53072: netbsd-8 regression: startx (nv driver) crashes
> system
> Date: Sun, 4 Mar 2018 17:31:03 +0100
>
> On Sun, Mar 04, 2018 at 02:40:01PM +0000, rcbixler@nyx.net wrote:
> > the X works and the system is usable again. I suspect that this
> > change "[pullup-8 #593] please pullup pmap & pool(9) fixes for
> > netbsd-8" is the issue.
>
> This sounds a bit unlikely - would you be able to test a -8 kernel
> just before that pullup happened?
>
> Martin
I tried the netbsd-8 build from 201802262040Z and didn't have the
problem. I first encountered the problem with the netbsd-8 build
from 201802271630Z. I see only 2 commits between those 2 times,
the one I suspected:
To: source-changes%NetBSD.org@localhost
Subject: CVS commit: [netbsd-8] src/sys
From: "Martin Husemann" <martin%netbsd.org@localhost>
Date: Tue, 27 Feb 2018 09:07:33 +0000
and this one:
To: source-changes%NetBSD.org@localhost
Subject: CVS commit: [netbsd-8] src/doc
From: "Martin Husemann" <martin%netbsd.org@localhost>
Date: Tue, 27 Feb 2018 06:07:28 +0000
Module Name: src
Committed By: martin
Date: Tue Feb 27 06:07:28 UTC 2018
Modified Files:
src/doc [netbsd-8]: CHANGES-8.0
Log Message:
Ammend ticket #587: additionally xform_esp.c r1.77 has been pulled up.
To generate a diff of this commit:
cvs rdiff -u -r1.1.2.132 -r1.1.2.133 src/doc/CHANGES-8.0
I considered the former commit more likely, as it looks pretty involved
and it affects memory allocation. Are you suggesting that I try
to build a netbsd-8 from the current tree with the suspect change reverted?
If not, what are you suggesting?
--
Roy Bixler <rcbixler@nyx.net>
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, rcbixler@nyx.net
Subject: Re: kern/53072: netbsd-8 regression: startx (nv driver) crashes
system
Date: Mon, 5 Mar 2018 08:47:21 +0100
On Sun, Mar 04, 2018 at 06:50:01PM +0000, rcbixler@nyx.net wrote:
> I considered the former commit more likely, as it looks pretty involved
> and it affects memory allocation. Are you suggesting that I try
> to build a netbsd-8 from the current tree with the suspect change reverted?
> If not, what are you suggesting?
Ok, that is strong evidence ;-)
Any chance you could try a -current kernel (just kernel, just for testing)?
Martin
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org, rcbixler@nyx.net
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: re: kern/53072: netbsd-8 regression: startx (nv driver) crashes system
Date: Mon, 05 Mar 2018 18:57:19 +1100
> _KERNEL_OPT_NARCNET(0,104,c011e2a5,8,c0fff385,0,104,c0f73de5,dabefc5c,da=
befc40) a
> t 0
> __kernel_end(104,0,c0f73de5,dabefc5c,c2b6ed40,6,dabefce4,dabefc50,c0947c=
9a,c0f73
> de5) at dabefc5c
> vpanic(c0f73de5,dabefc5c,dabefcd8,c0120935,c0f73de5,dabefce4,dabefce4,1,=
dabed2c0
> ,13246) at vpanic+0x131
> snprintf(c0f73de5,dabefce4,dabefce4,1,dabed2c0,13246,8,0,0,0) at snprint=
f
> trap_tss() at trap_tss
> --- trap via task gate ---
> _KERNEL_OPT_BEEP_ONHALT_COUNT+0x2:
this is from crash(8)? can you try gdb, see if it can trace
through the trap and where it really is happening?
this probably is some teardown issue, as it seems that X tries
and then fails, and then we crash. can you try the vesa driver
for now, it should have a reasonably performance until we
figure this problem out.
.mrg.
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org, rcbixler@nyx.net
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: re: kern/53072: netbsd-8 regression: startx (nv driver) crashes system
Date: Mon, 05 Mar 2018 18:59:15 +1100
other things you can try are to enable drm debug before
starting X, and seeing what it logs around the failure.
there are two sysctl's:
hw.drm2.drm_debug
hw.drm2.nouveau_debug
not sure about the latter, but i guess both are useful.
(the former is generic debug that i've used.)
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, rcbixler@nyx.net
Subject: Re: kern/53072: netbsd-8 regression: startx (nv driver) crashes
system
Date: Mon, 5 Mar 2018 09:06:54 +0100
FWIW: I tested a noveau machine running -current and didn't run into any
issues there.
Martin
From: rcbixler@nyx.net
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: kern/53072: netbsd-8 regression: startx (nv driver) crashes
system
Date: Mon, 5 Mar 2018 05:36:06 -0700
> The following reply was made to PR kern/53072; it has been noted by GNATS.
>
> From: Martin Husemann <martin@duskware.de>
> To: gnats-bugs@NetBSD.org
> Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
> netbsd-bugs@netbsd.org, rcbixler@nyx.net
> Subject: Re: kern/53072: netbsd-8 regression: startx (nv driver) crashes
> system
> Date: Mon, 5 Mar 2018 08:47:21 +0100
>
> On Sun, Mar 04, 2018 at 06:50:01PM +0000, rcbixler@nyx.net wrote:
> > I considered the former commit more likely, as it looks pretty
> involved
> > and it affects memory allocation. Are you suggesting that I try
> > to build a netbsd-8 from the current tree with the suspect change
> reverted?
> > If not, what are you suggesting?
>
> Ok, that is strong evidence ;-)
>
> Any chance you could try a -current kernel (just kernel, just for
> testing)?
I have a -current installation on that machine, updated on 1 Mar, and
X works on it.
--
Roy Bixler <rcbixler@nyx.net>
From: Roy Bixler <rcbixler@nyx.net>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/53072: netbsd-8 regression: startx (nv driver) crashes
system
Date: Mon, 5 Mar 2018 07:55:19 -0700
On Mon, Mar 05, 2018 at 08:00:02AM +0000, matthew green wrote:
> The following reply was made to PR kern/53072; it has been noted by GNATS.
>
> From: matthew green <mrg@eterna.com.au>
> To: gnats-bugs@NetBSD.org, rcbixler@nyx.net
> Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
> netbsd-bugs@netbsd.org
> Subject: re: kern/53072: netbsd-8 regression: startx (nv driver) crashes system
> Date: Mon, 05 Mar 2018 18:57:19 +1100
>
> > _KERNEL_OPT_NARCNET(0,104,c011e2a5,8,c0fff385,0,104,c0f73de5,dabefc5c,da=
> befc40) a
> > t 0
> > __kernel_end(104,0,c0f73de5,dabefc5c,c2b6ed40,6,dabefce4,dabefc50,c0947c=
> 9a,c0f73
> > de5) at dabefc5c
> > vpanic(c0f73de5,dabefc5c,dabefcd8,c0120935,c0f73de5,dabefce4,dabefce4,1,=
> dabed2c0
> > ,13246) at vpanic+0x131
> > snprintf(c0f73de5,dabefce4,dabefce4,1,dabed2c0,13246,8,0,0,0) at snprint=
> f
> > trap_tss() at trap_tss
> > --- trap via task gate ---
> > _KERNEL_OPT_BEEP_ONHALT_COUNT+0x2:
>
> this is from crash(8)? can you try gdb, see if it can trace
> through the trap and where it really is happening?
>
> this probably is some teardown issue, as it seems that X tries
> and then fails, and then we crash. can you try the vesa driver
> for now, it should have a reasonably performance until we
> figure this problem out.
>
>
> .mrg.
>
I've run out of time for now, but I was able to confirm that a kernel
built from CVS as of 27 Feb. 2018 0900Z doesn't have the problem, but
a kernel build from CVS as of 27 Feb. 2018 1000Z does have the
problem. I also confirmed that I am unable to use X at all with the
latter. I tried the vesa driver and the wsfb driver with vesa mode
and those crash as well. I guess, if I want to use X, I'll be
restricted to using the 27 Feb. 2018 0900Z build or -current.
--
Roy Bixler <rcbixler@nyx.net>
"The fundamental principle of science, the definition almost, is this: the
sole test of the validity of any idea is experiment."
-- Richard P. Feynman
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, rcbixler@nyx.net
Subject: re: kern/53072: netbsd-8 regression: startx (nv driver) crashes system
Date: Tue, 06 Mar 2018 05:17:28 +1100
> I have a -current installation on that machine, updated on 1 Mar, and
> X works on it.
OK, so it's probably the case that i missed a particular change
in the pool/pmap pullup but i am having a lot of trouble figuring
out what it could be. i've made 2 more passes over the list of
updated pool callers and found nothing more.
did you get anywhere with gdb? it might be helpful to track
this down if we know what code path is failing.
thanks.
.mrg.
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org, rcbixler@nyx.net, bsiegert@netbsd.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: re: kern/53072: netbsd-8 regression: startx (nv driver) crashes system
Date: Tue, 06 Mar 2018 06:06:41 +1100
can you try this patch? it's x86/pmap.c 1.267 which was
missed in the pullup.
Benny, this might fix your new problem too. can you also
test it?
thanks.
.mrg.
Index: pmap.c
===================================================================
RCS file: /cvsroot/src/sys/arch/x86/x86/pmap.c,v
retrieving revision 1.245.6.2
diff -p -u -u -r1.245.6.2 pmap.c
--- pmap.c 27 Feb 2018 09:07:33 -0000 1.245.6.2
+++ pmap.c 5 Mar 2018 19:02:45 -0000
@@ -1737,8 +1737,8 @@ pmap_pp_needs_pve(struct pmap_page *pp)
* since the first pv entry is stored in the pmap_page.
*/
- return (pp->pp_flags & PP_EMBEDDED) != 0 ||
- !LIST_EMPTY(&pp->pp_head.pvh_list);
+ return pp && ((pp->pp_flags & PP_EMBEDDED) != 0 ||
+ !LIST_EMPTY(&pp->pp_head.pvh_list));
}
/*
@@ -4123,7 +4123,7 @@ pmap_enter_ma(struct pmap *pmap, vaddr_t
*/
bool needpves = pmap_pp_needs_pve(new_pp);
- if (new_pp && needpves) {
+ if (needpves) {
new_pve = pool_cache_get(&pmap_pv_cache, PR_NOWAIT);
new_sparepve = pool_cache_get(&pmap_pv_cache, PR_NOWAIT);
} else {
From: Roy Bixler <rcbixler@nyx.net>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/53072: netbsd-8 regression: startx (nv driver) crashes
system
Date: Mon, 5 Mar 2018 17:23:19 -0700
On Mon, Mar 05, 2018 at 07:10:00PM +0000, matthew green wrote:
> The following reply was made to PR kern/53072; it has been noted by GNATS.
>
> From: matthew green <mrg@eterna.com.au>
> To: gnats-bugs@NetBSD.org, rcbixler@nyx.net, bsiegert@netbsd.org
> Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
> netbsd-bugs@netbsd.org
> Subject: re: kern/53072: netbsd-8 regression: startx (nv driver) crashes system
> Date: Tue, 06 Mar 2018 06:06:41 +1100
>
> can you try this patch? it's x86/pmap.c 1.267 which was
> missed in the pullup.
I applied the patch to the source tree pulled from netbsd-8 as of
2018-02-27 1000Z and X with the nv driver works again.
> Index: pmap.c
> ===================================================================
> RCS file: /cvsroot/src/sys/arch/x86/x86/pmap.c,v
> retrieving revision 1.245.6.2
> diff -p -u -u -r1.245.6.2 pmap.c
> --- pmap.c 27 Feb 2018 09:07:33 -0000 1.245.6.2
> +++ pmap.c 5 Mar 2018 19:02:45 -0000
> @@ -1737,8 +1737,8 @@ pmap_pp_needs_pve(struct pmap_page *pp)
> * since the first pv entry is stored in the pmap_page.
> */
>
> - return (pp->pp_flags & PP_EMBEDDED) != 0 ||
> - !LIST_EMPTY(&pp->pp_head.pvh_list);
> + return pp && ((pp->pp_flags & PP_EMBEDDED) != 0 ||
> + !LIST_EMPTY(&pp->pp_head.pvh_list));
> }
>
> /*
> @@ -4123,7 +4123,7 @@ pmap_enter_ma(struct pmap *pmap, vaddr_t
> */
>
> bool needpves = pmap_pp_needs_pve(new_pp);
> - if (new_pp && needpves) {
> + if (needpves) {
> new_pve = pool_cache_get(&pmap_pv_cache, PR_NOWAIT);
> new_sparepve = pool_cache_get(&pmap_pv_cache, PR_NOWAIT);
> } else {
>
--
Roy Bixler <rcbixler@nyx.net>
"The fundamental principle of science, the definition almost, is this: the
sole test of the validity of any idea is experiment."
-- Richard P. Feynman
Responsible-Changed-From-To: kern-bug-people->mrg
Responsible-Changed-By: mrg@NetBSD.org
Responsible-Changed-When: Tue, 06 Mar 2018 09:40:18 +0000
Responsible-Changed-Why:
my pullup caused the problem.
State-Changed-From-To: open->closed
State-Changed-By: mrg@NetBSD.org
State-Changed-When: Tue, 06 Mar 2018 09:40:18 +0000
State-Changed-Why:
fix has been pulled up. thanks!
From: Benny Siegert <bsiegert@netbsd.org>
To: mrg@eterna.com.au
Cc: gnats-bugs@netbsd.org, rcbixler@nyx.net, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/53072: netbsd-8 regression: startx (nv driver) crashes system
Date: Sun, 11 Mar 2018 17:32:51 +0000
On Sun, Mar 11, 2018 at 5:33 PM Benny Siegert <bsiegert@netbsd.org> wrote:
> On Mon, Mar 5, 2018 at 8:06 PM matthew green <mrg@eterna.com.au> wrote:
> > Benny, this might fix your new problem too. can you also
> > test it?
> I downloaded a NetBSD-8 GENERIC kernel built on March 11, so that should
> include the pulled-up patch. I still get runaway memory allocation on
login
> (from the login process). Because of that, I could not test whether startx
> succeeds.
I managed to gdb into the login process, and it was stuck below
__getlastlogx50. Removing /var/log/lastlogx fixed this problem!
Now I could actually run startx, which was crashing before on this
installation. It is now working perfectly. Thanks for fixing the problem.
--
Benny
From: Benny Siegert <bsiegert@netbsd.org>
To: mrg@eterna.com.au
Cc: gnats-bugs@netbsd.org, rcbixler@nyx.net, kern-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: kern/53072: netbsd-8 regression: startx (nv driver) crashes system
Date: Sun, 11 Mar 2018 16:33:18 +0000
On Mon, Mar 5, 2018 at 8:06 PM matthew green <mrg@eterna.com.au> wrote:
> Benny, this might fix your new problem too. can you also
> test it?
I downloaded a NetBSD-8 GENERIC kernel built on March 11, so that should
include the pulled-up patch. I still get runaway memory allocation on login
(from the login process). Because of that, I could not test whether startx
succeeds.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.43 2018/01/16 07:36:43 maya Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2017
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.