NetBSD Problem Report #54962

From knakahara@netbsd.org  Fri Feb 14 11:52:01 2020
Return-Path: <knakahara@netbsd.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 0E6DF1A9213
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 14 Feb 2020 11:52:01 +0000 (UTC)
Message-Id: <20200214115200.1DEA21A9259@mollari.NetBSD.org>
Date: Fri, 14 Feb 2020 11:52:00 +0000 (UTC)
From: knakahara@netbsd.org
Reply-To: knakahara@netbsd.org
To: gnats-bugs@NetBSD.org
Subject: cannot boot NetBSD-current on VMware player 15 on Ryzen 5 3600X after sys/arch/x86/x86/cpu_topology.c:r1.18
X-Send-Pr-Version: 3.95

>Number:         54962
>Category:       port-amd64
>Synopsis:       cannot boot NetBSD-current on VMware player 15 on Ryzen 5 3600X after sys/arch/x86/x86/cpu_topology.c:r1.18
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    mlelstv
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Feb 14 11:55:00 +0000 2020
>Closed-Date:    
>Last-Modified:  Mon Mar 09 04:28:10 +0000 2020
>Originator:     Kengo NAKAHARA
>Release:        HEAD (after sys/arch/x86/x86/cpu_topology.c:r1.18)
>Organization:
	TNF
>Environment:
	HEAD on VMware player 15 on Ryzen 5 3600X
System: NetBSD mollari.NetBSD.org 8.1_STABLE NetBSD 8.1_STABLE (amd64-DOMU_SERVER) #12: Fri Jan 24 18:09:18 UTC 2020 spz@franklin.NetBSD.org:/home/netbsd/8/amd64/obj/sys/arch/amd64/compile/amd64-DOMU_SERVER amd64
Architecture: x86_64
Machine: amd64
>Description:
	After sys/arch/x86/x86/cpu_topology.c:r1.18, NetBSD-current cannot boot on VMware player 15 on Ryzen 5 3600X.  The kernel silently stop while loading kernel.
>How-To-Repeat:
	Just boot on VMware player 15 on windows 10 whose CPU is Ryzen 5 3600X.	
>Fix:
Apply the following patch, it can boot.  However, I don't understand the patch is appropriate or not.

diff --git a/sys/arch/x86/x86/cpu_topology.c b/sys/arch/x86/x86/cpu_topology.c
index f73b7f65052..69002070829 100644
--- a/sys/arch/x86/x86/cpu_topology.c
+++ b/sys/arch/x86/x86/cpu_topology.c
@@ -192,7 +192,7 @@ x86_cpu_topology(struct cpu_info *ci)

                KASSERT(smt_bits == 0);
                smt_bits = ilog2(threads);
-               KASSERT(smt_bits <= core_bits);
+//             KASSERT(smt_bits <= core_bits);
                core_bits -= smt_bits;
        }

>Release-Note:

>Audit-Trail:

State-Changed-From-To: open->feedback
State-Changed-By: maxv@NetBSD.org
State-Changed-When: Sat, 15 Feb 2020 18:01:49 +0000
State-Changed-Why:
I noticed a bug in x86_cpu_topology() some time ago:

115 		/* Maximum number of LPs sharing a cache (ebx[23:16]). */
116 		x86_cpuid(1, descs);
117 		lp_max = __SHIFTOUT(descs[1], CPUID_HTT_CORES);

The Intel specification (SDM Volume 1) says:

	"The nearest power-of-2 integer that is not smaller than EBX[23:16] is the
	 number of unique initial APICIDs reserved for addressing different logical
	 processors in a physical package"

So here lp_max should be rounded up to the nearest power-of-2.


From: Kengo NAKAHARA <k-nakahara@iij.ad.jp>
To: gnats-bugs@netbsd.org, port-amd64-maintainer@netbsd.org,
        netbsd-bugs@netbsd.org, gnats-admin@netbsd.org, maxv@NetBSD.org,
        knakahara@netbsd.org
Cc: 
Subject: Re: port-amd64/54962 (cannot boot NetBSD-current on VMware player 15
 on Ryzen 5 3600X after sys/arch/x86/x86/cpu_topology.c:r1.18)
Date: Mon, 17 Feb 2020 19:27:49 +0900

 Hi,

 Thank you for your comment.

 On 2020/02/16 3:01, maxv@NetBSD.org wrote:
 > Synopsis: cannot boot NetBSD-current on VMware player 15 on Ryzen 5 3600X after sys/arch/x86/x86/cpu_topology.c:r1.18
 > 
 > State-Changed-From-To: open->feedback
 > State-Changed-By: maxv@NetBSD.org
 > State-Changed-When: Sat, 15 Feb 2020 18:01:49 +0000
 > State-Changed-Why:
 > I noticed a bug in x86_cpu_topology() some time ago:
 > 
 > 115 		/* Maximum number of LPs sharing a cache (ebx[23:16]). */
 > 116 		x86_cpuid(1, descs);
 > 117 		lp_max = __SHIFTOUT(descs[1], CPUID_HTT_CORES);
 > 
 > The Intel specification (SDM Volume 1) says:
 > 
 > 	"The nearest power-of-2 integer that is not smaller than EBX[23:16] is the
 > 	 number of unique initial APICIDs reserved for addressing different logical
 > 	 processors in a physical package"
 > 
 > So here lp_max should be rounded up to the nearest power-of-2.

 I try the fix, however the kernel cannot boot on my environment.
 Hmm, it may have other bug(s)...


 Thanks,

 -- 
 //////////////////////////////////////////////////////////////////////
 Internet Initiative Japan Inc.

 Device Engineering Section,
 Product Development Department,
 Product Division,
 Technology Unit

 Kengo NAKAHARA <k-nakahara@iij.ad.jp>

Responsible-Changed-From-To: port-amd64-maintainer->mlelstv
Responsible-Changed-By: knakahara@NetBSD.org
Responsible-Changed-When: Mon, 09 Mar 2020 04:28:10 +0000
Responsible-Changed-Why:
cpu_topology.c:r1.18 is commited by mlelstv@n.o


State-Changed-From-To: feedback->open
State-Changed-By: knakahara@NetBSD.org
State-Changed-When: Mon, 09 Mar 2020 04:28:10 +0000
State-Changed-Why:
cpu_topology.c:r1.18 is commited by mlelstv@n.o


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.