NetBSD Problem Report #55573

From gson@gson.org  Fri Aug 14 08:10:37 2020
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.NetBSD.org", Issuer "mail.NetBSD.org CA" (not verified))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 990161A9239
	for <gnats-bugs@gnats.NetBSD.org>; Fri, 14 Aug 2020 08:10:37 +0000 (UTC)
Message-Id: <20200814081033.D4A22253EDE@guava.gson.org>
Date: Fri, 14 Aug 2020 11:10:33 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: sparc testbed logs many "coprocessor instruction" messages
X-Send-Pr-Version: 3.95

>Number:         55573
>Category:       port-sparc
>Synopsis:       sparc testbed logs many "coprocessor instruction" messages
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    port-sparc-maintainer
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Fri Aug 14 08:15:00 +0000 2020
>Closed-Date:    Thu Aug 20 14:30:30 +0000 2020
>Last-Modified:  Wed Mar 31 13:55:01 +0000 2021
>Originator:     Andreas Gustafsson
>Release:        NetBSD-current
>Organization:

>Environment:
System: NetBSD
Architecture: sparc
Machine: sparc
>Description:

The qemu-based TNF sparc testbed has started logging large numbers of
kernel messages saying "coprocessor instruction", like the following:

  Starting syslogd.
  [  15.0167285] coprocessor instruction
  [  15.0167285] coprocessor instruction
  [  15.0167285] coprocessor instruction
  [  15.0167285] coprocessor instruction
  Mounting all file systems...

This is from:

  http://releng.netbsd.org/b5reports/sparc/2020/2020.06.22.03.16.29/test.log

There seem to be no ill effects other than cluttering the console
output and causing confusion.

The problem started during the period of build breakage between source
dates 2020.06.20.02.27.55 and 2020.06.22.03.16.29.  The only sparc
specific commits during the period in case were:

  2020.06.21.22.16.08 christos src/crypto/external/bsd/openssl/lib/libcrypto/arch/sparc/bn.inc 1.2
  2020.06.21.22.16.08 christos src/crypto/external/bsd/openssl/lib/libcrypto/arch/sparc/crypto.inc 1.13
  2020.06.21.22.16.08 christos src/crypto/external/bsd/openssl/lib/libcrypto/arch/sparc/modes.inc 1.4

Is openssl perhaps trying to use some kind of crypto coprocessor that
is not supported by qemu, and successfully falling back to another
method?  If that's the case, perhaps the message should simply be
disabled.

>How-To-Repeat:

>Fix:

>Release-Note:

>Audit-Trail:
From: Martin Husemann <martin@duskware.de>
To: gnats-bugs@netbsd.org
Cc: 
Subject: Re: port-sparc/55573: sparc testbed logs many "coprocessor
 instruction" messages
Date: Fri, 14 Aug 2020 10:30:23 +0200

 On Fri, Aug 14, 2020 at 08:15:01AM +0000, Andreas Gustafsson wrote:
 >   [  15.0167285] coprocessor instruction

 This is a FPU instruction with FPU disabled (trap T_CPDISABLED),
 sending a SIGILL to the process.

 Wild guess: the in-kernel FPU changes modified the FPU disable method
 so we now get this instead of T_FPDISABLED (which would cause us to do
 proper FPU fixup).

 Martin

From: Andreas Gustafsson <gson@gson.org>
To: Martin Husemann <martin@duskware.de>
Cc: gnats-bugs@netbsd.org
Subject: Re: port-sparc/55573: sparc testbed logs many "coprocessor
 instruction" messages
Date: Fri, 14 Aug 2020 11:55:33 +0300

 Martin Husemann wrote:
 >  This is a FPU instruction with FPU disabled (trap T_CPDISABLED),
 >  sending a SIGILL to the process.

 A number of ATF test cases also trigger the message, yet they pass;
 if they are getting a SIGILL, why don't they fail?
 -- 
 Andreas Gustafsson, gson@gson.org

From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@netbsd.org
Cc: port-sparc-maintainer@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org
Subject: re: port-sparc/55573: sparc testbed logs many "coprocessor instruction" messages
Date: Fri, 14 Aug 2020 19:08:17 +1000

 > The qemu-based TNF sparc testbed has started logging large numbers of
 > kernel messages saying "coprocessor instruction", like the following:
 > =

 >   Starting syslogd.
 >   [  15.0167285] coprocessor instruction
 >   [  15.0167285] coprocessor instruction
 >   [  15.0167285] coprocessor instruction
 >   [  15.0167285] coprocessor instruction
 >   Mounting all file systems...
 > =

 > This is from:
 > =

 >   http://releng.netbsd.org/b5reports/sparc/2020/2020.06.22.03.16.29/test=
 .log
 > =

 > There seem to be no ill effects other than cluttering the console
 > output and causing confusion.
 > =

 > The problem started during the period of build breakage between source
 > dates 2020.06.20.02.27.55 and 2020.06.22.03.16.29.  The only sparc
 > specific commits during the period in case were:
 > =

 >   2020.06.21.22.16.08 christos src/crypto/external/bsd/openssl/lib/libcr=
 ypto/arch/sparc/bn.inc 1.2
 >   2020.06.21.22.16.08 christos src/crypto/external/bsd/openssl/lib/libcr=
 ypto/arch/sparc/crypto.inc 1.13
 >   2020.06.21.22.16.08 christos src/crypto/external/bsd/openssl/lib/libcr=
 ypto/arch/sparc/modes.inc 1.4
 > =

 > Is openssl perhaps trying to use some kind of crypto coprocessor that
 > is not supported by qemu, and successfully falling back to another
 > method?  If that's the case, perhaps the message should simply be
 > disabled.

 this is odd.  i had a look, there doesnt seem to be
 any co-pro functions called here.  (i actually did
 not recall they existed, and had to read the manual.)

 i'll see about trying to reproduce it, expand the
 message to be more verbose about what proc/%pc etc.


 .mrg.

From: Martin Husemann <martin@duskware.de>
To: Andreas Gustafsson <gson@gson.org>
Cc: gnats-bugs@netbsd.org
Subject: Re: port-sparc/55573: sparc testbed logs many "coprocessor
 instruction" messages
Date: Fri, 14 Aug 2020 11:15:04 +0200

 On Fri, Aug 14, 2020 at 11:55:33AM +0300, Andreas Gustafsson wrote:
 > Martin Husemann wrote:
 > >  This is a FPU instruction with FPU disabled (trap T_CPDISABLED),
 > >  sending a SIGILL to the process.
 > 
 > A number of ATF test cases also trigger the message, yet they pass;
 > if they are getting a SIGILL, why don't they fail?

 Yes, you are right (had to look up the traps, they are different).
 Let's identify the instruction causing it and remove the kernel printf.

 Martin

From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@netbsd.org, martin@netbsd.org
Cc: port-sparc-maintainer@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org, gson@gson.org (Andreas Gustafsson)
Subject: re: port-sparc/55573: sparc testbed logs many "coprocessor instruction" messages
Date: Fri, 14 Aug 2020 19:20:27 +1000

 Martin Husemann writes:
 > The following reply was made to PR port-sparc/55573; it has been noted b=
 y GNATS.
 > =

 > From: Martin Husemann <martin@duskware.de>
 > To: gnats-bugs@netbsd.org
 > Cc: =

 > Subject: Re: port-sparc/55573: sparc testbed logs many "coprocessor
 >  instruction" messages
 > Date: Fri, 14 Aug 2020 10:30:23 +0200
 > =

 >  On Fri, Aug 14, 2020 at 08:15:01AM +0000, Andreas Gustafsson wrote:
 >  >   [  15.0167285] coprocessor instruction
 >  =

 >  This is a FPU instruction with FPU disabled (trap T_CPDISABLED),
 >  sending a SIGILL to the process.
 >  =

 >  Wild guess: the in-kernel FPU changes modified the FPU disable method
 >  so we now get this instead of T_FPDISABLED (which would cause us to do
 >  proper FPU fixup).

 i think you're confused.  i had to look up sparc copro,
 thinking similar:

 #define T_FPDISABLED    0x04    /* (5) fp instr while fp disabled */
 #define T_CPDISABLED    0x24    /* (5) coprocessor instr while disabled */

 are different.  the latter happens for stuff that access
 the "%cN" registers, but i can't find code that does that
 in openssl...

 search for "cp_disabled" in the v8 manual for details.


 .mrg.

From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@netbsd.org, martin@duskware.de
Cc: port-sparc-maintainer@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org, gson@gson.org (Andreas Gustafsson)
Subject: re: port-sparc/55573: sparc testbed logs many "coprocessor instruction" messages
Date: Fri, 14 Aug 2020 19:58:04 +1000

 >  Martin Husemann wrote:
 >  >  This is a FPU instruction with FPU disabled (trap T_CPDISABLED),
 >  >  sending a SIGILL to the process.
 >  
 >  A number of ATF test cases also trigger the message, yet they pass;
 >  if they are getting a SIGILL, why don't they fail?

 i suspect that this is sparcv9 code that has the same instruction
 encoding as sparcv8 co-pro (the conditional branches do, and so
 do many of the VIS instructions, and a few others.)

 if i'm right, we should just disable the message.


 .mrg.

From: Martin Husemann <martin@duskware.de>
To: Andreas Gustafsson <gson@gson.org>
Cc: gnats-bugs@netbsd.org
Subject: Re: port-sparc/55573: sparc testbed logs many "coprocessor
 instruction" messages
Date: Fri, 14 Aug 2020 15:43:12 +0200

 Matthew is right:

 Program received signal SIGILL, Illegal instruction.
 0xedc9aacc in _sparcv9_vis1_probe () from /usr/lib/libcrypto.so.14
 (gdb) c
 Continuing.
 [ 10891.4871409] coprocessor instruction

 Program received signal SIGILL, Illegal instruction.
 0xedc9ab58 in _sparcv9_fmadd_probe () from /usr/lib/libcrypto.so.14
 (gdb) x/i $pc
 => 0xedc9ab58 <_sparcv9_fmadd_probe>:       cpop1  [ %g0 + %g0 ], %g0

 so the fmadd (v9 VIS instruction) overlays the v8 cpop1 instructions and
 triggers this trap.

 I'll remove the kernel printf.

 Martin

From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55573 CVS commit: src/sys/arch/sparc/sparc
Date: Fri, 14 Aug 2020 13:45:44 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Fri Aug 14 13:45:44 UTC 2020

 Modified Files:
 	src/sys/arch/sparc/sparc: trap.c

 Log Message:
 PR port-sparc/55573: remove kernel message about disabled coprocessor
 instructions - it is triggered by userland trying to detect availability
 of sparcv9 VIS instructions.


 To generate a diff of this commit:
 cvs rdiff -u -r1.198 -r1.199 src/sys/arch/sparc/sparc/trap.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Thu, 20 Aug 2020 14:30:30 +0000
State-Changed-Why:
The messages are gone.


From: "Martin Husemann" <martin@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/55573 CVS commit: [netbsd-9] src/sys/arch/sparc/sparc
Date: Wed, 31 Mar 2021 13:51:05 +0000

 Module Name:	src
 Committed By:	martin
 Date:		Wed Mar 31 13:51:05 UTC 2021

 Modified Files:
 	src/sys/arch/sparc/sparc [netbsd-9]: trap.c

 Log Message:
 Pull up following revision(s) (requested by christos in ticket #1240):

 	sys/arch/sparc/sparc/trap.c: revision 1.199

 PR port-sparc/55573: remove kernel message about disabled coprocessor
 instructions - it is triggered by userland trying to detect availability
 of sparcv9 VIS instructions.


 To generate a diff of this commit:
 cvs rdiff -u -r1.198 -r1.198.4.1 src/sys/arch/sparc/sparc/trap.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.46 2020/01/03 16:35:01 leot Exp $
$NetBSD: gnats_config.sh,v 1.9 2014/08/02 14:16:04 spz Exp $
Copyright © 1994-2020 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.