NetBSD Problem Report #51569
From gson@gson.org Fri Oct 21 17:10:36 2016
Return-Path: <gson@gson.org>
Received: from mail.netbsd.org (mail.netbsd.org [199.233.217.200])
(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
by mollari.NetBSD.org (Postfix) with ESMTPS id 48A8A7A217
for <gnats-bugs@gnats.NetBSD.org>; Fri, 21 Oct 2016 17:10:36 +0000 (UTC)
Message-Id: <20161021171029.87CA27449D1@guava.gson.org>
Date: Fri, 21 Oct 2016 20:10:29 +0300 (EEST)
From: gson@gson.org (Andreas Gustafsson)
Reply-To: gson@gson.org (Andreas Gustafsson)
To: gnats-bugs@NetBSD.org
Subject: crypto tests randomly failing on amd64 with openssl 1.0.2j
X-Send-Pr-Version: 3.95
>Number: 51569
>Category: lib
>Synopsis: crypto tests randomly failing on amd64 with openssl 1.0.2j
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: lib-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Fri Oct 21 17:15:00 +0000 2016
>Closed-Date: Thu Jan 19 19:25:26 +0000 2017
>Last-Modified: Thu Jan 19 19:25:26 +0000 2017
>Originator: Andreas Gustafsson
>Release: NetBSD-current, source date >= 2016.10.14.20.34.29
>Organization:
>Environment:
System: NetBSD
Architecture: x86_64
Machine: amd64
>Description:
A number of tests in tests/crypto/libcrypto are randomly failing on
the amd64 testbed, in about half the runs. The failing test cases
are:
crypto/libcrypto/t_libcrypto:bn
crypto/libcrypto/t_pubkey:ec
crypto/libcrypto/t_pubkey:ecdh
crypto/libcrypto/t_pubkey:ecdsa
This appears to have started with the import of openssl 1.0.2j on
October 14.
The first logged failures are at:
http://releng.netbsd.org/b5reports/amd64/commits-2016.10.html#2016.10.14.20.34.29
>How-To-Repeat:
>Fix:
>Release-Note:
>Audit-Trail:
From: Andreas Gustafsson <gson@gson.org>
To: gnats-bugs@NetBSD.org
Cc: spz@NetBSD.org
Subject: Re: lib/51569: crypto tests randomly failing on amd64 with openssl 1.0.2j
Date: Wed, 28 Dec 2016 19:21:37 +0200
Running
cd /usr/tests/crypto/libcrypto
./t_libcrypto bn
on a physical machine yielded a core file and a backtrace:
(gdb) where
#0 0x00007d4b35f735ca in bn_GF2m_mul_2x2 () from /usr/lib/libcrypto.so.12
#1 0x00007d4b35f6ebb7 in BN_GF2m_mod_mul_arr () from /usr/lib/libcrypto.so.12
#2 0x00007d4b35f6f7ce in BN_GF2m_mod_mul () from /usr/lib/libcrypto.so.12
#3 0x0000000003407cb7 in test_gf2m_mod_mul ()
#4 0x0000000003409e99 in main ()
(gdb) x/i $pc
=> 0x7d4b35f735ca <bn_GF2m_mul_2x2+42>: pclmullqlqdq %xmm1,%xmm0
(gdb)
Wikipedia says pclmullqlqdq (I hope I will never have to pronounce
that) was introduced around 2010, but the CPU on this machine (an
Intel Core 2 Quad Q6600) is older than that, so it looks like there
is something wrong with the CPU feature detection.
--
Andreas Gustafsson, gson@gson.org
From: coypu@SDF.ORG
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: lib/51569: crypto tests randomly failing on amd64 with openssl
1.0.2j
Date: Thu, 5 Jan 2017 13:44:48 +0000
--pWyiEgJYm5f9v55/
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Hi,
I can reproduce this on my pre-Westmere laptop.
The attached diff seems to work for me, and seems to use
the faster code for my newer desktop.
--pWyiEgJYm5f9v55/
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="pclmul.diff"
Index: x86_64/x86_64-gf2m.S
===================================================================
RCS file: /cvsroot/src/crypto/external/bsd/openssl/lib/libcrypto/arch/x86_64/x86_64-gf2m.S,v
retrieving revision 1.3
diff -u -p -u -r1.3 x86_64-gf2m.S
--- x86_64/x86_64-gf2m.S 14 Oct 2016 16:09:44 -0000 1.3
+++ x86_64/x86_64-gf2m.S 5 Jan 2017 13:40:39 -0000
@@ -202,8 +202,8 @@ _mul_1x1:
.type bn_GF2m_mul_2x2,@function
.align 16
bn_GF2m_mul_2x2:
- movq OPENSSL_ia32cap_P@GOTPCREL(%rip),%rax
- btq $33,%rax
+ mov OPENSSL_ia32cap_P+4(%rip),%eax
+ bt $1,%eax
jnc .Lvanilla_mul_2x2
.byte 102,72,15,110,198
--pWyiEgJYm5f9v55/--
From: coypu@SDF.ORG
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: lib/51569
Date: Thu, 5 Jan 2017 14:28:47 +0000
should probably be $2?
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51569 CVS commit: src/crypto/external/bsd/openssl/dist/crypto/bn/asm
Date: Sat, 7 Jan 2017 20:55:49 -0500
Module Name: src
Committed By: christos
Date: Sun Jan 8 01:55:49 UTC 2017
Modified Files:
src/crypto/external/bsd/openssl/dist/crypto/bn/asm: x86_64-gf2m.pl
Log Message:
PR/51569: Andreas Gustafsson: Check the right bit for pclmulqdq:
Perform a Carry-Less Multiplication of Quadword instruction
(accelerator for GCM)
To generate a diff of this commit:
cvs rdiff -u -r1.4 -r1.5 \
src/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gf2m.pl
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Andreas Gustafsson <gson@gson.org>
To: christos@NetBSD.org
Cc: gnats-bugs@netbsd.org, coypu@SDF.ORG, spz@NetBSD.org
Subject: Re: PR/51569 CVS commit: src/crypto/external/bsd/openssl/dist/crypto/bn/asm
Date: Mon, 9 Jan 2017 17:46:27 +0200
Christos,
You wrote:
> Log Message:
> PR/51569: Andreas Gustafsson: Check the right bit for pclmulqdq:
> Perform a Carry-Less Multiplication of Quadword instruction
> (accelerator for GCM)
>
> To generate a diff of this commit:
> cvs rdiff -u -r1.4 -r1.5 \
> src/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gf2m.pl
I think the actual root cause of this bug was the line
-e 's/\(OPENSSL[A-Za-z0-9_]*\)(%rip)/\1@GOTPCREL(%rip)/' \
in src/crypto/external/bsd/openssl/lib/libcrypto/arch/x86_64/Makefile,
which you later removed in revision 1.11.
I think the change to x86_64-gf2m.pl worked simply because the added
"+4" caused the regexp not to match, so that the @GOTPCREL was not
added. Other than that, it seems to me the old and new code should
give the same result (i.e., test the same bit). Now that the
@GOTPCREL hack is gone from the Makefile, there should no longer be
any need for the change to x86_64-gf2m.pl, so I think it should be
reverted in the interest of keeping the differences wrt upstream
minimal.
I have a local change to revert the x86_64-gf2m.pl change - OK to
commit that and the corresponding regenerated .S file?
--
Andreas Gustafsson, gson@gson.org
From: christos@zoulas.com (Christos Zoulas)
To: Andreas Gustafsson <gson@gson.org>
Cc: gnats-bugs@netbsd.org, coypu@SDF.ORG, spz@NetBSD.org
Subject: Re: PR/51569 CVS commit: src/crypto/external/bsd/openssl/dist/crypto/bn/asm
Date: Mon, 9 Jan 2017 10:52:50 -0500
On Jan 9, 5:46pm, gson@gson.org (Andreas Gustafsson) wrote:
-- Subject: Re: PR/51569 CVS commit: src/crypto/external/bsd/openssl/dist/cry
| Christos,
|
| You wrote:
| > Log Message:
| > PR/51569: Andreas Gustafsson: Check the right bit for pclmulqdq:
| > Perform a Carry-Less Multiplication of Quadword instruction
| > (accelerator for GCM)
| >
| > To generate a diff of this commit:
| > cvs rdiff -u -r1.4 -r1.5 \
| > src/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gf2m.pl
|
| I think the actual root cause of this bug was the line
|
| -e 's/\(OPENSSL[A-Za-z0-9_]*\)(%rip)/\1@GOTPCREL(%rip)/' \
|
| in src/crypto/external/bsd/openssl/lib/libcrypto/arch/x86_64/Makefile,
| which you later removed in revision 1.11.
|
| I think the change to x86_64-gf2m.pl worked simply because the added
| "+4" caused the regexp not to match, so that the @GOTPCREL was not
| added. Other than that, it seems to me the old and new code should
| give the same result (i.e., test the same bit). Now that the
| @GOTPCREL hack is gone from the Makefile, there should no longer be
| any need for the change to x86_64-gf2m.pl, so I think it should be
| reverted in the interest of keeping the differences wrt upstream
| minimal.
|
| I have a local change to revert the x86_64-gf2m.pl change - OK to
| commit that and the corresponding regenerated .S file?
If you can test it, go for it!
christos
From: "Andreas Gustafsson" <gson@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/51569 CVS commit: src/crypto/external/bsd/openssl/dist/crypto/bn/asm
Date: Mon, 9 Jan 2017 17:36:24 +0000
Module Name: src
Committed By: gson
Date: Mon Jan 9 17:36:24 UTC 2017
Modified Files:
src/crypto/external/bsd/openssl/dist/crypto/bn/asm: x86_64-gf2m.pl
Log Message:
Revert last two revisions; local changes should no longer be needed
now that the root cause of PR lib/51569 is fixed by revision 1.11 of
src/crypto/external/bsd/openssl/lib/libcrypto/arch/x86_64/Makefile.
To generate a diff of this commit:
cvs rdiff -u -r1.6 -r1.7 \
src/crypto/external/bsd/openssl/dist/crypto/bn/asm/x86_64-gf2m.pl
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: Joerg Sonnenberger <joerg@bec.de>
To: gnats-bugs@NetBSD.org
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, Andreas Gustafsson <gson@gson.org>
Subject: Re: PR/51569 CVS commit:
src/crypto/external/bsd/openssl/dist/crypto/bn/asm
Date: Mon, 9 Jan 2017 21:03:48 +0100
On Mon, Jan 09, 2017 at 03:50:01PM +0000, Andreas Gustafsson wrote:
> I think the change to x86_64-gf2m.pl worked simply because the added
> "+4" caused the regexp not to match, so that the @GOTPCREL was not
> added. Other than that, it seems to me the old and new code should
> give the same result (i.e., test the same bit). Now that the
> @GOTPCREL hack is gone from the Makefile, there should no longer be
> any need for the change to x86_64-gf2m.pl, so I think it should be
> reverted in the interest of keeping the differences wrt upstream
> minimal.
The original version penalizes both in terms of code size and
unnecessary unaligned access. Don't just revert to that crap.
Joerg
From: Andreas Gustafsson <gson@gson.org>
To: Joerg Sonnenberger <joerg@bec.de>
Cc: gnats-bugs@NetBSD.org
Subject: Re: PR/51569 CVS commit: src/crypto/external/bsd/openssl/dist/crypto/bn/asm
Date: Mon, 9 Jan 2017 22:39:39 +0200
Joerg Sonnenberger wrote:
> The original version penalizes both in terms of code size and
> unnecessary unaligned access. Don't just revert to that crap.
I already reverted it, with christos' blessing. If you don't like
the openssl code, I suggest you submit your improvements to the
openssl project rather than make NetBSD diverge.
--
Andreas Gustafsson, gson@gson.org
From: Joerg Sonnenberger <joerg@bec.de>
To: gnats-bugs@NetBSD.org
Cc: lib-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org, Andreas Gustafsson <gson@gson.org>
Subject: Re: PR/51569 CVS commit:
src/crypto/external/bsd/openssl/dist/crypto/bn/asm
Date: Tue, 10 Jan 2017 02:03:37 +0100
On Mon, Jan 09, 2017 at 08:40:01PM +0000, Andreas Gustafsson wrote:
> The following reply was made to PR lib/51569; it has been noted by GNATS.
>
> From: Andreas Gustafsson <gson@gson.org>
> To: Joerg Sonnenberger <joerg@bec.de>
> Cc: gnats-bugs@NetBSD.org
> Subject: Re: PR/51569 CVS commit: src/crypto/external/bsd/openssl/dist/crypto/bn/asm
> Date: Mon, 9 Jan 2017 22:39:39 +0200
>
> Joerg Sonnenberger wrote:
> > The original version penalizes both in terms of code size and
> > unnecessary unaligned access. Don't just revert to that crap.
>
> I already reverted it, with christos' blessing. If you don't like
> the openssl code, I suggest you submit your improvements to the
> openssl project rather than make NetBSD diverge.
Ah yes, let's ignore any on-going discussion by trigger happy commits.
Let's just make code worse, because "upstream" doesn't care about code
quality. I really like your attitude...
Joerg
From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, lib-bug-people@netbsd.org,
gnats-admin@netbsd.org, netbsd-bugs@netbsd.org,
gson@gson.org (Andreas Gustafsson)
Cc:
Subject: Re: PR/51569 CVS commit: src/crypto/external/bsd/openssl/dist/crypto/bn/asm
Date: Mon, 9 Jan 2017 20:32:17 -0500
On Jan 10, 1:05am, joerg@bec.de (Joerg Sonnenberger) wrote:
-- Subject: Re: PR/51569 CVS commit: src/crypto/external/bsd/openssl/dist/cry
| Ah yes, let's ignore any on-going discussion by trigger happy commits.
| Let's just make code worse, because "upstream" doesn't care about code
| quality. I really like your attitude...
I will push it upstream, no reason to fight :-)
christos
From: coypu@SDF.ORG
To: gnats-bugs@netbsd.org
Cc:
Subject: Re: lib/51569: crypto tests randomly failing on amd64 with openssl
Date: Tue, 10 Jan 2017 12:56:11 +0000
I have an untested diff for this against upstream openssl here:
https://github.com/openssl/openssl/compare/master...coypoop:master
I'm not sure if it will be welcome.
State-Changed-From-To: open->closed
State-Changed-By: gson@NetBSD.org
State-Changed-When: Thu, 19 Jan 2017 19:25:26 +0000
State-Changed-Why:
The bug is fixed. Discussion about OpenSSL code changes not
needed to fix this particular bug belongs elsewhere.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.