NetBSD Problem Report #37860
From abecher@kawo2.rwth-aachen.de Thu Jan 24 11:34:05 2008
Return-Path: <abecher@kawo2.rwth-aachen.de>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by narn.NetBSD.org (Postfix) with ESMTP id 7954063B8A2
for <gnats-bugs@gnats.netbsd.org>; Thu, 24 Jan 2008 11:34:05 +0000 (UTC)
Message-Id: <200801241134.m0OBY1rq001467@abc.kawo2.rwth-aachen.de>
Date: Thu, 24 Jan 2008 12:34:01 +0100 (CET)
From: Alexander Becher <abecher@kawo2.rwth-aachen.de>
Reply-To: abecher@kawo2.rwth-aachen.de
To: gnats-bugs@gnats.NetBSD.org
Subject: sort -n sorts 0 after 0.1
X-Send-Pr-Version: 3.95
>Number: 37860
>Category: bin
>Synopsis: sort -n sorts 0 after 0.1
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: bin-bug-people
>State: closed
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Thu Jan 24 11:35:00 +0000 2008
>Closed-Date: Sat Aug 22 11:18:51 +0000 2009
>Last-Modified: Wed Oct 14 20:45:36 +0000 2009
>Originator: Alexander Becher
>Release: NetBSD 3.1_STABLE
>Organization:
>Environment:
System: NetBSD abc 3.1_STABLE NetBSD 3.1_STABLE (abc) #1: Sat Sep 29 22:43:36 CEST 2007 alex@abc:/usr/obj/sys/arch/i386/compile/abc i386
Architecture: i386
Machine: i386
$ ident /usr/bin/sort
/usr/bin/sort:
$NetBSD: crt0.c,v 1.13 2003/07/26 19:24:27 salo Exp $
$NetBSD: append.c,v 1.13 2004/02/15 11:52:12 jdolecek Exp $
$NetBSD: fields.c,v 1.18 2004/03/14 21:12:14 heas Exp $
$NetBSD: files.c,v 1.23 2004/02/15 11:52:12 jdolecek Exp $
$NetBSD: fsort.c,v 1.30 2004/02/15 11:54:17 jdolecek Exp $
$NetBSD: init.c,v 1.16 2004/11/03 20:14:36 dsl Exp $
$NetBSD: msort.c,v 1.17 2004/02/17 19:09:36 jdolecek Exp $
$NetBSD: sort.c,v 1.41 2004/07/23 13:26:11 wiz Exp $
$NetBSD: tmp.c,v 1.11 2003/08/07 11:32:34 jdolecek Exp $
>Description:
Sorting lines that contain only numbers between 0 and 1 with the -n
option leads lines that contain only "0" to be sorted after lines that
start with "0.0". Obviously, 0.01 < 0 < 0.2 is wrong, but that's the
resulting sort order.
>How-To-Repeat:
$ echo -e "0.01\n0.4\n0.0\n0\n0.2" | /usr/bin/sort -n
0.0
0.01
0
0.2
0.4
>Fix:
As a work-around, I used GNU sort.
>Release-Note:
>Audit-Trail:
From: Alan Barrett <apb@cequrux.com>
To: gnats-bugs@NetBSD.org
Cc: netbsd-bugs@NetBSD.org
Subject: Re: bin/37860: sort -n sorts 0 after 0.1
Date: Fri, 25 Jan 2008 11:20:49 +0200
On Thu, 24 Jan 2008, Alexander Becher wrote:
> Sorting lines that contain only numbers between 0 and 1 with the -n
> option leads lines that contain only "0" to be sorted after lines that
> start with "0.0". Obviously, 0.01 < 0 < 0.2 is wrong, but that's the
> resulting sort order.
Please try the patch in PR 30504. I really should commit that some
time.
--apb (Alan Barrett)
From: alex@cyathus.de (Alexander Becher)
To: gnats-bugs@NetBSD.org
Cc:
Subject: Re: bin/37860: sort -n sorts 0 after 0.1
Date: Tue, 29 Jan 2008 21:11:46 +0100
* Alan Barrett:
> On Thu, 24 Jan 2008, Alexander Becher wrote:
> > Sorting lines that contain only numbers between 0 and 1 with the -n
> > option leads lines that contain only "0" to be sorted after lines that
> > start with "0.0". Obviously, 0.01 < 0 < 0.2 is wrong, but that's the
> > resulting sort order.
>
> Please try the patch in PR 30504.
Works.
> I really should commit that some time.
You should. It would close many sort-related PRs. And prevent new ones.
Regards,
Alexander
--
PGP key available
From: David Laight <dsl@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/37860 CVS commit: src/usr.bin/sort
Date: Sat, 22 Aug 2009 10:53:29 +0000
Module Name: src
Committed By: dsl
Date: Sat Aug 22 10:53:28 UTC 2009
Modified Files:
src/usr.bin/sort: append.c fields.c files.c fsort.c init.c msort.c
sort.c sort.h
Log Message:
Rework the way sort generates sort keys:
- If we generate a key, it is always sortable using memcmp()
- If we are sorting the whole record, then a weight-table must be used
during compares.
- Major surgery to encoding of numbers to ensure unique keys for equal
numeric values. Reverse numerics are handled by inverting the sign.
- Case folding (-f) is handled when the sort keys are generated. No other
code has to care at all.
- Key uniqueness (-u) is done during merge for large datasets. It only
has to be done when writing the output file for small files.
Since the file is in key order this is simple!
Probably fixes all of: PR/27257 PR/25551 PR/22182 PR/31095 PR/30504
PR/36816 PR/37860 PR/39308
Also PR/18614 should no longer die, but a little more work needs to be
done on the merging for very large files.
To generate a diff of this commit:
cvs rdiff -u -r1.19 -r1.20 src/usr.bin/sort/append.c src/usr.bin/sort/init.c
cvs rdiff -u -r1.24 -r1.25 src/usr.bin/sort/fields.c src/usr.bin/sort/sort.h
cvs rdiff -u -r1.34 -r1.35 src/usr.bin/sort/files.c
cvs rdiff -u -r1.38 -r1.39 src/usr.bin/sort/fsort.c
cvs rdiff -u -r1.22 -r1.23 src/usr.bin/sort/msort.c
cvs rdiff -u -r1.51 -r1.52 src/usr.bin/sort/sort.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
State-Changed-From-To: open->closed
State-Changed-By: dsl@NetBSD.org
State-Changed-When: Sat, 22 Aug 2009 11:18:51 +0000
State-Changed-Why:
fixed - see above
From: Stephen Borrill <sborrill@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/37860 CVS commit: [netbsd-5] src/usr.bin/sort
Date: Wed, 14 Oct 2009 20:41:53 +0000
Module Name: src
Committed By: sborrill
Date: Wed Oct 14 20:41:53 UTC 2009
Modified Files:
src/usr.bin/sort [netbsd-5]: Makefile append.c fields.c files.c fsort.c
fsort.h init.c msort.c sort.1 sort.c sort.h tmp.c
Added Files:
src/usr.bin/sort [netbsd-5]: radix_sort.c
Log Message:
Pull up the following revisions(s) (requested by dsl in ticket #1084):
usr.bin/sort/Makefile: revision 1.6-1.8
usr.bin/sort/append.c: revision 1.15-1.22
usr.bin/sort/fields.c: revision 1.20-1.30
usr.bin/sort/files.c: revision 1.27-1.40
usr.bin/sort/fsort.c: revision 1.33-1.45
usr.bin/sort/fsort.h: revision 1.14-1.17
usr.bin/sort/init.c: revision 1.19-1.23
usr.bin/sort/msort.c: revision 1.19-1.28
usr.bin/sort/radix_sort.c: revision 1.1-1.4
usr.bin/sort/sort.1: revision 1.27-1.29
usr.bin/sort/sort.c: revision 1.47-1.56
usr.bin/sort/sort.h: revision 1.20-1.30
usr.bin/sort/tmp.c: revision 1.14-1.15
Only use radix sort for in-memory sort, always merge temporary files.
Use a local radixsort() function so we can pass record length.
Avoid use of weight tables for key compares.
Fix generation of keys for numbers, negate value for reverse sort.
Write file in reverse-key order for 'sort -n'.
'sort -S' now does a posix sort (sort matching keys by record data).
Ensure merge sort doesn't have too many temporary files open.
Fixes: PR#18614 PR#27257 PR#25551 PR#22182 PR#31095 PR#30504 PR#36816
PR#37860 PR#39308 PR#42094
To generate a diff of this commit:
cvs rdiff -u -r1.5 -r1.5.40.1 src/usr.bin/sort/Makefile
cvs rdiff -u -r1.14 -r1.14.6.1 src/usr.bin/sort/append.c
cvs rdiff -u -r1.19 -r1.19.6.1 src/usr.bin/sort/fields.c \
src/usr.bin/sort/sort.h
cvs rdiff -u -r1.26 -r1.26.6.1 src/usr.bin/sort/files.c \
src/usr.bin/sort/sort.1
cvs rdiff -u -r1.32 -r1.32.6.1 src/usr.bin/sort/fsort.c
cvs rdiff -u -r1.13 -r1.13.6.1 src/usr.bin/sort/fsort.h \
src/usr.bin/sort/tmp.c
cvs rdiff -u -r1.18 -r1.18.6.1 src/usr.bin/sort/init.c \
src/usr.bin/sort/msort.c
cvs rdiff -u -r0 -r1.4.2.2 src/usr.bin/sort/radix_sort.c
cvs rdiff -u -r1.46 -r1.46.4.1 src/usr.bin/sort/sort.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
>Unformatted:
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.