NetBSD Problem Report #50441

From Manuel.Bouyer@lip6.fr  Tue Nov 17 16:30:59 2015
Return-Path: <Manuel.Bouyer@lip6.fr>
Received: from mail.netbsd.org (mail.netbsd.org [149.20.53.66])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(Client CN "mail.netbsd.org", Issuer "Postmaster NetBSD.org" (verified OK))
	by mollari.NetBSD.org (Postfix) with ESMTPS id 67409A6552
	for <gnats-bugs@gnats.NetBSD.org>; Tue, 17 Nov 2015 16:30:59 +0000 (UTC)
Message-Id: <20151117163054.E8DA9A8AA@armandeche.soc.lip6.fr>
Date: Tue, 17 Nov 2015 17:30:54 +0100 (MET)
From: Manuel.Bouyer@lip6.fr
Reply-To: Manuel.Bouyer@lip6.fr
To: gnats-bugs@NetBSD.org
Subject: db(3) doesn't work with 64k pagesize 
X-Send-Pr-Version: 3.95

>Number:         50441
>Category:       lib
>Synopsis:       db(3) doesn't work with 64k pagesize
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    lib-bug-people
>State:          closed
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Tue Nov 17 16:35:00 +0000 2015
>Closed-Date:    Thu Aug 17 18:52:45 +0000 2017
>Last-Modified:  Thu Aug 17 18:52:45 +0000 2017
>Originator:     Manuel Bouyer
>Release:        NetBSD 7.0
>Organization:
>Environment:
System: NetBSD armandeche.soc.lip6.fr 7.0 NetBSD 7.0 (GENERIC)
Architecture: x86_64
Machine: amd64
>Description:
	while tracking down a problem with spamassassin and its bayes db,
	I found that db(3) doesn't work properly with 64k pagesize.
	It happens that my spamassassin is storign its bayes db on a
	64k-blocksize ffs, which triggers the problem. switching to a 32k ffs
	works around the issue.

	This small test program can be used to reproduce the issue.
#!/bin/sh
pagesize=$1
for i in `cat /usr/share/dict/american`; do
    db -w -P $pagesize hash test.db $i $i
done
exit 0

	and here's how to reproduce the issue:
cuba:/home/bouyer>rm test.db
cuba:/home/bouyer>./test.sh 4096
[...]
cuba:/home/bouyer>ls -l test.db 
-rw-r--r--  1 bouyer  wheel  24576 Nov 17 16:01 test.db
cuba:/home/bouyer>db hash test.db |wc
     353     706    6634
cuba:/home/bouyer>rm test.db
cuba:/home/bouyer>./test.sh 32768
[...]
cuba:/home/bouyer>ls -l test.db
-rw-r--r--  1 bouyer  wheel  131072 Nov 17 16:06 test.db
cuba:/home/bouyer>db hash test.db | wc
     353     706    6634
cuba:/home/bouyer>rm test.db
cuba:/home/bouyer>./test.sh 65536
[...]
cuba:/home/bouyer>ls -l test.db 
-rw-r--r--  1 bouyer  wheel  80019456 Nov 17 16:09 test.db
cuba:/home/bouyer>db hash test.db | wc
     219     438    4166

	notice that with 64k pagesize, the db is much, much bigger and there
	are missing keys. The size of the underlying filesystem doesn't matters
	for this test. It matters for my spamassassin usage because, I guess,
	the db pagesize defaults to the filesystem blocksize.
	The same problem happens on netbsd-7/i386 (i.e  it's not a LP64 issue).
	On netbsd-6 the database created is as big (80019456 bytes) but
	there's no missing keys.

>How-To-Repeat:
	see above
>Fix:
	none yet.

>Release-Note:

>Audit-Trail:
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/50441 CVS commit: src/lib/libc/db/hash
Date: Tue, 17 Nov 2015 15:19:55 -0500

 Module Name:	src
 Committed By:	christos
 Date:		Tue Nov 17 20:19:55 UTC 2015

 Modified Files:
 	src/lib/libc/db/hash: hash.c

 Log Message:
 PR/50441: Manuel Bouyer: hash seq enumeration skips keys on big data.
 XXX: pullup-7


 To generate a diff of this commit:
 cvs rdiff -u -r1.35 -r1.36 src/lib/libc/db/hash/hash.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: gnats-bugs@netbsd.org
Cc: netbsd-bugs@netbsd.org, christos@netbsd.org
Subject: Re: PR/50441 CVS commit: src/lib/libc/db/hash
Date: Wed, 18 Nov 2015 00:07:27 +0100

 On Tue, Nov 17, 2015 at 08:20:00PM +0000, Christos Zoulas wrote:
 > The following reply was made to PR lib/50441; it has been noted by GNATS.
 > 
 > From: "Christos Zoulas" <christos@netbsd.org>
 > To: gnats-bugs@gnats.NetBSD.org
 > Cc: 
 > Subject: PR/50441 CVS commit: src/lib/libc/db/hash
 > Date: Tue, 17 Nov 2015 15:19:55 -0500
 > 
 >  Module Name:	src
 >  Committed By:	christos
 >  Date:		Tue Nov 17 20:19:55 UTC 2015
 >  
 >  Modified Files:
 >  	src/lib/libc/db/hash: hash.c
 >  
 >  Log Message:
 >  PR/50441: Manuel Bouyer: hash seq enumeration skips keys on big data.
 >  XXX: pullup-7

 thanks for looking at this.
 But there's still something wrong. I think the db file is much larger than it
 should be (24k for 4k pages, 128k for 32k pages, and 76M for 64k pages).
 Also, with spamassassin, I still get a corrupted/unreadable db file.

 I could'nt reproduce the corruption without spamassassin (maybe this requires
 non-ascii keys, which I can't do with a simple shell script).

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: christos@zoulas.com (Christos Zoulas)
To: gnats-bugs@NetBSD.org, lib-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org, Manuel.Bouyer@lip6.fr
Cc: 
Subject: Re: PR/50441 CVS commit: src/lib/libc/db/hash
Date: Tue, 17 Nov 2015 19:24:44 -0500

 On Nov 17, 11:10pm, bouyer@antioche.eu.org (Manuel Bouyer) wrote:
 -- Subject: Re: PR/50441 CVS commit: src/lib/libc/db/hash

 |  thanks for looking at this.
 |  But there's still something wrong. I think the db file is much larger than it
 |  should be (24k for 4k pages, 128k for 32k pages, and 76M for 64k pages).

 Heh, I know. I just fixed this.

 |  Also, with spamassassin, I still get a corrupted/unreadable db file.
 |  
 |  I could'nt reproduce the corruption without spamassassin (maybe this requires
 |  non-ascii keys, which I can't do with a simple shell script).

 I don't know how to debug that! test please?

 christos

From: Manuel Bouyer <bouyer@antioche.eu.org>
To: Christos Zoulas <christos@zoulas.com>
Cc: gnats-bugs@NetBSD.org, lib-bug-people@netbsd.org, gnats-admin@netbsd.org,
        netbsd-bugs@netbsd.org
Subject: Re: PR/50441 CVS commit: src/lib/libc/db/hash
Date: Wed, 18 Nov 2015 12:30:33 +0100

 On Tue, Nov 17, 2015 at 07:24:44PM -0500, Christos Zoulas wrote:
 > On Nov 17, 11:10pm, bouyer@antioche.eu.org (Manuel Bouyer) wrote:
 > -- Subject: Re: PR/50441 CVS commit: src/lib/libc/db/hash
 > 
 > |  thanks for looking at this.
 > |  But there's still something wrong. I think the db file is much larger than it
 > |  should be (24k for 4k pages, 128k for 32k pages, and 76M for 64k pages).
 > 
 > Heh, I know. I just fixed this.

 sorry for bein impatient :)

 > 
 > |  Also, with spamassassin, I still get a corrupted/unreadable db file.
 > |  
 > |  I could'nt reproduce the corruption without spamassassin (maybe this requires
 > |  non-ascii keys, which I can't do with a simple shell script).
 > 
 > I don't know how to debug that! test please?

 With
 src/lib/libc/db/hash/hash_page.c 1.27 and
 src/lib/libc/db/hash/hash.c 1.36
 pulled up to netbsd-7, the problem seems to be fixed.
 With my shell script test, the db file is 256k instead of 76M for the 64k page
 test, and spamassassin also produces a file of reasonable size, witout
 complains.

 I've reenabled the bayes filter on my mail server, let see how it works after
 a few hours.

 -- 
 Manuel Bouyer <bouyer@antioche.eu.org>
      NetBSD: 26 ans d'experience feront toujours la difference
 --

From: christos@zoulas.com (Christos Zoulas)
To: Manuel Bouyer <bouyer@antioche.eu.org>
Cc: gnats-bugs@NetBSD.org, lib-bug-people@netbsd.org, 
	gnats-admin@netbsd.org, netbsd-bugs@netbsd.org
Subject: Re: PR/50441 CVS commit: src/lib/libc/db/hash
Date: Wed, 18 Nov 2015 07:22:30 -0500

 On Nov 18, 12:30pm, bouyer@antioche.eu.org (Manuel Bouyer) wrote:
 -- Subject: Re: PR/50441 CVS commit: src/lib/libc/db/hash

 | > Heh, I know. I just fixed this.
 | 
 | sorry for bein impatient :)
 | 
 | > 
 | > |  Also, with spamassassin, I still get a corrupted/unreadable db file.
 | > |  
 | > |  I could'nt reproduce the corruption without spamassassin (maybe this requires
 | > |  non-ascii keys, which I can't do with a simple shell script).
 | > 
 | > I don't know how to debug that! test please?
 | 
 | With
 | src/lib/libc/db/hash/hash_page.c 1.27 and
 | src/lib/libc/db/hash/hash.c 1.36
 | pulled up to netbsd-7, the problem seems to be fixed.
 | With my shell script test, the db file is 256k instead of 76M for the 64k page
 | test, and spamassassin also produces a file of reasonable size, witout
 | complains.
 | 
 | I've reenabled the bayes filter on my mail server, let see how it works after
 | a few hours.

 This db code is complicated and broken. It is a fun brain-teaser problem.
 With the latest patches, one of the unit-tests break. I'll have to track
 that down now :-)

 christos

From: "Manuel Bouyer" <bouyer@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/50441 CVS commit: [netbsd-7] src/lib/libc/db/hash
Date: Sun, 22 Nov 2015 14:15:14 +0000

 Module Name:	src
 Committed By:	bouyer
 Date:		Sun Nov 22 14:15:14 UTC 2015

 Modified Files:
 	src/lib/libc/db/hash [netbsd-7]: hash.c hash.h hash_bigkey.c
 	    hash_page.c

 Log Message:
 Pull up following revision(s) (requested by christos in ticket #1046):
 	lib/libc/db/hash/hash_page.c: revision 1.27
 	lib/libc/db/hash/hash_page.c: revision 1.28
 	lib/libc/db/hash/hash.h: revision 1.16
 	lib/libc/db/hash/hash.c: revision 1.36
 	lib/libc/db/hash/hash.c: revision 1.37
 	lib/libc/db/hash/hash.c: revision 1.38
 	lib/libc/db/hash/hash_bigkey.c: revision 1.25
 Account for the -1 hack to fit 0x10000 in a short in hash_page.c
 Introduce a HASH_BSIZE macro to return the blocksize; in the 64K case this
 returns 0xffff to avoid overflow. This is used where sizes are stored.
 If MAX_BSIZE == hashp->BSIZE (65536) then it does not fit in a short, and
 we end up storing 0... This means that every entry needs a page. We store
 MAX_BSIZE - 1 here, but it would be better to always store (avail - 1) here
 so that we don't waste a byte and be consistent.
 PR/50441: Manuel Bouyer: hash seq enumeration skips keys on big data.
 XXX: pullup-7


 To generate a diff of this commit:
 cvs rdiff -u -r1.33.4.1 -r1.33.4.2 src/lib/libc/db/hash/hash.c
 cvs rdiff -u -r1.15 -r1.15.40.1 src/lib/libc/db/hash/hash.h
 cvs rdiff -u -r1.24 -r1.24.10.1 src/lib/libc/db/hash/hash_bigkey.c
 cvs rdiff -u -r1.26 -r1.26.4.1 src/lib/libc/db/hash/hash_page.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

State-Changed-From-To: open->pending-pullups
State-Changed-By: bouyer@NetBSD.org
State-Changed-When: Sun, 22 Nov 2015 14:28:10 +0000
State-Changed-Why:
Fixed in HEAD and netbsd-7; waiting on pullup-6 #1349


State-Changed-From-To: pending-pullups->closed
State-Changed-By: jdolecek@NetBSD.org
State-Changed-When: Thu, 17 Aug 2017 18:52:45 +0000
State-Changed-Why:
Pulled to netbsd-7 in Nov 2015, so is part of 7.1. Thank you.


>Unformatted:

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2014 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.