NetBSD Problem Report #39759

From cheusov@tut.by  Sat Oct 18 09:55:41 2008
Return-Path: <cheusov@tut.by>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by narn.NetBSD.org (Postfix) with ESMTP id F200463BC83
	for <gnats-bugs@gnats.netbsd.org>; Sat, 18 Oct 2008 09:55:40 +0000 (UTC)
Message-Id: <s93bpxi2pne.fsf@chen.chizhovka.net>
Date: Sat, 18 Oct 2008 12:55:49 +0300
From: cheusov@tut.by
Reply-To:
To: gnats-bugs@gnats.NetBSD.org
Subject: NetBSD awk (nawk) is 14 times slower than GNU awk
X-Send-Pr-Version: 3.95

>Number:         39759
>Category:       bin
>Synopsis:       NetBSD awk/nawk is 14 times slower than GNU awk
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    bin-bug-people
>State:          open
>Class:          change-request
>Submitter-Id:   net
>Arrival-Date:   Sat Oct 18 10:00:00 +0000 2008
>Originator:     cheusov@tut.by
>Release:        NetBSD 4.0_STABLE
>Organization:
>Environment:
System: NetBSD chen.chizhovka.net 4.0_STABLE NetBSD 4.0_STABLE (GENERIC) #2: Wed Sep 24 23:57:38 EEST 2008 cheusov@chen.chizhovka.net:/srv/obj/sys/arch/i386/compile/GENERIC i386
Architecture: i386
Machine: i386
>Description:
Th following simple script shows that NAWK is unacceptably slower than
GNU awk.

    ~/tmp/1.awk:
    {
        accu = accu "\n" $0
    }
    NF == 0 {
        print accu
        accu = ""
        next
    }

0 ~>time -p gawk -f ~/tmp/1.awk /srv/pkgsrc/pkg_src_summary.txt > /dev/null 
real 35.08
user 14.49
sys 20.58
0 ~>time -p mawk -f ~/tmp/1.awk /srv/pkgsrc/pkg_src_summary.txt > /dev/null 
real 68.26
user 27.47
sys 40.79
0 ~>time -p nbawk -f ~/tmp/1.awk /srv/pkgsrc/pkg_src_summary.txt > /dev/null 
real 263.79
user 202.20
sys 61.59
0 ~>

nbawk is wip/netbsd-awk compiled for Linux.
Under NetBSD-4 nawk works the same way - more than 12 times slower.

/srv/pkgsrc/pkg_src_summary.txt is a summary about all packages from
pkgsrc source tree, generated by pkg_src_summary(1) utility
from wip/pkg_summary-utils package.
It is ~50Mb text file 
    http://www.mova.org/~cheusov/pub/pkg_src_summary.txt

I didn't run profiler but it seems to me that nawk is slower
because its string concatenation ALWAYS run malloc/memcpy/free
functions. This is extremly inefficient.

See run.c:cat function

>Fix:

Unknown

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.