NetBSD Problem Report #45539
From woods@once.weird.com Sat Oct 29 01:52:26 2011
Return-Path: <woods@once.weird.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
by www.NetBSD.org (Postfix) with ESMTP id D629C63CAE8
for <gnats-bugs@gnats.NetBSD.org>; Sat, 29 Oct 2011 01:52:25 +0000 (UTC)
Message-Id: <m1RJy5r-001EBmC@once.weird.com>
Date: Fri, 28 Oct 2011 18:52:19 -0700 (PDT)
From: "Greg A. Woods" <woods@planix.com>
Sender: "Greg A. Woods" <woods@once.weird.com>
Reply-To: "Greg A. Woods" <woods@planix.com>
To: gnats-bugs@gnats.NetBSD.org
Subject: add support for getrusage(2) memory size statistics
X-Send-Pr-Version: 3.95
>Number: 45539
>Category: kern
>Synopsis: add support for getrusage(2) memory size statistics
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat Oct 29 01:55:00 +0000 2011
>Last-Modified: Mon Oct 31 03:55:00 +0000 2011
>Originator: Greg A. Woods
>Release: NetBSD -current 2011/10/28
>Organization:
Planix, Inc.; Kelowna, BC; Canada
>Environment:
System: NetBSD
Architecture: all
Machine: all
>Description:
some time between 4.3BSD and 4.4BSD the code to track process
memory size statistics was removed and never re-written, so it
has been missing in NetBSD since the beginning
>How-To-Repeat:
run "/usr/bin/time -l some-big-long-running-process" and note
that the memory size stats are always zero
>Fix:
(untested in -current, but tested in netbsd-5)
--- kern_clock.c 28 Sep 2011 16:06:27 -0700 1.128
+++ kern_clock.c 28 Oct 2011 18:43:48 -0700
@@ -427,6 +427,36 @@
spc->spc_cp_time[CP_IDLE]++;
}
}
+ /*
+ * If the CPU is currently scheduled to a non-idle process, then charge
+ * that process with the appropriate VM resource utilization for a tick.
+ *
+ * Assume that the current process has been running the entire last
+ * tick, and account for VM use regardless of whether in user mode or
+ * system mode (XXX or interrupt mode?).
+ *
+ * rusage VM stats are expressed in kilobytes * ticks-of-execution.
+ */
+ /* based on code from 4.3BSD kern_clock.c and from FreeBSD */
+
+# define pg2kb(n) (((n) * PAGE_SIZE) / 1024)
+
+ if (p != NULL) {
+ struct vmspace *vm = p->p_vmspace;
+ long rss;
+
+ /*
+ * XXX is p_stmutex currently enough to protect updates to
+ * p_stats too?
+ */
+ p->p_stats->p_ru.ru_idrss += pg2kb(vm->vm_dsize); /* unshared data */
+ p->p_stats->p_ru.ru_isrss += pg2kb(vm->vm_ssize); /* unshared stack */
+ p->p_stats->p_ru.ru_ixrss += pg2kb(vm->vm_tsize); /* "shared" text? */
+
+ rss = pg2kb(vm_resident_count(vm));
+ if (rss > p->p_stats->p_ru.ru_maxrss)
+ p->p_stats->p_ru.ru_maxrss = rss;
+ }
spc->spc_pscnt = psdiv;
if (p != NULL) {
>Audit-Trail:
From: "Greg A. Woods" <woods@planix.ca>
To: NetBSD GNATS <gnats-bugs@NetBSD.org>
Cc:
Subject: Re: kern/45539: add support for getrusage(2) memory size statistics
Date: Fri, 28 Oct 2011 19:38:26 -0700
--pgp-sign-Multipart_Fri_Oct_28_19:38:26_2011-1
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
BTW, by "tested on netbsd-5" I should say that I've successfully run a
netbsd-5 kernel with this patch and I have seen from time(1)'s '-l'
option that it now "does something".
I don't know if the results are right -- I'm not sure they're even
really close. I also note that sometimes it doesn't show anything
at all, but it seems to be on the right track.
Here's an example of time(1)'s output after my tcountbits test run:
75.47 real 0.32 user 75.12 sys
204 maximum resident set size
150 average shared memory size
513 average unshared data size
11 average unshared stack size
21 page reclaims
8 page faults
0 swaps
0 block input operations
0 block output operations
0 messages sent
0 messages received
0 signals received
5 voluntary context switches
142 involuntary context switches
Note this is a static-linked binary, and here from the size(1) info it
looks like the text size seems about right, and I guess the stack could
be right too, but the data size seems wonky:
# size ~/tmp/tcountbits
text data bss dec hex filename
152953 3176 15572 171701 29eb5 /root/tmp/tcountbits
Hmmm.... maybe it's not wonky, as it seems that once the first set of
results has been shown the virtual size skyrockets:
# /usr/bin/time -l tmp/tcountbits -ti 1=20
tcountbits: using CLOCK_MONOTONIC timer with resolution: 0 s, 279 ns
tcountbits: now running each algorithm for 1000000 iterations....
^Z[1] + Stopped /usr/bin/time -l tmp/tcountbits -ti 1=20
# ps -lp 376
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
0 376 375 0 43 0 184 192 - T tty00 0:01.23 tmp/tcountbits -=
ti 1
# fg
/usr/bin/time -l tmp/tcountbits -ti 1=20
time() =3D -0.0004 us/c user, 8.8335 us/c sys, 39.5177 us/c wa=
it, 48.3507 us/c wall
^Z[1] + Stopped /usr/bin/time -l tmp/tcountbits -ti 1=20
# ps -lp 376 =20
UID PID PPID CPU PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
0 376 375 2998 42 0 1772 212 - T tty00 0:11.49 tmp/tcountbits -=
ti 1
The numbers from getrusage() are not quite matching what I see from
ps(1) either, but they're very close -- perhaps the averaging explains
the differences.
--=20
Greg A. Woods
Planix, Inc.
<woods@planix.com> +1 250 762-7675 http://www.planix.com/
--pgp-sign-Multipart_Fri_Oct_28_19:38:26_2011-1
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (NetBSD)
iD8DBQBOq2ciZn1xt3i/9H8RAmaEAKDGlLNnj577PVcr9IDkpYkqZpe6bACcCvSD
bRVWjR8IyYyYAFrrp813NKQ=
=bQW2
-----END PGP SIGNATURE-----
--pgp-sign-Multipart_Fri_Oct_28_19:38:26_2011-1--
From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc:
Subject: PR/45539 CVS commit: src/sys/kern
Date: Sat, 29 Oct 2011 11:58:38 -0400
Module Name: src
Committed By: christos
Date: Sat Oct 29 15:58:38 UTC 2011
Modified Files:
src/sys/kern: kern_clock.c
Log Message:
PR/45539: Greg A. Woods: add support for getrusage(2) memory size statistics
To generate a diff of this commit:
cvs rdiff -u -r1.128 -r1.129 src/sys/kern/kern_clock.c
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
netbsd-bugs@netbsd.org
Subject: re: kern/45539: add support for getrusage(2) memory size statistics
Date: Sun, 30 Oct 2011 13:11:55 +1100
> >Synopsis: add support for getrusage(2) memory size statistics
thanks for looking at this. i've wished it worked for ages.
this is from statclock():
> + /*
> + * If the CPU is currently scheduled to a non-idle process, then charge
> + * that process with the appropriate VM resource utilization for a tick.
> + *
> + * Assume that the current process has been running the entire last
> + * tick, and account for VM use regardless of whether in user mode or
> + * system mode (XXX or interrupt mode?).
> + *
> + * rusage VM stats are expressed in kilobytes * ticks-of-execution.
> + */
> + /* based on code from 4.3BSD kern_clock.c and from FreeBSD */
> +
> +# define pg2kb(n) (((n) * PAGE_SIZE) / 1024)
> +
> + if (p != NULL) {
> + struct vmspace *vm = p->p_vmspace;
> + long rss;
> +
> + /*
> + * XXX is p_stmutex currently enough to protect updates to
> + * p_stats too?
> + */
> + p->p_stats->p_ru.ru_idrss += pg2kb(vm->vm_dsize); /* unshared data */
> + p->p_stats->p_ru.ru_isrss += pg2kb(vm->vm_ssize); /* unshared stack */
> + p->p_stats->p_ru.ru_ixrss += pg2kb(vm->vm_tsize); /* "shared" text? */
this is the wrong way to get these values. the += will
accumulate over time, 'hz' times a second, depend on the target.
this is probably why you have wrong values.
i don't think statclock() is the right place to do it. i would
simply grab them when filling in the rusage as normal. (but this
makes me question what these values actually mean. the member
names are "ru_i?rss". is the "rss" supposed to imply "resident
set size" here, to? if so, i don't know that we have any way of
getting that info.)
> + rss = pg2kb(vm_resident_count(vm));
> + if (rss > p->p_stats->p_ru.ru_maxrss)
> + p->p_stats->p_ru.ru_maxrss = rss;
> + }
initially, i didn't like the look of this one either.
vm_resident_count() can be expensive (but i don't think it is
on any current netbsd ports) but i'm not sure that pushing the
cost of it anywhere else would help. eg, pushing it into lwp
switch would end up being more expensive to maintain on a system
with very active but switching lwp's.
.mrg.
From: "Greg A. Woods" <woods@planix.ca>
To: NetBSD GNATS <gnats-bugs@NetBSD.org>
Cc:
Subject: Re: kern/45539: add support for getrusage(2) memory size statistics
Date: Sun, 30 Oct 2011 20:53:25 -0700
--pgp-sign-Multipart_Sun_Oct_30_20:53:25_2011-1
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
At Sun, 30 Oct 2011 02:15:06 +0000 (UTC), matthew green <mrg@eterna.com.au>=
wrote:
Subject: re: kern/45539: add support for getrusage(2) memory size statistics
>=20
> this is from statclock():
> =20
> > + * rusage VM stats are expressed in kilobytes * ticks-of-execution.
> =20
> this is the wrong way to get these values. the +=3D will
> accumulate over time, 'hz' times a second, depend on the target.
> this is probably why you have wrong values.
As per the comment I left quoted above, this accumulation over time is
actually what's necessary to meet the specifications.
You'll also see in time(1) that it displays the average memory size per
tick of execution, not the raw value from getrusage().
(And, after all, the instantaneous values are not representative of the
usage over the interval being measured -- also the instantaneous values
are available through ps, etc., though self-reference through sysctl(2)
could be made a wee bit easier, though maybe kern.proc2 with
KERN_PROC_PID is good enough)
> initially, i didn't like the look of this one either.
> vm_resident_count() can be expensive (but i don't think it is
> on any current netbsd ports) but i'm not sure that pushing the
> cost of it anywhere else would help. eg, pushing it into lwp
> switch would end up being more expensive to maintain on a system
> with very active but switching lwp's.
Exactly. With statclock() "normally" running at 1/100'th the rate of
hardclock() (e.g. on i386) this is the least expensive place for such
statistics to be gathered.
On the other hand it seems this difference in statclock() vs. timeslice
rates leads to very poor statistics. (the subject of my recent post in
tech-kern)
--=20
Greg A. Woods
Planix, Inc.
<woods@planix.com> +1 250 762-7675 http://www.planix.com/
--pgp-sign-Multipart_Sun_Oct_30_20:53:25_2011-1
Content-Type: application/pgp-signature
Content-Transfer-Encoding: 7bit
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (NetBSD)
iD8DBQBOrhu1Zn1xt3i/9H8RAp2xAJ4xGr3KLqhhd/hhrpuaMKj6A0yMnACg1B8D
9ufUkTAFqI8tLo99LTWchZU=
=k09I
-----END PGP SIGNATURE-----
--pgp-sign-Multipart_Sun_Oct_30_20:53:25_2011-1--
(Contact us)
$NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007
The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.