NetBSD Problem Report #45539

From woods@once.weird.com  Sat Oct 29 01:52:26 2011
Return-Path: <woods@once.weird.com>
Received: from mail.netbsd.org (mail.netbsd.org [204.152.190.11])
	by www.NetBSD.org (Postfix) with ESMTP id D629C63CAE8
	for <gnats-bugs@gnats.NetBSD.org>; Sat, 29 Oct 2011 01:52:25 +0000 (UTC)
Message-Id: <m1RJy5r-001EBmC@once.weird.com>
Date: Fri, 28 Oct 2011 18:52:19 -0700 (PDT)
From: "Greg A. Woods" <woods@planix.com>
Sender: "Greg A. Woods" <woods@once.weird.com>
Reply-To: "Greg A. Woods" <woods@planix.com>
To: gnats-bugs@gnats.NetBSD.org
Subject: add support for getrusage(2) memory size statistics
X-Send-Pr-Version: 3.95

>Number:         45539
>Category:       kern
>Synopsis:       add support for getrusage(2) memory size statistics
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Oct 29 01:55:00 +0000 2011
>Last-Modified:  Mon Oct 31 03:55:00 +0000 2011
>Originator:     Greg A. Woods
>Release:        NetBSD -current 2011/10/28
>Organization:
Planix, Inc.; Kelowna, BC; Canada
>Environment:
System: NetBSD
Architecture: all
Machine: all
>Description:

	some time between 4.3BSD and 4.4BSD the code to track process
	memory size statistics was removed and never re-written, so it
	has been missing in NetBSD since the beginning

>How-To-Repeat:

	run "/usr/bin/time -l some-big-long-running-process" and note
	that the memory size stats are always zero

>Fix:

	(untested in -current, but tested in netbsd-5)

--- kern_clock.c	28 Sep 2011 16:06:27 -0700	1.128
+++ kern_clock.c	28 Oct 2011 18:43:48 -0700	
@@ -427,6 +427,36 @@
 			spc->spc_cp_time[CP_IDLE]++;
 		}
 	}
+	/*
+	 * If the CPU is currently scheduled to a non-idle process, then charge
+	 * that process with the appropriate VM resource utilization for a tick.
+	 *
+	 * Assume that the current process has been running the entire last
+	 * tick, and account for VM use regardless of whether in user mode or
+	 * system mode (XXX or interrupt mode?).
+	 *
+	 * rusage VM stats are expressed in kilobytes * ticks-of-execution.
+	 */
+	/* based on code from 4.3BSD kern_clock.c and from FreeBSD */
+
+# define pg2kb(n)	(((n) * PAGE_SIZE) / 1024)
+
+	if (p != NULL) {
+		struct vmspace *vm = p->p_vmspace;
+		long rss;
+
+		/*
+		 * XXX is p_stmutex currently enough to protect updates to
+		 * p_stats too?
+		 */
+		p->p_stats->p_ru.ru_idrss += pg2kb(vm->vm_dsize); /* unshared data */
+		p->p_stats->p_ru.ru_isrss += pg2kb(vm->vm_ssize); /* unshared stack */
+		p->p_stats->p_ru.ru_ixrss += pg2kb(vm->vm_tsize); /* "shared" text? */
+
+		rss = pg2kb(vm_resident_count(vm));
+		if (rss > p->p_stats->p_ru.ru_maxrss)
+			p->p_stats->p_ru.ru_maxrss = rss;
+	}
 	spc->spc_pscnt = psdiv;

 	if (p != NULL) {

>Audit-Trail:
From: "Greg A. Woods" <woods@planix.ca>
To: NetBSD GNATS <gnats-bugs@NetBSD.org>
Cc: 
Subject: Re: kern/45539: add support for getrusage(2) memory size statistics
Date: Fri, 28 Oct 2011 19:38:26 -0700

 --pgp-sign-Multipart_Fri_Oct_28_19:38:26_2011-1
 Content-Type: text/plain; charset=US-ASCII
 Content-Transfer-Encoding: quoted-printable

 BTW, by "tested on netbsd-5" I should say that I've successfully run a
 netbsd-5 kernel with this patch and I have seen from time(1)'s '-l'
 option that it now "does something".

 I don't know if the results are right -- I'm not sure they're even
 really close.  I also note that sometimes it doesn't show anything
 at all, but it seems to be on the right track.

 Here's an example of time(1)'s output after my tcountbits test run:

        75.47 real         0.32 user        75.12 sys
        204  maximum resident set size
        150  average shared memory size
        513  average unshared data size
         11  average unshared stack size
         21  page reclaims
          8  page faults
          0  swaps
          0  block input operations
          0  block output operations
          0  messages sent
          0  messages received
          0  signals received
          5  voluntary context switches
        142  involuntary context switches

 Note this is a static-linked binary, and here from the size(1) info it
 looks like the text size seems about right, and I guess the stack could
 be right too, but the data size seems wonky:

 # size ~/tmp/tcountbits
    text    data     bss     dec     hex filename
  152953    3176   15572  171701   29eb5 /root/tmp/tcountbits

 Hmmm.... maybe it's not wonky, as it seems that once the first set of
 results has been shown the virtual size skyrockets:

 # /usr/bin/time -l tmp/tcountbits -ti 1=20
 tcountbits: using CLOCK_MONOTONIC timer with resolution: 0 s, 279 ns
 tcountbits: now running each algorithm for 1000000 iterations....
 ^Z[1] + Stopped              /usr/bin/time -l tmp/tcountbits -ti 1=20
 # ps -lp 376
 UID PID PPID CPU PRI NI  VSZ RSS WCHAN  STAT TTY      TIME COMMAND
   0 376  375   0  43  0  184 192 -      T    tty00 0:01.23 tmp/tcountbits -=
 ti 1
 # fg
 /usr/bin/time -l tmp/tcountbits -ti 1=20
             time() =3D -0.0004 us/c user,  8.8335 us/c sys, 39.5177 us/c wa=
 it, 48.3507 us/c wall
 ^Z[1] + Stopped              /usr/bin/time -l tmp/tcountbits -ti 1=20
 # ps -lp 376 =20
 UID PID PPID  CPU PRI NI  VSZ RSS WCHAN STAT TTY      TIME COMMAND
   0 376  375 2998  42  0 1772 212 -     T    tty00 0:11.49 tmp/tcountbits -=
 ti 1


 The numbers from getrusage() are not quite matching what I see from
 ps(1) either, but they're very close -- perhaps the averaging explains
 the differences.

 --=20
 						Greg A. Woods
 						Planix, Inc.

 <woods@planix.com>       +1 250 762-7675        http://www.planix.com/

 --pgp-sign-Multipart_Fri_Oct_28_19:38:26_2011-1
 Content-Type: application/pgp-signature
 Content-Transfer-Encoding: 7bit

 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (NetBSD)

 iD8DBQBOq2ciZn1xt3i/9H8RAmaEAKDGlLNnj577PVcr9IDkpYkqZpe6bACcCvSD
 bRVWjR8IyYyYAFrrp813NKQ=
 =bQW2
 -----END PGP SIGNATURE-----

 --pgp-sign-Multipart_Fri_Oct_28_19:38:26_2011-1--

From: "Christos Zoulas" <christos@netbsd.org>
To: gnats-bugs@gnats.NetBSD.org
Cc: 
Subject: PR/45539 CVS commit: src/sys/kern
Date: Sat, 29 Oct 2011 11:58:38 -0400

 Module Name:	src
 Committed By:	christos
 Date:		Sat Oct 29 15:58:38 UTC 2011

 Modified Files:
 	src/sys/kern: kern_clock.c

 Log Message:
 PR/45539: Greg A. Woods: add support for getrusage(2) memory size statistics


 To generate a diff of this commit:
 cvs rdiff -u -r1.128 -r1.129 src/sys/kern/kern_clock.c

 Please note that diffs are not public domain; they are subject to the
 copyright notices on the relevant files.

From: matthew green <mrg@eterna.com.au>
To: gnats-bugs@NetBSD.org
Cc: kern-bug-people@netbsd.org, gnats-admin@netbsd.org,
    netbsd-bugs@netbsd.org
Subject: re: kern/45539: add support for getrusage(2) memory size statistics
Date: Sun, 30 Oct 2011 13:11:55 +1100

 > >Synopsis:       add support for getrusage(2) memory size statistics

 thanks for looking at this.  i've wished it worked for ages.

 this is from statclock():

 > +	/*
 > +	 * If the CPU is currently scheduled to a non-idle process, then charge
 > +	 * that process with the appropriate VM resource utilization for a tick.
 > +	 *
 > +	 * Assume that the current process has been running the entire last
 > +	 * tick, and account for VM use regardless of whether in user mode or
 > +	 * system mode (XXX or interrupt mode?).
 > +	 *
 > +	 * rusage VM stats are expressed in kilobytes * ticks-of-execution.
 > +	 */
 > +	/* based on code from 4.3BSD kern_clock.c and from FreeBSD */
 > +
 > +# define pg2kb(n)	(((n) * PAGE_SIZE) / 1024)
 > +
 > +	if (p != NULL) {
 > +		struct vmspace *vm = p->p_vmspace;
 > +		long rss;
 > +
 > +		/*
 > +		 * XXX is p_stmutex currently enough to protect updates to
 > +		 * p_stats too?
 > +		 */
 > +		p->p_stats->p_ru.ru_idrss += pg2kb(vm->vm_dsize); /* unshared data */
 > +		p->p_stats->p_ru.ru_isrss += pg2kb(vm->vm_ssize); /* unshared stack */
 > +		p->p_stats->p_ru.ru_ixrss += pg2kb(vm->vm_tsize); /* "shared" text? */

 this is the wrong way to get these values.  the += will
 accumulate over time, 'hz' times a second, depend on the target.
 this is probably why you have wrong values.

 i don't think statclock() is the right place to do it.  i would
 simply grab them when filling in the rusage as normal.  (but this
 makes me question what these values actually mean.  the member
 names are "ru_i?rss".  is the "rss" supposed to imply "resident
 set size" here, to?  if so, i don't know that we have any way of
 getting that info.)

 > +		rss = pg2kb(vm_resident_count(vm));
 > +		if (rss > p->p_stats->p_ru.ru_maxrss)
 > +			p->p_stats->p_ru.ru_maxrss = rss;
 > +	}

 initially, i didn't like the look of this one either.
 vm_resident_count() can be expensive (but i don't think it is
 on any current netbsd ports) but i'm not sure that pushing the
 cost of it anywhere else would help.  eg, pushing it into lwp
 switch would end up being more expensive to maintain on a system
 with very active but switching lwp's.


 .mrg.

From: "Greg A. Woods" <woods@planix.ca>
To: NetBSD GNATS <gnats-bugs@NetBSD.org>
Cc: 
Subject: Re: kern/45539: add support for getrusage(2) memory size statistics
Date: Sun, 30 Oct 2011 20:53:25 -0700

 --pgp-sign-Multipart_Sun_Oct_30_20:53:25_2011-1
 Content-Type: text/plain; charset=US-ASCII
 Content-Transfer-Encoding: quoted-printable

 At Sun, 30 Oct 2011 02:15:06 +0000 (UTC), matthew green <mrg@eterna.com.au>=
  wrote:
 Subject: re: kern/45539: add support for getrusage(2) memory size statistics
 >=20
 >  this is from statclock():
 > =20
 >  > +	 * rusage VM stats are expressed in kilobytes * ticks-of-execution.
 > =20
 >  this is the wrong way to get these values.  the +=3D will
 >  accumulate over time, 'hz' times a second, depend on the target.
 >  this is probably why you have wrong values.

 As per the comment I left quoted above, this accumulation over time is
 actually what's necessary to meet the specifications.

 You'll also see in time(1) that it displays the average memory size per
 tick of execution, not the raw value from getrusage().

 (And, after all, the instantaneous values are not representative of the
 usage over the interval being measured -- also the instantaneous values
 are available through ps, etc., though self-reference through sysctl(2)
 could be made a wee bit easier, though maybe kern.proc2 with
 KERN_PROC_PID is good enough)

 >  initially, i didn't like the look of this one either.
 >  vm_resident_count() can be expensive (but i don't think it is
 >  on any current netbsd ports) but i'm not sure that pushing the
 >  cost of it anywhere else would help.  eg, pushing it into lwp
 >  switch would end up being more expensive to maintain on a system
 >  with very active but switching lwp's.

 Exactly.  With statclock() "normally" running at 1/100'th the rate of
 hardclock() (e.g. on i386) this is the least expensive place for such
 statistics to be gathered.

 On the other hand it seems this difference in statclock() vs. timeslice
 rates leads to very poor statistics.  (the subject of my recent post in
 tech-kern)

 --=20
 						Greg A. Woods
 						Planix, Inc.

 <woods@planix.com>       +1 250 762-7675        http://www.planix.com/

 --pgp-sign-Multipart_Sun_Oct_30_20:53:25_2011-1
 Content-Type: application/pgp-signature
 Content-Transfer-Encoding: 7bit

 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.9 (NetBSD)

 iD8DBQBOrhu1Zn1xt3i/9H8RAp2xAJ4xGr3KLqhhd/hhrpuaMKj6A0yMnACg1B8D
 9ufUkTAFqI8tLo99LTWchZU=
 =k09I
 -----END PGP SIGNATURE-----

 --pgp-sign-Multipart_Sun_Oct_30_20:53:25_2011-1--

NetBSD Home
NetBSD PR Database Search

(Contact us) $NetBSD: query-full-pr,v 1.39 2013/11/01 18:47:49 spz Exp $
$NetBSD: gnats_config.sh,v 1.8 2006/05/07 09:23:38 tsutsui Exp $
Copyright © 1994-2007 The NetBSD Foundation, Inc. ALL RIGHTS RESERVED.