[Developers] Proposed Cactus Timer API Completion

Steve White swhite at aei.mpg.de
Wed Aug 11 04:04:07 CDT 2004


On Tue, 10 Aug 2004, John Shalf wrote:
> do you want any non-get-time-of-day counter/timer examples? For
> instance, I've got timers that use PAPI, HPMCOUNT, and Linux RTC
> counters to extract timing info at sub-microsecond accuracy.
> I've been having significant problems with the gettimeofday() based
> counter-timers giving incorrect results on the Cray and NEC systems
> due to timer granularity issues.  I can provide you with some specific
> examples where the Cactus timers diverged from the hardware
> performance counters by 200% or more in the final timer dump. So
> hiding more accurate counter/timers beneath your API will be very
> useful from my standpoint.
> In order to support these "hidden timers", the timer API would benefit
> from simple opaque addition/subtraction operators.  Internally, you
> can use a timer-dependent accurate mechanism for performing the
> addition/subtraction correctly.
The Cactus timer mechanism as it is currently imlemented has some rather
interesting features. The primary feature is that a new "clock" can be
added, and times can be reported from it throughout the code, without
touching any code at all.  One need only compile it in as a thorn, and
activate the thorn in the par file.

So I think Cactus already supports what you call "hidden timers", 
although they aren't exactly hidden.

As far as addition/subtraction operators is concerned, I think (well...I
would hope) the existing mechanism already does the necessary operations
correctly.  The user of the code only wants to know how much time has
elapsed since the last call; the timer API should separate them from the
details of how to correctly calculate the value.

The built-in Cactus clocks are 'gettimeofday' and 'getrusage', which are
based on the unix calls of the same names.  The former reports "wall
time", the latter reports CPU time.  The reason they are built-in is that
they are universal on unix systems, not because they're wonderful

The other timers you mentioned would be easy to add as individual thorns.
I have written a timer thorn based on the MPI timer.  Maybe it would be
worthwhile to include it in the distibution as an example.

Attached find a couple of examples demonstrating the granularity of the
built-in clodks.  On my laptop, the value for gettimeofday wavers between
2 and 3 microseconds, but sometimes can be much larger.  The getrusage
timer seems to have a resolution of one millisecond (!).

Concerning the discrepancy you mentioned, the difference between the wall
time and a hardware counter time doesn't mean much.  If another process
ran while yours ran, this would increase the wall time arbitrarily but not
the hardware counter.  Comparing your high-resolution counter to getrusage
might also be fruitless if you try to measure a time interval that is near
the (very coarse) granularity of getrusage.

So the default clocks could be right, in their respective senses and
accuracies, despite the descrepancy you see.  If you think something else
might be wrong, I'll definately look at it!

Maybe a few words on this topic should be put in the Users' Guide.

> I think someone requested a subroutine call to request the timer
> granularity.  I think collecting that data will be useful to keep for
> the innards of your timing API for sanity checks. For instance, if the
> elapsed time for any timer start/stop event is < 8x the timer
> granularity, this should set some kind of "flag" associated with that
> timer to indicate that its results are not to be trusted.  So I'd
> advocate incorporating the timer granularity into your API design even
> if that information is only used internal to the timing API.
That was Erik.  

I appreciate the usefulness of granularity.  The only thing that stopped
me from implementing it was that I couldn't find a simple, portable way to
determine granularity for the built-in timers 'gettimeofday' and
'getrtime' (the MPI timer has an API call that reports it.)

If I can come up with a sensible way to report granularity for these, I
will straightaway add a CCTK_ClockGranularity (or would 'Resolution' be
better?  I think it's a more commonly understood term.)


Steve White : Programmer
Max-Planck-Institut für Gravitationsphysik      Albert-Einstein-Institut
Am Mühlenberg 1, D-14476 Golm, Germany                  +49-331-567-7329

-------------- next part --------------
#include <stdio.h>
#include <sys/time.h>
#include <time.h>

	int i;
	struct timeval start, end;

	printf( "gettimeofday\n" );
	for( i = 0; i < 30; i++ )
		gettimeofday( &start, NULL );
		gettimeofday( &end, NULL );
		printf( "%ld usec\n", end.tv_usec - start.tv_usec );
	return 0;
-------------- next part --------------
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <sys/time.h>
#include <sys/resource.h>

	int i;
	struct rusage ru_start, ru_end;
	getrusage( RUSAGE_SELF, &ru_start );
	for( i = 0; i < 100000; i++ )
		getrusage( RUSAGE_SELF, &ru_end );
		if( ru_end.ru_utime.tv_usec > ru_start.ru_utime.tv_usec )
			printf( "%ld usec; %d iterations\n",
			       	- ru_start.ru_utime.tv_usec, i );
	return 0;

More information about the Developers mailing list