public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* gettimeofday_ns accuracy
@ 2010-08-07 17:02 Rob Donovan
  2010-08-07 21:53 ` Frank Ch. Eigler
  0 siblings, 1 reply; 6+ messages in thread
From: Rob Donovan @ 2010-08-07 17:02 UTC (permalink / raw)
  To: systemtap

Hi,

Is there any other kind of time measure other than gettimeofday_ns()?

Since gettimeofday_ns is not that accurate, and has the possibility (and
does) sometimes go backwards in time.

So if you're using it to measure syscall times for instance, between start
probe and .return probe, then sometimes (normally when the start probe is on
one CPU and it then switches to another CPU for the return) it gives a
negative duration for the call.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: gettimeofday_ns accuracy
  2010-08-07 17:02 gettimeofday_ns accuracy Rob Donovan
@ 2010-08-07 21:53 ` Frank Ch. Eigler
  2010-09-03  8:56   ` Rob Donovan
  0 siblings, 1 reply; 6+ messages in thread
From: Frank Ch. Eigler @ 2010-08-07 21:53 UTC (permalink / raw)
  To: Rob Donovan; +Cc: systemtap

"Rob Donovan" <rob@proivrc.com> writes:

> Is there any other kind of time measure other than gettimeofday_ns()?

There is also get_cycles() after the epolymous kernel function.

> Since gettimeofday_ns is not that accurate, and has the possibility
> (and does) sometimes go backwards in time.

Can you describe your system (CPU models, kernel versions) a little
more?  We may be able to use/expose other time sources too, or help
you write the bit of code that would do so.

- FChE

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: gettimeofday_ns accuracy
  2010-08-07 21:53 ` Frank Ch. Eigler
@ 2010-09-03  8:56   ` Rob Donovan
  2010-09-03 16:58     ` Steve Fink
  0 siblings, 1 reply; 6+ messages in thread
From: Rob Donovan @ 2010-09-03  8:56 UTC (permalink / raw)
  To: 'Frank Ch. Eigler'; +Cc: systemtap

Hi,

Sorry for the delay...

I think I've got a worse problem now with gettimeofday now.

If you run a script for a long time, it seems that the gettimeofday_s()
sometimes goes backwards by 1 second. And if you leave it running for 30
minutes (or up to 2 hrs sometimes) then some probes are returning with the
time 2-3 hrs ago.

I have the following script below, and if you start it, it will start off
with just a few errors of 1 second, but if you leave it running, it will get
worse and worse.

I've also included some of the output.

Now, this doesn't seem to happen on a RHEL5 kernel, but it does on RHEL6
Beta and also on Fedora (2.6.33.6-147.2.4.fc13.i686)

All the machines I've tried it on are multi CPU machines.

Sometimes its specific probes, like 'kprocess.exec' and sometimes it happens
with the 'timer.ms' one all the time. It seems random.

The RHEL5 & RHEL6 systems are on the same hardware, but dual boot.

Versions & machine info:

Fedora Machine:

[robdon@BlackSun ~ 09:50:27]$ stap -V
SystemTap translator/driver (version 1.3/0.148 non-git sources)
Copyright (C) 2005-2010 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
enabled features: AVAHI LIBRPM LIBSQLITE3 NSS BOOST_SHARED_PTR
TR1_UNORDERED_MAP

uname: Linux BlackSun 2.6.33.6-147.2.4.fc13.i686 #1 SMP Fri Jul 23 17:27:40
UTC 2010 i686 i686 i386 GNU/Linux

I have 4 CPUs, all like this:

[robdon@BlackSun ~ 09:47:36]$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 6
model name      : Intel(R) Pentium(R) D CPU 3.73GHz
stepping        : 4
cpu MHz         : 1400.000
cache size      : 2048 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 6
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm constant_tsc
pebs bts pni dtes64 monitor ds_cpl vmx cid cx16 xtpr pdcm lahf_lm tpr_shadow
bogomips        : 7465.81
clflush size    : 64
cache_alignment : 128
address sizes   : 36 bits physical, 48 bits virtual
power management:

RHEL6 Machine:

[cscrodo@xxxxxx rob]$ /usr/local/systap-releases/systemtap-1.3/bin/stap -V
SystemTap translator/driver (version 1.3/0.147 non-git sources)
Copyright (C) 2005-2010 Red Hat, Inc. and others
This is free software; see the source for copying conditions.
enabled features: TR1_UNORDERED_MAP

uname: Linux xxxxxxxxx 2.6.32-37.el6.x86_64 #1 SMP Sun Jun 20 19:29:35 EDT
2010 x86_64 x86_64 x86_64 GNU/Linux

This machine has 7 CPUS:

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 4
model name      : Quad-Core AMD Opteron(tm) Processor 2384
stepping        : 2
cpu MHz         : 2671.732
cache size      : 512 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb
rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni
monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a
misalignsse 3dnowprefetch osvw ibs skinit wdt npt lbrv svm_lock nrip_save
bogomips        : 5399.90
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate



Script:

global last_time

probe begin
{
  mytime = gettimeofday_s()
  printf("Started at %s\n",ctime(mytime))
}

probe kprocess.exec
{
  mytime = gettimeofday_s()

  if (mytime < last_time)
  {
    printf("proc_exec: Oh dear, last time was %s but now its %s\n",
ctime(last_time), ctime(mytime))
  }

  last_time = mytime
}

probe timer.ms(10000)
{
  mytime = gettimeofday_s()

  if (mytime < last_time)
  {
    printf("timer: Oh dear, last time was %s but now its %s\n",
ctime(last_time), ctime(mytime))
  }

  last_time = mytime
}

Output:

Started at Thu Sep  2 20:08:44 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:00:02 2010 but now its Thu
Sep  2 21:00:01 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:01:02 2010 but now its Thu
Sep  2 21:01:01 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:10:02 2010 but now its Thu
Sep  2 21:10:01 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:20:02 2010 but now its Thu
Sep  2 21:20:01 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:30:02 2010 but now its Thu
Sep  2 21:30:01 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:40:01 2010 but now its Thu
Sep  2 20:17:41 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:45:04 2010 but now its Thu
Sep  2 20:22:52 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:49:48 2010 but now its Thu
Sep  2 20:27:28 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:49:56 2010 but now its Thu
Sep  2 20:27:36 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:49:56 2010 but now its Thu
Sep  2 20:27:36 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:49:57 2010 but now its Thu
Sep  2 20:27:37 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:49:57 2010 but now its Thu
Sep  2 20:27:37 2010
proc_exec: Oh dear, last time was Thu Sep  2 21:50:01 2010 but now its Thu
Sep  2 20:27:41 2010
proc_exec: Oh dear, last time was Thu Sep  2 22:00:01 2010 but now its Thu
Sep  2 20:37:41 2010
proc_exec: Oh dear, last time was Thu Sep  2 22:01:01 2010 but now its Thu
Sep  2 20:38:41 2010
proc_exec: Oh dear, last time was Thu Sep  2 22:01:01 2010 but now its Thu
Sep  2 20:38:41 2010
proc_exec: Oh dear, last time was Thu Sep  2 22:06:44 2010 but now its Thu
Sep  2 20:44:28 2010
proc_exec: Oh dear, last time was Thu Sep  2 22:10:01 2010 but now its Thu
Sep  2 20:47:41 2010
proc_exec: Oh dear, last time was Thu Sep  2 22:17:24 2010 but now its Thu
Sep  2 20:55:06 2010
proc_exec: Oh dear, last time was Thu Sep  2 22:20:01 2010 but now its Thu
Sep  2 20:57:41 2010
proc_exec: Oh dear, last time was Thu Sep  2 22:30:01 2010 but now its Thu
Sep  2 21:07:41 2010
proc_exec: Oh dear, last time was Thu Sep  2 22:40:01 2010 but now its Thu
Sep  2 21:17:41 2010
proc_exec: Oh dear, last time was Thu Sep  2 22:50:01 2010 but now its Thu
Sep  2 21:27:41 2010
proc_exec: Oh dear, last time was Thu Sep  2 22:53:01 2010 but now its Thu
Sep  2 21:30:41 2010
proc_exec: Oh dear, last time was Thu Sep  2 23:00:01 2010 but now its Thu
Sep  2 20:15:20 2010
....
....

Thanks for any help,

Rob.

-----Original Message-----
From: systemtap-owner@sourceware.org [mailto:systemtap-owner@sourceware.org]
On Behalf Of Frank Ch. Eigler
Sent: 07 August 2010 22:54
To: Rob Donovan
Cc: systemtap@sourceware.org
Subject: Re: gettimeofday_ns accuracy

"Rob Donovan" <rob@proivrc.com> writes:

> Is there any other kind of time measure other than gettimeofday_ns()?

There is also get_cycles() after the epolymous kernel function.

> Since gettimeofday_ns is not that accurate, and has the possibility
> (and does) sometimes go backwards in time.

Can you describe your system (CPU models, kernel versions) a little
more?  We may be able to use/expose other time sources too, or help
you write the bit of code that would do so.

- FChE

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: gettimeofday_ns accuracy
  2010-09-03  8:56   ` Rob Donovan
@ 2010-09-03 16:58     ` Steve Fink
  2010-09-05 16:46       ` Rob Donovan
  0 siblings, 1 reply; 6+ messages in thread
From: Steve Fink @ 2010-09-03 16:58 UTC (permalink / raw)
  To: Rob Donovan; +Cc: Frank Ch. Eigler, systemtap

On Fri, Sep 3, 2010 at 1:56 AM, Rob Donovan <rob@proivrc.com> wrote:
> Hi,
>
> Sorry for the delay...
>
> I think I've got a worse problem now with gettimeofday now.
>
> If you run a script for a long time, it seems that the gettimeofday_s()
> sometimes goes backwards by 1 second. And if you leave it running for 30
> minutes (or up to 2 hrs sometimes) then some probes are returning with the
> time 2-3 hrs ago.
>
> I have the following script below, and if you start it, it will start off
> with just a few errors of 1 second, but if you leave it running, it will get
> worse and worse.

Just speculating - does it still happen if you turn off power management?

Your output says the CPU is a 3.73GHz machine running at 1.4GHz, which
I think means you have frequency scaling happening. I'm wondering if
the larger differences are from CPUs getting completely shut down for
a while or something.

Can you print out the CPU ids in the output as well?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: gettimeofday_ns accuracy
  2010-09-03 16:58     ` Steve Fink
@ 2010-09-05 16:46       ` Rob Donovan
  0 siblings, 0 replies; 6+ messages in thread
From: Rob Donovan @ 2010-09-05 16:46 UTC (permalink / raw)
  To: 'Steve Fink'; +Cc: 'Frank Ch. Eigler', systemtap

Hi,

Well, I've done some more testing...

I turned off Power management (by putting "apm=off acpi=off" in
/etc/grub.conf and rebooting).

GHz still showed 1400, and now gettimeofday_s() always returns '0'.

I don't think that has anything to do with the bug mind you, because the
other machine I try it on at work doesn't show the GHz incorrectly, and I
still get the problem on that computer also, but as I said, only on RHEL6
not RHEL5.

One thing I didn't say, is that this problem seems to get worse when lots of
processes are running or starting, so once the script has started if you
also run 2 commands to create processes, something like.....  watch -n 0 "ps
aux | grep stap"

Once the stap script is running, I get some of the messages that the time
has gone backwards by 1 second, but then as I said after a while it then
starts reporting that the difference is many hours. 

One thing I've noticed, is that when it does jump to the difference that is
hrs, the time that it reports instead of the correct time, is the time that
the script started! And then it just gets higher.

Also, when it was reporting 1 second diff, the 'last time' was always on CPU
3 mostly, sometimes on 1, and the 'now' time was always on CPU 0 or 2.
Exactly when it starts showing a diff of 1hr or more, it switches to the
'last time' on CPU 2 or 0, and the 'now' time on CPU 3 or 1.... not sure if
that is relevant or not...

Below is new output, with the CPU field for the 2 times...

Started at Sun Sep  5 14:55:28 2010
proc_exec: Oh dear, last time was Sun Sep  5 14:55:54 2010 (CPU=3) but now
its Sun Sep  5 14:55:53 2010 (CPU=0)
proc_exec: Oh dear, last time was Sun Sep  5 14:55:56 2010 (CPU=3) but now
its Sun Sep  5 14:55:55 2010 (CPU=2)
proc_exec: Oh dear, last time was Sun Sep  5 14:56:03 2010 (CPU=3) but now
its Sun Sep  5 14:56:02 2010 (CPU=0)
proc_exec: Oh dear, last time was Sun Sep  5 14:56:07 2010 (CPU=3) but now
its Sun Sep  5 14:56:06 2010 (CPU=0)
proc_exec: Oh dear, last time was Sun Sep  5 14:56:09 2010 (CPU=3) but now
its Sun Sep  5 14:56:08 2010 (CPU=2)
proc_exec: Oh dear, last time was Sun Sep  5 14:56:52 2010 (CPU=3) but now
its Sun Sep  5 14:56:51 2010 (CPU=0)
proc_exec: Oh dear, last time was Sun Sep  5 14:56:53 2010 (CPU=3) but now
its Sun Sep  5 14:56:52 2010 (CPU=0)
proc_exec: Oh dear, last time was Sun Sep  5 14:56:55 2010 (CPU=3) but now
its Sun Sep  5 14:56:54 2010 (CPU=0)
proc_exec: Oh dear, last time was Sun Sep  5 14:57:00 2010 (CPU=3) but now
its Sun Sep  5 14:56:59 2010 (CPU=2)
proc_exec: Oh dear, last time was Sun Sep  5 14:57:02 2010 (CPU=3) but now
its Sun Sep  5 14:57:01 2010 (CPU=2)
proc_exec: Oh dear, last time was Sun Sep  5 14:57:04 2010 (CPU=3) but now
its Sun Sep  5 14:57:03 2010 (CPU=2)
proc_exec: Oh dear, last time was Sun Sep  5 14:57:08 2010 (CPU=1) but now
its Sun Sep  5 14:57:07 2010 (CPU=2)
proc_exec: Oh dear, last time was Sun Sep  5 14:57:18 2010 (CPU=3) but now
its Sun Sep  5 14:57:17 2010 (CPU=0)
proc_exec: Oh dear, last time was Sun Sep  5 14:57:23 2010 (CPU=3) but now
its Sun Sep  5 14:57:22 2010 (CPU=0)
....
....
....
proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=3) but now
its Sun Sep  5 16:17:48 2010 (CPU=0)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=3) but now
its Sun Sep  5 16:17:48 2010 (CPU=2)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=3) but now
its Sun Sep  5 16:17:48 2010 (CPU=0)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=3) but now
its Sun Sep  5 16:17:48 2010 (CPU=0)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=1) but now
its Sun Sep  5 14:55:28 2010 (CPU=3)   <- this is where it starts back to
the time the scripts starts!
proc_exec: Oh dear, last time was Sun Sep  5 16:17:48 2010 (CPU=2) but now
its Sun Sep  5 14:55:28 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:48 2010 (CPU=2) but now
its Sun Sep  5 14:55:28 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=2) but now
its Sun Sep  5 14:55:28 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=2) but now
its Sun Sep  5 14:55:28 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=2) but now
its Sun Sep  5 14:55:28 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=0) but now
its Sun Sep  5 14:55:28 2010 (CPU=3)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=2) but now
its Sun Sep  5 14:55:28 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=2) but now
its Sun Sep  5 14:55:29 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=2) but now
its Sun Sep  5 14:55:29 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:50 2010 (CPU=2) but now
its Sun Sep  5 14:55:29 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:50 2010 (CPU=2) but now
its Sun Sep  5 14:55:29 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:50 2010 (CPU=2) but now
its Sun Sep  5 14:55:29 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:50 2010 (CPU=2) but now
its Sun Sep  5 14:55:30 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:50 2010 (CPU=2) but now
its Sun Sep  5 14:55:30 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:50 2010 (CPU=2) but now
its Sun Sep  5 14:55:30 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=0) but now
its Sun Sep  5 14:55:30 2010 (CPU=3)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=0) but now
its Sun Sep  5 14:55:30 2010 (CPU=3)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
its Sun Sep  5 14:55:30 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
its Sun Sep  5 14:55:30 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
its Sun Sep  5 14:55:30 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
its Sun Sep  5 14:55:30 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
its Sun Sep  5 14:55:31 2010 (CPU=3)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
its Sun Sep  5 14:55:31 2010 (CPU=3)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
its Sun Sep  5 14:55:31 2010 (CPU=1)
proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
its Sun Sep  5 14:55:31 2010 (CPU=1)

-----Original Message-----
From: systemtap-owner@sourceware.org [mailto:systemtap-owner@sourceware.org]
On Behalf Of Steve Fink
Sent: 03 September 2010 17:58
To: Rob Donovan
Cc: Frank Ch. Eigler; systemtap@sourceware.org
Subject: Re: gettimeofday_ns accuracy

On Fri, Sep 3, 2010 at 1:56 AM, Rob Donovan <rob@proivrc.com> wrote:
> Hi,
>
> Sorry for the delay...
>
> I think I've got a worse problem now with gettimeofday now.
>
> If you run a script for a long time, it seems that the 
> gettimeofday_s() sometimes goes backwards by 1 second. And if you 
> leave it running for 30 minutes (or up to 2 hrs sometimes) then some 
> probes are returning with the time 2-3 hrs ago.
>
> I have the following script below, and if you start it, it will start 
> off with just a few errors of 1 second, but if you leave it running, 
> it will get worse and worse.

Just speculating - does it still happen if you turn off power management?

Your output says the CPU is a 3.73GHz machine running at 1.4GHz, which I
think means you have frequency scaling happening. I'm wondering if the
larger differences are from CPUs getting completely shut down for a while or
something.

Can you print out the CPU ids in the output as well?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: gettimeofday_ns accuracy
@ 2010-10-01 12:44 Rob Donovan
  0 siblings, 0 replies; 6+ messages in thread
From: Rob Donovan @ 2010-10-01 12:44 UTC (permalink / raw)
  To: systemtap

Hi,

I've tested this on the latest snapshot (systemtap-20100925), and it's still
a problem.

Is anyone particularly looking into this or do you need any more info or
anything from me?

Has anyone been able to reproduce it, or is it just me?

Thanks,

Rob.

> -----Original Message-----
> From: Rob Donovan [mailto:rob@proivrc.com]
> Sent: 05 September 2010 17:47
> To: 'Steve Fink'
> Cc: 'Frank Ch. Eigler'; 'systemtap@sourceware.org'
> Subject: RE: gettimeofday_ns accuracy
> 
> Hi,
> 
> Well, I've done some more testing...
> 
> I turned off Power management (by putting "apm=off acpi=off" in
> /etc/grub.conf and rebooting).
> 
> GHz still showed 1400, and now gettimeofday_s() always returns '0'.
> 
> I don't think that has anything to do with the bug mind you, because the
> other machine I try it on at work doesn't show the GHz incorrectly, and I
still
> get the problem on that computer also, but as I said, only on RHEL6 not
> RHEL5.
> 
> One thing I didn't say, is that this problem seems to get worse when lots
of
> processes are running or starting, so once the script has started if you
also
> run 2 commands to create processes, something like.....  watch -n 0 "ps
aux |
> grep stap"
> 
> Once the stap script is running, I get some of the messages that the time
has
> gone backwards by 1 second, but then as I said after a while it then
starts
> reporting that the difference is many hours.
> 
> One thing I've noticed, is that when it does jump to the difference that
is hrs,
> the time that it reports instead of the correct time, is the time that the
script
> started! And then it just gets higher.
> 
> Also, when it was reporting 1 second diff, the 'last time' was always on
CPU 3
> mostly, sometimes on 1, and the 'now' time was always on CPU 0 or 2.
Exactly
> when it starts showing a diff of 1hr or more, it switches to the 'last
time' on
> CPU 2 or 0, and the 'now' time on CPU 3 or 1.... not sure if that is
relevant or
> not...
> 
> Below is new output, with the CPU field for the 2 times...
> 
> Started at Sun Sep  5 14:55:28 2010
> proc_exec: Oh dear, last time was Sun Sep  5 14:55:54 2010 (CPU=3) but now
> its Sun Sep  5 14:55:53 2010 (CPU=0)
> proc_exec: Oh dear, last time was Sun Sep  5 14:55:56 2010 (CPU=3) but now
> its Sun Sep  5 14:55:55 2010 (CPU=2)
> proc_exec: Oh dear, last time was Sun Sep  5 14:56:03 2010 (CPU=3) but now
> its Sun Sep  5 14:56:02 2010 (CPU=0)
> proc_exec: Oh dear, last time was Sun Sep  5 14:56:07 2010 (CPU=3) but now
> its Sun Sep  5 14:56:06 2010 (CPU=0)
> proc_exec: Oh dear, last time was Sun Sep  5 14:56:09 2010 (CPU=3) but now
> its Sun Sep  5 14:56:08 2010 (CPU=2)
> proc_exec: Oh dear, last time was Sun Sep  5 14:56:52 2010 (CPU=3) but now
> its Sun Sep  5 14:56:51 2010 (CPU=0)
> proc_exec: Oh dear, last time was Sun Sep  5 14:56:53 2010 (CPU=3) but now
> its Sun Sep  5 14:56:52 2010 (CPU=0)
> proc_exec: Oh dear, last time was Sun Sep  5 14:56:55 2010 (CPU=3) but now
> its Sun Sep  5 14:56:54 2010 (CPU=0)
> proc_exec: Oh dear, last time was Sun Sep  5 14:57:00 2010 (CPU=3) but now
> its Sun Sep  5 14:56:59 2010 (CPU=2)
> proc_exec: Oh dear, last time was Sun Sep  5 14:57:02 2010 (CPU=3) but now
> its Sun Sep  5 14:57:01 2010 (CPU=2)
> proc_exec: Oh dear, last time was Sun Sep  5 14:57:04 2010 (CPU=3) but now
> its Sun Sep  5 14:57:03 2010 (CPU=2)
> proc_exec: Oh dear, last time was Sun Sep  5 14:57:08 2010 (CPU=1) but now
> its Sun Sep  5 14:57:07 2010 (CPU=2)
> proc_exec: Oh dear, last time was Sun Sep  5 14:57:18 2010 (CPU=3) but now
> its Sun Sep  5 14:57:17 2010 (CPU=0)
> proc_exec: Oh dear, last time was Sun Sep  5 14:57:23 2010 (CPU=3) but now
> its Sun Sep  5 14:57:22 2010 (CPU=0) ....
> ....
> ....
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=3) but now
> its Sun Sep  5 16:17:48 2010 (CPU=0)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=3) but now
> its Sun Sep  5 16:17:48 2010 (CPU=2)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=3) but now
> its Sun Sep  5 16:17:48 2010 (CPU=0)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=3) but now
> its Sun Sep  5 16:17:48 2010 (CPU=0)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=1) but now
> its Sun Sep  5 14:55:28 2010 (CPU=3)   <- this is where it starts back to
the time
> the scripts starts!
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:48 2010 (CPU=2) but now
> its Sun Sep  5 14:55:28 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:48 2010 (CPU=2) but now
> its Sun Sep  5 14:55:28 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=2) but now
> its Sun Sep  5 14:55:28 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=2) but now
> its Sun Sep  5 14:55:28 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=2) but now
> its Sun Sep  5 14:55:28 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=0) but now
> its Sun Sep  5 14:55:28 2010 (CPU=3)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=2) but now
> its Sun Sep  5 14:55:28 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=2) but now
> its Sun Sep  5 14:55:29 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:49 2010 (CPU=2) but now
> its Sun Sep  5 14:55:29 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:50 2010 (CPU=2) but now
> its Sun Sep  5 14:55:29 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:50 2010 (CPU=2) but now
> its Sun Sep  5 14:55:29 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:50 2010 (CPU=2) but now
> its Sun Sep  5 14:55:29 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:50 2010 (CPU=2) but now
> its Sun Sep  5 14:55:30 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:50 2010 (CPU=2) but now
> its Sun Sep  5 14:55:30 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:50 2010 (CPU=2) but now
> its Sun Sep  5 14:55:30 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=0) but now
> its Sun Sep  5 14:55:30 2010 (CPU=3)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=0) but now
> its Sun Sep  5 14:55:30 2010 (CPU=3)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
> its Sun Sep  5 14:55:30 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
> its Sun Sep  5 14:55:30 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
> its Sun Sep  5 14:55:30 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
> its Sun Sep  5 14:55:30 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
> its Sun Sep  5 14:55:31 2010 (CPU=3)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
> its Sun Sep  5 14:55:31 2010 (CPU=3)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
> its Sun Sep  5 14:55:31 2010 (CPU=1)
> proc_exec: Oh dear, last time was Sun Sep  5 16:17:51 2010 (CPU=2) but now
> its Sun Sep  5 14:55:31 2010 (CPU=1)
> 
> -----Original Message-----
> From: systemtap-owner@sourceware.org [mailto:systemtap-
> owner@sourceware.org] On Behalf Of Steve Fink
> Sent: 03 September 2010 17:58
> To: Rob Donovan
> Cc: Frank Ch. Eigler; systemtap@sourceware.org
> Subject: Re: gettimeofday_ns accuracy
> 
> On Fri, Sep 3, 2010 at 1:56 AM, Rob Donovan <rob@proivrc.com> wrote:
> > Hi,
> >
> > Sorry for the delay...
> >
> > I think I've got a worse problem now with gettimeofday now.
> >
> > If you run a script for a long time, it seems that the
> > gettimeofday_s() sometimes goes backwards by 1 second. And if you
> > leave it running for 30 minutes (or up to 2 hrs sometimes) then some
> > probes are returning with the time 2-3 hrs ago.
> >
> > I have the following script below, and if you start it, it will start
> > off with just a few errors of 1 second, but if you leave it running,
> > it will get worse and worse.
> 
> Just speculating - does it still happen if you turn off power management?
> 
> Your output says the CPU is a 3.73GHz machine running at 1.4GHz, which I
> think means you have frequency scaling happening. I'm wondering if the
> larger differences are from CPUs getting completely shut down for a while
or
> something.
> 
> Can you print out the CPU ids in the output as well?

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-10-01 12:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-07 17:02 gettimeofday_ns accuracy Rob Donovan
2010-08-07 21:53 ` Frank Ch. Eigler
2010-09-03  8:56   ` Rob Donovan
2010-09-03 16:58     ` Steve Fink
2010-09-05 16:46       ` Rob Donovan
2010-10-01 12:44 Rob Donovan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).