* timestamp problems
@ 2008-10-12 20:37 satya komaragiri
2008-10-12 23:44 ` Frank Ch. Eigler
0 siblings, 1 reply; 5+ messages in thread
From: satya komaragiri @ 2008-10-12 20:37 UTC (permalink / raw)
To: systemtap
Hello
I was working on a few scripts[1] that together profile the boot-up
process using systemtap and render an svg image very similar to
bootchart.
I noticed this strange error in timestamps that used to scramble the
XML log generated. I had initially thought something was wrong with
my scripts and had (after a bit of complaining on my blog) managed a
workaround that problem. Now that I upgraded to F9 and I see that
error still persisting, and considering that I do nothing apart from
just calling a gettimeofday_ms(), am quite convinced that it is not a
problem with my script and even if it is.. not an obvious error that
glares at your face.
( It might also be some boot time peculiarity as when I run the
scripts after boot-up there is no such problem.)
The error: The timestamps after a sometime jump back by a huge number.
on Fedora 8
The timestamps starts off with something like 1220226924420
and goes on till 1220226928771
(process executing rc.sysinitentry number 133, time in ms since first
probe : 4351 ms )
then suddenly jumps back to 1220207130000
(process executing rc.sysinit, entry number 134 )
and then continues normally till the end of the boot process.
The jump is 229.15244213 days.
( the output can be found at [2].. F8 output)
on Fedora 9
The timestamps starts with 1223857645400
and goes till 1223857652873
(process executing udev, entry number 1089, time in ms since first
probe: 7473 ms)
then jumps back to 1223837854000
( process executing udev, entry number 1090)
and then continues normally.
The jump is 229.153622685 days.
( entire output uploaded at [2] F9 output 2 with XMLs )
(the exact numbers are from one particular case.. but the trend and
order of difference is similar)
The process name or the number of entries or time elapsed before such
an event takes place is not constant but the jump is ( ~229 days in
all cases).
My **temporary** workaround:
This solution just serves my purpose and does not solve the main
issue in anyway.
As the boot process will never last for more than a few minutes,
taking a substring (the last 7 digits) also works for me ( and works
better as then I can use them as int type ) . Luckily, the substring
is always in order resulting in proper functioning of bootlimn. But I
know this is not a solution.
Was wondering if this is a systemtap problem. I'll be grateful if
someone checks it out.
Regards
Satya Komaragiri
[1] http://code.google.com/p/bootlimn/
[2] http://code.google.com/p/bootlimn/downloads/list
project blog: http://meworkstoo.blogspot.com/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: timestamp problems
2008-10-12 20:37 timestamp problems satya komaragiri
@ 2008-10-12 23:44 ` Frank Ch. Eigler
2008-10-13 10:28 ` satya komaragiri
0 siblings, 1 reply; 5+ messages in thread
From: Frank Ch. Eigler @ 2008-10-12 23:44 UTC (permalink / raw)
To: satya komaragiri; +Cc: systemtap
Hi -
> [...] considering that I do nothing apart from
> just calling a gettimeofday_ms(), am quite convinced that it is not a
> problem with my script [...]
It seems like a systemtap (runtime) bug.
> The error: The timestamps after a sometime jump back by a huge number.
>
> on Fedora 8
> The timestamps starts off with something like 1220226924420
> and goes on till 1220226928771
> (process executing rc.sysinitentry number 133, time in ms since first
> probe : 4351 ms )
> then suddenly jumps back to 1220207130000
That is "only" 0.229 days (20 million units if we're talking about
milliseconds), but of course still unacceptable. (I'm assuming the
main system clock does not jump similarly.)
Can you try something like this:
probe timer.s(10) { now=gettimeofday_ns()
printf("systemtap %s %6d\n",ctime(now/1000000000),now%1000000000)
system("date -u +\"system %a %b %d %H:%M:%S %Y %N\"")
system("grep -i mhz /proc/cpuinfo") }'
and watch for a divergence over time. The timer.s(NN) parameter could
be changed to narrow in on a suspected trigger time.
- FChE
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: timestamp problems
2008-10-12 23:44 ` Frank Ch. Eigler
@ 2008-10-13 10:28 ` satya komaragiri
2008-10-13 12:11 ` Frank Ch. Eigler
0 siblings, 1 reply; 5+ messages in thread
From: satya komaragiri @ 2008-10-13 10:28 UTC (permalink / raw)
To: Frank Ch. Eigler; +Cc: systemtap
Hi
On Mon, Oct 13, 2008 at 5:13 AM, Frank Ch. Eigler <fche@redhat.com> wrote:
>
> That is "only" 0.229 days (20 million units if we're talking about
> milliseconds), but of course still unacceptable. (I'm assuming the
> main system clock does not jump similarly.)
>
Sorry for that.
> Can you try something like this:
>
> probe timer.s(10) { now=gettimeofday_ns()
> printf("systemtap %s %6d\n",ctime(now/1000000000),now%1000000000)
> system("date -u +\"system %a %b %d %H:%M:%S %Y %N\"")
> system("grep -i mhz /proc/cpuinfo") }'
>
> and watch for a divergence over time. The timer.s(NN) parameter could
> be changed to narrow in on a suspected trigger time.
I ran the script several times for different intervals and the
timestamps are working fine at normal time
just that the logging is weird at times.( not in the order that out
probe tells it to print)
snips from the outputs ( mainly showing the messed up logging at times.. )
cpu MHz : 3065.653
system Mon Oct 13 04:24:45 2008 408697424
systemtap Mon Oct 13 04:24:45 2008 389369690
systemtap Mon Oct 13 04:24:55 2008 389369783
system Mon Oct 13 04:24:55 2008 395816927
cpu MHz : 3065.653
systemtap Mon Oct 13 04:25:05 2008 389369404
system Mon Oct 13 04:25:05 2008 395788064
cpu MHz : 3065.653
systemtap Mon Oct 13 04:25:15 2008 389369407
system Mon Oct 13 04:25:15 2008 395763673
cpu MHz : 3065.653
.......
==============================
system Mon Oct 13 04:30:23 2008 725795912
cpu MHz : 3065.653
systemtap Mon Oct 13 04:30:23 2008 713991064
system Mon Oct 13 04:30:33 2008 717822396
cpu MHz : 3065.653
systemtap Mon Oct 13 04:30:33 2008 713992080
system Mon Oct 13 04:30:43 2008 725775669
.......
systemtap Mon Oct 13 04:32:13 2008 713990711
systemtap Mon Oct 13 04:32:23 2008 713993789
system Mon Oct 13 04:32:23 2008 729103889
cpu MHz : 3065.653
systemtap Mon Oct 13 04:32:33 2008 713992105
system Mon Oct 13 04:32:33 2008 725866561
cpu MHz : 3065.65
But when I run it at boot time (with some modifications)
i get
systemtap Mon Oct 13 14:42:16 2008 159315516
systemtap Mon Oct 13 14:42:17 2008 159315673
systemtap Mon Oct 13 14:42:18 2008 159315837
systemtap Mon Oct 13 14:42:19 2008 159316258
//(then jumps)
systemtap Mon Oct 13 09:12:20 2008 645426712
systemtap Mon Oct 13 09:12:21 2008 645426460
systemtap Mon Oct 13 09:12:22 2008 645426486
systemtap Mon Oct 13 09:12:23 2008 645426933
systemtap Mon Oct 13 09:12:24 2008 645426848
systemtap Mon Oct 13 09:12:25 2008 645426421
system Mon Oct 13 14:42:16 2008 195582695
system Mon Oct 13 14:42:17 2008 164907660
system Mon Oct 13 14:42:18 2008 162838435
system Mon Oct 13 14:42:19 2008 162789467
//(similar jump in system time too )
system Mon Oct 13 09:12:20 2008 649037391
system Mon Oct 13 09:12:21 2008 653586541
system Mon Oct 13 09:12:22 2008 653056695
system Mon Oct 13 09:12:23 2008 657430857
system Mon Oct 13 09:12:24 2008 656989129
system Mon Oct 13 09:12:25 2008 650496853
The time after the jump was the correct UTC time ( i had checked immediately)
I don't know if that's normal or is my clock messed up.
What I did:
Stp script:
probe timer.s(1)
{
now=gettimeofday_ns()
printf("systemtap %s %6d\n",ctime(now/1000000000),now%1000000000)
system("date -u +\"system %a %b %d %H:%M:%S %Y %N\" >> /debugout/sys.txt")
system("grep -i mhz /proc/cpuinfo")
}
===========================
mkdir /debugko
mkdir /debugout
chmod 777 /debugko
chmod 777 /debugout
(we'll only run it once for trial.. so no harm)
stap -k -g -p4 -m debug ./debug.stp
cp debug.ko /debugko/debug.ko
=============================
have a script in sbin
say /sbin/trial
contents of /sbin/trial
#!/bin/sh
# Use tmpfs to collect data, since root is still readonly
mount -n -t tmpfs -o size=5M none /debugout
# run as long until exit condition satisfied
# Start systemtap scriot & probe
/usr/bin/staprun /debugko/debug.ko -c 'sleep 10' -o /debugout/trial.txt &
# Give daemon time to start collecting...
sleep 3
# Hand off to real init
exec /sbin/init
===========================================
change the grub to include( copy pasting from my menu.lst)
title trial
root (hd0,2)
kernel /vmlinuz-2.6.25-14.fc9.i686 ro
root=UUID=be43b64a-bfa8-43fa-8c8d-cd9a036a3301 rhgb quiet
init=/sbin/trial
initrd /initrd-2.6.25-14.fc9.i686.img
=============================================
reboot into 'trial' open terminal to do a sleep 10 and the check in /debugout
Regards
Satya
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: timestamp problems
2008-10-13 10:28 ` satya komaragiri
@ 2008-10-13 12:11 ` Frank Ch. Eigler
2008-10-15 13:49 ` satya komaragiri
0 siblings, 1 reply; 5+ messages in thread
From: Frank Ch. Eigler @ 2008-10-13 12:11 UTC (permalink / raw)
To: satya komaragiri; +Cc: systemtap
Hi -
On Mon, Oct 13, 2008 at 03:58:01PM +0530, satya komaragiri wrote:
> [...at boot time ...]
> systemtap Mon Oct 13 14:42:18 2008 159315837
> systemtap Mon Oct 13 14:42:19 2008 159316258
> //(then jumps)
> systemtap Mon Oct 13 09:12:20 2008 645426712
> systemtap Mon Oct 13 09:12:21 2008 645426460
> [...]
> system Mon Oct 13 14:42:19 2008 162789467
> //(similar jump in system time too )
> system Mon Oct 13 09:12:20 2008 649037391
> [...]
> The time after the jump was the correct UTC time ( i had checked immediately)
> I don't know if that's normal or is my clock messed up.
My guess is that your hardware RTC clock is set to some local time
zone, and the kernel can't recompute its system time in UTC until
userspace comes up and tells it what the RTC time zone was. If so,
setting your hardware clock to UTC should fix this problem.
- FChE
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: timestamp problems
2008-10-13 12:11 ` Frank Ch. Eigler
@ 2008-10-15 13:49 ` satya komaragiri
0 siblings, 0 replies; 5+ messages in thread
From: satya komaragiri @ 2008-10-15 13:49 UTC (permalink / raw)
To: Frank Ch. Eigler; +Cc: systemtap
Hello
> My guess is that your hardware RTC clock is set to some local time
> zone, and the kernel can't recompute its system time in UTC until
> userspace comes up and tells it what the RTC time zone was. If so,
> setting your hardware clock to UTC should fix this problem.
Thank you. Sorry for the late response.
Regards
Satya
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2008-10-15 13:49 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-10-12 20:37 timestamp problems satya komaragiri
2008-10-12 23:44 ` Frank Ch. Eigler
2008-10-13 10:28 ` satya komaragiri
2008-10-13 12:11 ` Frank Ch. Eigler
2008-10-15 13:49 ` satya komaragiri
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).