public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* RE: Summary of nightly tests 20061109
@ 2006-11-11 14:27 Nguyen, Thang P
  0 siblings, 0 replies; 4+ messages in thread
From: Nguyen, Thang P @ 2006-11-11 14:27 UTC (permalink / raw)
  To: William Cohen, SystemTAP

Similar errors were observed on my ia64 box (2.6.9-42.EL). Most
noticeable errors are:

(1) kernel.statement() failed. For example, following probe is for
function "generic_make_request".  Same thing occurred with other
function such as "scheduler_tick"

probe kernel.statement(0xa000000100367ac0){
   printf ("testing\n")
}

Pass 1: parsed user script and 52 library script(s) in
324usr/7sys/332real ms.
WARNING: cannot find kernel debuginfo stap: tapsets.cxx:654: void
dwflpp::query_cu_containing_global_address(Dwarf_Addr, void*): Assertion
`bias == module_bias' failed.
Abort (core dumped)

(2) unaligned accesses in staprun seen in many failed test cases

Pass 5: starting run.
staprun(18130): unaligned access to
0x2000000000330014,ip=0x4000000000007bd1
staprun(18130): unaligned access to
0x200000000033001c,ip=0x4000000000007be0
staprun(18130): unaligned access to
0x2000000000330024,ip=0x4000000000007bd1 staprun(18130): unaligned
access to 0x200000000033002c,ip=0x4000000000007be0
systemtap starting probe

===  Additional information for my environment ===
OS:   RHEL 4 (update 4) 2.6.9-42.EL
Stap version:
SystemTap translator/driver (version 0.5.11 built 2006-11-09)
(Using Red Hat elfutils 0.124 libraries.)

/proc/cpuinfo 
processor  : 0
vendor     : GenuineIntel
arch       : IA-64
family     : Itanium 2
model      : 2
revision   : 1
archrev    : 0
features   : branchlong
cpu number : 0
cpu regs   : 4
cpu MHz    : 1595.653000
itc MHz    : 1595.653000
BogoMIPS   : 2390.75
siblings   : 1

processor  : 1
vendor     : GenuineIntel
arch       : IA-64
family     : Itanium 2
model      : 2
revision   : 1
archrev    : 0
features   : branchlong
cpu number : 0
cpu regs   : 4
cpu MHz    : 1595.653000
itc MHz    : 1595.653000
BogoMIPS   : 2382.36
siblings   : 1

processor  : 2
vendor     : GenuineIntel
arch       : IA-64
family     : Itanium 2
model      : 2
revision   : 1
archrev    : 0
features   : branchlong
cpu number : 0
cpu regs   : 4
cpu MHz    : 1595.653000
itc MHz    : 1595.653000
BogoMIPS   : 2382.36
siblings   : 1

processor  : 3
vendor     : GenuineIntel
arch       : IA-64
family     : Itanium 2
model      : 2
revision   : 1
archrev    : 0
features   : branchlong
cpu number : 0
cpu regs   : 4
cpu MHz    : 1595.653000
itc MHz    : 1595.653000
BogoMIPS   : 2382.36
siblings   : 1
 


Thang


>-----Original Message-----
>From: systemtap-owner@sourceware.org [mailto:systemtap-
>owner@sourceware.org] On Behalf Of William Cohen
>Sent: Thursday, November 09, 2006 11:19 AM
>To: SystemTAP
>Subject: Summary of nightly tests 20061109
>
>Only four of the six machines reported results. fc6/rawhide i686 and
>rhel4 i686 died. The ia64 machine results look to have a systematic
>problem.
>
>The fc6 i686 machine has page allocation failure in /var/log/messages.
>"staprun: page allocation failure. order:5, mode:0xd0".
>It looks like this is happening during sysetemtap.syscall/test.exp
>
>The RHEL4U4 i686 crashes because of 2726, could we just blacklist
>"scheduler_tick" and be done with it?
>
>The RHEL4U5 ia64 getting unaligned accesses in staprun on the failed
tests:
>add.stp, div0.stp, equal.stp, inc.stp, etc.
>
>The kernel.statement() is failing on the 64-bit machines.
>
>The bench (0) test is failing on all the machines.
>
>I will see about better characterizing the problems in encountered by
>the testing.
>
>
>-Will
>
>FC5 i686
>Kernel: Linux 2.6.18-1.2200.fc5 #1 Sat Oct 14 16:59:26 EDT 2006 i686
i686
>i386
>GNU/Linux
>
>Testsuite summary of failed tests
>FAIL: bench (0)
>FAIL:
>/home/wcohen/stap_testing_200611091417/src/testsuite/systemtap.stress/c
urre
>nt.stp
>startup (timeout)
>		=== systemtap Summary ===
>
># of expected passes		239
># of unexpected failures	2
># of expected failures		107
># of unknown successes		2
># of known failures		3
># of untested testcases		1
># of unsupported tests		1
>
>
>FC6/rawhide i686
>Kernel: Linux 2.6.18-1.2798.fc6PAE #1 SMP Mon Oct 16 14:54:22 EDT 2006
i686
>i686
>i386 GNU/Linux
>
>Testsuite summary of failed tests
>FAIL: bench (0)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.stress/c
urre
>nt.stp
>startup (timeout)
>		=== systemtap Summary ===
>
># of expected passes		239
># of unexpected failures	2
># of expected failures		107
># of unknown successes		2
># of known failures		3
># of untested testcases		1
>
>
>
>FC6/rawhide x86_64
>Kernel: Linux 2.6.18-1.2798.fc6 #1 SMP Mon Oct 16 14:39:22 EDT 2006
x86_64
>x86_64 x86_64 GNU/Linux
>
>Testsuite summary of failed tests
>FAIL: bench (0)
>FAIL: probefunc:kernel.statement(0xffffffff80287e1a) startup (eof)
>FAIL: buildok/seventeen.stp
>FAIL:
>/home/wcohen/stap_testing_200611091604/src/testsuite/systemtap.stress/c
urre
>nt.stp
>startup (timeout)
>		=== systemtap Summary ===
>
># of expected passes		235
># of unexpected failures	4
># of expected failures		107
># of unknown successes		2
># of known failures		3
># of untested testcases		1
># of unsupported tests		1
>
>
>RHEL4 U4 i686
>crashed due to 2726
>
>RHEL4 U4 x86_64
>Kernel: Linux 2.6.9-42.0.3.ELsmp #1 SMP Mon Sep 25 17:24:31 EDT 2006
x86_64
>x86_64 x86_64 GNU/Linux
>
>Testsuite summary of failed tests
>FAIL: bench (0)
>FAIL: probefunc:kernel.statement(0xffffffff80133660) startup (eof)
>FAIL: buildok/seventeen.stp
>		=== systemtap Summary ===
>
># of expected passes		239
># of unexpected failures	3
># of expected failures		107
># of unknown successes		2
># of known failures		3
># of untested testcases		1
>
>
>RHLE4 U4 ia64
>Kernel: Linux 2.6.9-42.0.3.EL #1 SMP Mon Sep 25 17:14:34 EDT 2006 ia64
ia64
>ia64
>GNU/Linux
>
>Testsuite summary of failed tests
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/add
.stp
>startup (timeout)
>FAIL: bench (0)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/div
0.st
>p
>startup (timeout)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/equ
al.s
>tp
>startup (timeout)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/fin
loop
>2.stp
>startup (timeout)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/if.
stp
>startup (timeout)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/inc
.stp
>startup (timeout)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/kfu
nct.
>stp
>startup (timeout)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/kmo
dule
>.stp
>startup (timeout)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/log
ical
>_and.stp
>startup (timeout)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/not
.stp
>startup (timeout)
>FAIL: probefunc:kernel.statement(0xa00000010006c840) startup (eof)
>FAIL: probefunc:kernel.function("scheduler_tick") startup (timeout)
>FAIL: probefunc:kernel.inline("context_switch") compilation
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/sim
ple.
>stp
>startup (timeout)
>FAIL: timeofday test startup (timeout)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/tim
ers.
>stp
>startup (timeout)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/tri
.stp
>startup (timeout)
>FAIL: absentstats (1 13)
>FAIL: buildok/seventeen.stp
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.samples/
iobl
>ocktest.stp
>startup (timeout)
>FAIL: symbols (15)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.samples/
tcpt
>est.stp
>startup (timeout)
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.stress/c
urre
>nt.stp
>startup (timeout)
>		=== systemtap Summary ===
>
># of expected passes		180
># of unexpected failures	24
># of unexpected successes	2
># of expected failures		105
># of unknown successes		1
># of known failures		4
># of untested testcases		1
>
>Some additional information about the ipf machine is
>
>$ more /proc/cpuinfo
>processor  : 0
>vendor     : GenuineIntel
>arch       : IA-64
>family     : Itanium 2
>model      : 0
>revision   : 7
>archrev    : 0
>features   : branchlong
>cpu number : 0
>cpu regs   : 4
>cpu MHz    : 900.000000
>itc MHz    : 900.000000
>BogoMIPS   : 1346.37
>siblings   : 1
>
>The earliest record of staprun causing unaligned accesses on the ia64
>machine is Nov 5. See things like the following in the systemtap.log
output
>of
>dejagnu.
>
>xp completed in 0 seconds
>Running
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/add
.exp
> ...
>Pass 1: parsed user script and 52 library script(s) in
>1901usr/22sys/1942real ms.
>Pass 2: analyzed script: 2 probe(s), 1 function(s), 0 embed(s), 3
global(s)
>in
>32usr/1sys/33real ms.
>Pass 3: translated to C into
>"/tmp/stapoIYqmN/stap_1bb105f87e0392534ace6263540881cd_484.c" in
>3usr/0sys/4real ms.
>Pass 4: compiled C into "stap_1bb105f87e0392534ace6263540881cd_484.ko"
in
>4193usr/210sys/4450real ms.
>Pass 5: starting run.
>staprun(9115): unaligned access to 0x2000000000338014,
>ip=0x4000000000007c01
>staprun(9115): unaligned access to 0x200000000033801c,
>ip=0x4000000000007c10
>staprun(9115): unaligned access to 0x2000000000338024,
>ip=0x4000000000007c01
>staprun(9115): unaligned access to 0x200000000033802c,
>ip=0x4000000000007c10
>systemtap starting probe
>FAIL:
>/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/add
.stp
>startup (timeout)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Summary of nightly tests 20061109
@ 2006-11-14 12:49 Stone, Joshua I
  0 siblings, 0 replies; 4+ messages in thread
From: Stone, Joshua I @ 2006-11-14 12:49 UTC (permalink / raw)
  To: William Cohen, Martin Hunt; +Cc: SystemTAP

On Thursday, November 09, 2006 11:19 AM, William Cohen wrote:
> RHLE4 U4 ia64
> Kernel: Linux 2.6.9-42.0.3.EL #1 SMP Mon Sep 25 17:14:34 EDT 2006
> ia64 ia64 ia64 GNU/Linux
> [...]
> The earliest record of staprun causing unaligned accesses on the ia64
> machine is Nov 5. See things like the following in the systemtap.log
> output of dejagnu.
> [...]
> staprun(9115): unaligned access to 0x2000000000338014,
ip=0x4000000000007c01
> staprun(9115): unaligned access to 0x200000000033801c,
ip=0x4000000000007c10
> staprun(9115): unaligned access to 0x2000000000338024,
ip=0x4000000000007c01
> staprun(9115): unaligned access to 0x200000000033802c,
ip=0x4000000000007c10

In do_kernel_symbols (runtime/stpd/symbols.c), there are these lines:

144:     sym_base = malloc(MAX_SYMBOLS*sizeof(struct
_stp_symbol)+sizeof(int));
[...]
154:     *(int *)sym_base = STP_SYMBOLS;
155:     syms = (struct _stp_symbol *)(sym_base + sizeof(int));
[...]
178:             syms[i].addr = addr;
179:             syms[i].symbol = (char *)(dataptr - data);

The ips 7c01 and 7c10 correspond to lines 178 and 179.  Line 144 will
return an 8-byte aligned pointer, and then 155 adds 4 to it.  From then
on, all access to 8-byte fields through 'syms' will be misaligned.
There's a logging rate limiter that's saving us from seeing more than
four of the same message, thankfully.

One potential solution is to make the transport command a long instead
of int.  Such a fix would require changes to the whole transport layer,
though, so I'll leave it to Martin to decide what to do...

Josh

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Summary of nightly tests 20061109
@ 2006-11-10  5:02 Stone, Joshua I
  0 siblings, 0 replies; 4+ messages in thread
From: Stone, Joshua I @ 2006-11-10  5:02 UTC (permalink / raw)
  To: William Cohen, SystemTAP

On Thursday, November 09, 2006 11:19 AM, William Cohen wrote:
> The RHEL4U4 i686 crashes because of 2726, could we just blacklist
> "scheduler_tick" and be done with it?

That would be treating the symptom instead of the cause...

Have you tried the patch that Chuck Ebbert posted?  I see that Jeff
Layton resurrected that LKML thread, and Andrew Morton responded, so the
problem at least has some eyes on it now...


Josh

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Summary of nightly tests 20061109
@ 2006-11-09 19:26 William Cohen
  0 siblings, 0 replies; 4+ messages in thread
From: William Cohen @ 2006-11-09 19:26 UTC (permalink / raw)
  To: SystemTAP

Only four of the six machines reported results. fc6/rawhide i686 and
rhel4 i686 died. The ia64 machine results look to have a systematic
problem.

The fc6 i686 machine has page allocation failure in /var/log/messages.
"staprun: page allocation failure. order:5, mode:0xd0".
It looks like this is happening during sysetemtap.syscall/test.exp

The RHEL4U4 i686 crashes because of 2726, could we just blacklist
"scheduler_tick" and be done with it?

The RHEL4U5 ia64 getting unaligned accesses in staprun on the failed tests:
add.stp, div0.stp, equal.stp, inc.stp, etc.

The kernel.statement() is failing on the 64-bit machines.

The bench (0) test is failing on all the machines.

I will see about better characterizing the problems in encountered by
the testing.


-Will

FC5 i686
Kernel: Linux 2.6.18-1.2200.fc5 #1 Sat Oct 14 16:59:26 EDT 2006 i686 i686 i386 
GNU/Linux

Testsuite summary of failed tests
FAIL: bench (0)
FAIL: 
/home/wcohen/stap_testing_200611091417/src/testsuite/systemtap.stress/current.stp 
startup (timeout)
		=== systemtap Summary ===

# of expected passes		239
# of unexpected failures	2
# of expected failures		107
# of unknown successes		2
# of known failures		3
# of untested testcases		1
# of unsupported tests		1


FC6/rawhide i686
Kernel: Linux 2.6.18-1.2798.fc6PAE #1 SMP Mon Oct 16 14:54:22 EDT 2006 i686 i686 
i386 GNU/Linux

Testsuite summary of failed tests
FAIL: bench (0)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.stress/current.stp 
startup (timeout)
		=== systemtap Summary ===

# of expected passes		239
# of unexpected failures	2
# of expected failures		107
# of unknown successes		2
# of known failures		3
# of untested testcases		1



FC6/rawhide x86_64
Kernel: Linux 2.6.18-1.2798.fc6 #1 SMP Mon Oct 16 14:39:22 EDT 2006 x86_64 
x86_64 x86_64 GNU/Linux

Testsuite summary of failed tests
FAIL: bench (0)
FAIL: probefunc:kernel.statement(0xffffffff80287e1a) startup (eof)
FAIL: buildok/seventeen.stp
FAIL: 
/home/wcohen/stap_testing_200611091604/src/testsuite/systemtap.stress/current.stp 
startup (timeout)
		=== systemtap Summary ===

# of expected passes		235
# of unexpected failures	4
# of expected failures		107
# of unknown successes		2
# of known failures		3
# of untested testcases		1
# of unsupported tests		1


RHEL4 U4 i686
crashed due to 2726

RHEL4 U4 x86_64
Kernel: Linux 2.6.9-42.0.3.ELsmp #1 SMP Mon Sep 25 17:24:31 EDT 2006 x86_64 
x86_64 x86_64 GNU/Linux

Testsuite summary of failed tests
FAIL: bench (0)
FAIL: probefunc:kernel.statement(0xffffffff80133660) startup (eof)
FAIL: buildok/seventeen.stp
		=== systemtap Summary ===

# of expected passes		239
# of unexpected failures	3
# of expected failures		107
# of unknown successes		2
# of known failures		3
# of untested testcases		1


RHLE4 U4 ia64
Kernel: Linux 2.6.9-42.0.3.EL #1 SMP Mon Sep 25 17:14:34 EDT 2006 ia64 ia64 ia64 
GNU/Linux

Testsuite summary of failed tests
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/add.stp 
startup (timeout)
FAIL: bench (0)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/div0.stp 
startup (timeout)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/equal.stp 
startup (timeout)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/finloop2.stp 
startup (timeout)
FAIL: /home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/if.stp 
startup (timeout)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/inc.stp 
startup (timeout)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/kfunct.stp 
startup (timeout)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/kmodule.stp 
startup (timeout)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/logical_and.stp 
startup (timeout)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/not.stp 
startup (timeout)
FAIL: probefunc:kernel.statement(0xa00000010006c840) startup (eof)
FAIL: probefunc:kernel.function("scheduler_tick") startup (timeout)
FAIL: probefunc:kernel.inline("context_switch") compilation
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/simple.stp 
startup (timeout)
FAIL: timeofday test startup (timeout)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/timers.stp 
startup (timeout)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/tri.stp 
startup (timeout)
FAIL: absentstats (1 13)
FAIL: buildok/seventeen.stp
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.samples/ioblocktest.stp 
startup (timeout)
FAIL: symbols (15)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.samples/tcptest.stp 
startup (timeout)
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.stress/current.stp 
startup (timeout)
		=== systemtap Summary ===

# of expected passes		180
# of unexpected failures	24
# of unexpected successes	2
# of expected failures		105
# of unknown successes		1
# of known failures		4
# of untested testcases		1

Some additional information about the ipf machine is

$ more /proc/cpuinfo
processor  : 0
vendor     : GenuineIntel
arch       : IA-64
family     : Itanium 2
model      : 0
revision   : 7
archrev    : 0
features   : branchlong
cpu number : 0
cpu regs   : 4
cpu MHz    : 900.000000
itc MHz    : 900.000000
BogoMIPS   : 1346.37
siblings   : 1

The earliest record of staprun causing unaligned accesses on the ia64
machine is Nov 5. See things like the following in the systemtap.log output of 
dejagnu.

xp completed in 0 seconds
Running 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/add.exp ...
Pass 1: parsed user script and 52 library script(s) in 1901usr/22sys/1942real ms.
Pass 2: analyzed script: 2 probe(s), 1 function(s), 0 embed(s), 3 global(s) in 
32usr/1sys/33real ms.
Pass 3: translated to C into 
"/tmp/stapoIYqmN/stap_1bb105f87e0392534ace6263540881cd_484.c" in 3usr/0sys/4real ms.
Pass 4: compiled C into "stap_1bb105f87e0392534ace6263540881cd_484.ko" in 
4193usr/210sys/4450real ms.
Pass 5: starting run.
staprun(9115): unaligned access to 0x2000000000338014, ip=0x4000000000007c01
staprun(9115): unaligned access to 0x200000000033801c, ip=0x4000000000007c10
staprun(9115): unaligned access to 0x2000000000338024, ip=0x4000000000007c01
staprun(9115): unaligned access to 0x200000000033802c, ip=0x4000000000007c10
systemtap starting probe
FAIL: 
/home/wcohen/stap_testing_200611090930/src/testsuite/systemtap.base/add.stp 
startup (timeout)

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2006-11-14  2:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-11-11 14:27 Summary of nightly tests 20061109 Nguyen, Thang P
  -- strict thread matches above, loose matches on Subject: below --
2006-11-14 12:49 Stone, Joshua I
2006-11-10  5:02 Stone, Joshua I
2006-11-09 19:26 William Cohen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).