* Linux AMI 2011.09.1.x86_64-ebs issue: kernel aki-825ea7eb build-id munging
@ 2011-10-27 20:56 Frank Ch. Eigler
2011-10-27 21:25 ` Gafton, Cristian
2011-10-27 22:56 ` Mark Wielaard
0 siblings, 2 replies; 3+ messages in thread
From: Frank Ch. Eigler @ 2011-10-27 20:56 UTC (permalink / raw)
To: gafton; +Cc: systemtap
Hi again, Christian -
I'm back for some more EC2 systemtap testing. You kindly fixed one packaging
problem with kernel-debuginfo back in August, if you recall. It turns out we
have a new problem; this one related to build-ids. systemtap uses ELF
build-id notes in order to verify version matching between the running kernel
and one whose ELF/DWARF files it's reading. On the current default AMI
kernel (2.6.35.14-95.38.amzn1.x86_64), there is a mismatch.
One can see this by hex-dumping /sys/kernel/notes on a running instance,
and contrasting it with
% readelf -x .notes /usr/lib/debug/lib/modules/`uname -r`/vmlinux
from the corresponding debuginfo. The last bunch of bytes are supposed
to be identical.
The build-id is getting corrupted at some point during the packaging process.
This precludes systemtap operation:
sudo stap -e 'probe kernel.function("sys_open"){}' -tv
Pass 1: parsed user script and 76 library script(s) using 96240virt/21920res/2788shr kb, in 130usr/20sys/151real ms.
Pass 2: analyzed script: 1 probe(s), 0 function(s), 0 embed(s), 0 global(s) using 196460virt/86980res/51868shr kb, in 270usr/140sys/414real ms.
Pass 3: translated to C into "/tmp/stapVRU8p1/stap_5789e459df56e64ee93d1b2d5fe74936_758.c" using 196460virt/87868res/52756shr kb, in 280usr/10sys/294real ms.
Pass 4: compiled C into "stap_5789e459df56e64ee93d1b2d5fe74936_758.ko" in 4380usr/1590sys/6490real ms.
Pass 5: starting run.
ERROR: Build-id mismatch: "kernel" vs. "vmlinux" byte 0 (0x7c vs 0x01) address 0xffffffff813218f4 rc 0
I seem to recall a kernel makefile (or perhaps elfutils) problem that
resulted in a problem like this before. IIRC, it was some sort of problem
during the vmlinux debuginfo stripping stage. Unfortunately, I can't find
a link to the fix of the actual problem.
cc:'ing our team to see if someone's memories can be jogged.
- FChE
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: Linux AMI 2011.09.1.x86_64-ebs issue: kernel aki-825ea7eb build-id munging
2011-10-27 20:56 Linux AMI 2011.09.1.x86_64-ebs issue: kernel aki-825ea7eb build-id munging Frank Ch. Eigler
@ 2011-10-27 21:25 ` Gafton, Cristian
2011-10-27 22:56 ` Mark Wielaard
1 sibling, 0 replies; 3+ messages in thread
From: Gafton, Cristian @ 2011-10-27 21:25 UTC (permalink / raw)
To: Frank Ch. Eigler; +Cc: systemtap
Thanks for the report, Frank - I will try to do some digging into it over the weekend and see if I can figure out where the build-ids are getting clobbered. If any additional information/memories come back to you please let me know. Sorry for not getting it fully right, still.
Cristian
-----Original Message-----
From: Frank Ch. Eigler [mailto:fche@redhat.com]
Sent: Thursday, October 27, 2011 1:56 PM
To: Gafton, Cristian
Cc: systemtap@sourceware.org
Subject: Linux AMI 2011.09.1.x86_64-ebs issue: kernel aki-825ea7eb build-id munging
Hi again, Christian -
I'm back for some more EC2 systemtap testing. You kindly fixed one packaging problem with kernel-debuginfo back in August, if you recall. It turns out we have a new problem; this one related to build-ids. systemtap uses ELF build-id notes in order to verify version matching between the running kernel and one whose ELF/DWARF files it's reading. On the current default AMI kernel (2.6.35.14-95.38.amzn1.x86_64), there is a mismatch.
One can see this by hex-dumping /sys/kernel/notes on a running instance, and contrasting it with % readelf -x .notes /usr/lib/debug/lib/modules/`uname -r`/vmlinux from the corresponding debuginfo. The last bunch of bytes are supposed to be identical.
The build-id is getting corrupted at some point during the packaging process.
This precludes systemtap operation:
sudo stap -e 'probe kernel.function("sys_open"){}' -tv Pass 1: parsed user script and 76 library script(s) using 96240virt/21920res/2788shr kb, in 130usr/20sys/151real ms.
Pass 2: analyzed script: 1 probe(s), 0 function(s), 0 embed(s), 0 global(s) using 196460virt/86980res/51868shr kb, in 270usr/140sys/414real ms.
Pass 3: translated to C into "/tmp/stapVRU8p1/stap_5789e459df56e64ee93d1b2d5fe74936_758.c" using 196460virt/87868res/52756shr kb, in 280usr/10sys/294real ms.
Pass 4: compiled C into "stap_5789e459df56e64ee93d1b2d5fe74936_758.ko" in 4380usr/1590sys/6490real ms.
Pass 5: starting run.
ERROR: Build-id mismatch: "kernel" vs. "vmlinux" byte 0 (0x7c vs 0x01) address 0xffffffff813218f4 rc 0
I seem to recall a kernel makefile (or perhaps elfutils) problem that resulted in a problem like this before. IIRC, it was some sort of problem during the vmlinux debuginfo stripping stage. Unfortunately, I can't find a link to the fix of the actual problem.
cc:'ing our team to see if someone's memories can be jogged.
- FChE
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Linux AMI 2011.09.1.x86_64-ebs issue: kernel aki-825ea7eb build-id munging
2011-10-27 20:56 Linux AMI 2011.09.1.x86_64-ebs issue: kernel aki-825ea7eb build-id munging Frank Ch. Eigler
2011-10-27 21:25 ` Gafton, Cristian
@ 2011-10-27 22:56 ` Mark Wielaard
1 sibling, 0 replies; 3+ messages in thread
From: Mark Wielaard @ 2011-10-27 22:56 UTC (permalink / raw)
To: Frank Ch. Eigler; +Cc: gafton, systemtap
On Thu, 2011-10-27 at 16:55 -0400, Frank Ch. Eigler wrote:
> On the current default AMI
> kernel (2.6.35.14-95.38.amzn1.x86_64), there is a mismatch.
>
> One can see this by hex-dumping /sys/kernel/notes on a running instance,
> and contrasting it with
> % readelf -x .notes /usr/lib/debug/lib/modules/`uname -r`/vmlinux
> from the corresponding debuginfo. The last bunch of bytes are supposed
> to be identical.
> [...]
> I seem to recall a kernel makefile (or perhaps elfutils) problem that
> resulted in a problem like this before. IIRC, it was some sort of problem
> during the vmlinux debuginfo stripping stage. Unfortunately, I can't find
> a link to the fix of the actual problem.
>
> cc:'ing our team to see if someone's memories can be jogged.
It doesn't immediately ring a bell [*].
Just to be sure. Does /sys/kernel/notes match
readelf -x .notes /boot/vmlinuz-`uname -r` ?
Is there anything else about /boot/vmlinuz-`uname -r`
vs /usr/lib/debug/lib/modules/`uname -r`/vmlinux that might indicate a
mismatch? Or does everything look fine if you just hack out the stap
build-id safety-check?
Cheers,
Mark
[*] There was https://bugzilla.redhat.com/show_bug.cgi?id=590947
"debugedit vs modsign changes build ID", but that should only
impact kernel modules, not the vmlinuz image itself.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-10-27 22:56 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-27 20:56 Linux AMI 2011.09.1.x86_64-ebs issue: kernel aki-825ea7eb build-id munging Frank Ch. Eigler
2011-10-27 21:25 ` Gafton, Cristian
2011-10-27 22:56 ` Mark Wielaard
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).