public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* working with debug symbols
@ 2010-05-19  5:48 Nikos Karampatziakis
  2010-05-19 11:18 ` Nick Clifton
  0 siblings, 1 reply; 4+ messages in thread
From: Nikos Karampatziakis @ 2010-05-19  5:48 UTC (permalink / raw)
  To: binutils

Hi everyone,

I am working on a project where, ideally, we would like to be able to
say for every byte in the .text section of an (ELF or PE) executable,
whether or not it is part of an instruction or not (i.e. whether it's
code or data). The executables will be compiled with gcc and debugging
information. At which granularity can we extract this information from
the debug symbols? I know it's possible to get back the addresses of
the function entry points. I assume we can also get the address of the
first instruction of most lines in a source file. I would appreciate
some pointers on how to do that. However, a way to tag each address as
code or data would be even better. Finally, will any of these break if
the debug symbols are stored outside the executable?

Thank you,
Nikos

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: working with debug symbols
  2010-05-19  5:48 working with debug symbols Nikos Karampatziakis
@ 2010-05-19 11:18 ` Nick Clifton
  2010-05-19 13:45   ` Nikos Karampatziakis
  0 siblings, 1 reply; 4+ messages in thread
From: Nick Clifton @ 2010-05-19 11:18 UTC (permalink / raw)
  To: Nikos Karampatziakis; +Cc: binutils

Hi Nikos,

> I am working on a project where, ideally, we would like to be able to
> say for every byte in the .text section of an (ELF or PE) executable,
> whether or not it is part of an instruction or not (i.e. whether it's
> code or data). The executables will be compiled with gcc and debugging
> information. At which granularity can we extract this information from
> the debug symbols?

It depends upon the architecture that you are examining.  For example it 
would normally be reasonable to assume that every byte inside a function 
is part of an instruction, and so given a function's start address and 
its length you can make reasonable estimates as to the location of every 
instruction byte.  But if the compiler for the target architecture 
places constant pools inside functions then the assumption does not hold.

> However, a way to tag each address as
> code or data would be even better.

Have you looked at the ARM port's use of mapping symbols ?  The compiler 
inserts these into the assembler output to indicate a change between 
data and instructions (and between different types of instructions) and 
then tools like the debugger can look for these special symbols to 
determine the nature of any given byte.

> Finally, will any of these break if
> the debug symbols are stored outside the executable?

That should not matter, provided that whatever tool you are creating is 
able to follow the information in the .gnu_debug_link section to the 
external debug information.

Cheers
   Nick

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: working with debug symbols
  2010-05-19 11:18 ` Nick Clifton
@ 2010-05-19 13:45   ` Nikos Karampatziakis
  2010-05-19 15:31     ` Nick Clifton
  0 siblings, 1 reply; 4+ messages in thread
From: Nikos Karampatziakis @ 2010-05-19 13:45 UTC (permalink / raw)
  To: Nick Clifton; +Cc: binutils

Hi Nick,

On Wed, May 19, 2010 at 7:18 AM, Nick Clifton <nickc@redhat.com> wrote:
> Hi Nikos,
>
>> I am working on a project where, ideally, we would like to be able to
>> say for every byte in the .text section of an (ELF or PE) executable,
>> whether or not it is part of an instruction or not (i.e. whether it's
>> code or data). The executables will be compiled with gcc and debugging
>> information. At which granularity can we extract this information from
>> the debug symbols?
>
> It depends upon the architecture that you are examining.  For example it
> would normally be reasonable to assume that every byte inside a function is
> part of an instruction, and so given a function's start address and its
> length you can make reasonable estimates as to the location of every
> instruction byte.  But if the compiler for the target architecture places
> constant pools inside functions then the assumption does not hold.
>

We are interested in x86 for now. Are the jump tables for switch
statements stored inside the functions?

Best,
-Nikos

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: working with debug symbols
  2010-05-19 13:45   ` Nikos Karampatziakis
@ 2010-05-19 15:31     ` Nick Clifton
  0 siblings, 0 replies; 4+ messages in thread
From: Nick Clifton @ 2010-05-19 15:31 UTC (permalink / raw)
  To: Nikos Karampatziakis; +Cc: binutils

Hi Nikos,

> We are interested in x86 for now. Are the jump tables for switch
> statements stored inside the functions?

Probably not for you, but there are a few cases where they are. 
(Specifically when generating code for a 64-bit x86 Mach-O target, or a 
32-bit x86 target on a system where the assembler does not support the 
GOTOFF relocation in data sections).  You can easily compile a small 
test application to find out.

Cheers
   Nick


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-05-19 15:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-19  5:48 working with debug symbols Nikos Karampatziakis
2010-05-19 11:18 ` Nick Clifton
2010-05-19 13:45   ` Nikos Karampatziakis
2010-05-19 15:31     ` Nick Clifton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).