Re: Transparent imports of partial units

public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed

* Re: Transparent imports of partial units
@ 2014-11-07 14:08 Mark Wielaard
  0 siblings, 0 replies; 5+ messages in thread
From: Mark Wielaard @ 2014-11-07 14:08 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1729 bytes --]

On Fri, 2014-10-31 at 15:48 -0700, Josh Stone wrote:
> On 10/31/2014 11:26 AM, Petr Machata wrote:
> > Mark Wielaard <mjw@redhat.com> writes:
> > 
> >> So, assuming we can use the long int padding field as Dwarf_Die pointer,
> > 
> > In other words, do we want to keep backward compatibility on LLP64
> > architectures?  I know Widnows do use this model, do we care?  Any
> > Unices with this model out there?  FWIW, x32 and (MIPS) n32 are both
> > ILP32 ABI's.
> 
> For reference:
> 
>   /* DIE information.  */
>   typedef struct
>   {
>     /* The offset can be computed from the address.  */
>     void *addr;
>     struct Dwarf_CU *cu;
>     Dwarf_Abbrev *abbrev;
>     // XXX We'll see what other information will be needed.
>     long int padding__;
>   } Dwarf_Die;
> 
> Even if that padding is only 4 bytes on LLP64, wouldn't the whole struct
> still be 32 bytes for alignment?  So you may be able to cheat this one,
> turn it into a pointer anyway.
> 
> But I'm not sure if things like struct copies will include alignment
> padding.  Probably not, so then this wouldn't work after all...
> 
> Anyway, at least this only cites Windows 64 as being LLP64:
> https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models
> 
> And cppref agrees, though it notes some early Unix systems were ILP64.
> http://en.cppreference.com/w/cpp/language/types#Data_models

But ILP64 wouldn't be a problem since long and pointer have the same
size. So changing the long to a pointer would really only be an ABI
issue for Windows 64 API (even Windows 16 and 32 should be fine).

> But really, I think it's safe to ignore Windows issues in elfutils...
> 
> :)

Agreed.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Transparent imports of partial units
@ 2014-10-31 22:48 Josh Stone
  0 siblings, 0 replies; 5+ messages in thread
From: Josh Stone @ 2014-10-31 22:48 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1953 bytes --]

On 10/31/2014 11:26 AM, Petr Machata wrote:
> Mark Wielaard <mjw@redhat.com> writes:
> 
>> So, assuming we can use the long int padding field as Dwarf_Die pointer,
> 
> In other words, do we want to keep backward compatibility on LLP64
> architectures?  I know Widnows do use this model, do we care?  Any
> Unices with this model out there?  FWIW, x32 and (MIPS) n32 are both
> ILP32 ABI's.

For reference:

  /* DIE information.  */
  typedef struct
  {
    /* The offset can be computed from the address.  */
    void *addr;
    struct Dwarf_CU *cu;
    Dwarf_Abbrev *abbrev;
    // XXX We'll see what other information will be needed.
    long int padding__;
  } Dwarf_Die;

Even if that padding is only 4 bytes on LLP64, wouldn't the whole struct
still be 32 bytes for alignment?  So you may be able to cheat this one,
turn it into a pointer anyway.

But I'm not sure if things like struct copies will include alignment
padding.  Probably not, so then this wouldn't work after all...

Anyway, at least this only cites Windows 64 as being LLP64:
https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models

And cppref agrees, though it notes some early Unix systems were ILP64.
http://en.cppreference.com/w/cpp/language/types#Data_models

For fun, I tried to cross-compile to x86_64-w64-mingw32 from F20.  I had
to use the portable branch for lack of __thread, and then I still ran
into a bunch of headers that had no suitable provider in the repos.

$ make -k |& grep '#include' | sort -u
 #include <argp.h>
 #include <ar.h>
 #include <byteswap.h>
 #include <endian.h>
 #include <features.h>
 #include <fnmatch.h>
 #include <fts.h>
 #include <obstack.h>
 #include <stdio_ext.h>
 #include <sys/mman.h>

That doesn't mean it's impossible to find those headers, just that
they're not in Fedora's mingw packages.  But really, I think it's safe
to ignore Windows issues in elfutils...

:)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Transparent imports of partial units
@ 2014-10-31 18:26 Petr Machata
  0 siblings, 0 replies; 5+ messages in thread
From: Petr Machata @ 2014-10-31 18:26 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 2370 bytes --]

Mark Wielaard <mjw@redhat.com> writes:

> So, assuming we can use the long int padding field as Dwarf_Die pointer,

In other words, do we want to keep backward compatibility on LLP64
architectures?  I know Widnows do use this model, do we care?  Any
Unices with this model out there?  FWIW, x32 and (MIPS) n32 are both
ILP32 ABI's.

If this is a concern, we can keep it as is and use the value as an index
into an internal array of cached DIE's instead of raw pointer.

Actually, we may want to choose this way also because it allows us to
take only part of the remaining space, and save the rest for rainy day.
But the savings can't be too aggresive.  We can save a bit by storing
import points at target partial unit instead of putting it all together
at Dwarf, but the space complexity will still be essentially O(number of
DIE's in Dwarf).  I'm reluctantly proposing 24 bits as a "good enough
for everybody" number.  That would allow us to index 16 million import
paths to any partial units, and save 8 bits for whatever else we need.
But I'm not quite convinced myself, I have Dwarfs with 16 million DIE's
in my personal dwgrep test bench.

> each Dwarf_Die returned by dwarf_child_aggregate (ugly name!) that came

Not only dwarf_child_aggregate, also dwarf_siblingof_aggregate.  I think
it's perfectly valid to have an import-point in the middle of sibling
chain, or to have several import points in one chain.

> through an imported_unit DIE would point back to that imported_unit and
> dwarf_sibling_aggregate would propagate that back pointer to each
> sibling. And when there are no more sibling and the back pointer is set,
> then dwarf_sibling_aggregate would continue at the imported DIE pointed
> to and take the next sibling there?

That's the idea.

> I don't immediately see any drawbacks. The only thing I can think of is
> that we might want to provide a equal/identity function for Dwarf_Die,
> in case people want to know whether two "raw" DIEs are really the same
> (although people can already do that given the addr and CU fields I see
> now).

Yeah, they would just have to throw our pointer/index thingy into the
mix.  Having an interface for this seems more prudent, and would allow
us to tweak the exact representation should the need arise (such as when
we run out of space).

Thanks,
Petr

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Transparent imports of partial units
@ 2014-10-31 14:40 Mark Wielaard
  0 siblings, 0 replies; 5+ messages in thread
From: Mark Wielaard @ 2014-10-31 14:40 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 2267 bytes --]

On Tue, 2014-10-14 at 02:24 +0200, Petr Machata wrote:
> It seems inefficient to have to replicate this logic in each client.  We
> have interfaces for transparent integration of attributes, why not do
> something similar for partial unit imports?

Agreed. I updated systemtap some time ago and there were surprisingly
many places that needed to be updated. I hope I caught them all.

> I don't have any particular plan, but at least dwarf_child and
> dwarf_siblingof seem like they shoud have transparent-importing twins.
> My abstract ideas count on caching DIE's of the integration points and
> referencing them through the now-unused padding field of Dwarf_Die.
> Then when dwarf_siblingof hits the end of DIE chain, it checks this
> field and possibly continues the iteration there.

That already sounds like an (abstract) plan :)
So, assuming we can use the long int padding field as Dwarf_Die pointer,
each Dwarf_Die returned by dwarf_child_aggregate (ugly name!) that came
through an imported_unit DIE would point back to that imported_unit and
dwarf_sibling_aggregate would propagate that back pointer to each
sibling. And when there are no more sibling and the back pointer is set,
then dwarf_sibling_aggregate would continue at the imported DIE pointed
to and take the next sibling there?

I don't immediately see any drawbacks. The only thing I can think of is
that we might want to provide a equal/identity function for Dwarf_Die,
in case people want to know whether two "raw" DIEs are really the same
(although people can already do that given the addr and CU fields I see
now).

> On top of the above, there are places in libdw where dwarf_child and
> dwarf_siblingof are called.  E.g. in dwarf_aggregate_size, there could
> in theory be an import point between DW_TAG_enumeration_type and its
> DW_TAG_enumerator's.  That's not currently handled.  It seems like in
> these sorts of cases, we should use the new interfaces.

Yes, this only happens to work currently because dwz only uses import
DIEs as "top-level" CU children, so the DIEs you are feeding these
functions probably don't have any children that are imported units, but
if they will in the future then things break unexpectedly.

Thanks,

Mark

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Transparent imports of partial units
@ 2014-10-14  0:24 Petr Machata
  0 siblings, 0 replies; 5+ messages in thread
From: Petr Machata @ 2014-10-14  0:24 UTC (permalink / raw)
  To: elfutils-devel

[-- Attachment #1: Type: text/plain, Size: 1578 bytes --]

Hi there,

with dwz out there in the wild and being used, clients will generally
have to deal with handling of partial units.  (They always kinda had to,
but now they have to for real.)  In theory, every time that you call
dwarf_child or dwarf_siblingof, you should do a little dance of checking
whether the DIE happens to be DW_TAG_imported_unit, and if yes, locate
that unit and look at its children, possibly recursively resolving more
imported units.  You then should retract along the same lines when
dwarf_siblingof hits the wall.

It seems inefficient to have to replicate this logic in each client.  We
have interfaces for transparent integration of attributes, why not do
something similar for partial unit imports?  Currently partial units are
only handled in __libdw_visit_scopes.

I don't have any particular plan, but at least dwarf_child and
dwarf_siblingof seem like they shoud have transparent-importing twins.
My abstract ideas count on caching DIE's of the integration points and
referencing them through the now-unused padding field of Dwarf_Die.
Then when dwarf_siblingof hits the end of DIE chain, it checks this
field and possibly continues the iteration there.

On top of the above, there are places in libdw where dwarf_child and
dwarf_siblingof are called.  E.g. in dwarf_aggregate_size, there could
in theory be an import point between DW_TAG_enumeration_type and its
DW_TAG_enumerator's.  That's not currently handled.  It seems like in
these sorts of cases, we should use the new interfaces.

Opinions?

Thanks,
Petr

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-11-07 14:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-07 14:08 Transparent imports of partial units Mark Wielaard
  -- strict thread matches above, loose matches on Subject: below --
2014-10-31 22:48 Josh Stone
2014-10-31 18:26 Petr Machata
2014-10-31 14:40 Mark Wielaard
2014-10-14  0:24 Petr Machata

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).