public inbox for dwz@sourceware.org
 help / color / mirror / Atom feed
* [Bug default/25449] Factor out compilation units
  2020-01-01  0:00 [Bug default/25449] New: Factor out compilation units vries at gcc dot gnu.org
  2020-01-01  0:00 ` [Bug default/25449] " vries at gcc dot gnu.org
  2020-01-01  0:00 ` vries at gcc dot gnu.org
@ 2020-01-01  0:00 ` vries at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: vries at gcc dot gnu.org @ 2020-01-01  0:00 UTC (permalink / raw)
  To: dwz

https://sourceware.org/bugzilla/show_bug.cgi?id=25449

--- Comment #1 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #0)
> So, as a first step we could do a optimization in DWZ to look at all the
> items that are selected to be moved into a PU, and decide whether we can
> transform the PU into a CU and drop the imports.

One of the requirements probably has to be that the items are from a single
language.

> An #include directive appearing outside any other declarations is a good
> candidate to be represented using DW_TAG_compile_unit.
> 
> However, an #include appearing inside a C++ namespace declaration or a
> function, for example, is not a good candidate because the entities included
> are not necessarily file level entities.

The appendix suggests DIEs in a namespace are not good candidates, but I think
what that tries to say is that if we originally we have a DIE in a namespace:
...
DIE2: compilation unit B
  DIE3: namespace bla
    DIE1
...
and do some factoring out like so:
...
DIE0: factored-out unit A
  DIE1
DIE2: compilation unit B
  DIE3: namespace bla
    DIE4: import DIE0
...
the factored-out unit cannot use DW_TAG_compile_unit, because DIE1 is not a
globally visible entry.

However, dwz generates this type of partial unit:
...
DIE0: partial unit A
  DIE3: namespace bla
    DIE1
DIE2: compilation unit B
  DIE4: import DIE0
...
which basically works around this problem, and I don't see a reason here why
unit A can't be a compilation unit.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug default/25449] Factor out compilation units
  2020-01-01  0:00 [Bug default/25449] New: Factor out compilation units vries at gcc dot gnu.org
  2020-01-01  0:00 ` [Bug default/25449] " vries at gcc dot gnu.org
@ 2020-01-01  0:00 ` vries at gcc dot gnu.org
  2020-01-01  0:00 ` vries at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: vries at gcc dot gnu.org @ 2020-01-01  0:00 UTC (permalink / raw)
  To: dwz

https://sourceware.org/bugzilla/show_bug.cgi?id=25449

--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #0)
> So, there also is an option to tag the created units with
> DW_TAG_compile_unit instead of DW_TAG_partial_unit, which means no
> requirement to create DW_TAG_imported_unit/DW_AT_import for such units,
> which means better compression.

The bit of "no requirement to create DW_TAG_imported_unit/DW_AT_import for such
units" is not entirely trivial.

In appendix E we find:
...
Use of DW_TAG_imported_unit

A DW_TAG_imported_unit debugging information entry has an DW_AT_import
attribute referencing a DW_TAG_compile_unit or DW_TAG_partial_unit debugging
information entry.

A DW_TAG_imported_unit debugging information entry refers to a
DW_TAG_compile_unit or DW_TAG_partial_unit debugging information entry to
specify that the DW_TAG_compile_unit or DW_TAG_partial_unit contents logically
appear at the point of the DW_TAG_imported_unit entry.
...

So, it's possible to do an import of a compile unit.

But in the first C++ example in E.1, the import statement for the compilation
unit is missing, while in the first Fortran example in E1, the import statement
for the partial unit is included.

Furthermore, at 3.1.1 Normal and Partial Compilation Unit Entries, we have:
...
A compilation unit entry owns debugging information entries that represent all
or part of the declarations made in the corresponding compilation. In the case
of a partial compilation unit, the containing scope of its owned declarations
is indicated by imported unit entries in one or more other compilation unit
entries that refer to that partial compilation unit.
...

A bit of explanation about when import is used and when not occurs here in E.1
"C example":
...
The C++ example in this Section might appear to be equally valid as a C
example. However, it is prudent to include a DW_TAG_imported_unit in the
primary unit (see Figure 84) with an DW_AT_import attribute that refers to the
proper unit in the section group.

The C rules for consistency of global (file scope) symbols across compilations
are less strict than for C++; inclusion of the import unit attribute assures
that the declarations of the proper section group are considered before
declarations from other compilations.
...

So, the jist of this seems to be:
- factored out partial unit: needs import
- factored out compilation unit:
  - prudent to import from C compilation unit (but we can have f.i. a
    command line option to not do this, and see what breaks)
  - not required from C++ compilation unit

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug default/25449] New: Factor out compilation units
@ 2020-01-01  0:00 vries at gcc dot gnu.org
  2020-01-01  0:00 ` [Bug default/25449] " vries at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: vries at gcc dot gnu.org @ 2020-01-01  0:00 UTC (permalink / raw)
  To: dwz

https://sourceware.org/bugzilla/show_bug.cgi?id=25449

            Bug ID: 25449
           Summary: Factor out compilation units
           Product: dwz
           Version: unspecified
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: default
          Assignee: nobody at sourceware dot org
          Reporter: vries at gcc dot gnu.org
                CC: dwz at sourceware dot org
  Target Milestone: ---

The dwarf standard contains "Appendix E -- DWARF Compression and Duplicate
Elimination (informative)", describing a technique on how to generate smaller
debug information:
...
to break up the debug information of a compilation into separate normal and
partial compilation units, each consisting of one or more sections. By
arranging that a sufficiently similar partitioning occurs in other
compilations, a suitable system linker can delete redundant groups of sections
when combining object files.
...

DWZ implements this scheme, but with the approach (described in pre-link terms
in the appendix) applied post-link.

It does this by:
- moving common DIEs into partial units (tagged with DW_TAG_partial_unit),
- generating DW_TAG_imported_unit/DW_AT_import to import the partial units
  into the compilation units which originally contained the DIEs
- referencing DIEs in partial units using DW_FORM_ref_addr, when referenced
from
  the originally containing compilation units or other partial units.

The appendix has a bit though on "Use of DW_TAG_compile_unit versus
DW_TAG_partial_unit":
...
A section group compilation unit that uses DW_TAG_compile_unit is like any
other compilation unit, in that its contents are evaluated by consumers as
though it were an ordinary compilation unit.

An #include directive appearing outside any other declarations is a good
candidate to be represented using DW_TAG_compile_unit.

However, an #include appearing inside a C++ namespace declaration or a
function, for example, is not a good candidate because the entities included
are not necessarily file level entities.

<SNIP>

Consequently a compiler must use DW_TAG_partial_unit (instead of
DW_TAG_compile_unit) in a section group whenever the section group contents are
not necessarily globally visible.

This directs consumers to ignore that compilation unit when scanning top level
declarations and definitions.

The DW_TAG_partial_unit compilation unit will be referenced from elsewhere and
the referencing locations give the appropriate context for interpreting the
partial compilation unit.
...

So, there also is an option to tag the created units with DW_TAG_compile_unit
instead of DW_TAG_partial_unit, which means no requirement to create
DW_TAG_imported_unit/DW_AT_import for such units, which means better
compression.

The first C++ example in the appendix shows this situation, and states:
...
This example uses DW_TAG_compile_unit for the section group, implying that the
contents of the compilation unit are globally visible (in accordance with C++
language rules). DW_TAG_partial_unit is not needed for the same reason.
...

So, as a first step we could do a optimization in DWZ to look at all the items
that are selected to be moved into a PU, and decide whether we can transform
the PU into a CU and drop the imports.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug default/25449] Factor out compilation units
  2020-01-01  0:00 [Bug default/25449] New: Factor out compilation units vries at gcc dot gnu.org
@ 2020-01-01  0:00 ` vries at gcc dot gnu.org
  2020-01-01  0:00 ` vries at gcc dot gnu.org
  2020-01-01  0:00 ` vries at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: vries at gcc dot gnu.org @ 2020-01-01  0:00 UTC (permalink / raw)
  To: dwz

https://sourceware.org/bugzilla/show_bug.cgi?id=25449

--- Comment #3 from Tom de Vries <vries at gcc dot gnu.org> ---
(In reply to Tom de Vries from comment #2)
> So, the jist of this seems to be:
> - factored out partial unit: needs import
> - factored out compilation unit:
>   - prudent to import from C compilation unit (but we can have f.i. a
>     command line option to not do this, and see what breaks)

Alternatively, we can try to prove we don't need the import.  If the all the
elements in the factored out compilation unit are uniquely named in link scope,
there's no confusion about which is meant, and we don't need the import.

>   - not required from C++ compilation unit

Conversely, this may cause problems because there may be different DIEs with
the same globally unique name which are not structurally equivalent. This
happens for instance with member function templates, where a DIE in one CU
representing a named struct can have extra members representing the various
member function template instantiations in the CU, making the DIE potentially
different from other DIEs representing the same named struct in other CUs. [
Which is why we want a --odr-mode=unify option. ]

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-01-23 13:07 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-01  0:00 [Bug default/25449] New: Factor out compilation units vries at gcc dot gnu.org
2020-01-01  0:00 ` [Bug default/25449] " vries at gcc dot gnu.org
2020-01-01  0:00 ` vries at gcc dot gnu.org
2020-01-01  0:00 ` vries at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).