public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* LTO Dead Field Elimination
@ 2020-07-24 15:43 Erick Ochoa
  2020-07-27 12:36 ` Richard Biener
  2020-07-27 15:11 ` David Brown
  0 siblings, 2 replies; 8+ messages in thread
From: Erick Ochoa @ 2020-07-24 15:43 UTC (permalink / raw)
  To: GCC Development
  Cc: Christoph Müllner, Philipp Tomsich, Tamar Christina,
	Kevin Smith, Richard Biener, Martin Jambor, Jan Hubicka,
	Gary Oblock, Ashok Bhat, mliska, kyrylo.tkachov

This patchset brings back struct reorg to GCC.

We’ve been working on improving cache utilization recently and would 
like to share our current implementation to receive some feedback on it.

Essentially, we’ve implemented the following components:

     Type-based escape analysis to determine if we can reorganize a type 
at link-time

     Dead-field elimination to remove unused fields of a struct at 
link-time

The type-based escape analysis provides a list of types, that are not 
visible outside of the current linking unit (e.g. parameter types of 
external functions).

The dead-field elimination pass analyses non-escaping structs for fields 
that are not used in the linking unit and thus can be removed. The 
resulting struct has a smaller memory footprint, which allows for a 
higher cache utilization.

As a side-effect a couple of new infrastructure code has been written 
(e.g. a type walker, which we were really missing in GCC), which can be 
of course reused for other passes as well.

We’ve prepared a patchset in the following branch:

   refs/vendors/ARM/heads/arm-struct-reorg-wip

We’ve also added a subsection in the GCC internals document to allow 
other compiler devs to better understand our design and implementation. 
A generated PDF can be found here:

    https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F
    https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download

    page: 719

We’ve been testing the pass against a range of in-tree tests and 
real-life applications (e.g. all SPEC CPU2017 C benchmarks). For 
testing, please see testing subsection in the gcc internals we prepared.

Currently we see the following limitations:

* It is not a "true" ipa pass yes. That is, we can only succeed with 
-flto-partition=none.
* Currently it is not safe to use -fipa-sra.
* Brace constructors not supported now. We handle this gracefully.
* Only C as of now.
* Results of sizeof() and offsetof() are generated in the compiler 
frontend and thus can’t be changed later at link time. There are a 
couple of ideas to resolve this, but that’s currently unimplemented.
* At this point we’d like to thank the GCC community for their patient 
help so far on the mailing list and in other channels. And we ask for 
your support in terms of feedback, comments and testing.

Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: LTO Dead Field Elimination
  2020-07-24 15:43 LTO Dead Field Elimination Erick Ochoa
@ 2020-07-27 12:36 ` Richard Biener
  2020-07-27 12:52   ` Erick Ochoa
  2020-07-27 12:59   ` Christoph Müllner
  2020-07-27 15:11 ` David Brown
  1 sibling, 2 replies; 8+ messages in thread
From: Richard Biener @ 2020-07-27 12:36 UTC (permalink / raw)
  To: Erick Ochoa
  Cc: GCC Development, Christoph Müllner, Philipp Tomsich,
	Tamar Christina, Kevin Smith, Martin Jambor, Jan Hubicka,
	Gary Oblock, Ashok Bhat, Martin Liška, Kyrylo Tkachov

On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa
<erick.ochoa@theobroma-systems.com> wrote:
>
> This patchset brings back struct reorg to GCC.
>
> We’ve been working on improving cache utilization recently and would
> like to share our current implementation to receive some feedback on it.
>
> Essentially, we’ve implemented the following components:
>
>      Type-based escape analysis to determine if we can reorganize a type
> at link-time
>
>      Dead-field elimination to remove unused fields of a struct at
> link-time
>
> The type-based escape analysis provides a list of types, that are not
> visible outside of the current linking unit (e.g. parameter types of
> external functions).
>
> The dead-field elimination pass analyses non-escaping structs for fields
> that are not used in the linking unit and thus can be removed. The
> resulting struct has a smaller memory footprint, which allows for a
> higher cache utilization.
>
> As a side-effect a couple of new infrastructure code has been written
> (e.g. a type walker, which we were really missing in GCC), which can be
> of course reused for other passes as well.
>
> We’ve prepared a patchset in the following branch:
>
>    refs/vendors/ARM/heads/arm-struct-reorg-wip

Just had some time to peek into this.  Ugh.  The code doesn't look like
GCC code looks :/  It doesn't help to have one set of files per C++ class (25!).
The code itself is undocumented - it's hard to understand what the purpose
of all the Walker stuff is.

You also didn't seem to know walk_tree () nor walk_gimple* ().

Take as example - I figured to look for IPA pass entries, then I see

+
+static void
+collect_types ()
+{
+  GimpleTypeCollector collector;
+  collector.walk ();
+  collector.print_collected ();
+  ptrset_t types = collector.get_pointer_set ();
+  GimpleCaster caster (types);
+  caster.walk ();
+  if (flag_print_cast_analysis)
+    caster.print_reasons ();
+  ptrset_t casting = caster.get_sets ();
+  fix_escaping_types_in_set (casting);
+  GimpleAccesser accesser;
+  accesser.walk ();
+  if (flag_print_access_analysis)
+    accesser.print_accesses ();
+  record_field_map_t record_field_map = accesser.get_map ();
+  TypeIncompleteEquality equality;
+  bool has_fields_that_can_be_deleted = false;
+  typedef std::set<unsigned> field_offsets_t;

there's no comments (not even file-level) that explains how type escape
is computed.

Sorry, but this isn't even close to be coarsely reviewable.

> We’ve also added a subsection in the GCC internals document to allow
> other compiler devs to better understand our design and implementation.
> A generated PDF can be found here:
>
>     https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F
>     https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download
>
>     page: 719
>
> We’ve been testing the pass against a range of in-tree tests and
> real-life applications (e.g. all SPEC CPU2017 C benchmarks). For
> testing, please see testing subsection in the gcc internals we prepared.
>
> Currently we see the following limitations:
>
> * It is not a "true" ipa pass yes. That is, we can only succeed with
> -flto-partition=none.
> * Currently it is not safe to use -fipa-sra.
> * Brace constructors not supported now. We handle this gracefully.
> * Only C as of now.
> * Results of sizeof() and offsetof() are generated in the compiler
> frontend and thus can’t be changed later at link time. There are a
> couple of ideas to resolve this, but that’s currently unimplemented.
> * At this point we’d like to thank the GCC community for their patient
> help so far on the mailing list and in other channels. And we ask for
> your support in terms of feedback, comments and testing.
>
> Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: LTO Dead Field Elimination
  2020-07-27 12:36 ` Richard Biener
@ 2020-07-27 12:52   ` Erick Ochoa
  2020-07-27 13:02     ` Jakub Jelinek
  2020-07-27 12:59   ` Christoph Müllner
  1 sibling, 1 reply; 8+ messages in thread
From: Erick Ochoa @ 2020-07-27 12:52 UTC (permalink / raw)
  To: Richard Biener
  Cc: GCC Development, Christoph Müllner, Philipp Tomsich,
	Tamar Christina, Kevin Smith, Martin Jambor, Jan Hubicka,
	Gary Oblock, Ashok Bhat, Martin Liška, Kyrylo Tkachov



On 27/07/2020 14:36, Richard Biener wrote:
> On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa
> <erick.ochoa@theobroma-systems.com> wrote:
>>
>> This patchset brings back struct reorg to GCC.
>>
>> We’ve been working on improving cache utilization recently and would
>> like to share our current implementation to receive some feedback on it.
>>
>> Essentially, we’ve implemented the following components:
>>
>>       Type-based escape analysis to determine if we can reorganize a type
>> at link-time
>>
>>       Dead-field elimination to remove unused fields of a struct at
>> link-time
>>
>> The type-based escape analysis provides a list of types, that are not
>> visible outside of the current linking unit (e.g. parameter types of
>> external functions).
>>
>> The dead-field elimination pass analyses non-escaping structs for fields
>> that are not used in the linking unit and thus can be removed. The
>> resulting struct has a smaller memory footprint, which allows for a
>> higher cache utilization.
>>
>> As a side-effect a couple of new infrastructure code has been written
>> (e.g. a type walker, which we were really missing in GCC), which can be
>> of course reused for other passes as well.
>>
>> We’ve prepared a patchset in the following branch:
>>
>>     refs/vendors/ARM/heads/arm-struct-reorg-wip
> 
> Just had some time to peek into this.  Ugh.  The code doesn't look like
> GCC code looks :/  It doesn't help to have one set of files per C++ class (25!).
> The code itself is undocumented - it's hard to understand what the purpose
> of all the Walker stuff is.
> 
> You also didn't seem to know walk_tree () nor walk_gimple* ().
> 
> Take as example - I figured to look for IPA pass entries, then I see
> 
> +
> +static void
> +collect_types ()
> +{
> +  GimpleTypeCollector collector;
> +  collector.walk ();
> +  collector.print_collected ();
> +  ptrset_t types = collector.get_pointer_set ();
> +  GimpleCaster caster (types);
> +  caster.walk ();
> +  if (flag_print_cast_analysis)
> +    caster.print_reasons ();
> +  ptrset_t casting = caster.get_sets ();
> +  fix_escaping_types_in_set (casting);
> +  GimpleAccesser accesser;
> +  accesser.walk ();
> +  if (flag_print_access_analysis)
> +    accesser.print_accesses ();
> +  record_field_map_t record_field_map = accesser.get_map ();
> +  TypeIncompleteEquality equality;
> +  bool has_fields_that_can_be_deleted = false;
> +  typedef std::set<unsigned> field_offsets_t;
> 
> there's no comments (not even file-level) that explains how type escape
> is computed.
> 
> Sorry, but this isn't even close to be coarsely reviewable.

Thanks Richard!

I will work on making this more readable. I understand your comments and 
you are right. I am currently in the process of writing a human-readable 
PDF with an overview of this. There is already a (somewhat hurried) 
version of this PDF in the GCC internals link we shared. This should 
answer most of the high level questions about how the type escape 
analysis is computed. However, yes, there is still work to be done. I 
will start looking into walk_tree and walk_gimple* and see if the 
gimple_walker I wrote is redundant and substitute it.

Thanks for the input!

> 
>> We’ve also added a subsection in the GCC internals document to allow
>> other compiler devs to better understand our design and implementation.
>> A generated PDF can be found here:
>>
>>      https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F
>>      https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download
>>
>>      page: 719
>>
>> We’ve been testing the pass against a range of in-tree tests and
>> real-life applications (e.g. all SPEC CPU2017 C benchmarks). For
>> testing, please see testing subsection in the gcc internals we prepared.
>>
>> Currently we see the following limitations:
>>
>> * It is not a "true" ipa pass yes. That is, we can only succeed with
>> -flto-partition=none.
>> * Currently it is not safe to use -fipa-sra.
>> * Brace constructors not supported now. We handle this gracefully.
>> * Only C as of now.
>> * Results of sizeof() and offsetof() are generated in the compiler
>> frontend and thus can’t be changed later at link time. There are a
>> couple of ideas to resolve this, but that’s currently unimplemented.
>> * At this point we’d like to thank the GCC community for their patient
>> help so far on the mailing list and in other channels. And we ask for
>> your support in terms of feedback, comments and testing.
>>
>> Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: LTO Dead Field Elimination
  2020-07-27 12:36 ` Richard Biener
  2020-07-27 12:52   ` Erick Ochoa
@ 2020-07-27 12:59   ` Christoph Müllner
  2020-07-27 13:12     ` Richard Biener
  2020-07-27 13:15     ` David Edelsohn
  1 sibling, 2 replies; 8+ messages in thread
From: Christoph Müllner @ 2020-07-27 12:59 UTC (permalink / raw)
  To: Richard Biener, Erick Ochoa
  Cc: GCC Development, Philipp Tomsich, Tamar Christina, Kevin Smith,
	Martin Jambor, Jan Hubicka, Gary Oblock, Ashok Bhat,
	Martin Liška, Kyrylo Tkachov

Hi Richard,

On 7/27/20 2:36 PM, Richard Biener wrote:
> On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa
> <erick.ochoa@theobroma-systems.com> wrote:
>>
>> This patchset brings back struct reorg to GCC.
>>
>> We’ve been working on improving cache utilization recently and would
>> like to share our current implementation to receive some feedback on it.
>>
>> Essentially, we’ve implemented the following components:
>>
>>      Type-based escape analysis to determine if we can reorganize a type
>> at link-time
>>
>>      Dead-field elimination to remove unused fields of a struct at
>> link-time
>>
>> The type-based escape analysis provides a list of types, that are not
>> visible outside of the current linking unit (e.g. parameter types of
>> external functions).
>>
>> The dead-field elimination pass analyses non-escaping structs for fields
>> that are not used in the linking unit and thus can be removed. The
>> resulting struct has a smaller memory footprint, which allows for a
>> higher cache utilization.
>>
>> As a side-effect a couple of new infrastructure code has been written
>> (e.g. a type walker, which we were really missing in GCC), which can be
>> of course reused for other passes as well.
>>
>> We’ve prepared a patchset in the following branch:
>>
>>    refs/vendors/ARM/heads/arm-struct-reorg-wip
> 
> Just had some time to peek into this.  Ugh.  The code doesn't look like
> GCC code looks :/  It doesn't help to have one set of files per C++ class (25!).

Any suggestions how to best structure these?
Are there some coding guidelines in the GCC project,
which can help us to match the expectation?

> The code itself is undocumented - it's hard to understand what the purpose
> of all the Walker stuff is.
> 
> You also didn't seem to know walk_tree () nor walk_gimple* ().

True, we were not aware of that code.
Thanks for pointing to that code.
We will have a look.

> Take as example - I figured to look for IPA pass entries, then I see
> 
> +
> +static void
> +collect_types ()
> +{
> +  GimpleTypeCollector collector;
> +  collector.walk ();
> +  collector.print_collected ();
> +  ptrset_t types = collector.get_pointer_set ();
> +  GimpleCaster caster (types);
> +  caster.walk ();
> +  if (flag_print_cast_analysis)
> +    caster.print_reasons ();
> +  ptrset_t casting = caster.get_sets ();
> +  fix_escaping_types_in_set (casting);
> +  GimpleAccesser accesser;
> +  accesser.walk ();
> +  if (flag_print_access_analysis)
> +    accesser.print_accesses ();
> +  record_field_map_t record_field_map = accesser.get_map ();
> +  TypeIncompleteEquality equality;
> +  bool has_fields_that_can_be_deleted = false;
> +  typedef std::set<unsigned> field_offsets_t;
> 
> there's no comments (not even file-level) that explains how type escape
> is computed.
> 
> Sorry, but this isn't even close to be coarsely reviewable.

Sad to hear.
We'll work on the input that you provided and provide a new version.

Thanks,
Christoph

> 
>> We’ve also added a subsection in the GCC internals document to allow
>> other compiler devs to better understand our design and implementation.
>> A generated PDF can be found here:
>>
>>     https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F
>>     https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download
>>
>>     page: 719
>>
>> We’ve been testing the pass against a range of in-tree tests and
>> real-life applications (e.g. all SPEC CPU2017 C benchmarks). For
>> testing, please see testing subsection in the gcc internals we prepared.
>>
>> Currently we see the following limitations:
>>
>> * It is not a "true" ipa pass yes. That is, we can only succeed with
>> -flto-partition=none.
>> * Currently it is not safe to use -fipa-sra.
>> * Brace constructors not supported now. We handle this gracefully.
>> * Only C as of now.
>> * Results of sizeof() and offsetof() are generated in the compiler
>> frontend and thus can’t be changed later at link time. There are a
>> couple of ideas to resolve this, but that’s currently unimplemented.
>> * At this point we’d like to thank the GCC community for their patient
>> help so far on the mailing list and in other channels. And we ask for
>> your support in terms of feedback, comments and testing.
>>
>> Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: LTO Dead Field Elimination
  2020-07-27 12:52   ` Erick Ochoa
@ 2020-07-27 13:02     ` Jakub Jelinek
  0 siblings, 0 replies; 8+ messages in thread
From: Jakub Jelinek @ 2020-07-27 13:02 UTC (permalink / raw)
  To: Erick Ochoa
  Cc: Richard Biener, Kevin Smith, GCC Development, Gary Oblock,
	Ashok Bhat, Philipp Tomsich, Jan Hubicka, Christoph Müllner

On Mon, Jul 27, 2020 at 02:52:21PM +0200, Erick Ochoa wrote:
> I will work on making this more readable. I understand your comments and you
> are right. I am currently in the process of writing a human-readable PDF
> with an overview of this. There is already a (somewhat hurried) version of
> this PDF in the GCC internals link we shared. This should answer most of the
> high level questions about how the type escape analysis is computed.
> However, yes, there is still work to be done. I will start looking into
> walk_tree and walk_gimple* and see if the gimple_walker I wrote is redundant
> and substitute it.

The overview doesn't belong in some on the side document, but should be in
the source (usually at the start of the source file that contains the pass).
And then each function should be documented.

I'd also like to stress what has been seen several times in the past, but is
being not taken into account.  One of the reasons why old ipa struct reorg
has been removed is that it worked on the types granularity, if it is safe
to adjust all references to all objects of a certain type, then do it,
otherwise give up.  That is fundamentally wrong, it can't be useful except
by pure luck on some benchmark.
Instead, it should work on the object (or sets of those) granularity.
If the analysis phase determines all accesses to certain object (automatic
or global variable, or e.g. memory allocation) can be adjusted by using a
different type for it (with either fields reordered, or some removed etc.),
then it should be optimized regardless of whether other objects can be
optimized similarly or not.  In other cases, it will not be possible to
optimize a single object on its own, e.g. some function is called with
an address of one of 10 objects, so in that case put those 10 objects into a
set and either optimize those together if possible, or punt on those
together.  But only very seldom all objects of certain type can be optimized
the same way, and even if they do, perhaps it is better to optimize some of
them even further (e.g. you find none of the objects of type XYZ use field
abc, and decide to adjust the type and remove the field.  But perhaps 50% of
objects of that type also don't use def field and if those were optimized
separately, you could optimize that too).

	Jakub


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: LTO Dead Field Elimination
  2020-07-27 12:59   ` Christoph Müllner
@ 2020-07-27 13:12     ` Richard Biener
  2020-07-27 13:15     ` David Edelsohn
  1 sibling, 0 replies; 8+ messages in thread
From: Richard Biener @ 2020-07-27 13:12 UTC (permalink / raw)
  To: Christoph Müllner
  Cc: Erick Ochoa, GCC Development, Philipp Tomsich, Tamar Christina,
	Kevin Smith, Martin Jambor, Jan Hubicka, Gary Oblock, Ashok Bhat,
	Martin Liška, Kyrylo Tkachov

On Mon, Jul 27, 2020 at 2:59 PM Christoph Müllner
<christoph.muellner@theobroma-systems.com> wrote:
>
> Hi Richard,
>
> On 7/27/20 2:36 PM, Richard Biener wrote:
> > On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa
> > <erick.ochoa@theobroma-systems.com> wrote:
> >>
> >> This patchset brings back struct reorg to GCC.
> >>
> >> We’ve been working on improving cache utilization recently and would
> >> like to share our current implementation to receive some feedback on it.
> >>
> >> Essentially, we’ve implemented the following components:
> >>
> >>      Type-based escape analysis to determine if we can reorganize a type
> >> at link-time
> >>
> >>      Dead-field elimination to remove unused fields of a struct at
> >> link-time
> >>
> >> The type-based escape analysis provides a list of types, that are not
> >> visible outside of the current linking unit (e.g. parameter types of
> >> external functions).
> >>
> >> The dead-field elimination pass analyses non-escaping structs for fields
> >> that are not used in the linking unit and thus can be removed. The
> >> resulting struct has a smaller memory footprint, which allows for a
> >> higher cache utilization.
> >>
> >> As a side-effect a couple of new infrastructure code has been written
> >> (e.g. a type walker, which we were really missing in GCC), which can be
> >> of course reused for other passes as well.
> >>
> >> We’ve prepared a patchset in the following branch:
> >>
> >>    refs/vendors/ARM/heads/arm-struct-reorg-wip
> >
> > Just had some time to peek into this.  Ugh.  The code doesn't look like
> > GCC code looks :/  It doesn't help to have one set of files per C++ class (25!).
>
> Any suggestions how to best structure these?

As "bad" as it sounds, put everything into one file (maybe separate out
type escape analysis from the actual transform).  Add a toplevel comment
per file explaining things.

> Are there some coding guidelines in the GCC project,
> which can help us to match the expectation?

Look at existing passes, otherwise there's mostly conventions on
formatting.

> > The code itself is undocumented - it's hard to understand what the purpose
> > of all the Walker stuff is.
> >
> > You also didn't seem to know walk_tree () nor walk_gimple* ().
>
> True, we were not aware of that code.
> Thanks for pointing to that code.
> We will have a look.
>
> > Take as example - I figured to look for IPA pass entries, then I see
> >
> > +
> > +static void
> > +collect_types ()
> > +{
> > +  GimpleTypeCollector collector;
> > +  collector.walk ();
> > +  collector.print_collected ();
> > +  ptrset_t types = collector.get_pointer_set ();
> > +  GimpleCaster caster (types);
> > +  caster.walk ();
> > +  if (flag_print_cast_analysis)
> > +    caster.print_reasons ();
> > +  ptrset_t casting = caster.get_sets ();
> > +  fix_escaping_types_in_set (casting);
> > +  GimpleAccesser accesser;
> > +  accesser.walk ();
> > +  if (flag_print_access_analysis)
> > +    accesser.print_accesses ();
> > +  record_field_map_t record_field_map = accesser.get_map ();
> > +  TypeIncompleteEquality equality;
> > +  bool has_fields_that_can_be_deleted = false;
> > +  typedef std::set<unsigned> field_offsets_t;
> >
> > there's no comments (not even file-level) that explains how type escape
> > is computed.
> >
> > Sorry, but this isn't even close to be coarsely reviewable.
>
> Sad to hear.
> We'll work on the input that you provided and provide a new version.
>
> Thanks,
> Christoph
>
> >
> >> We’ve also added a subsection in the GCC internals document to allow
> >> other compiler devs to better understand our design and implementation.
> >> A generated PDF can be found here:
> >>
> >>     https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F
> >>     https://cloud.theobroma-systems.com/s/aWwxPiDJ3nCgc7F/download
> >>
> >>     page: 719
> >>
> >> We’ve been testing the pass against a range of in-tree tests and
> >> real-life applications (e.g. all SPEC CPU2017 C benchmarks). For
> >> testing, please see testing subsection in the gcc internals we prepared.
> >>
> >> Currently we see the following limitations:
> >>
> >> * It is not a "true" ipa pass yes. That is, we can only succeed with
> >> -flto-partition=none.
> >> * Currently it is not safe to use -fipa-sra.
> >> * Brace constructors not supported now. We handle this gracefully.
> >> * Only C as of now.
> >> * Results of sizeof() and offsetof() are generated in the compiler
> >> frontend and thus can’t be changed later at link time. There are a
> >> couple of ideas to resolve this, but that’s currently unimplemented.
> >> * At this point we’d like to thank the GCC community for their patient
> >> help so far on the mailing list and in other channels. And we ask for
> >> your support in terms of feedback, comments and testing.
> >>
> >> Thanks!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: LTO Dead Field Elimination
  2020-07-27 12:59   ` Christoph Müllner
  2020-07-27 13:12     ` Richard Biener
@ 2020-07-27 13:15     ` David Edelsohn
  1 sibling, 0 replies; 8+ messages in thread
From: David Edelsohn @ 2020-07-27 13:15 UTC (permalink / raw)
  To: Christoph Müllner
  Cc: Richard Biener, Erick Ochoa, Kevin Smith, GCC Development,
	Gary Oblock, Ashok Bhat, Philipp Tomsich, Jan Hubicka

On Mon, Jul 27, 2020 at 9:03 AM Christoph Müllner
<christoph.muellner@theobroma-systems.com> wrote:
>
> Hi Richard,
>
> On 7/27/20 2:36 PM, Richard Biener wrote:
> > On Fri, Jul 24, 2020 at 5:43 PM Erick Ochoa
> > <erick.ochoa@theobroma-systems.com> wrote:
> >>
> >> This patchset brings back struct reorg to GCC.
> >>
> >> We’ve been working on improving cache utilization recently and would
> >> like to share our current implementation to receive some feedback on it.
> >>
> >> Essentially, we’ve implemented the following components:
> >>
> >>      Type-based escape analysis to determine if we can reorganize a type
> >> at link-time
> >>
> >>      Dead-field elimination to remove unused fields of a struct at
> >> link-time
> >>
> >> The type-based escape analysis provides a list of types, that are not
> >> visible outside of the current linking unit (e.g. parameter types of
> >> external functions).
> >>
> >> The dead-field elimination pass analyses non-escaping structs for fields
> >> that are not used in the linking unit and thus can be removed. The
> >> resulting struct has a smaller memory footprint, which allows for a
> >> higher cache utilization.
> >>
> >> As a side-effect a couple of new infrastructure code has been written
> >> (e.g. a type walker, which we were really missing in GCC), which can be
> >> of course reused for other passes as well.
> >>
> >> We’ve prepared a patchset in the following branch:
> >>
> >>    refs/vendors/ARM/heads/arm-struct-reorg-wip
> >
> > Just had some time to peek into this.  Ugh.  The code doesn't look like
> > GCC code looks :/  It doesn't help to have one set of files per C++ class (25!).
>
> Any suggestions how to best structure these?
> Are there some coding guidelines in the GCC project,
> which can help us to match the expectation?

Thanks for working on this feature in GCC.  The coding standards for
the GNU Project and for GCC are:

https://www.gnu.org/prep/standards/
https://gcc.gnu.org/codingconventions.html

As others have mentioned, the documentation should be in the source
code itself.  An external design document is nice, but the code should
seemlessly integrate with the rest of the GCC source code, not a
separate, self-contained set of files bolted onto the side.

Thanks, David

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: LTO Dead Field Elimination
  2020-07-24 15:43 LTO Dead Field Elimination Erick Ochoa
  2020-07-27 12:36 ` Richard Biener
@ 2020-07-27 15:11 ` David Brown
  1 sibling, 0 replies; 8+ messages in thread
From: David Brown @ 2020-07-27 15:11 UTC (permalink / raw)
  To: Erick Ochoa, GCC Development
  Cc: Kevin Smith, Gary Oblock, Ashok Bhat, Philipp Tomsich,
	Jan Hubicka, Christoph Müllner

On 24/07/2020 17:43, Erick Ochoa wrote:
> This patchset brings back struct reorg to GCC.
> 
> We’ve been working on improving cache utilization recently and would
> like to share our current implementation to receive some feedback on it.
> 
> Essentially, we’ve implemented the following components:
> 
>     Type-based escape analysis to determine if we can reorganize a type
> at link-time
> 
>     Dead-field elimination to remove unused fields of a struct at link-time
> 
> The type-based escape analysis provides a list of types, that are not
> visible outside of the current linking unit (e.g. parameter types of
> external functions).
> 
> The dead-field elimination pass analyses non-escaping structs for fields
> that are not used in the linking unit and thus can be removed. The
> resulting struct has a smaller memory footprint, which allows for a
> higher cache utilization.
> 

I am very much a lurker on this list, and not a gcc developer - I am
just an enthusiastic user.  So my opinion here might not even be worth
the apocryphal two cents, and I won't feel insulted if someone says I
don't know what I am talking about here.

With that disclaimer out of the way, my immediate reaction to this idea
is "Why?".

What is the aim of this feature?  I can see it making many things a lot
more complicated, but I can't see it being of correspondingly
significant benefit.

Do you have reason to suppose that there really are many structs in use
in real code, for which there are fields that aren't used, and for which
removing those fields makes a /significant/ improvement to the
efficiency of the code?

If I have the history correct, gcc used to have an optimisation that
would let it re-order fields in a struct to reduce padding.  This got
removed in later versions because it made things so complicated in other
parts of compilation and linking, especially if some parts of the code
are compiled with different options.  Why would these new features not
suffer from exactly the same kinds of issues?

I worry that this kind of global optimisation has so many knock-on
effects that it simply is not worth the cost.  Remember that it is not
just in writing these patches that work must be done - people are going
to have to maintain the code for ever after, and it could mean changes
or limitations in other parts of gcc.

As I see it, there is a simpler path with much lower cost.  When I am
writing code, if I have poor struct layout or unnecessary fields, I
don't want the compiler to re-organise things behind my back.  I want to
be /told/.  Either I want the field there (perhaps I haven't written the
code that uses it yet), or I've made a mistake in my code.

My recommendation here would be to keep the analysis part - that could
be useful for other features too.  But drop the optimisation part -
replace it with a warning.  Tell the programmer, and let /them/ decide
whether or not the field is unnecessary.  Let the programmer make the
decision and the change - then you get all the benefits of the
optimisation with none of the risks or complications.

David


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-07-27 15:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-24 15:43 LTO Dead Field Elimination Erick Ochoa
2020-07-27 12:36 ` Richard Biener
2020-07-27 12:52   ` Erick Ochoa
2020-07-27 13:02     ` Jakub Jelinek
2020-07-27 12:59   ` Christoph Müllner
2020-07-27 13:12     ` Richard Biener
2020-07-27 13:15     ` David Edelsohn
2020-07-27 15:11 ` David Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).