public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [WIP] Re-introduce 'TREE_USED' in tree streaming
@ 2023-09-15  9:20 Thomas Schwinge
  2023-09-15 10:11 ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Schwinge @ 2023-09-15  9:20 UTC (permalink / raw)
  To: Richard Biener, Jan Hubicka, gcc-patches; +Cc: Tobias Burnus, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 2647 bytes --]

Hi!

Now, that was another quirky debug session: in
'gcc/omp-low.cc:create_omp_child_function' we clearly do set
'TREE_USED (t) = 1;' for '.omp_data_i', which ends up as formal parameter
for outlined '[...]._omp_fn.[...]' functions, pointing to the "OMP blob".
Yet, in offloading compilation, I only ever got '!TREE_USED' for the
formal parameter '.omp_data_i'.  This greatly disturbs a nvptx back end
expand-time transformation that I have implemented, that's active
'if (!TREE_USED ([formal parameter]))'.

After checking along all the host-side OMP handling, eventually (in
hindsight: "obvious"...) I found that, "simply", we're not streaming
'TREE_USED'!  With that changed (see attached
"Re-introduce 'TREE_USED' in tree streaming"; no visible changes in
x86_64-pc-linux-gnu and powerpc64le-unknown-linux-gnu 'make check'), my
issue was quickly addressed -- if not for the question *why* 'TREE_USED'
isn't streamed (..., and apparently, that's a problem only for my
case..?), and then I found that it's *intentionally been removed*
in one-decade-old commit ee03e71d472a3f73cbc1a132a284309f36565972
(Subversion r200151) "Re-write LTO type merging again, do tree merging".

At this point, I need help: is this OK to re-introduce unconditionally,
or in some conditionalized form (but, "ugh..."), or be done differently
altogether in the nvptx back end (is 'TREE_USED' considered "stale" at
some point in the compilation pipeline?), or do we need some logic in
tree stream read-in (?) to achieve the same thing that removing
'TREE_USED' streaming apparently did achieve, or yet something else?
Indeed, from a quick look, most use of 'TREE_USED' seems to be "early",
but I saw no reason that it couldn't be used "late", either?

Original discussion "not streaming and comparing TREE_USED":
<https://inbox.sourceware.org/alpine.LNX.2.00.1306131614000.26078@zhemvz.fhfr.qr>
"[RFC] Re-write LTO type merging again, do tree merging", continued
<https://inbox.sourceware.org/alpine.LNX.2.00.1306141240340.6998@zhemvz.fhfr.qr>
"Re-write LTO type merging again, do tree merging".


In 2013, offloading compilation was just around the corner --
<https://inbox.sourceware.org/1375103926.7129.7694.camel@triegel.csb>
"Summary of the Accelerator BOF at Cauldron" -- and you easily could've
foreseen this issue, no?  ;-P


Grüße
 Thomas


-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-WIP-Re-introduce-TREE_USED-in-tree-streaming.patch --]
[-- Type: text/x-diff, Size: 2052 bytes --]

From cba6e4a8ec3b8718de7857b90d0137ae82f381fb Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 15 Sep 2023 00:14:13 +0200
Subject: [PATCH] [WIP] Re-introduce 'TREE_USED' in tree streaming

I have a nvptx back end expand-time transformation implemented, that's active
'if (!TREE_USED ([formal parameter]))'.  Now I found that per one-decade-old
commit ee03e71d472a3f73cbc1a132a284309f36565972 (Subversion r200151)
"Re-write LTO type merging again, do tree merging", 'TREE_USED' has
*intentionally been removed* from tree streaming.  That means, in nvptx
offloading compilation, every formal parameter (like for outlined
'[...]._omp_fn.[...]' functions the one that's pointing to the "OMP blob",
'.omp_data_i', for example) is considered unused, and thus mis-optimized.
---
 gcc/tree-streamer-in.cc  | 1 +
 gcc/tree-streamer-out.cc | 1 +
 2 files changed, 2 insertions(+)

diff --git a/gcc/tree-streamer-in.cc b/gcc/tree-streamer-in.cc
index 5bead0c3c6a..f82374e60a5 100644
--- a/gcc/tree-streamer-in.cc
+++ b/gcc/tree-streamer-in.cc
@@ -132,6 +132,7 @@ unpack_ts_base_value_fields (struct bitpack_d *bp, tree expr)
     TYPE_ARTIFICIAL (expr) = (unsigned) bp_unpack_value (bp, 1);
   else
     TREE_NO_WARNING (expr) = (unsigned) bp_unpack_value (bp, 1);
+  TREE_USED (expr) = (unsigned) bp_unpack_value (bp, 1);
   TREE_NOTHROW (expr) = (unsigned) bp_unpack_value (bp, 1);
   TREE_STATIC (expr) = (unsigned) bp_unpack_value (bp, 1);
   if (TREE_CODE (expr) != TREE_BINFO)
diff --git a/gcc/tree-streamer-out.cc b/gcc/tree-streamer-out.cc
index ff9694e17dd..74f969478cf 100644
--- a/gcc/tree-streamer-out.cc
+++ b/gcc/tree-streamer-out.cc
@@ -105,6 +105,7 @@ pack_ts_base_value_fields (struct bitpack_d *bp, tree expr)
     bp_pack_value (bp, TYPE_ARTIFICIAL (expr), 1);
   else
     bp_pack_value (bp, TREE_NO_WARNING (expr), 1);
+  bp_pack_value (bp, TREE_USED (expr), 1);
   bp_pack_value (bp, TREE_NOTHROW (expr), 1);
   bp_pack_value (bp, TREE_STATIC (expr), 1);
   if (TREE_CODE (expr) != TREE_BINFO)
-- 
2.34.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [WIP] Re-introduce 'TREE_USED' in tree streaming
  2023-09-15  9:20 [WIP] Re-introduce 'TREE_USED' in tree streaming Thomas Schwinge
@ 2023-09-15 10:11 ` Richard Biener
  2023-09-15 13:01   ` Thomas Schwinge
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2023-09-15 10:11 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Jan Hubicka, gcc-patches, Tobias Burnus, Jakub Jelinek

On Fri, Sep 15, 2023 at 11:20 AM Thomas Schwinge
<thomas@codesourcery.com> wrote:
>
> Hi!
>
> Now, that was another quirky debug session: in
> 'gcc/omp-low.cc:create_omp_child_function' we clearly do set
> 'TREE_USED (t) = 1;' for '.omp_data_i', which ends up as formal parameter
> for outlined '[...]._omp_fn.[...]' functions, pointing to the "OMP blob".
> Yet, in offloading compilation, I only ever got '!TREE_USED' for the
> formal parameter '.omp_data_i'.  This greatly disturbs a nvptx back end
> expand-time transformation that I have implemented, that's active
> 'if (!TREE_USED ([formal parameter]))'.
>
> After checking along all the host-side OMP handling, eventually (in
> hindsight: "obvious"...) I found that, "simply", we're not streaming
> 'TREE_USED'!  With that changed (see attached
> "Re-introduce 'TREE_USED' in tree streaming"; no visible changes in
> x86_64-pc-linux-gnu and powerpc64le-unknown-linux-gnu 'make check'), my
> issue was quickly addressed -- if not for the question *why* 'TREE_USED'
> isn't streamed (..., and apparently, that's a problem only for my
> case..?), and then I found that it's *intentionally been removed*
> in one-decade-old commit ee03e71d472a3f73cbc1a132a284309f36565972
> (Subversion r200151) "Re-write LTO type merging again, do tree merging".
>
> At this point, I need help: is this OK to re-introduce unconditionally,
> or in some conditionalized form (but, "ugh..."), or be done differently
> altogether in the nvptx back end (is 'TREE_USED' considered "stale" at
> some point in the compilation pipeline?), or do we need some logic in
> tree stream read-in (?) to achieve the same thing that removing
> 'TREE_USED' streaming apparently did achieve, or yet something else?
> Indeed, from a quick look, most use of 'TREE_USED' seems to be "early",
> but I saw no reason that it couldn't be used "late", either?

TREE_USED is considered stale, it doesn't reflect reality and is used with
different semantics throughout the pass pipeline so it doesn't make much sense
to stream it also because it will needlessly cause divergence between TUs
during tree merging.  So we definitely do not want to stream TREE_USED for
every tree.

Why would you guard anything late on TREE_USED?  If you want to know
whether a formal parameter is "used" (used in code generation?  used in the
source?) you have to compute this property.  As you can see using TREE_USED
is fragile.

> Original discussion "not streaming and comparing TREE_USED":
> <https://inbox.sourceware.org/alpine.LNX.2.00.1306131614000.26078@zhemvz.fhfr.qr>
> "[RFC] Re-write LTO type merging again, do tree merging", continued
> <https://inbox.sourceware.org/alpine.LNX.2.00.1306141240340.6998@zhemvz.fhfr.qr>
> "Re-write LTO type merging again, do tree merging".
>
>
> In 2013, offloading compilation was just around the corner --
> <https://inbox.sourceware.org/1375103926.7129.7694.camel@triegel.csb>
> "Summary of the Accelerator BOF at Cauldron" -- and you easily could've
> foreseen this issue, no?  ;-P
>
>
> Grüße
>  Thomas
>
>
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [WIP] Re-introduce 'TREE_USED' in tree streaming
  2023-09-15 10:11 ` Richard Biener
@ 2023-09-15 13:01   ` Thomas Schwinge
  2023-09-15 13:05     ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Thomas Schwinge @ 2023-09-15 13:01 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc-patches, Tobias Burnus, Jakub Jelinek

Hi!

On 2023-09-15T12:11:44+0200, Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> On Fri, Sep 15, 2023 at 11:20 AM Thomas Schwinge
> <thomas@codesourcery.com> wrote:
>> Now, that was another quirky debug session: in
>> 'gcc/omp-low.cc:create_omp_child_function' we clearly do set
>> 'TREE_USED (t) = 1;' for '.omp_data_i', which ends up as formal parameter
>> for outlined '[...]._omp_fn.[...]' functions, pointing to the "OMP blob".
>> Yet, in offloading compilation, I only ever got '!TREE_USED' for the
>> formal parameter '.omp_data_i'.  This greatly disturbs a nvptx back end
>> expand-time transformation that I have implemented, that's active
>> 'if (!TREE_USED ([formal parameter]))'.
>>
>> After checking along all the host-side OMP handling, eventually (in
>> hindsight: "obvious"...) I found that, "simply", we're not streaming
>> 'TREE_USED'!  With that changed (see attached
>> "Re-introduce 'TREE_USED' in tree streaming"; no visible changes in
>> x86_64-pc-linux-gnu and powerpc64le-unknown-linux-gnu 'make check'), my
>> issue was quickly addressed -- if not for the question *why* 'TREE_USED'
>> isn't streamed (..., and apparently, that's a problem only for my
>> case..?), and then I found that it's *intentionally been removed*
>> in one-decade-old commit ee03e71d472a3f73cbc1a132a284309f36565972
>> (Subversion r200151) "Re-write LTO type merging again, do tree merging".
>>
>> At this point, I need help: is this OK to re-introduce unconditionally,
>> or in some conditionalized form (but, "ugh..."), or be done differently
>> altogether in the nvptx back end (is 'TREE_USED' considered "stale" at
>> some point in the compilation pipeline?), or do we need some logic in
>> tree stream read-in (?) to achieve the same thing that removing
>> 'TREE_USED' streaming apparently did achieve, or yet something else?
>> Indeed, from a quick look, most use of 'TREE_USED' seems to be "early",
>> but I saw no reason that it couldn't be used "late", either?
>
> TREE_USED is considered stale, it doesn't reflect reality and is used with
> different semantics throughout the pass pipeline

Aha, thanks.  Any suggestion about how to update 'gcc/tree.h:TREE_USED',
for next time, to detail at which stages the properties indicated there
are meaningful?  (..., and we shall also add some such comment in the two
tree streamer functions.)

> so it doesn't make much sense
> to stream it also because it will needlessly cause divergence between TUs
> during tree merging.

Right, that's what I'd assumed from quickly skimming the 2013 discussion.

> So we definitely do not want to stream TREE_USED for
> every tree.
>
> Why would you guard anything late on TREE_USED?  If you want to know
> whether a formal parameter is "used" (used in code generation?  used in the
> source?) you have to compute this property.  As you can see using TREE_USED
> is fragile.

The issue is: for function call outgoing/incoming arguments, the nvptx
back end has (to use) a mechanism different from usual targets.  For the
latter, the incoming arguments are readily available in registers or on
the stack, without requiring emission of any setup instructions.  For
nvptx, we have to generate boilerplate code for every function incoming
argument, to load the argument value into a local register.  (The latter
are then, at least for '-O0', spilled to and restored from the stack
frame, before the first actual use -- if there's any use at all.)

This generates some bulky PTX code, which goes so far that we run into
timeout or OOM-killed 'ptxas' for 'gcc.c-torture/compile/limits-fndefn.c'
at '-O0', for example, where we've got half a million lines of
boilerplate PTX code.  That one certainly is a rogue test case, but I
then found that if I conditionalize emission of that incoming argument
setup code on 'TREE_USED' of the respective element of the chain of
'DECL_ARGUMENTS', then I do get the desired behavior: zero-instructions
'limits-fndefn.S'.  So this "late" use of 'TREE_USED' does work -- just
that, as discussed, 'TREE_USED' isn't available in the offloading
setting.  ;-)

I'll look into computing "unused" locally, before/for nvptx expand time.
(To make the '-O0' case work, I figure this has to happen early, instead
of later DCEing the mess that we generated earlier.)  Any quick
suggestions?  My naïve first idea would be to simply in
'TARGET_FUNCTION_INCOMING_ARG' scan if the corresponding element of
'DECL_ARGUMENTS' is used in the function, or maybe do that once for all
'DECL_ARGUMENTS' in 'INIT_CUMULATIVE_INCOMING_ARGS'.


Grüße
 Thomas


>> Original discussion "not streaming and comparing TREE_USED":
>> <https://inbox.sourceware.org/alpine.LNX.2.00.1306131614000.26078@zhemvz.fhfr.qr>
>> "[RFC] Re-write LTO type merging again, do tree merging", continued
>> <https://inbox.sourceware.org/alpine.LNX.2.00.1306141240340.6998@zhemvz.fhfr.qr>
>> "Re-write LTO type merging again, do tree merging".
>>
>>
>> In 2013, offloading compilation was just around the corner --
>> <https://inbox.sourceware.org/1375103926.7129.7694.camel@triegel.csb>
>> "Summary of the Accelerator BOF at Cauldron" -- and you easily could've
>> foreseen this issue, no?  ;-P
>>
>>
>> Grüße
>>  Thomas
>>
>>
>> -----------------
>> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [WIP] Re-introduce 'TREE_USED' in tree streaming
  2023-09-15 13:01   ` Thomas Schwinge
@ 2023-09-15 13:05     ` Richard Biener
  2023-09-15 13:10       ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Richard Biener @ 2023-09-15 13:05 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Jan Hubicka, gcc-patches, Tobias Burnus, Jakub Jelinek

On Fri, Sep 15, 2023 at 3:01 PM Thomas Schwinge <thomas@codesourcery.com> wrote:
>
> Hi!
>
> On 2023-09-15T12:11:44+0200, Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> > On Fri, Sep 15, 2023 at 11:20 AM Thomas Schwinge
> > <thomas@codesourcery.com> wrote:
> >> Now, that was another quirky debug session: in
> >> 'gcc/omp-low.cc:create_omp_child_function' we clearly do set
> >> 'TREE_USED (t) = 1;' for '.omp_data_i', which ends up as formal parameter
> >> for outlined '[...]._omp_fn.[...]' functions, pointing to the "OMP blob".
> >> Yet, in offloading compilation, I only ever got '!TREE_USED' for the
> >> formal parameter '.omp_data_i'.  This greatly disturbs a nvptx back end
> >> expand-time transformation that I have implemented, that's active
> >> 'if (!TREE_USED ([formal parameter]))'.
> >>
> >> After checking along all the host-side OMP handling, eventually (in
> >> hindsight: "obvious"...) I found that, "simply", we're not streaming
> >> 'TREE_USED'!  With that changed (see attached
> >> "Re-introduce 'TREE_USED' in tree streaming"; no visible changes in
> >> x86_64-pc-linux-gnu and powerpc64le-unknown-linux-gnu 'make check'), my
> >> issue was quickly addressed -- if not for the question *why* 'TREE_USED'
> >> isn't streamed (..., and apparently, that's a problem only for my
> >> case..?), and then I found that it's *intentionally been removed*
> >> in one-decade-old commit ee03e71d472a3f73cbc1a132a284309f36565972
> >> (Subversion r200151) "Re-write LTO type merging again, do tree merging".
> >>
> >> At this point, I need help: is this OK to re-introduce unconditionally,
> >> or in some conditionalized form (but, "ugh..."), or be done differently
> >> altogether in the nvptx back end (is 'TREE_USED' considered "stale" at
> >> some point in the compilation pipeline?), or do we need some logic in
> >> tree stream read-in (?) to achieve the same thing that removing
> >> 'TREE_USED' streaming apparently did achieve, or yet something else?
> >> Indeed, from a quick look, most use of 'TREE_USED' seems to be "early",
> >> but I saw no reason that it couldn't be used "late", either?
> >
> > TREE_USED is considered stale, it doesn't reflect reality and is used with
> > different semantics throughout the pass pipeline
>
> Aha, thanks.  Any suggestion about how to update 'gcc/tree.h:TREE_USED',
> for next time, to detail at which stages the properties indicated there
> are meaningful?  (..., and we shall also add some such comment in the two
> tree streamer functions.)
>
> > so it doesn't make much sense
> > to stream it also because it will needlessly cause divergence between TUs
> > during tree merging.
>
> Right, that's what I'd assumed from quickly skimming the 2013 discussion.
>
> > So we definitely do not want to stream TREE_USED for
> > every tree.
> >
> > Why would you guard anything late on TREE_USED?  If you want to know
> > whether a formal parameter is "used" (used in code generation?  used in the
> > source?) you have to compute this property.  As you can see using TREE_USED
> > is fragile.
>
> The issue is: for function call outgoing/incoming arguments, the nvptx
> back end has (to use) a mechanism different from usual targets.  For the
> latter, the incoming arguments are readily available in registers or on
> the stack, without requiring emission of any setup instructions.  For
> nvptx, we have to generate boilerplate code for every function incoming
> argument, to load the argument value into a local register.  (The latter
> are then, at least for '-O0', spilled to and restored from the stack
> frame, before the first actual use -- if there's any use at all.)
>
> This generates some bulky PTX code, which goes so far that we run into
> timeout or OOM-killed 'ptxas' for 'gcc.c-torture/compile/limits-fndefn.c'
> at '-O0', for example, where we've got half a million lines of
> boilerplate PTX code.  That one certainly is a rogue test case, but I
> then found that if I conditionalize emission of that incoming argument
> setup code on 'TREE_USED' of the respective element of the chain of
> 'DECL_ARGUMENTS', then I do get the desired behavior: zero-instructions
> 'limits-fndefn.S'.  So this "late" use of 'TREE_USED' does work -- just
> that, as discussed, 'TREE_USED' isn't available in the offloading
> setting.  ;-)
>
> I'll look into computing "unused" locally, before/for nvptx expand time.
> (To make the '-O0' case work, I figure this has to happen early, instead
> of later DCEing the mess that we generated earlier.)  Any quick
> suggestions?  My naïve first idea would be to simply in
> 'TARGET_FUNCTION_INCOMING_ARG' scan if the corresponding element of
> 'DECL_ARGUMENTS' is used in the function, or maybe do that once for all
> 'DECL_ARGUMENTS' in 'INIT_CUMULATIVE_INCOMING_ARGS'.

RTL expansion re-computes TREE_USED (well, it computes something into
it related to use), but it does so only for BLOCK scope variables and
local decls.
I suppose extending it to also re-compute TREE_USED for formal parameters
should be straight-forward.

Richard.

>
> Grüße
>  Thomas
>
>
> >> Original discussion "not streaming and comparing TREE_USED":
> >> <https://inbox.sourceware.org/alpine.LNX.2.00.1306131614000.26078@zhemvz.fhfr.qr>
> >> "[RFC] Re-write LTO type merging again, do tree merging", continued
> >> <https://inbox.sourceware.org/alpine.LNX.2.00.1306141240340.6998@zhemvz.fhfr.qr>
> >> "Re-write LTO type merging again, do tree merging".
> >>
> >>
> >> In 2013, offloading compilation was just around the corner --
> >> <https://inbox.sourceware.org/1375103926.7129.7694.camel@triegel.csb>
> >> "Summary of the Accelerator BOF at Cauldron" -- and you easily could've
> >> foreseen this issue, no?  ;-P
> >>
> >>
> >> Grüße
> >>  Thomas
> >>
> >>
> >> -----------------
> >> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
> -----------------
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [WIP] Re-introduce 'TREE_USED' in tree streaming
  2023-09-15 13:05     ` Richard Biener
@ 2023-09-15 13:10       ` Richard Biener
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2023-09-15 13:10 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Jan Hubicka, gcc-patches, Tobias Burnus, Jakub Jelinek

On Fri, Sep 15, 2023 at 3:05 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Fri, Sep 15, 2023 at 3:01 PM Thomas Schwinge <thomas@codesourcery.com> wrote:
> >
> > Hi!
> >
> > On 2023-09-15T12:11:44+0200, Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> > > On Fri, Sep 15, 2023 at 11:20 AM Thomas Schwinge
> > > <thomas@codesourcery.com> wrote:
> > >> Now, that was another quirky debug session: in
> > >> 'gcc/omp-low.cc:create_omp_child_function' we clearly do set
> > >> 'TREE_USED (t) = 1;' for '.omp_data_i', which ends up as formal parameter
> > >> for outlined '[...]._omp_fn.[...]' functions, pointing to the "OMP blob".
> > >> Yet, in offloading compilation, I only ever got '!TREE_USED' for the
> > >> formal parameter '.omp_data_i'.  This greatly disturbs a nvptx back end
> > >> expand-time transformation that I have implemented, that's active
> > >> 'if (!TREE_USED ([formal parameter]))'.
> > >>
> > >> After checking along all the host-side OMP handling, eventually (in
> > >> hindsight: "obvious"...) I found that, "simply", we're not streaming
> > >> 'TREE_USED'!  With that changed (see attached
> > >> "Re-introduce 'TREE_USED' in tree streaming"; no visible changes in
> > >> x86_64-pc-linux-gnu and powerpc64le-unknown-linux-gnu 'make check'), my
> > >> issue was quickly addressed -- if not for the question *why* 'TREE_USED'
> > >> isn't streamed (..., and apparently, that's a problem only for my
> > >> case..?), and then I found that it's *intentionally been removed*
> > >> in one-decade-old commit ee03e71d472a3f73cbc1a132a284309f36565972
> > >> (Subversion r200151) "Re-write LTO type merging again, do tree merging".
> > >>
> > >> At this point, I need help: is this OK to re-introduce unconditionally,
> > >> or in some conditionalized form (but, "ugh..."), or be done differently
> > >> altogether in the nvptx back end (is 'TREE_USED' considered "stale" at
> > >> some point in the compilation pipeline?), or do we need some logic in
> > >> tree stream read-in (?) to achieve the same thing that removing
> > >> 'TREE_USED' streaming apparently did achieve, or yet something else?
> > >> Indeed, from a quick look, most use of 'TREE_USED' seems to be "early",
> > >> but I saw no reason that it couldn't be used "late", either?
> > >
> > > TREE_USED is considered stale, it doesn't reflect reality and is used with
> > > different semantics throughout the pass pipeline
> >
> > Aha, thanks.  Any suggestion about how to update 'gcc/tree.h:TREE_USED',
> > for next time, to detail at which stages the properties indicated there
> > are meaningful?  (..., and we shall also add some such comment in the two
> > tree streamer functions.)
> >
> > > so it doesn't make much sense
> > > to stream it also because it will needlessly cause divergence between TUs
> > > during tree merging.
> >
> > Right, that's what I'd assumed from quickly skimming the 2013 discussion.
> >
> > > So we definitely do not want to stream TREE_USED for
> > > every tree.
> > >
> > > Why would you guard anything late on TREE_USED?  If you want to know
> > > whether a formal parameter is "used" (used in code generation?  used in the
> > > source?) you have to compute this property.  As you can see using TREE_USED
> > > is fragile.
> >
> > The issue is: for function call outgoing/incoming arguments, the nvptx
> > back end has (to use) a mechanism different from usual targets.  For the
> > latter, the incoming arguments are readily available in registers or on
> > the stack, without requiring emission of any setup instructions.  For
> > nvptx, we have to generate boilerplate code for every function incoming
> > argument, to load the argument value into a local register.  (The latter
> > are then, at least for '-O0', spilled to and restored from the stack
> > frame, before the first actual use -- if there's any use at all.)
> >
> > This generates some bulky PTX code, which goes so far that we run into
> > timeout or OOM-killed 'ptxas' for 'gcc.c-torture/compile/limits-fndefn.c'
> > at '-O0', for example, where we've got half a million lines of
> > boilerplate PTX code.  That one certainly is a rogue test case, but I
> > then found that if I conditionalize emission of that incoming argument
> > setup code on 'TREE_USED' of the respective element of the chain of
> > 'DECL_ARGUMENTS', then I do get the desired behavior: zero-instructions
> > 'limits-fndefn.S'.  So this "late" use of 'TREE_USED' does work -- just
> > that, as discussed, 'TREE_USED' isn't available in the offloading
> > setting.  ;-)
> >
> > I'll look into computing "unused" locally, before/for nvptx expand time.
> > (To make the '-O0' case work, I figure this has to happen early, instead
> > of later DCEing the mess that we generated earlier.)  Any quick
> > suggestions?  My naïve first idea would be to simply in
> > 'TARGET_FUNCTION_INCOMING_ARG' scan if the corresponding element of
> > 'DECL_ARGUMENTS' is used in the function, or maybe do that once for all
> > 'DECL_ARGUMENTS' in 'INIT_CUMULATIVE_INCOMING_ARGS'.
>
> RTL expansion re-computes TREE_USED (well, it computes something into
> it related to use), but it does so only for BLOCK scope variables and
> local decls.
> I suppose extending it to also re-compute TREE_USED for formal parameters
> should be straight-forward.

Btw, it does sound somewhat like premature optimization for the
limits-fndefn testcase,
doesn't it?

> Richard.
>
> >
> > Grüße
> >  Thomas
> >
> >
> > >> Original discussion "not streaming and comparing TREE_USED":
> > >> <https://inbox.sourceware.org/alpine.LNX.2.00.1306131614000.26078@zhemvz.fhfr.qr>
> > >> "[RFC] Re-write LTO type merging again, do tree merging", continued
> > >> <https://inbox.sourceware.org/alpine.LNX.2.00.1306141240340.6998@zhemvz.fhfr.qr>
> > >> "Re-write LTO type merging again, do tree merging".
> > >>
> > >>
> > >> In 2013, offloading compilation was just around the corner --
> > >> <https://inbox.sourceware.org/1375103926.7129.7694.camel@triegel.csb>
> > >> "Summary of the Accelerator BOF at Cauldron" -- and you easily could've
> > >> foreseen this issue, no?  ;-P
> > >>
> > >>
> > >> Grüße
> > >>  Thomas
> > >>
> > >>
> > >> -----------------
> > >> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
> > -----------------
> > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-09-15 13:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-15  9:20 [WIP] Re-introduce 'TREE_USED' in tree streaming Thomas Schwinge
2023-09-15 10:11 ` Richard Biener
2023-09-15 13:01   ` Thomas Schwinge
2023-09-15 13:05     ` Richard Biener
2023-09-15 13:10       ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).