public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* 'walk_stmt_load_store_addr_ops' for non-'gimple_assign_single_p (stmt)'
@ 2021-03-15 18:31 Thomas Schwinge
  2021-03-15 19:17 ` Richard Biener
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Schwinge @ 2021-03-15 18:31 UTC (permalink / raw)
  To: gcc

Hi!

First time I'm using this API -- so the error certainly may be on my
side.  ;-)

What I'm doing, is a 'walk_gimple_seq', and in that one's
'callback_stmt', call 'walk_stmt_load_store_addr_ops', to collect
variable load/store/address-taken instances.  This did seem quite
straight-forward, given the description; 'gcc/gimple-walk.c':

    /* For the statement STMT call the callbacks VISIT_LOAD, VISIT_STORE and
       VISIT_ADDR if non-NULL on loads, store and address-taken operands
       passing the STMT, the base of the operand, the operand itself containing
       the base and DATA to it.  The base will be either a decl, an indirect
       reference (including TARGET_MEM_REF) or the argument of an address
       expression.
       Returns the results of these callbacks or'ed.  */

    bool
    walk_stmt_load_store_addr_ops (gimple *stmt, void *data,
                                   walk_stmt_load_store_addr_fn visit_load,
                                   walk_stmt_load_store_addr_fn visit_store,
                                   walk_stmt_load_store_addr_fn visit_addr)
    { [...] }

Indeed, given (Fortran) 'zzz = 1', we produce GIMPLE:

    gimple_assign <real_cst, zzz, 1.0e+0, NULL, NULL>

..., and calling 'walk_stmt_load_store_addr_ops' on that, I see, as
expected, the 'visit_store' callback invoked, with 'rhs' and 'arg':
'<var_decl zzz>'.

However, given (Fortran) 'zzz = r + r2', we produce GIMPLE:

    gimple_assign <plus_expr, zzz, r, r2, NULL>

..., and calling 'walk_stmt_load_store_addr_ops' on that, I see,
unexpectedly, no callback at all invoked: neither 'visit_load', nor
'visit_store' (nor 'visit_address', obviously).

From a quick look at 'gcc/gimple-walk.c:walk_stmt_load_store_addr_ops',
this seems to intentionally be implemented in this way -- but I don't
understand the rationale?


Instead of 'walk_gimple_seq' -> 'callback_stmt' ->
'walk_stmt_load_store_addr_ops', do I need to use 'walk_gimple_seq' ->
'callback_op' -> "something"?


Grüße
 Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 'walk_stmt_load_store_addr_ops' for non-'gimple_assign_single_p (stmt)'
  2021-03-15 18:31 'walk_stmt_load_store_addr_ops' for non-'gimple_assign_single_p (stmt)' Thomas Schwinge
@ 2021-03-15 19:17 ` Richard Biener
  2021-03-16 15:02   ` Thomas Schwinge
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2021-03-15 19:17 UTC (permalink / raw)
  To: gcc, Thomas Schwinge

On March 15, 2021 7:31:46 PM GMT+01:00, Thomas Schwinge <thomas@codesourcery.com> wrote:
>Hi!
>
>First time I'm using this API -- so the error certainly may be on my
>side.  ;-)
>
>What I'm doing, is a 'walk_gimple_seq', and in that one's
>'callback_stmt', call 'walk_stmt_load_store_addr_ops', to collect
>variable load/store/address-taken instances.  This did seem quite
>straight-forward, given the description; 'gcc/gimple-walk.c':
>
>/* For the statement STMT call the callbacks VISIT_LOAD, VISIT_STORE
>and
>      VISIT_ADDR if non-NULL on loads, store and address-taken operands
>passing the STMT, the base of the operand, the operand itself
>containing
>  the base and DATA to it.  The base will be either a decl, an indirect
>     reference (including TARGET_MEM_REF) or the argument of an address
>       expression.
>       Returns the results of these callbacks or'ed.  */
>
>    bool
>    walk_stmt_load_store_addr_ops (gimple *stmt, void *data,
>                               walk_stmt_load_store_addr_fn visit_load,
>                              walk_stmt_load_store_addr_fn visit_store,
>                               walk_stmt_load_store_addr_fn visit_addr)
>    { [...] }
>
>Indeed, given (Fortran) 'zzz = 1', we produce GIMPLE:
>
>    gimple_assign <real_cst, zzz, 1.0e+0, NULL, NULL>
>
>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see, as
>expected, the 'visit_store' callback invoked, with 'rhs' and 'arg':
>'<var_decl zzz>'.
>
>However, given (Fortran) 'zzz = r + r2', we produce GIMPLE:
>
>    gimple_assign <plus_expr, zzz, r, r2, NULL>
>
>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see,
>unexpectedly, no callback at all invoked: neither 'visit_load', nor
>'visit_store' (nor 'visit_address', obviously).

The variables involved are registers. You only get called on memory operands. 

From a quick look at 'gcc/gimple-walk.c:walk_stmt_load_store_addr_ops',
>this seems to intentionally be implemented in this way -- but I don't
>understand the rationale?
>
>
>Instead of 'walk_gimple_seq' -> 'callback_stmt' ->
>'walk_stmt_load_store_addr_ops', do I need to use 'walk_gimple_seq' ->
>'callback_op' -> "something"?

Yes, if you want to visit register sets and uses as a well. Note you'll also see constants that way. 

Richard. 

>
>Grüße
> Thomas
>-----------------
>Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München
>Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung,
>Frank Thürauf


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 'walk_stmt_load_store_addr_ops' for non-'gimple_assign_single_p (stmt)'
  2021-03-15 19:17 ` Richard Biener
@ 2021-03-16 15:02   ` Thomas Schwinge
  2021-03-16 15:16     ` Richard Biener
  2021-03-16 15:25     ` Michael Matz
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Schwinge @ 2021-03-16 15:02 UTC (permalink / raw)
  To: Richard Biener, gcc

Hi!

Richard, thanks for your answer.  I'll need to look into this more; two
questions already:

On 2021-03-15T20:17:17+0100, Richard Biener via Gcc <gcc@gcc.gnu.org> wrote:
> On March 15, 2021 7:31:46 PM GMT+01:00, Thomas Schwinge <thomas@codesourcery.com> wrote:
>>First time I'm using this API -- so the error certainly may be on my
>>side.  ;-)
>>
>>What I'm doing, is a 'walk_gimple_seq', and in that one's
>>'callback_stmt', call 'walk_stmt_load_store_addr_ops', to collect
>>variable load/store/address-taken instances.  This did seem quite
>>straight-forward, given the description; 'gcc/gimple-walk.c':
>>
>>/* For the statement STMT call the callbacks VISIT_LOAD, VISIT_STORE
>>and
>>      VISIT_ADDR if non-NULL on loads, store and address-taken operands
>>passing the STMT, the base of the operand, the operand itself
>>containing
>>  the base and DATA to it.  The base will be either a decl, an indirect
>>     reference (including TARGET_MEM_REF) or the argument of an address
>>       expression.
>>       Returns the results of these callbacks or'ed.  */
>>
>>    bool
>>    walk_stmt_load_store_addr_ops (gimple *stmt, void *data,
>>                               walk_stmt_load_store_addr_fn visit_load,
>>                              walk_stmt_load_store_addr_fn visit_store,
>>                               walk_stmt_load_store_addr_fn visit_addr)
>>    { [...] }
>>
>>Indeed, given (Fortran) 'zzz = 1', we produce GIMPLE:
>>
>>    gimple_assign <real_cst, zzz, 1.0e+0, NULL, NULL>
>>
>>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see, as
>>expected, the 'visit_store' callback invoked, with 'rhs' and 'arg':
>>'<var_decl zzz>'.
>>
>>However, given (Fortran) 'zzz = r + r2', we produce GIMPLE:
>>
>>    gimple_assign <plus_expr, zzz, r, r2, NULL>
>>
>>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see,
>>unexpectedly, no callback at all invoked: neither 'visit_load', nor
>>'visit_store' (nor 'visit_address', obviously).
>
> The variables involved are registers. You only get called on memory operands.

How would I have told that from the 'walk_stmt_load_store_addr_ops'
function description?  (How to improve that one "to reflect relatity"?)

But 'zzz' surely is the same in 'zzz = 1' vs. 'zzz = r + r2' -- for the
former I *do* see the 'visit_store' callback invoked, for the latter I
don't?

>From a quick look at 'gcc/gimple-walk.c:walk_stmt_load_store_addr_ops',
>>this seems to intentionally be implemented in this way -- but I don't
>>understand the rationale?
>>
>>
>>Instead of 'walk_gimple_seq' -> 'callback_stmt' ->
>>'walk_stmt_load_store_addr_ops', do I need to use 'walk_gimple_seq' ->
>>'callback_op' -> "something"?
>
> Yes, if you want to visit register sets and uses as a well. Note you'll also see constants that way.

I'll look into that; in particular to figure out "something" for what I
need: load/store/address-taken.


Grüße
 Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 'walk_stmt_load_store_addr_ops' for non-'gimple_assign_single_p (stmt)'
  2021-03-16 15:02   ` Thomas Schwinge
@ 2021-03-16 15:16     ` Richard Biener
  2021-03-16 15:25     ` Michael Matz
  1 sibling, 0 replies; 10+ messages in thread
From: Richard Biener @ 2021-03-16 15:16 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: GCC Development

On Tue, Mar 16, 2021 at 4:02 PM Thomas Schwinge <thomas@codesourcery.com> wrote:
>
> Hi!
>
> Richard, thanks for your answer.  I'll need to look into this more; two
> questions already:
>
> On 2021-03-15T20:17:17+0100, Richard Biener via Gcc <gcc@gcc.gnu.org> wrote:
> > On March 15, 2021 7:31:46 PM GMT+01:00, Thomas Schwinge <thomas@codesourcery.com> wrote:
> >>First time I'm using this API -- so the error certainly may be on my
> >>side.  ;-)
> >>
> >>What I'm doing, is a 'walk_gimple_seq', and in that one's
> >>'callback_stmt', call 'walk_stmt_load_store_addr_ops', to collect
> >>variable load/store/address-taken instances.  This did seem quite
> >>straight-forward, given the description; 'gcc/gimple-walk.c':
> >>
> >>/* For the statement STMT call the callbacks VISIT_LOAD, VISIT_STORE
> >>and
> >>      VISIT_ADDR if non-NULL on loads, store and address-taken operands
> >>passing the STMT, the base of the operand, the operand itself
> >>containing
> >>  the base and DATA to it.  The base will be either a decl, an indirect
> >>     reference (including TARGET_MEM_REF) or the argument of an address
> >>       expression.
> >>       Returns the results of these callbacks or'ed.  */
> >>
> >>    bool
> >>    walk_stmt_load_store_addr_ops (gimple *stmt, void *data,
> >>                               walk_stmt_load_store_addr_fn visit_load,
> >>                              walk_stmt_load_store_addr_fn visit_store,
> >>                               walk_stmt_load_store_addr_fn visit_addr)
> >>    { [...] }
> >>
> >>Indeed, given (Fortran) 'zzz = 1', we produce GIMPLE:
> >>
> >>    gimple_assign <real_cst, zzz, 1.0e+0, NULL, NULL>
> >>
> >>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see, as
> >>expected, the 'visit_store' callback invoked, with 'rhs' and 'arg':
> >>'<var_decl zzz>'.
> >>
> >>However, given (Fortran) 'zzz = r + r2', we produce GIMPLE:
> >>
> >>    gimple_assign <plus_expr, zzz, r, r2, NULL>
> >>
> >>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see,
> >>unexpectedly, no callback at all invoked: neither 'visit_load', nor
> >>'visit_store' (nor 'visit_address', obviously).
> >
> > The variables involved are registers. You only get called on memory operands.
>
> How would I have told that from the 'walk_stmt_load_store_addr_ops'
> function description?  (How to improve that one "to reflect relatity"?)

Hmm, from the function name which mentions 'load' and 'store'? ;)

For example we have

static inline bool
gimple_store_p (const gimple *gs)
{
  tree lhs = gimple_get_lhs (gs);
  return lhs && !is_gimple_reg (lhs);
}


> But 'zzz' surely is the same in 'zzz = 1' vs. 'zzz = r + r2' -- for the
> former I *do* see the 'visit_store' callback invoked, for the latter I
> don't?
>
> >>From a quick look at 'gcc/gimple-walk.c:walk_stmt_load_store_addr_ops',
> >>this seems to intentionally be implemented in this way -- but I don't
> >>understand the rationale?
> >>
> >>
> >>Instead of 'walk_gimple_seq' -> 'callback_stmt' ->
> >>'walk_stmt_load_store_addr_ops', do I need to use 'walk_gimple_seq' ->
> >>'callback_op' -> "something"?
> >
> > Yes, if you want to visit register sets and uses as a well. Note you'll also see constants that way.
>
> I'll look into that; in particular to figure out "something" for what I
> need: load/store/address-taken.

Well, indeed.  The question is what you are doing the exercise for ;)
(and thus what you actually need visited)

Richard.

>
> Grüße
>  Thomas
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 'walk_stmt_load_store_addr_ops' for non-'gimple_assign_single_p (stmt)'
  2021-03-16 15:02   ` Thomas Schwinge
  2021-03-16 15:16     ` Richard Biener
@ 2021-03-16 15:25     ` Michael Matz
  2021-03-16 23:24       ` Thomas Schwinge
  2021-03-17  7:42       ` Richard Biener
  1 sibling, 2 replies; 10+ messages in thread
From: Michael Matz @ 2021-03-16 15:25 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Richard Biener, gcc

Hello,

On Tue, 16 Mar 2021, Thomas Schwinge wrote:

> >>Indeed, given (Fortran) 'zzz = 1', we produce GIMPLE:
> >>
> >>    gimple_assign <real_cst, zzz, 1.0e+0, NULL, NULL>
> >>
> >>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see, as
> >>expected, the 'visit_store' callback invoked, with 'rhs' and 'arg':
> >>'<var_decl zzz>'.
> >>
> >>However, given (Fortran) 'zzz = r + r2', we produce GIMPLE:
> >>
> >>    gimple_assign <plus_expr, zzz, r, r2, NULL>

But that's pre-ssa form.  After writing into SSA 'zzz' will be replaced by 
an SSA name, and the actual store into 'zzz' will happen in a store 
instruction.

> >>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see,
> >>unexpectedly, no callback at all invoked: neither 'visit_load', nor
> >>'visit_store' (nor 'visit_address', obviously).
> >
> > The variables involved are registers. You only get called on memory operands.
> 
> How would I have told that from the 'walk_stmt_load_store_addr_ops'
> function description?  (How to improve that one "to reflect relatity"?)
> 
> But 'zzz' surely is the same in 'zzz = 1' vs. 'zzz = r + r2' -- for the
> former I *do* see the 'visit_store' callback invoked, for the latter I
> don't?

The walk_gimple functions are intended to be used on the SSA form of 
gimple (i.e. the one that it is in most of the time).  And in that it's 
not the case that 'zzz = 1' and 'zzz = r + r2' are similar.  The former 
can have memory as the lhs (that includes static variables, or indirection 
through pointers), the latter can not.  The lhs of a binary statement is 
always an SSA name.  A write to an SSA name is not a store, which is why 
it's not walked for walk_stmt_load_store_addr_ops.

Maybe it helps to look at simple C examples:

% cat x.c
int zzz;
void foo(void) { zzz = 1; }
void bar(int i) { zzz = i + 1; }
% gcc -c x.c -fdump-tree-ssa-vops
% cat x.c.*ssa
foo ()
{
  <bb 2> :
  # .MEM_2 = VDEF <.MEM_1(D)>
  zzz = 1;
  # VUSE <.MEM_2>
  return;
}

bar (int i)
{
  int _1;

  <bb 2> :
  _1 = i_2(D) + 1;
  # .MEM_4 = VDEF <.MEM_3(D)>
  zzz = _1;
  # VUSE <.MEM_4>
  return;

}

See how the instruction writing to zzz (a global, and hence memory) is 
going through a temporary for the addition in bar?  This will always be 
the case when the expression is arithmetic.

In SSA form gimple only very few instruction types can be stores, namely 
calls and stores like above (where the RHS is an unary tree).  If you want 
to capture writes into SSA names as well (which are more appropriately 
thought of as 'setting the ssa name' or 'associating the ssa name with the 
rhs value') you need the per-operand callback indeed.  But that depends on 
what you actually want to do.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 'walk_stmt_load_store_addr_ops' for non-'gimple_assign_single_p (stmt)'
  2021-03-16 15:25     ` Michael Matz
@ 2021-03-16 23:24       ` Thomas Schwinge
  2021-03-17  7:46         ` Richard Biener
  2021-03-17  7:42       ` Richard Biener
  1 sibling, 1 reply; 10+ messages in thread
From: Thomas Schwinge @ 2021-03-16 23:24 UTC (permalink / raw)
  To: Michael Matz; +Cc: Richard Biener, gcc, Jakub Jelinek, Frederik Harwath

Hi!

Thanks, Michael, and again Richard for your quick responses.

On 2021-03-16T15:25:10+0000, Michael Matz <matz@suse.de> wrote:
> On Tue, 16 Mar 2021, Thomas Schwinge wrote:
>
>> >>Indeed, given (Fortran) 'zzz = 1', we produce GIMPLE:
>> >>
>> >>    gimple_assign <real_cst, zzz, 1.0e+0, NULL, NULL>
>> >>
>> >>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see, as
>> >>expected, the 'visit_store' callback invoked, with 'rhs' and 'arg':
>> >>'<var_decl zzz>'.
>> >>
>> >>However, given (Fortran) 'zzz = r + r2', we produce GIMPLE:
>> >>
>> >>    gimple_assign <plus_expr, zzz, r, r2, NULL>
>
> But that's pre-ssa form.  After writing into SSA 'zzz' will be replaced by
> an SSA name, and the actual store into 'zzz' will happen in a store
> instruction.

Oh, I see, and...

>> >>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see,
>> >>unexpectedly, no callback at all invoked: neither 'visit_load', nor
>> >>'visit_store' (nor 'visit_address', obviously).
>> >
>> > The variables involved are registers. You only get called on memory operands.
>>
>> How would I have told that from the 'walk_stmt_load_store_addr_ops'
>> function description?  (How to improve that one "to reflect relatity"?)
>>
>> But 'zzz' surely is the same in 'zzz = 1' vs. 'zzz = r + r2' -- for the
>> former I *do* see the 'visit_store' callback invoked, for the latter I
>> don't?
>
> The walk_gimple functions are intended to be used on the SSA form of
> gimple (i.e. the one that it is in most of the time).

Yes, "most of the time", but actually not in my case: I'm doing stuff
right after gimplification (before OMP lowering)...  So that's the
"detail" I was missing -- sorry for not mentioning that right away.  :-|

As 'walk_gimple_[...]' are used during gimplification, OMP lowering,
supposedly they're fine to use in non-SSA form -- but evidently some of
the helper functions are not.  (Might there be a way to add
'gcc_checking_assert (gimple_in_ssa_p (cfun))' or similar for that?
Putting that into 'walk_stmt_load_store_addr_ops' triggers a lot, as
called from 'gcc/cgraphbuild.c:cgraph_node::record_stmt_references', for
example.)

> And in that it's
> not the case that 'zzz = 1' and 'zzz = r + r2' are similar.  The former
> can have memory as the lhs (that includes static variables, or indirection
> through pointers), the latter can not.  The lhs of a binary statement is
> always an SSA name.  A write to an SSA name is not a store, which is why
> it's not walked for walk_stmt_load_store_addr_ops.
>
> Maybe it helps to look at simple C examples: [...]

I see, many thanks for reminding me about these items!

> If you want
> to capture writes into SSA names as well ([...])
> you need the per-operand callback indeed.

What I actually need is loads from/uses of actual variables (and I didn't
see 'walk_stmt_load_store_addr_ops' give me these).

> But that depends on
> what you actually want to do.

This is a prototype/"initial hack" for a very simple implementation of
<https://gcc.gnu.org/PR90591> "Avoid unnecessary data transfer out of OMP
construct".  (... to be completed and posted later.)  So simple that it
will easily fail (gracefully, of course), but yet is effective for a lot
of real-world code:

    subroutine [...]
    [...]
    real([...])   xx        !temporary variable, for distance calculation
    [...]
    !$acc kernels pcopyin(x, zii) reduction(+:eva) ! implicit 'copy(xx)' for scalar used inside region; established during gimplification
          do 100 i=0,n-1
            evx=0.0d0
            do 90 j=0,n-1
              xx=abs(x(1,i)-x(1,j))
    [...]
    !$acc end kernels
    [...]
    ['xx' never used here]
    end subroutine [...]

Inside 'kernels', we'd like to automatically parallelize loops (we've
basically got that working; analysis by Graphite etc.), but the problem
is that given "implicit 'copy(xx)' for scalar used inside region", when
Graphite later looks at the outlined 'kernels' region's function, it must
assume that 'xx' is still live after the OpenACC 'kernels' construct --
and thus cannot treat it as a thread-private temporary, cannot
parallelize.

Now, walking each function backwards (!), I'm taking note of any
variables' uses, and if I reach an 'kernels' construct, but have not seen
a use of 'xx', I may then optimize 'copy(xx)' -> 'firstprivate(xx)',
enabling Graphite to do its thing.  (Other such optimizations may be
added later.)  (This is inspired by Jakub's commit
1a80d6b87d81c3f336ab199a901cf80ae349c335 "re PR tree-optimization/68128
(A huge regression in Parboil v2.5 OpenMP CUTCP test (2.5 times lower
performance))".)

I've now got a simple 'callback_op', which for '!is_lhs' looks at
'get_base_address ([op])', and if that 'var' is contained in the set of
current candidates (initialized per containg 'bind's, which we enter
first, even if walking a 'gimple_seq' backwards), removes that 'var' as a
candidate for such optimization.  (Plus some "details", of couse.)  This
seems to work fine, as far as I can tell.  :-)


Of course, the eventual IPA-based solution (see PR90591, PR94693, etc.)
will be much better -- but we need something now.


Grüße
 Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 'walk_stmt_load_store_addr_ops' for non-'gimple_assign_single_p (stmt)'
  2021-03-16 15:25     ` Michael Matz
  2021-03-16 23:24       ` Thomas Schwinge
@ 2021-03-17  7:42       ` Richard Biener
  2021-03-17 13:27         ` Michael Matz
  1 sibling, 1 reply; 10+ messages in thread
From: Richard Biener @ 2021-03-17  7:42 UTC (permalink / raw)
  To: Michael Matz; +Cc: Thomas Schwinge, GCC Development

On Tue, Mar 16, 2021 at 4:25 PM Michael Matz <matz@suse.de> wrote:
>
> Hello,
>
> On Tue, 16 Mar 2021, Thomas Schwinge wrote:
>
> > >>Indeed, given (Fortran) 'zzz = 1', we produce GIMPLE:
> > >>
> > >>    gimple_assign <real_cst, zzz, 1.0e+0, NULL, NULL>
> > >>
> > >>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see, as
> > >>expected, the 'visit_store' callback invoked, with 'rhs' and 'arg':
> > >>'<var_decl zzz>'.
> > >>
> > >>However, given (Fortran) 'zzz = r + r2', we produce GIMPLE:
> > >>
> > >>    gimple_assign <plus_expr, zzz, r, r2, NULL>
>
> But that's pre-ssa form.  After writing into SSA 'zzz' will be replaced by
> an SSA name, and the actual store into 'zzz' will happen in a store
> instruction.
>
> > >>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see,
> > >>unexpectedly, no callback at all invoked: neither 'visit_load', nor
> > >>'visit_store' (nor 'visit_address', obviously).
> > >
> > > The variables involved are registers. You only get called on memory operands.
> >
> > How would I have told that from the 'walk_stmt_load_store_addr_ops'
> > function description?  (How to improve that one "to reflect relatity"?)
> >
> > But 'zzz' surely is the same in 'zzz = 1' vs. 'zzz = r + r2' -- for the
> > former I *do* see the 'visit_store' callback invoked, for the latter I
> > don't?
>
> The walk_gimple functions are intended to be used on the SSA form of
> gimple (i.e. the one that it is in most of the time).

Actually they are fine to use pre-SSA.  They just even pre-SSA distinguish
between registers and memory.  That's what gimplification honors as well,
in 'zzz = r + r2'  all operands are registers, otherwise GIMPLE requires loads
and stores split into separate stmts not doing any computation.

It's just less obivous in the dumps (compared to SSA name dumping).

Richard.

>  And in that it's
> not the case that 'zzz = 1' and 'zzz = r + r2' are similar.  The former
> can have memory as the lhs (that includes static variables, or indirection
> through pointers), the latter can not.  The lhs of a binary statement is
> always an SSA name.  A write to an SSA name is not a store, which is why
> it's not walked for walk_stmt_load_store_addr_ops.
>
> Maybe it helps to look at simple C examples:
>
> % cat x.c
> int zzz;
> void foo(void) { zzz = 1; }
> void bar(int i) { zzz = i + 1; }
> % gcc -c x.c -fdump-tree-ssa-vops
> % cat x.c.*ssa
> foo ()
> {
>   <bb 2> :
>   # .MEM_2 = VDEF <.MEM_1(D)>
>   zzz = 1;
>   # VUSE <.MEM_2>
>   return;
> }
>
> bar (int i)
> {
>   int _1;
>
>   <bb 2> :
>   _1 = i_2(D) + 1;
>   # .MEM_4 = VDEF <.MEM_3(D)>
>   zzz = _1;
>   # VUSE <.MEM_4>
>   return;
>
> }
>
> See how the instruction writing to zzz (a global, and hence memory) is
> going through a temporary for the addition in bar?  This will always be
> the case when the expression is arithmetic.
>
> In SSA form gimple only very few instruction types can be stores, namely
> calls and stores like above (where the RHS is an unary tree).  If you want
> to capture writes into SSA names as well (which are more appropriately
> thought of as 'setting the ssa name' or 'associating the ssa name with the
> rhs value') you need the per-operand callback indeed.  But that depends on
> what you actually want to do.
>
>
> Ciao,
> Michael.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 'walk_stmt_load_store_addr_ops' for non-'gimple_assign_single_p (stmt)'
  2021-03-16 23:24       ` Thomas Schwinge
@ 2021-03-17  7:46         ` Richard Biener
  2021-03-18 12:32           ` Thomas Schwinge
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2021-03-17  7:46 UTC (permalink / raw)
  To: Thomas Schwinge
  Cc: Michael Matz, GCC Development, Jakub Jelinek, Frederik Harwath

On Wed, Mar 17, 2021 at 12:25 AM Thomas Schwinge
<thomas@codesourcery.com> wrote:
>
> Hi!
>
> Thanks, Michael, and again Richard for your quick responses.
>
> On 2021-03-16T15:25:10+0000, Michael Matz <matz@suse.de> wrote:
> > On Tue, 16 Mar 2021, Thomas Schwinge wrote:
> >
> >> >>Indeed, given (Fortran) 'zzz = 1', we produce GIMPLE:
> >> >>
> >> >>    gimple_assign <real_cst, zzz, 1.0e+0, NULL, NULL>
> >> >>
> >> >>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see, as
> >> >>expected, the 'visit_store' callback invoked, with 'rhs' and 'arg':
> >> >>'<var_decl zzz>'.
> >> >>
> >> >>However, given (Fortran) 'zzz = r + r2', we produce GIMPLE:
> >> >>
> >> >>    gimple_assign <plus_expr, zzz, r, r2, NULL>
> >
> > But that's pre-ssa form.  After writing into SSA 'zzz' will be replaced by
> > an SSA name, and the actual store into 'zzz' will happen in a store
> > instruction.
>
> Oh, I see, and...
>
> >> >>..., and calling 'walk_stmt_load_store_addr_ops' on that, I see,
> >> >>unexpectedly, no callback at all invoked: neither 'visit_load', nor
> >> >>'visit_store' (nor 'visit_address', obviously).
> >> >
> >> > The variables involved are registers. You only get called on memory operands.
> >>
> >> How would I have told that from the 'walk_stmt_load_store_addr_ops'
> >> function description?  (How to improve that one "to reflect relatity"?)
> >>
> >> But 'zzz' surely is the same in 'zzz = 1' vs. 'zzz = r + r2' -- for the
> >> former I *do* see the 'visit_store' callback invoked, for the latter I
> >> don't?
> >
> > The walk_gimple functions are intended to be used on the SSA form of
> > gimple (i.e. the one that it is in most of the time).
>
> Yes, "most of the time", but actually not in my case: I'm doing stuff
> right after gimplification (before OMP lowering)...  So that's the
> "detail" I was missing -- sorry for not mentioning that right away.  :-|
>
> As 'walk_gimple_[...]' are used during gimplification, OMP lowering,
> supposedly they're fine to use in non-SSA form -- but evidently some of
> the helper functions are not.  (Might there be a way to add
> 'gcc_checking_assert (gimple_in_ssa_p (cfun))' or similar for that?
> Putting that into 'walk_stmt_load_store_addr_ops' triggers a lot, as
> called from 'gcc/cgraphbuild.c:cgraph_node::record_stmt_references', for
> example.)
>
> > And in that it's
> > not the case that 'zzz = 1' and 'zzz = r + r2' are similar.  The former
> > can have memory as the lhs (that includes static variables, or indirection
> > through pointers), the latter can not.  The lhs of a binary statement is
> > always an SSA name.  A write to an SSA name is not a store, which is why
> > it's not walked for walk_stmt_load_store_addr_ops.
> >
> > Maybe it helps to look at simple C examples: [...]
>
> I see, many thanks for reminding me about these items!
>
> > If you want
> > to capture writes into SSA names as well ([...])
> > you need the per-operand callback indeed.
>
> What I actually need is loads from/uses of actual variables (and I didn't
> see 'walk_stmt_load_store_addr_ops' give me these).
>
> > But that depends on
> > what you actually want to do.
>
> This is a prototype/"initial hack" for a very simple implementation of
> <https://gcc.gnu.org/PR90591> "Avoid unnecessary data transfer out of OMP
> construct".  (... to be completed and posted later.)  So simple that it
> will easily fail (gracefully, of course), but yet is effective for a lot
> of real-world code:
>
>     subroutine [...]
>     [...]
>     real([...])   xx        !temporary variable, for distance calculation
>     [...]
>     !$acc kernels pcopyin(x, zii) reduction(+:eva) ! implicit 'copy(xx)' for scalar used inside region; established during gimplification
>           do 100 i=0,n-1
>             evx=0.0d0
>             do 90 j=0,n-1
>               xx=abs(x(1,i)-x(1,j))
>     [...]
>     !$acc end kernels
>     [...]
>     ['xx' never used here]
>     end subroutine [...]
>
> Inside 'kernels', we'd like to automatically parallelize loops (we've
> basically got that working; analysis by Graphite etc.), but the problem
> is that given "implicit 'copy(xx)' for scalar used inside region", when
> Graphite later looks at the outlined 'kernels' region's function, it must
> assume that 'xx' is still live after the OpenACC 'kernels' construct --
> and thus cannot treat it as a thread-private temporary, cannot
> parallelize.
>
> Now, walking each function backwards (!), I'm taking note of any
> variables' uses, and if I reach an 'kernels' construct, but have not seen
> a use of 'xx', I may then optimize 'copy(xx)' -> 'firstprivate(xx)',
> enabling Graphite to do its thing.  (Other such optimizations may be
> added later.)  (This is inspired by Jakub's commit
> 1a80d6b87d81c3f336ab199a901cf80ae349c335 "re PR tree-optimization/68128
> (A huge regression in Parboil v2.5 OpenMP CUTCP test (2.5 times lower
> performance))".)
>
> I've now got a simple 'callback_op', which for '!is_lhs' looks at
> 'get_base_address ([op])', and if that 'var' is contained in the set of
> current candidates (initialized per containg 'bind's, which we enter
> first, even if walking a 'gimple_seq' backwards), removes that 'var' as a
> candidate for such optimization.  (Plus some "details", of couse.)  This
> seems to work fine, as far as I can tell.  :-)

It might then still fail for x = a[var] when you are interested in 'var'.

I think you want to use walk_gimple_stmt and provide walk_tree_fn which
will recurse into the complex tree operands (also making get_base_address
unnecessary).

>
> Of course, the eventual IPA-based solution (see PR90591, PR94693, etc.)
> will be much better -- but we need something now.
>
>
> Grüße
>  Thomas
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 'walk_stmt_load_store_addr_ops' for non-'gimple_assign_single_p (stmt)'
  2021-03-17  7:42       ` Richard Biener
@ 2021-03-17 13:27         ` Michael Matz
  0 siblings, 0 replies; 10+ messages in thread
From: Michael Matz @ 2021-03-17 13:27 UTC (permalink / raw)
  To: Richard Biener; +Cc: Thomas Schwinge, GCC Development

Hello,

On Wed, 17 Mar 2021, Richard Biener wrote:

> > The walk_gimple functions are intended to be used on the SSA form of 
> > gimple (i.e. the one that it is in most of the time).
> 
> Actually they are fine to use pre-SSA.

Structurally, sure.

> They just even pre-SSA distinguish between registers and memory.  

And that's of course the thing.

I probably should have used a different term, but used "SSA rewriting" to 
name the point where this distinction really starts to matter.  Before it 
a binary gimple statement could conceivably contain a non-register in the 
LHS (perhaps not right now, but there's nothing that would inherently 
break with that), and then would include a store that 
walk_stmt_load_store_addr_ops would "miss".

So, yeah, using SSA as name for that was sloppy, it's gimple itself that 
has the invariant of only registers in binary statements.


Ciao,
Michael.

> That's what gimplification honors as well, in 'zzz = r + r2' all 
> operands are registers, otherwise GIMPLE requires loads and stores split 
> into separate stmts not doing any computation.
> 
> It's just less obivous in the dumps (compared to SSA name dumping).
> 
> Richard.
> 
> >  And in that it's
> > not the case that 'zzz = 1' and 'zzz = r + r2' are similar.  The former
> > can have memory as the lhs (that includes static variables, or indirection
> > through pointers), the latter can not.  The lhs of a binary statement is
> > always an SSA name.  A write to an SSA name is not a store, which is why
> > it's not walked for walk_stmt_load_store_addr_ops.
> >
> > Maybe it helps to look at simple C examples:
> >
> > % cat x.c
> > int zzz;
> > void foo(void) { zzz = 1; }
> > void bar(int i) { zzz = i + 1; }
> > % gcc -c x.c -fdump-tree-ssa-vops
> > % cat x.c.*ssa
> > foo ()
> > {
> >   <bb 2> :
> >   # .MEM_2 = VDEF <.MEM_1(D)>
> >   zzz = 1;
> >   # VUSE <.MEM_2>
> >   return;
> > }
> >
> > bar (int i)
> > {
> >   int _1;
> >
> >   <bb 2> :
> >   _1 = i_2(D) + 1;
> >   # .MEM_4 = VDEF <.MEM_3(D)>
> >   zzz = _1;
> >   # VUSE <.MEM_4>
> >   return;
> >
> > }
> >
> > See how the instruction writing to zzz (a global, and hence memory) is
> > going through a temporary for the addition in bar?  This will always be
> > the case when the expression is arithmetic.
> >
> > In SSA form gimple only very few instruction types can be stores, namely
> > calls and stores like above (where the RHS is an unary tree).  If you want
> > to capture writes into SSA names as well (which are more appropriately
> > thought of as 'setting the ssa name' or 'associating the ssa name with the
> > rhs value') you need the per-operand callback indeed.  But that depends on
> > what you actually want to do.
> >
> >
> > Ciao,
> > Michael.
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 'walk_stmt_load_store_addr_ops' for non-'gimple_assign_single_p (stmt)'
  2021-03-17  7:46         ` Richard Biener
@ 2021-03-18 12:32           ` Thomas Schwinge
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Schwinge @ 2021-03-18 12:32 UTC (permalink / raw)
  To: Richard Biener; +Cc: Michael Matz, gcc, Jakub Jelinek, Frederik Harwath

Hi!

On 2021-03-17T08:46:16+0100, Richard Biener <richard.guenther@gmail.com> wrote:
> On Wed, Mar 17, 2021 at 12:25 AM Thomas Schwinge
> <thomas@codesourcery.com> wrote:
>> This is a prototype/"initial hack" for a very simple implementation of
>> <https://gcc.gnu.org/PR90591> "Avoid unnecessary data transfer out of OMP
>> construct".  (... to be completed and posted later.)  So simple that it
>> will easily fail (gracefully, of course), but yet is effective for a lot
>> of real-world code:
>>
>>     subroutine [...]
>>     [...]
>>     real([...])   xx        !temporary variable, for distance calculation
>>     [...]
>>     !$acc kernels pcopyin(x, zii) reduction(+:eva) ! implicit 'copy(xx)' for scalar used inside region; established during gimplification
>>           do 100 i=0,n-1
>>             evx=0.0d0
>>             do 90 j=0,n-1
>>               xx=abs(x(1,i)-x(1,j))
>>     [...]
>>     !$acc end kernels
>>     [...]
>>     ['xx' never used here]
>>     end subroutine [...]
>>
>> Inside 'kernels', we'd like to automatically parallelize loops (we've
>> basically got that working; analysis by Graphite etc.), but the problem
>> is that given "implicit 'copy(xx)' for scalar used inside region", when
>> Graphite later looks at the outlined 'kernels' region's function, it must
>> assume that 'xx' is still live after the OpenACC 'kernels' construct --
>> and thus cannot treat it as a thread-private temporary, cannot
>> parallelize.
>>
>> Now, walking each function backwards (!), I'm taking note of any
>> variables' uses, and if I reach an 'kernels' construct, but have not seen
>> a use of 'xx', I may then optimize 'copy(xx)' -> 'firstprivate(xx)',
>> enabling Graphite to do its thing.  (Other such optimizations may be
>> added later.)  (This is inspired by Jakub's commit
>> 1a80d6b87d81c3f336ab199a901cf80ae349c335 "re PR tree-optimization/68128
>> (A huge regression in Parboil v2.5 OpenMP CUTCP test (2.5 times lower
>> performance))".)
>>
>> I've now got a simple 'callback_op', which for '!is_lhs' looks at
>> 'get_base_address ([op])', and if that 'var' is contained in the set of
>> current candidates (initialized per containg 'bind's, which we enter
>> first, even if walking a 'gimple_seq' backwards), removes that 'var' as a
>> candidate for such optimization.  (Plus some "details", of couse.)  This
>> seems to work fine, as far as I can tell.  :-)
>
> It might then still fail for x = a[var] when you are interested in 'var'.

That already works as expected:

    gimple_assign <array_ref, x, a[var], NULL, NULL>

..., and in the debug log ('OP' is 'get_base_address (OPERAND)'), I see:

    ##### LOOKING INTO OPERAND:  <array_ref 0x7ffff67af3b8>
    ###### OP:  <var_decl 0x7ffff7feeb40 a>
    ####### WASN'T CANDIDATE
    ##### LOOKING INTO OPERAND:  <var_decl 0x7ffff7feeb40 a>
    ###### OP:  <var_decl 0x7ffff7feeb40 a>
    ####### WASN'T CANDIDATE
    ##### LOOKING INTO OPERAND:  <var_decl 0x7ffff7feebd0 var>
    ###### OP:  <var_decl 0x7ffff7feebd0 var>
    ####### NO LONGER CANDIDATE
    ##### LOOKING INTO OPERAND:  <var_decl 0x7ffff7feec60 x>
    ###### IGNORED: IS_LHS

Notice 'var' "NO LONGER CANDIDATE".

Well, I cross-checked, and I'm getting this behavior (deep scanning)
because I'm (intentionally) using (default) '*walk_subtrees = 1' in my
'callback_op'.  Indeed if I specify '*walk_subtrees = 0', the debug log
changes as follows:

     ##### LOOKING INTO OPERAND:  <array_ref 0x7ffff67af3b8>
     ###### OP:  <var_decl 0x7ffff7feeb40 a>
     ####### WASN'T CANDIDATE
    -##### LOOKING INTO OPERAND:  <var_decl 0x7ffff7feeb40 a>
    -###### OP:  <var_decl 0x7ffff7feeb40 a>
    -####### WASN'T CANDIDATE
    -##### LOOKING INTO OPERAND:  <var_decl 0x7ffff7feebd0 var>
    -###### OP:  <var_decl 0x7ffff7feebd0 var>
    -####### NO LONGER CANDIDATE
     ##### LOOKING INTO OPERAND:  <var_decl 0x7ffff7feec60 x>
     ###### IGNORED: IS_LHS

..., and your projected mis-optimization:

    +##### OPTIMIZING map(force_tofrom:var [len: 4]) -> firstprivate(var)

If I continue to use (default) '*walk_subtrees = 1', then I might not
actually need to use 'get_base_address', because we get all the
individual sub operands visited.

> I think you want to use walk_gimple_stmt and provide walk_tree_fn which
> will recurse into the complex tree operands (also making get_base_address
> unnecessary).

I'm already using 'walk_gimple_stmt' via 'walk_gimple_seq', and I'll
(later) look into removing 'get_base_address', and also look up
'walk_tree_fn'.  Ah, wait, what you suggested with 'walk_tree_fn' is
actually what I've already got; in my last email I said "I've now got a
simple 'callback_op' [...]".  Seems we're converging!  ;-)


Grüße
 Thomas


>> Of course, the eventual IPA-based solution (see PR90591, PR94693, etc.)
>> will be much better -- but we need something now.
>>
>>
>> Grüße
>>  Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-03-18 12:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-15 18:31 'walk_stmt_load_store_addr_ops' for non-'gimple_assign_single_p (stmt)' Thomas Schwinge
2021-03-15 19:17 ` Richard Biener
2021-03-16 15:02   ` Thomas Schwinge
2021-03-16 15:16     ` Richard Biener
2021-03-16 15:25     ` Michael Matz
2021-03-16 23:24       ` Thomas Schwinge
2021-03-17  7:46         ` Richard Biener
2021-03-18 12:32           ` Thomas Schwinge
2021-03-17  7:42       ` Richard Biener
2021-03-17 13:27         ` Michael Matz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).