public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
@ 2023-09-27 15:45 ` tnfchris at gcc dot gnu.org
  2023-09-28 17:48 ` tkoenig at gcc dot gnu.org
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2023-09-27 15:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

Tamar Christina <tnfchris at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tnfchris at gcc dot gnu.org,
                   |                            |toon at gcc dot gnu.org

--- Comment #6 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
This is the ticket I meant toon.

Do you or Thomas have any ideas how we can inline this?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
  2023-09-27 15:45 ` [Bug fortran/90608] Inline non-scalar minloc/maxloc calls tnfchris at gcc dot gnu.org
@ 2023-09-28 17:48 ` tkoenig at gcc dot gnu.org
  2023-10-11 11:48 ` mikael at gcc dot gnu.org
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 13+ messages in thread
From: tkoenig at gcc dot gnu.org @ 2023-09-28 17:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

Thomas Koenig <tkoenig at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mikael at gcc dot gnu.org

--- Comment #7 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #6)
> This is the ticket I meant toon.
> 
> Do you or Thomas have any ideas how we can inline this?

Two options, in principle.

One option is to extend use of the scalarizer for these cases. Mikael knows
this best, I am putting him in CC:

The other is to use front-end optimization and basically generate DO
loops, like we do for matmul.  A lot of the infrastructure is in place
in frontend-passes.cc, but there would be some restirctions because
it is not possible to put DO loops everywhere.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
  2023-09-27 15:45 ` [Bug fortran/90608] Inline non-scalar minloc/maxloc calls tnfchris at gcc dot gnu.org
  2023-09-28 17:48 ` tkoenig at gcc dot gnu.org
@ 2023-10-11 11:48 ` mikael at gcc dot gnu.org
  2023-10-11 12:09 ` tnfchris at gcc dot gnu.org
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 13+ messages in thread
From: mikael at gcc dot gnu.org @ 2023-10-11 11:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

--- Comment #8 from Mikael Morin <mikael at gcc dot gnu.org> ---
Created attachment 56091
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56091&action=edit
Rough patch

Here is a rough patch to make the scalarizer support minloc calls.
It regresses on minloc_1.f90 at least, but I haven't be able to pinpoint the
problem in the original tree dump so far.

The problem could be with the initialization of loop iteration variables. The
existing code used for scalar minloc was versioning loops, that is it was using
too loops in a row in some cases.  With scalar minloc, the initialization of
the loop variable could just be disabled in the second loop, but if there is
more than one dimension as in the array case, this can't work. So the patch
above initializes the loop variables conditionally on a "loop_break" boolean
variable, which I hoped the optimizers would be able to remove.  Unfortunately,
this conditional initialization seems to confuse the optimizers a lot.

Anyway, the patch is there; not sure how much I can pursue this further in the
future.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2023-10-11 11:48 ` mikael at gcc dot gnu.org
@ 2023-10-11 12:09 ` tnfchris at gcc dot gnu.org
  2023-10-11 16:30 ` mikael at gcc dot gnu.org
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2023-10-11 12:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

--- Comment #9 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Mikael Morin from comment #8)
> Created attachment 56091 [details]
> Rough patch
> 
> Here is a rough patch to make the scalarizer support minloc calls.
> It regresses on minloc_1.f90 at least, but I haven't be able to pinpoint the
> problem in the original tree dump so far.
> 
> The problem could be with the initialization of loop iteration variables.
> The existing code used for scalar minloc was versioning loops, that is it
> was using too loops in a row in some cases.  With scalar minloc, the
> initialization of the loop variable could just be disabled in the second
> loop, but if there is more than one dimension as in the array case, this
> can't work. So the patch above initializes the loop variables conditionally
> on a "loop_break" boolean variable, which I hoped the optimizers would be
> able to remove.  Unfortunately, this conditional initialization seems to
> confuse the optimizers a lot.
> 
> Anyway, the patch is there; not sure how much I can pursue this further in
> the future.

Thanks Mikael!

That's already plenty of help! I can try to debug further after I finish my
current patches.  Would it be ok if I ask questions when I do?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2023-10-11 12:09 ` tnfchris at gcc dot gnu.org
@ 2023-10-11 16:30 ` mikael at gcc dot gnu.org
  2023-10-12 11:41 ` mikael at gcc dot gnu.org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 13+ messages in thread
From: mikael at gcc dot gnu.org @ 2023-10-11 16:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

--- Comment #10 from Mikael Morin <mikael at gcc dot gnu.org> ---
(In reply to Mikael Morin from comment #8)
> (...) that is it was using too loops in a row in some cases. 
> 
... *two* loops in a row ...


(In reply to Tamar Christina from comment #9)
> 
> Thanks Mikael!
> 
> That's already plenty of help! I can try to debug further after I finish my
> current patches.  Would it be ok if I ask questions when I do?

Yes, of course.

I forgot to precise that the patch above supports only calls without any
optional args.  Allowing non-DIM arguments should work, as the code supporting
them is already there for the scalar case.  For the DIM argument, it's a bit
more work.  I'm not sure, but maybe the scalarizer support for reductions (SUM
or PRODUCT with DIM arg) can be used to support it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2023-10-11 16:30 ` mikael at gcc dot gnu.org
@ 2023-10-12 11:41 ` mikael at gcc dot gnu.org
  2023-10-16 10:30 ` tnfchris at gcc dot gnu.org
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 13+ messages in thread
From: mikael at gcc dot gnu.org @ 2023-10-12 11:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

Mikael Morin <mikael at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #56091|0                           |1
        is obsolete|                            |

--- Comment #11 from Mikael Morin <mikael at gcc dot gnu.org> ---
Created attachment 56094
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56094&action=edit
Improved patch

This improved patch (still single argument only) passes the fortran regression
testsuite.

(In reply to Mikael Morin from comment #8)
> It regresses on minloc_1.f90 at least, but I haven't be able to pinpoint the
> problem in the original tree dump so far.
> 
The problem was an initialization of the result to the first element of the
array that the patch removed, which seemed useless to me but made a difference
in the questionable case where the array argument is filled with nans.

> The problem could be with the initialization of loop iteration variables.
> (...)
> Unfortunately, this conditional initialization seems to
> confuse the optimizers a lot.
> 
On closer look, the conditional initialization doesn't seem to be that
confusing (at least in the problematic case), as it's removed early (ccp1) in
the pipeline.  The loop iteration variables remain initialized with phis, but
that's because of the loops.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
                   ` (5 preceding siblings ...)
  2023-10-12 11:41 ` mikael at gcc dot gnu.org
@ 2023-10-16 10:30 ` tnfchris at gcc dot gnu.org
  2023-10-16 12:49 ` mikael at gcc dot gnu.org
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2023-10-16 10:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

--- Comment #12 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Mikael Morin from comment #11)
> Created attachment 56094 [details]
> Improved patch
> 
> This improved patch (still single argument only) passes the fortran
> regression testsuite.
> 

Awesome! Thanks! it looks like the benchmark always uses dim=1 or the mask
argument.

Can you give a hint into what I'd need to do to add the additional params?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
                   ` (6 preceding siblings ...)
  2023-10-16 10:30 ` tnfchris at gcc dot gnu.org
@ 2023-10-16 12:49 ` mikael at gcc dot gnu.org
  2023-10-25 17:02 ` mikael at gcc dot gnu.org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 13+ messages in thread
From: mikael at gcc dot gnu.org @ 2023-10-16 12:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

--- Comment #13 from Mikael Morin <mikael at gcc dot gnu.org> ---
(In reply to Tamar Christina from comment #12)
> (In reply to Mikael Morin from comment #11)
> > Created attachment 56094 [details]
> > Improved patch
> > 
> > This improved patch (still single argument only) passes the fortran
> > regression testsuite.
> > 
> 
> Awesome! Thanks! it looks like the benchmark always uses dim=1 or the mask
> argument.
> 
> Can you give a hint into what I'd need to do to add the additional params?

For the mask argument, I hope it would just need
gfc_inline_intrinsic_function_p to return true to work.

For the dim argument, it's a bit more complicated. You can have a look at how
the dim argument support for sum and product was introduced here:
https://gcc.gnu.org/pipermail/fortran/2011-October/037574.html
To describe in a few words the needed bits:
 - gfc_walk_inline_intrinsic_function needs to collect the arrays involved in
scalarization.  In the case where dim is absent, there is just the result of
minloc and it can't be decomposed further.  If dim is present on the other
hand, the arrays are the non-dim arguments, and there is one nested loop
reducing those arrays' dimension by one.  One important thing to pay attention
for, the arrays must be present in the same order they will be consumed by the
gfc_conv_intrinsic_minmaxloc later.
 - All the calls to gfc_walk_expr in gfc_conv_intrinsic_minmaxloc should be
disabled in favor enter_nested_loop.
 - Setting of gfc_se::ss pointers should be disabled, as they should come
correct from initialization.
 - The call to gfc_cleanup_loop should be disabled

These 4 points are very similar to the sum/product patch mentioned above.
Additionally, one has to disable the changes to support the other cases of
{min,max}loc.  Possibly it's easier to start with an unpatched master and merge
the patches afterwards.

Anyway, I should be able to help, maybe by the end of this week, or next week.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
                   ` (7 preceding siblings ...)
  2023-10-16 12:49 ` mikael at gcc dot gnu.org
@ 2023-10-25 17:02 ` mikael at gcc dot gnu.org
  2023-10-25 17:05 ` tnfchris at gcc dot gnu.org
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 13+ messages in thread
From: mikael at gcc dot gnu.org @ 2023-10-25 17:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

Mikael Morin <mikael at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #56094|0                           |1
        is obsolete|                            |

--- Comment #14 from Mikael Morin <mikael at gcc dot gnu.org> ---
Created attachment 56313
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56313&action=edit
inline minloc with mask

This patch adds support for {min,max}loc with mask.
It is not 100% testsuite clean as there are (runtime) error messages that
regress slightly for maxloc_bounds_{4,5,6,7}.f90


(In reply to Mikael Morin from comment #11)
> 
> > The problem could be with the initialization of loop iteration variables.
> > (...)
> > Unfortunately, this conditional initialization seems to
> > confuse the optimizers a lot.
> > 
> On closer look, the conditional initialization doesn't seem to be that
> confusing (at least in the problematic case), as it's removed early (ccp1)
> in the pipeline.  The loop iteration variables remain initialized with phis,
> but that's because of the loops.

Unfortunately, this is true for rank 1 arrays, but not for higher ranks.
Constant values are slowly propagated to the phi arguments as optimization
passes are run, but no simplification of the control flow happens as soon as
multiple loop levels are involved.

Need to look into the dim argument next.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
                   ` (8 preceding siblings ...)
  2023-10-25 17:02 ` mikael at gcc dot gnu.org
@ 2023-10-25 17:05 ` tnfchris at gcc dot gnu.org
  2023-11-26 19:04 ` mikael at gcc dot gnu.org
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2023-10-25 17:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

--- Comment #15 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Mikael Morin from comment #14)
> Created attachment 56313 [details]
> inline minloc with mask
> 
> This patch adds support for {min,max}loc with mask.

Awesome, thank you!

> It is not 100% testsuite clean as there are (runtime) error messages that
> regress slightly for maxloc_bounds_{4,5,6,7}.f90
> 
> 
> (In reply to Mikael Morin from comment #11)
> > 
> > > The problem could be with the initialization of loop iteration variables.
> > > (...)
> > > Unfortunately, this conditional initialization seems to
> > > confuse the optimizers a lot.
> > > 
> > On closer look, the conditional initialization doesn't seem to be that
> > confusing (at least in the problematic case), as it's removed early (ccp1)
> > in the pipeline.  The loop iteration variables remain initialized with phis,
> > but that's because of the loops.
> 
> Unfortunately, this is true for rank 1 arrays, but not for higher ranks.
> Constant values are slowly propagated to the phi arguments as optimization
> passes are run, but no simplification of the control flow happens as soon as
> multiple loop levels are involved.
> 
> Need to look into the dim argument next.

It's very much appreciated! this should help greatly! Sorry I hadn't reply to
the previous message. Finishing up some work for stage-1.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
                   ` (9 preceding siblings ...)
  2023-10-25 17:05 ` tnfchris at gcc dot gnu.org
@ 2023-11-26 19:04 ` mikael at gcc dot gnu.org
  2023-11-26 20:16 ` anlauf at gcc dot gnu.org
  2024-01-10 17:39 ` tnfchris at gcc dot gnu.org
  12 siblings, 0 replies; 13+ messages in thread
From: mikael at gcc dot gnu.org @ 2023-11-26 19:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

Mikael Morin <mikael at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |mikael at gcc dot gnu.org

--- Comment #16 from Mikael Morin <mikael at gcc dot gnu.org> ---
This missed the gcc stage 1 deadline, but I'm still working on it.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
                   ` (10 preceding siblings ...)
  2023-11-26 19:04 ` mikael at gcc dot gnu.org
@ 2023-11-26 20:16 ` anlauf at gcc dot gnu.org
  2024-01-10 17:39 ` tnfchris at gcc dot gnu.org
  12 siblings, 0 replies; 13+ messages in thread
From: anlauf at gcc dot gnu.org @ 2023-11-26 20:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

--- Comment #17 from anlauf at gcc dot gnu.org ---
(In reply to Mikael Morin from comment #16)
> This missed the gcc stage 1 deadline, but I'm still working on it.

I always thought that the Fortran FE does not fall under this rule.

Why don't you proceed, and let's discuss later whether your patch is
safe enough for gcc-14?

Or are you touching parts outside of gcc/fortran/ ?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug fortran/90608] Inline non-scalar minloc/maxloc calls
       [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
                   ` (11 preceding siblings ...)
  2023-11-26 20:16 ` anlauf at gcc dot gnu.org
@ 2024-01-10 17:39 ` tnfchris at gcc dot gnu.org
  12 siblings, 0 replies; 13+ messages in thread
From: tnfchris at gcc dot gnu.org @ 2024-01-10 17:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90608

--- Comment #18 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Mikael Morin from comment #16)
> This missed the gcc stage 1 deadline, but I'm still working on it.

Thanks Mikael!  If I can help with anything do let me know :)

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-01-10 17:39 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-90608-4@http.gcc.gnu.org/bugzilla/>
2023-09-27 15:45 ` [Bug fortran/90608] Inline non-scalar minloc/maxloc calls tnfchris at gcc dot gnu.org
2023-09-28 17:48 ` tkoenig at gcc dot gnu.org
2023-10-11 11:48 ` mikael at gcc dot gnu.org
2023-10-11 12:09 ` tnfchris at gcc dot gnu.org
2023-10-11 16:30 ` mikael at gcc dot gnu.org
2023-10-12 11:41 ` mikael at gcc dot gnu.org
2023-10-16 10:30 ` tnfchris at gcc dot gnu.org
2023-10-16 12:49 ` mikael at gcc dot gnu.org
2023-10-25 17:02 ` mikael at gcc dot gnu.org
2023-10-25 17:05 ` tnfchris at gcc dot gnu.org
2023-11-26 19:04 ` mikael at gcc dot gnu.org
2023-11-26 20:16 ` anlauf at gcc dot gnu.org
2024-01-10 17:39 ` tnfchris at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).