public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/97307] New: Optimization for pure vs. const function
@ 2020-10-06 16:46 benjamin.meier70 at gmail dot com
  2020-10-07  6:49 ` [Bug tree-optimization/97307] " rguenth at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: benjamin.meier70 at gmail dot com @ 2020-10-06 16:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97307

            Bug ID: 97307
           Summary: Optimization for pure vs. const function
           Product: gcc
           Version: 10.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: benjamin.meier70 at gmail dot com
  Target Milestone: ---

This bug report is based on this stackoverflow post:
https://stackoverflow.com/q/64034889/916672

The source code I use in this post, is also available here:
https://gcc.godbolt.org/z/dGvxnv

Given this C source code:

> int pure_f(int a, int b) __attribute__((pure));
> 
> int const_f(int a, int b) __attribute__((const));
> 
> int my_f(int a, int b) {
>     int x = pure_f(a, b);
>     if (a > 0) {
>         return x;
>     }
>     return a;
> }

If this is compiled with gcc with -O3, I would expect that the evaluation of
pure_f(a, b) is moved into the if. But it is not done:

> my_f(int, int):
>         push    r12
>         mov     r12d, edi
>         call    pure_f(int, int)
>         test    r12d, r12d
>         cmovg   r12d, eax
>         mov     eax, r12d
>         pop     r12
>         ret

On the other side, if const_f is called instead of pure_f, it is moved into the
if:


> my_f(int, int):
>         test    edi, edi
>         jg      .L4
>         mov     eax, edi
>         ret
> .L4:
>         jmp     const_f(int, int)


Why isn't this optimization applied for a pure function? From my understanding,
this should also be possible and it seems to be beneficial.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/97307] Optimization for pure vs. const function
  2020-10-06 16:46 [Bug c/97307] New: Optimization for pure vs. const function benjamin.meier70 at gmail dot com
@ 2020-10-07  6:49 ` rguenth at gcc dot gnu.org
  2020-10-07  8:44 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-10-07  6:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97307

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
          Component|middle-end                  |tree-optimization
     Ever confirmed|0                           |1
                 CC|                            |rguenth at gcc dot gnu.org
   Last reconfirmed|                            |2020-10-07

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
It's because we hit

      /* If this is a load then do not sink past any stores.
         ???  This is overly simple but cheap.  We basically look
         for an existing load with the same VUSE in the path to one
         of the sink candidate blocks and we adjust commondom to the
         nearest to commondom.  */
      if (gimple_vuse (stmt))
        {
...
          imm_use_iterator imm_iter;
          use_operand_p use_p;
          basic_block found = NULL;
          FOR_EACH_IMM_USE_FAST (use_p, imm_iter, gimple_vuse (stmt))
            {
              gimple *use_stmt = USE_STMT (use_p);
              basic_block bb = gimple_bb (use_stmt);
              /* For PHI nodes the block we know sth about
                 is the incoming block with the use.  */
              if (gimple_code (use_stmt) == GIMPLE_PHI)
                bb = EDGE_PRED (bb, PHI_ARG_INDEX_FROM_USE (use_p))->src;
              /* Any dominator of commondom would be ok with
                 adjusting commondom to that block.  */
              bb = nearest_common_dominator (CDI_DOMINATORS, bb, commondom);
              if (!found)
                found = bb;

where we do not ignore stmt itself (nor stmts in its same block).  We also
do not ignore the function return which directs us to BB2 as well.  The
heuristic looks quite misguided to me ...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/97307] Optimization for pure vs. const function
  2020-10-06 16:46 [Bug c/97307] New: Optimization for pure vs. const function benjamin.meier70 at gmail dot com
  2020-10-07  6:49 ` [Bug tree-optimization/97307] " rguenth at gcc dot gnu.org
@ 2020-10-07  8:44 ` rguenth at gcc dot gnu.org
  2020-10-07 14:51 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-10-07  8:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97307

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot gnu.org
             Status|NEW                         |ASSIGNED

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Testing a patch.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/97307] Optimization for pure vs. const function
  2020-10-06 16:46 [Bug c/97307] New: Optimization for pure vs. const function benjamin.meier70 at gmail dot com
  2020-10-07  6:49 ` [Bug tree-optimization/97307] " rguenth at gcc dot gnu.org
  2020-10-07  8:44 ` rguenth at gcc dot gnu.org
@ 2020-10-07 14:51 ` rguenth at gcc dot gnu.org
  2020-10-07 14:55 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-10-07 14:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97307

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
The patch will cause

FAIL: gcc.dg/vect/pr65947-3.c scan-tree-dump-times vect "LOOP VECTORIZED" 2

since the testcase has exactly such a pattern:

unsigned int
condition_reduction (unsigned int *a, unsigned int min_v, unsigned int *b)
{
  unsigned int last = N + 65;
  unsigned int aval;

  for (unsigned int i = 0; i < N; i++)
    {
      aval = a[i];
      if (b[i] < min_v)
        last = aval;
    }
  return last;
}

and sinking the aval = a[i] load causes if-conversion to no longer happen
because a[i] now appears to only conditionally trap.  It will be vectorized
with masked loads if those are available though, so I'm considering to
XFAIL it for the moment.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/97307] Optimization for pure vs. const function
  2020-10-06 16:46 [Bug c/97307] New: Optimization for pure vs. const function benjamin.meier70 at gmail dot com
                   ` (2 preceding siblings ...)
  2020-10-07 14:51 ` rguenth at gcc dot gnu.org
@ 2020-10-07 14:55 ` cvs-commit at gcc dot gnu.org
  2020-10-07 14:56 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-10-07 14:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97307

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:dae673abd37d400408959497e50fe1f3fbef5533

commit r11-3705-gdae673abd37d400408959497e50fe1f3fbef5533
Author: Richard Biener <rguenther@suse.de>
Date:   Wed Oct 7 10:42:12 2020 +0200

    tree-optimization/97307 - improve sinking of loads

    This improves the heuristics finding a sink location for loads that does
    not cross any store.

    2020-10-07  Richard Biener  <rguenther@suse.de>

            PR tree-optimization/97307
            * tree-ssa-sink.c (statement_sink_location): Change heuristic
            for not skipping stores to look for virtual definitions
            rather than uses.

            * gcc.dg/tree-ssa/ssa-sink-17.c: New testcase.
            * gcc.dg/vect/pr65947-3.c: XFAIL.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/97307] Optimization for pure vs. const function
  2020-10-06 16:46 [Bug c/97307] New: Optimization for pure vs. const function benjamin.meier70 at gmail dot com
                   ` (3 preceding siblings ...)
  2020-10-07 14:55 ` cvs-commit at gcc dot gnu.org
@ 2020-10-07 14:56 ` rguenth at gcc dot gnu.org
  2020-10-07 15:04 ` benjamin.meier70 at gmail dot com
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-10-07 14:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97307

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
      Known to work|                            |11.0
         Resolution|---                         |FIXED

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
Fixed on trunk.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/97307] Optimization for pure vs. const function
  2020-10-06 16:46 [Bug c/97307] New: Optimization for pure vs. const function benjamin.meier70 at gmail dot com
                   ` (4 preceding siblings ...)
  2020-10-07 14:56 ` rguenth at gcc dot gnu.org
@ 2020-10-07 15:04 ` benjamin.meier70 at gmail dot com
  2020-10-07 15:30 ` jakub at gcc dot gnu.org
  2020-10-07 17:27 ` rguenther at suse dot de
  7 siblings, 0 replies; 9+ messages in thread
From: benjamin.meier70 at gmail dot com @ 2020-10-07 15:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97307

--- Comment #6 from Benjamin B. Meier <benjamin.meier70 at gmail dot com> ---
Thanks for the super quick reaction:)!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/97307] Optimization for pure vs. const function
  2020-10-06 16:46 [Bug c/97307] New: Optimization for pure vs. const function benjamin.meier70 at gmail dot com
                   ` (5 preceding siblings ...)
  2020-10-07 15:04 ` benjamin.meier70 at gmail dot com
@ 2020-10-07 15:30 ` jakub at gcc dot gnu.org
  2020-10-07 17:27 ` rguenther at suse dot de
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-10-07 15:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97307

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #3)
> The patch will cause
> 
> FAIL: gcc.dg/vect/pr65947-3.c scan-tree-dump-times vect "LOOP VECTORIZED" 2
> 
> since the testcase has exactly such a pattern:
> 
> unsigned int
> condition_reduction (unsigned int *a, unsigned int min_v, unsigned int *b)
> {
>   unsigned int last = N + 65;
>   unsigned int aval;
> 
>   for (unsigned int i = 0; i < N; i++)
>     {
>       aval = a[i];
>       if (b[i] < min_v)
>         last = aval;
>     }
>   return last;
> }
> 
> and sinking the aval = a[i] load causes if-conversion to no longer happen
> because a[i] now appears to only conditionally trap.  It will be vectorized
> with masked loads if those are available though, so I'm considering to
> XFAIL it for the moment.

Isn't that an important real-world case though?  I mean, people who care a lot
about the vectorization often change code from the loads in the conditionals to
preloading the value before the condition, such that it can be vectorized and
now it won't (and masked loads are less common among architectures that can
vectorize such loops than non-masked ones).
So shouldn't we just sink that way calls before vectorization and after
vectorization everything?  Or have some way to tell ifcvt it can undo the
sinking.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/97307] Optimization for pure vs. const function
  2020-10-06 16:46 [Bug c/97307] New: Optimization for pure vs. const function benjamin.meier70 at gmail dot com
                   ` (6 preceding siblings ...)
  2020-10-07 15:30 ` jakub at gcc dot gnu.org
@ 2020-10-07 17:27 ` rguenther at suse dot de
  7 siblings, 0 replies; 9+ messages in thread
From: rguenther at suse dot de @ 2020-10-07 17:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97307

--- Comment #8 from rguenther at suse dot de <rguenther at suse dot de> ---
On October 7, 2020 5:30:14 PM GMT+02:00, "jakub at gcc dot gnu.org"
<gcc-bugzilla@gcc.gnu.org> wrote:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97307
>
>Jakub Jelinek <jakub at gcc dot gnu.org> changed:
>
>           What    |Removed                     |Added
>----------------------------------------------------------------------------
>               CC|                            |jakub at gcc dot gnu.org
>
>--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
>(In reply to Richard Biener from comment #3)
>> The patch will cause
>> 
>> FAIL: gcc.dg/vect/pr65947-3.c scan-tree-dump-times vect "LOOP
>VECTORIZED" 2
>> 
>> since the testcase has exactly such a pattern:
>> 
>> unsigned int
>> condition_reduction (unsigned int *a, unsigned int min_v, unsigned
>int *b)
>> {
>>   unsigned int last = N + 65;
>>   unsigned int aval;
>> 
>>   for (unsigned int i = 0; i < N; i++)
>>     {
>>       aval = a[i];
>>       if (b[i] < min_v)
>>         last = aval;
>>     }
>>   return last;
>> }
>> 
>> and sinking the aval = a[i] load causes if-conversion to no longer
>happen
>> because a[i] now appears to only conditionally trap.  It will be
>vectorized
>> with masked loads if those are available though, so I'm considering
>to
>> XFAIL it for the moment.
>
>Isn't that an important real-world case though?  I mean, people who
>care a lot
>about the vectorization often change code from the loads in the
>conditionals to
>preloading the value before the condition, such that it can be
>vectorized and
>now it won't (and masked loads are less common among architectures that
>can
>vectorize such loops than non-masked ones).
>So shouldn't we just sink that way calls before vectorization and after
>vectorization everything?  Or have some way to tell ifcvt it can undo
>the
>sinking.

It is a general issue we might want to address. Either by if converting this
earlier (phiopt, had some patches with the min/max stuff I dropped the ball on)
or by doing sinking only after loop opts /vectorization (like we moved
predcom). That we didn't sink here is an artifact of the testcase surroundings,
not the actual kernel.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-10-07 17:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-06 16:46 [Bug c/97307] New: Optimization for pure vs. const function benjamin.meier70 at gmail dot com
2020-10-07  6:49 ` [Bug tree-optimization/97307] " rguenth at gcc dot gnu.org
2020-10-07  8:44 ` rguenth at gcc dot gnu.org
2020-10-07 14:51 ` rguenth at gcc dot gnu.org
2020-10-07 14:55 ` cvs-commit at gcc dot gnu.org
2020-10-07 14:56 ` rguenth at gcc dot gnu.org
2020-10-07 15:04 ` benjamin.meier70 at gmail dot com
2020-10-07 15:30 ` jakub at gcc dot gnu.org
2020-10-07 17:27 ` rguenther at suse dot de

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).