public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/104526] New: [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0)
@ 2022-02-14 10:07 theodort at inf dot ethz.ch
  2022-02-14 10:12 ` [Bug tree-optimization/104526] " rguenth at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: theodort at inf dot ethz.ch @ 2022-02-14 10:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104526

            Bug ID: 104526
           Summary: [12 Regression] Dead Code Elimination Regression at
                    -O3 (trunk vs. 11.2.0)
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: theodort at inf dot ethz.ch
  Target Milestone: ---

cat case.c #138750
void foo(void);

static int a, b = 1, *c = &b;
int main() {
  for (; a; a--) {
    int d = 2 >> (1 / *c);
    if (!d)
      foo();
  }
}

gcc-58aeb75d4097010ad9bb72b964265b18ab284f93 (trunk) -O3 can not eliminate foo
but gcc-11.2.0 -O3 can.

gcc-58aeb75d4097010ad9bb72b964265b18ab284f93 (trunk) -O3 -S -o /dev/stdout
case.c
--------- OUTPUT ---------
main:
.LFB0:
        .cfi_startproc
        movl    a(%rip), %eax
        testl   %eax, %eax
        je      .L12
        pushq   %rbx
        .cfi_def_cfa_offset 16
        .cfi_offset 3, -16
        xorl    %ebx, %ebx
.L2:
        movl    b(%rip), %ecx
        leal    1(%rcx), %eax
        cmpl    $2, %eax
        movl    $2, %eax
        cmova   %ebx, %ecx
        sarl    %cl, %eax
        testl   %eax, %eax
        je      .L3
        movl    $0, a(%rip)
.L10:
        xorl    %eax, %eax
        popq    %rbx
        .cfi_remember_state
        .cfi_def_cfa_offset 8
        ret
        .p2align 4,,10
        .p2align 3
.L3:
        .cfi_restore_state
        call    foo
        subl    $1, a(%rip)
        jne     .L2
        jmp     .L10
.L12:
        .cfi_def_cfa_offset 8
        .cfi_restore 3
        xorl    %eax, %eax
        ret
---------- END OUTPUT ---------


gcc-11.2.0 -O3 -S -o /dev/stdout case.c
--------- OUTPUT ---------
main:
.LFB0:
        .cfi_startproc
        movl    a(%rip), %eax
        testl   %eax, %eax
        je      .L2
        movl    $0, a(%rip)
.L2:
        xorl    %eax, %eax
        ret
---------- END OUTPUT ---------


Bisects to:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=c2b610e7c6c89fd422c5c31f01023bcddf3cf4a5

----- Build information -----
----- 58aeb75d4097010ad9bb72b964265b18ab284f93 (trunk)
Target: x86_64-pc-linux-gnu
Configured with: ../configure --disable-multilib --disable-bootstrap
--enable-languages=c,c++ 
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 12.0.1 20220213 (experimental) (GCC)

----- releases/gcc-11.2.0
Target: x86_64-pc-linux-gnu
Configured with: ../configure --disable-multilib --disable-bootstrap
--enable-languages=c,c++ 
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.2.0 (GCC)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/104526] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0)
  2022-02-14 10:07 [Bug tree-optimization/104526] New: [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0) theodort at inf dot ethz.ch
@ 2022-02-14 10:12 ` rguenth at gcc dot gnu.org
  2022-02-14 13:41 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-02-14 10:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104526

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
   Target Milestone|---                         |12.0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/104526] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0)
  2022-02-14 10:07 [Bug tree-optimization/104526] New: [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0) theodort at inf dot ethz.ch
  2022-02-14 10:12 ` [Bug tree-optimization/104526] " rguenth at gcc dot gnu.org
@ 2022-02-14 13:41 ` jakub at gcc dot gnu.org
  2022-02-14 14:33 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-02-14 13:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104526

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Started with r12-6924-gc2b610e7c6c89fd422c5c31f01023bcddf3cf4a5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/104526] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0)
  2022-02-14 10:07 [Bug tree-optimization/104526] New: [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0) theodort at inf dot ethz.ch
  2022-02-14 10:12 ` [Bug tree-optimization/104526] " rguenth at gcc dot gnu.org
  2022-02-14 13:41 ` jakub at gcc dot gnu.org
@ 2022-02-14 14:33 ` jakub at gcc dot gnu.org
  2022-02-14 22:14 ` amacleod at redhat dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-02-14 14:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104526

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |aldyh at gcc dot gnu.org,
                   |                            |amacleod at redhat dot com
           Priority|P3                          |P1
   Last reconfirmed|                            |2022-02-14
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Seems like something EVRP should optimize.

The pre- r12-6924 IL was:
  c.0_1 = c;
  _2 = *c.0_1;
  # RANGE [-1, 1]
  _3 = 1 / _2;
  # RANGE [1, 2] NONZERO 3
  d_11 = 2 >> _3;
and evrp properly figured out those ranges, that 1 / int is [-1, 1] and
that 2 >> [-1, 1] is [1, 2].
But since r12-6924 the IL is:
  c.0_1 = c;
  _2 = *c.0_1;
  _11 = (unsigned int) _2;
  _12 = _11 + 1;
  _13 = _12 <= 2;
  _3 = _12 <= 2 ? _2 : 0;
  # RANGE [0, 2] NONZERO 3
  d_14 = 2 >> _3;
and the range for d_14 is too broad (includes 0) and no ranges are recorded for
the other SSA_NAMEs.
Now, __1 and _12 are of course VARYING, and because _13 is _Bool, it is also
VARYING.
The important missing part is that we don't realize that _12 <= 2 ? _2 : 0
implies [-1, 1] range.  The _2 + 1U <= 2U is a standard pattern how ranges are
encoded.  Now if I rewrite the testcase by hand to:
void foo(void);

static int a, b = 1, *c = &b;
int main() {
  for (; a; a--) {
    int e;
    int ct = *c;
    if (ct + 1U <= 2U)
      e = ct;
    else
      e = 0;
    int d = 2 >> e;
    if (!d)
      foo();
  }
}
which is equivalent to doing the 1 / int PR95424 optimization by hand, but
instead of having it in a COND_EXPR do it in separate bbs, i.e.:
  c.0_1 = c;
  ct_12 = *c.0_1;
  ct.1_2 = (unsigned int) ct_12;
  _3 = ct.1_2 + 1;
  if (_3 <= 2)
    goto <bb 5>; [INV]
  else
    goto <bb 4>; [INV]

  <bb 4> :

  <bb 5> :
  # RANGE [-1, 1]
  # e_7 = PHI <ct_12(3), 0(4)>
  # RANGE [1, 2] NONZERO 3
  d_15 = 2 >> e_7;
then evrp handles it just fine.

So, Andrew/Aldy, how hard would it be to improve ranger COND_EXPR handling, so
that it essentially does what we do for the PHI cases?  I.e. from the COND_EXPR
condition, compute "assertion" if condition is true or if condition is false,
and use that on the COND_EXPR's second and third argument.
So for the
  _3 = _12 <= 2 ? _2 : 0;
comparison, for second argument the condition must be true which implies that
_2 must be there [-1, 1], while for the third argument the condition must be
false, but the argument is constant 0, so range is [0, 0], then just union
those 2 ranges.

As this is a P1 regression, if we can fix it, would be nice to get it into GCC
12.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/104526] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0)
  2022-02-14 10:07 [Bug tree-optimization/104526] New: [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0) theodort at inf dot ethz.ch
                   ` (2 preceding siblings ...)
  2022-02-14 14:33 ` jakub at gcc dot gnu.org
@ 2022-02-14 22:14 ` amacleod at redhat dot com
  2022-02-15 22:12 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: amacleod at redhat dot com @ 2022-02-14 22:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104526

--- Comment #3 from Andrew Macleod <amacleod at redhat dot com> ---
(In reply to Jakub Jelinek from comment #2)

> and evrp properly figured out those ranges, that 1 / int is [-1, 1] and
> that 2 >> [-1, 1] is [1, 2].
> But since r12-6924 the IL is:
>   c.0_1 = c;
>   _2 = *c.0_1;
>   _11 = (unsigned int) _2;
>   _12 = _11 + 1;
>   _13 = _12 <= 2;
>   _3 = _12 <= 2 ? _2 : 0;

> So, Andrew/Aldy, how hard would it be to improve ranger COND_EXPR handling,
> so that it essentially does what we do for the PHI cases?  I.e. from the
> COND_EXPR condition, compute "assertion" if condition is true or if
> condition is false, and use that on the COND_EXPR's second and third
> argument.
> So for the
>   _3 = _12 <= 2 ? _2 : 0;
> comparison, for second argument the condition must be true which implies that
> _2 must be there [-1, 1], while for the third argument the condition must be
> false, but the argument is constant 0, so range is [0, 0], then just union
> those 2 ranges.
> 
> As this is a P1 regression, if we can fix it, would be nice to get it into
> GCC 12.

I'm having a look. The bits are all there. Most of gori is stmt oriented, but I
may be able to invoke the components such that we evaluate the 2nd and 3rd
arguemnts as if they were on true/false edges to improve the results..

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/104526] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0)
  2022-02-14 10:07 [Bug tree-optimization/104526] New: [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0) theodort at inf dot ethz.ch
                   ` (3 preceding siblings ...)
  2022-02-14 22:14 ` amacleod at redhat dot com
@ 2022-02-15 22:12 ` cvs-commit at gcc dot gnu.org
  2022-02-15 22:13 ` amacleod at redhat dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-02-15 22:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104526

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Andrew Macleod <amacleod@gcc.gnu.org>:

https://gcc.gnu.org/g:e15425e899e4a9eec768cf74aaf36cdbf1d29913

commit r12-7253-ge15425e899e4a9eec768cf74aaf36cdbf1d29913
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Mon Feb 14 19:43:40 2022 -0500

    Use GORI to evaluate arguments of a COND_EXPR.

    Provide an API into gori to perform a basic evaluation of the arguments of
a
    COND_EXPR if they are in the dependency chain of the condition.

            PR tree-optimization/104526
            gcc/
            * gimple-range-fold.cc (fold_using_range::range_of_cond_expr): Call
            new routine.
            * gimple-range-gori.cc (range_def_chain::get_def_chain): Force a
build
            of dependency chain if there isn't one.
            (gori_compute::condexpr_adjust): New.
            * gimple-range-gori.h (class gori_compute): New prototype.

            gcc/testsuite/
            * gcc.dg/pr104526.c: New.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/104526] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0)
  2022-02-14 10:07 [Bug tree-optimization/104526] New: [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0) theodort at inf dot ethz.ch
                   ` (4 preceding siblings ...)
  2022-02-15 22:12 ` cvs-commit at gcc dot gnu.org
@ 2022-02-15 22:13 ` amacleod at redhat dot com
  2022-02-16 12:39 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: amacleod at redhat dot com @ 2022-02-15 22:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104526

Andrew Macleod <amacleod at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #5 from Andrew Macleod <amacleod at redhat dot com> ---
fixed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/104526] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0)
  2022-02-14 10:07 [Bug tree-optimization/104526] New: [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0) theodort at inf dot ethz.ch
                   ` (5 preceding siblings ...)
  2022-02-15 22:13 ` amacleod at redhat dot com
@ 2022-02-16 12:39 ` jakub at gcc dot gnu.org
  2022-02-16 14:05 ` amacleod at redhat dot com
  2022-02-16 14:07 ` jakub at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-02-16 12:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104526

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
+  tree type = TREE_TYPE (TREE_OPERAND (cond, 0));
+  if (type != TREE_TYPE (TREE_OPERAND (cond, 1)))
+    return false;
looks unnecessarily restrictive.
What tree-cfg.cc verification guarantees (and no need to check it in the
ranger)
is what verify_gimple_comparison verifies, i.e. that
  /* For comparisons we do not have the operations type as the
     effective type the comparison is carried out in.  Instead
     we require that either the first operand is trivially
     convertible into the second, or the other way around.  */
  if (!useless_type_conversion_p (op0_type, op1_type)
      && !useless_type_conversion_p (op1_type, op0_type))
I think the ranger has to be prepared for non-pointer-equal type mismatches as
long as they are useless_type_conversion_p compatible, that can happen anywhere
in the IL, including even cases like different but useless_type_conversion_p
compatible types of binary operators like +, -, * etc.
So I'd just remove the
  if (type != TREE_TYPE (TREE_OPERAND (cond, 1)))
    return false;
lines.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/104526] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0)
  2022-02-14 10:07 [Bug tree-optimization/104526] New: [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0) theodort at inf dot ethz.ch
                   ` (6 preceding siblings ...)
  2022-02-16 12:39 ` jakub at gcc dot gnu.org
@ 2022-02-16 14:05 ` amacleod at redhat dot com
  2022-02-16 14:07 ` jakub at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: amacleod at redhat dot com @ 2022-02-16 14:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104526

--- Comment #7 from Andrew Macleod <amacleod at redhat dot com> ---
On 2/16/22 07:39, jakub at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104526
>
> --- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
> +  tree type = TREE_TYPE (TREE_OPERAND (cond, 0));
> +  if (type != TREE_TYPE (TREE_OPERAND (cond, 1)))
> +    return false;
> looks unnecessarily restrictive.
> What tree-cfg.cc verification guarantees (and no need to check it in the
> ranger)
> is what verify_gimple_comparison verifies, i.e. that
>    /* For comparisons we do not have the operations type as the
>       effective type the comparison is carried out in.  Instead
>       we require that either the first operand is trivially
>       convertible into the second, or the other way around.  */
>    if (!useless_type_conversion_p (op0_type, op1_type)
>        && !useless_type_conversion_p (op1_type, op0_type))
> I think the ranger has to be prepared for non-pointer-equal type mismatches as
> long as they are useless_type_conversion_p compatible, that can happen anywhere
> in the IL, including even cases like different but useless_type_conversion_p
> compatible types of binary operators like +, -, * etc.
> So I'd just remove the
>    if (type != TREE_TYPE (TREE_OPERAND (cond, 1)))
>      return false;
> lines.

The rest of ranger isn't this restrictive.. it is satisfied by 
range_compatable_p() which boils down to "same precision, same sign".

I added it here so to be super paranoid so I didn't get caught by 
something unexpected later in the routine and cause an ICE in intersect 
in the middle of building the kernel or something.  In hindsight, I 
should have used range_compatible_p...

Are you OK with the following change?  I'll bootstrap and regression test...

Andrew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug tree-optimization/104526] [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0)
  2022-02-14 10:07 [Bug tree-optimization/104526] New: [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0) theodort at inf dot ethz.ch
                   ` (7 preceding siblings ...)
  2022-02-16 14:05 ` amacleod at redhat dot com
@ 2022-02-16 14:07 ` jakub at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-02-16 14:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104526

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
LGTM, thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-02-16 14:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14 10:07 [Bug tree-optimization/104526] New: [12 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 11.2.0) theodort at inf dot ethz.ch
2022-02-14 10:12 ` [Bug tree-optimization/104526] " rguenth at gcc dot gnu.org
2022-02-14 13:41 ` jakub at gcc dot gnu.org
2022-02-14 14:33 ` jakub at gcc dot gnu.org
2022-02-14 22:14 ` amacleod at redhat dot com
2022-02-15 22:12 ` cvs-commit at gcc dot gnu.org
2022-02-15 22:13 ` amacleod at redhat dot com
2022-02-16 12:39 ` jakub at gcc dot gnu.org
2022-02-16 14:05 ` amacleod at redhat dot com
2022-02-16 14:07 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).