public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug ipa/116024] New: [14/15 Regression] unnecessary integer comparison(s) for a simple loop
@ 2024-07-21 14:33 artemiy at synopsys dot com
  2024-07-21 14:57 ` [Bug ipa/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3 pinskia at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: artemiy at synopsys dot com @ 2024-07-21 14:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116024

            Bug ID: 116024
           Summary: [14/15 Regression] unnecessary integer comparison(s)
                    for a simple loop
           Product: gcc
           Version: 14.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: ipa
          Assignee: unassigned at gcc dot gnu.org
          Reporter: artemiy at synopsys dot com
                CC: jh at suse dot cz
  Target Milestone: ---
            Target: riscv32-unknown-elf

gcc 14 and later emits unnecessary instructions when compiling the i() function
in the following code:
$ cat test.i
# 0 "test.c"
# 0 "<built-in>"
# 0 "<command-line>"
# 1 "test.c"
typedef enum { a, b } c;
extern char d, e;
c f()
{
  if (d)
    return a;
  d = e;
  return b;
}
void i()
{
  int l = 2;
  while (l <= 2)
    if (f() == a)
      l++;
}

$ riscv32-unknown-elf-gcc test.i -S -O3 -fno-inline -Wall -Wextra
$ cat test.s
[snip]
i:
        addi    sp,sp,-16
        sw      s0,8(sp)
        sw      s1,4(sp)
        sw      ra,12(sp)
        li      s1,3
        li      s0,2
.L6:
        call    f
        sub     a0,s1,a0
        beq     a0,s0,.L6
[snip]

gcc 13.2.0 doesn’t load any immediates and just emits a branch-if-not-zero:

[snip]
i:
        addi    sp,sp,-16
        sw      ra,12(sp)
.L6:
        call    f
        bne     a0,zero,.L6
[snip]

I have bisected the introduction of this behavior down to:

commit 53ba8d669550d3a1f809048428b97ca607f95cf5
Author: Jan Hubicka <jh@suse.cz>
Date:   Mon Nov 20 19:35:53 2023 +0100

    inter-procedural value range propagation

I can't be too sure, but I think what ends up happening is that the ipa-cp pass
propagates the range of return values of f() ({0,1}), then later on phiopt2
uses this to expand the PHI node to something like 3 - f() == 2:

 <bb 3> [local count: 955630224]:
  # DEBUG BEGIN_STMT
  _1 = f ();
  _6 = (int) _1;
  _8 = 3 - _6;

  <bb 4> [local count: 1073741824]:
  # l_2 = PHI <_8(3), 2(2)>
  # DEBUG l => l_2
  # DEBUG BEGIN_STMT
  if (l_2 == 2)
    goto <bb 3>; [89.00%]
  else
    goto <bb 5>; [11.00%]

And the aforementioned expression never gets simplified back to f() != 0,
resulting in more complicated assembly code.  Indeed, specifying -fno-ipa-vrp
(or -fno-ssa-phiopt) works as a temporary workaround.

godbolt for convenience: https://godbolt.org/z/xr1oTrW51
gcc -v output: 
$ riscv32-unknown-elf-gcc -v
Using built-in specs.
COLLECT_GCC=riscv32-unknown-elf-gcc
COLLECT_LTO_WRAPPER=/home/art/work/install/riscv-gnu-toolchain/libexec/gcc/riscv32-unknown-elf/15.0.0/lto-wrapper
Target: riscv32-unknown-elf
Configured with: /home/art/work/src/gcc/configure --target=riscv32-unknown-elf
--prefix=/home/art/work/install/riscv-gnu-toolchain --disable-shared
--disable-threads --enable-languages=c,c++ --with-pkgversion=g80c37335baf
--with-system-zlib --enable-tls --with-newlib
--with-sysroot=/home/art/work/install/riscv-gnu-toolchain/riscv32-unknown-elf
--with-native-system-header-dir=/include --disable-libmudflap --disable-libssp
--disable-libquadmath --disable-libgomp --disable-nls
--disable-tm-clone-registry --src=/home/art/work/src/gcc --enable-multilib
--with-abi=ilp32 --with-arch=rv32gc --with-tune=rocket --with-isa-spec=20191213
'CFLAGS_FOR_TARGET=-Os    ' 'CXXFLAGS_FOR_TARGET=-Os    '
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 15.0.0 20240721 (experimental) (g80c37335baf)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug ipa/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3
  2024-07-21 14:33 [Bug ipa/116024] New: [14/15 Regression] unnecessary integer comparison(s) for a simple loop artemiy at synopsys dot com
@ 2024-07-21 14:57 ` pinskia at gcc dot gnu.org
  2024-07-21 14:58 ` [Bug middle-end/116024] " sjames at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-07-21 14:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116024

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
It is VRP which is going funny.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug middle-end/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3
  2024-07-21 14:33 [Bug ipa/116024] New: [14/15 Regression] unnecessary integer comparison(s) for a simple loop artemiy at synopsys dot com
  2024-07-21 14:57 ` [Bug ipa/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3 pinskia at gcc dot gnu.org
@ 2024-07-21 14:58 ` sjames at gcc dot gnu.org
  2024-07-21 15:00 ` [Bug tree-optimization/116024] " pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: sjames at gcc dot gnu.org @ 2024-07-21 14:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116024

Sam James <sjames at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sjames at gcc dot gnu.org
          Component|ipa                         |middle-end

--- Comment #2 from Sam James <sjames at gcc dot gnu.org> ---
sorry, I hadn't seen you'd changed component

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3
  2024-07-21 14:33 [Bug ipa/116024] New: [14/15 Regression] unnecessary integer comparison(s) for a simple loop artemiy at synopsys dot com
  2024-07-21 14:57 ` [Bug ipa/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3 pinskia at gcc dot gnu.org
  2024-07-21 14:58 ` [Bug middle-end/116024] " sjames at gcc dot gnu.org
@ 2024-07-21 15:00 ` pinskia at gcc dot gnu.org
  2024-07-21 17:15 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-07-21 15:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116024

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
             Blocks|85316                       |
   Last reconfirmed|                            |2024-07-21
                 CC|hubicka at gcc dot gnu.org,        |pinskia at gcc dot gnu.org
                   |jh at suse dot cz                  |
          Component|middle-end                  |tree-optimization
   Target Milestone|---                         |14.2

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note I don't think IPA is the issue of exporting (correctly) the return value
range. The problem is VRP decides to create `a == 0 ? 2 : 3` which gets
converted into `3 - a` (which then vrp again converts into `a == 0 ? 2 : 3`).

But then `(3 - a) == 2` is never optimized to just `a == 0` by match or
something else.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85316
[Bug 85316] [meta-bug] VRP range propagation missed cases

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3
  2024-07-21 14:33 [Bug ipa/116024] New: [14/15 Regression] unnecessary integer comparison(s) for a simple loop artemiy at synopsys dot com
                   ` (2 preceding siblings ...)
  2024-07-21 15:00 ` [Bug tree-optimization/116024] " pinskia at gcc dot gnu.org
@ 2024-07-21 17:15 ` pinskia at gcc dot gnu.org
  2024-07-22  9:23 ` artemiy at synopsys dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-07-21 17:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116024

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
So the missed optimizations that VRP/PHIopt/match expose are:
```
unsigned f(void);
int i1(void)
{
  int l = 2;
  l = 3 - (int)f();
  return l <= 2; // f() > 0
}
int i2(void)
{
  unsigned l = 2;
  l = 3 - (unsigned)f();
  return l <= 2; // f() - 1 < 3
}

int i3(void)
{
  unsigned l = 2;
  l = 3 + (unsigned)f();
  return l <= 2; // f() > -4u
}

int i4(void)
{
  int l = 2;
  l = 3 + (int)f();
  return l <= 2; // f() < 0
}

```

This can be done even without knowing the range of `f()` here. and these is
done by clang already. Note GCC does handle i4 is already.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3
  2024-07-21 14:33 [Bug ipa/116024] New: [14/15 Regression] unnecessary integer comparison(s) for a simple loop artemiy at synopsys dot com
                   ` (3 preceding siblings ...)
  2024-07-21 17:15 ` pinskia at gcc dot gnu.org
@ 2024-07-22  9:23 ` artemiy at synopsys dot com
  2024-07-22 14:01 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: artemiy at synopsys dot com @ 2024-07-22  9:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116024

--- Comment #5 from Artemiy Volkov <artemiy at synopsys dot com> ---
Hi Andrew, thank you for the breakdown.  For i1() (the case applicable to the
initial bug report) something like this seems to fix the issue:

diff --git a/gcc/match.pd b/gcc/match.pd
index cf359b0ec0f..8ab6d47e278 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -8773,2 +8773,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

+/* Transform comparisons of the form C1 - X CMP C2 to X - C1 CMP -C2.  */
+(for cmp (lt le gt ge eq ne)
+     rcmp (gt ge lt le eq ne)
+  (simplify
+   (cmp (minus INTEGER_CST@0 @1) INTEGER_CST@2)
+   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@1)))
+     (rcmp (minus @1 @0) (negate @2)))))
+
 /* Canonicalizations of BIT_FIELD_REFs.  */

Would it make sense for this ticket to be assigned to me so I could refine and
post the above patch as well as tackle i2() and i3() (should those be extracted
to a separate PR or is it fine to fix all three under this PR)?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3
  2024-07-21 14:33 [Bug ipa/116024] New: [14/15 Regression] unnecessary integer comparison(s) for a simple loop artemiy at synopsys dot com
                   ` (4 preceding siblings ...)
  2024-07-22  9:23 ` artemiy at synopsys dot com
@ 2024-07-22 14:01 ` rguenth at gcc dot gnu.org
  2024-07-23 10:02 ` artemiy at synopsys dot com
  2024-08-01  9:41 ` jakub at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-07-22 14:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116024

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Artemiy Volkov from comment #5)
> Hi Andrew, thank you for the breakdown.  For i1() (the case applicable to
> the initial bug report) something like this seems to fix the issue:
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index cf359b0ec0f..8ab6d47e278 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -8773,2 +8773,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  
> +/* Transform comparisons of the form C1 - X CMP C2 to X - C1 CMP -C2.  */
> +(for cmp (lt le gt ge eq ne)
> +     rcmp (gt ge lt le eq ne)
> +  (simplify
> +   (cmp (minus INTEGER_CST@0 @1) INTEGER_CST@2)
> +   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@1)))
> +     (rcmp (minus @1 @0) (negate @2)))))
> +
>  /* Canonicalizations of BIT_FIELD_REFs.  */
> 
> Would it make sense for this ticket to be assigned to me so I could refine
> and post the above patch as well as tackle i2() and i3() (should those be
> extracted to a separate PR or is it fine to fix all three under this PR)?

I don't think this is correct for types with undefined behavior on overflow
because you can't negate INT_MIN.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3
  2024-07-21 14:33 [Bug ipa/116024] New: [14/15 Regression] unnecessary integer comparison(s) for a simple loop artemiy at synopsys dot com
                   ` (5 preceding siblings ...)
  2024-07-22 14:01 ` rguenth at gcc dot gnu.org
@ 2024-07-23 10:02 ` artemiy at synopsys dot com
  2024-08-01  9:41 ` jakub at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: artemiy at synopsys dot com @ 2024-07-23 10:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116024

--- Comment #7 from Artemiy Volkov <artemiy at synopsys dot com> ---
(In reply to Richard Biener from comment #6)
> (In reply to Artemiy Volkov from comment #5)
> > Hi Andrew, thank you for the breakdown.  For i1() (the case applicable to
> > the initial bug report) something like this seems to fix the issue:
> > 
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index cf359b0ec0f..8ab6d47e278 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -8773,2 +8773,10 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >  
> > +/* Transform comparisons of the form C1 - X CMP C2 to X - C1 CMP -C2.  */
> > +(for cmp (lt le gt ge eq ne)
> > +     rcmp (gt ge lt le eq ne)
> > +  (simplify
> > +   (cmp (minus INTEGER_CST@0 @1) INTEGER_CST@2)
> > +   (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@1)))
> > +     (rcmp (minus @1 @0) (negate @2)))))
> > +
> >  /* Canonicalizations of BIT_FIELD_REFs.  */
> > 
> > Would it make sense for this ticket to be assigned to me so I could refine
> > and post the above patch as well as tackle i2() and i3() (should those be
> > extracted to a separate PR or is it fine to fix all three under this PR)?
> 
> I don't think this is correct for types with undefined behavior on overflow
> because you can't negate INT_MIN.

Fair enough.  How about something like:

diff --git a/gcc/match.pd b/gcc/match.pd
index cf359b0ec0f..897e01c39a3 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -8773,2 +8773,37 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)

+/* Transform comparisons of the form C1 - X CMP C2 to X CMP C1 - C2.  */
+(for cmp (lt le gt ge eq ne)
+     rcmp (gt ge lt le eq ne)
+  (simplify
+   (cmp (minus INTEGER_CST@0 @1) INTEGER_CST@2)
+   (if (!TREE_OVERFLOW (@0) && !TREE_OVERFLOW (@2)
+       && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@1)))
+     (with { tree res = int_const_binop (MINUS_EXPR, @0, @2); }
+      (if (TREE_OVERFLOW (res))
+       (with
+        {
+          fold_overflow_warning (("assuming signed overflow does not occur "
+                                  "when simplifying conditional to constant"),
+                                WARN_STRICT_OVERFLOW_CONDITIONAL);
+        }
+        (switch
+         (if (cmp == NE_EXPR)
+          { constant_boolean_node (true, type); })
+         (if (cmp == EQ_EXPR)
+          { constant_boolean_node (false, type); })
+         {
+           bool less = cmp == LE_EXPR || cmp == LT_EXPR;
+           bool ovf_high = wi::gt_p (wi::to_wide (@0), 0,
+                                     TYPE_SIGN (TREE_TYPE (@0)));
+           constant_boolean_node (less == ovf_high, type);
+         }))
+       (with
+        {
+          fold_overflow_warning (("assuming signed overflow does not occur "
+                                  "when changing C1 - X cmp C2 to "
+                                  "X cmp C1 - C2"),
+                                WARN_STRICT_OVERFLOW_COMPARISON);
+        }
+        (rcmp @1 { res; })))))))
+
 /* Canonicalizations of BIT_FIELD_REFs.  */

This passes a full dejagnu run plus a couple of newly crafted test cases around
INT_MIN and INT_MAX. (I made it mostly by analogy with the "X +- C1 CMP C2”
case directly above it.  I did not understand why that one is split into the
equality and relational parts with stricter pre-conditions for the former (the
old fold-const.c code that came before it didn't have this separation); so for
the new pattern I decided to keep them unified and just require
TYPE_OVERFLOW_UNDEFINED().)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug tree-optimization/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3
  2024-07-21 14:33 [Bug ipa/116024] New: [14/15 Regression] unnecessary integer comparison(s) for a simple loop artemiy at synopsys dot com
                   ` (6 preceding siblings ...)
  2024-07-23 10:02 ` artemiy at synopsys dot com
@ 2024-08-01  9:41 ` jakub at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-08-01  9:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116024

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|14.2                        |14.3

--- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC 14.2 is being released, retargeting bugs to GCC 14.3.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-08-01  9:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-21 14:33 [Bug ipa/116024] New: [14/15 Regression] unnecessary integer comparison(s) for a simple loop artemiy at synopsys dot com
2024-07-21 14:57 ` [Bug ipa/116024] [14/15 Regression] unnecessary integer comparison(s) for a simple loop since r14-5628-g53ba8d669550d3 pinskia at gcc dot gnu.org
2024-07-21 14:58 ` [Bug middle-end/116024] " sjames at gcc dot gnu.org
2024-07-21 15:00 ` [Bug tree-optimization/116024] " pinskia at gcc dot gnu.org
2024-07-21 17:15 ` pinskia at gcc dot gnu.org
2024-07-22  9:23 ` artemiy at synopsys dot com
2024-07-22 14:01 ` rguenth at gcc dot gnu.org
2024-07-23 10:02 ` artemiy at synopsys dot com
2024-08-01  9:41 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).