public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/113838] New: regression of redundant load operation introduced by -fno-tree-forwprop introduce
@ 2024-02-08 17:37 absoler at smail dot nju.edu.cn
  2024-02-08 17:50 ` [Bug tree-optimization/113838] " jakub at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: absoler at smail dot nju.edu.cn @ 2024-02-08 17:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838

            Bug ID: 113838
           Summary: regression of redundant load operation introduced by
                    -fno-tree-forwprop introduce
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: absoler at smail dot nju.edu.cn
  Target Milestone: ---

hi, I have found for the following code, with -O2 option, gcc-10.2.0 will
generate a redundant load, and gcc-13.2.0 won't. However, with an extra flag "
-fno-tree-forwprop", gcc-13.2.0 will produce the same bad code

full code https://godbolt.org/z/objsWGnY6

```
func_37() ...

    l_50[1] = &g_51;
    if (((*l_50[1]) ^= g_26[5][3][0]))
    { /* block id: 13 */
        int32_t **l_52[5] = {&l_50[2],&l_50[2],&l_50[2],&l_50[2],&l_50[2]};
        int i;
        (*l_52[4]) = (((void*)0 != &g_51) , &g_36[3][4]);
        return p_39;
    }
    else
    { /* block id: 16 */
        int32_t *l_53 = &g_54[0][1][1];
        return l_53;
    }
```
```
func_37():
  401d74:       mov    0x364e(%rip),%edx        # 4053c8 <g_26+0x1a8>
func_11():
  401d7a:       mov    %al,0x33d9(%rip)        # 405159 <g_44+0x9>
func_37():
  401d80:       mov    0x33c2(%rip),%eax        # 405148 <g_51>
  401d86:       xor    %eax,%edx
  401d88:       cmp    0x363a(%rip),%eax        # 4053c8 <g_26+0x1a8>
  401d8e:       mov    $0x40510c,%eax
  401d93:       mov    %edx,0x33af(%rip)        # 405148 <g_51>
  401d99:       mov    $0x4051a4,%edx
  401d9e:       cmovne %rdx,%rax
```

the second load of g_26[5][3][0], i.e. "cmp    0x363a(%rip),%eax" can be
optimized away. The better code generated by gcc-13.2.0 is:
```
func_37():
  401e40:       mov    0x3582(%rip),%eax        # 4053c8 <g_26+0x1a8>
  401e46:       mov    %edx,%ecx
  401e48:       xor    %eax,%ecx
  401e4a:       cmp    %eax,%edx
  401e4c:       mov    $0x40510c,%edx
  401e51:       mov    $0x4051a4,%eax
  401e56:       cmove  %rdx,%rax
```

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/113838] regression of redundant load operation introduced by -fno-tree-forwprop introduce
  2024-02-08 17:37 [Bug tree-optimization/113838] New: regression of redundant load operation introduced by -fno-tree-forwprop introduce absoler at smail dot nju.edu.cn
@ 2024-02-08 17:50 ` jakub at gcc dot gnu.org
  2024-02-09  1:11 ` [Bug target/113838] " pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-08 17:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Disabling optimizations and then wondering why optimizations didn't happen is
too weird.  Don't do that.  Such options are intended for debugging GCC, or
perhaps working around some compiler or application bug, but performance of
generated code should be only judged without such options.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113838] regression of redundant load operation introduced by -fno-tree-forwprop introduce
  2024-02-08 17:37 [Bug tree-optimization/113838] New: regression of redundant load operation introduced by -fno-tree-forwprop introduce absoler at smail dot nju.edu.cn
  2024-02-08 17:50 ` [Bug tree-optimization/113838] " jakub at gcc dot gnu.org
@ 2024-02-09  1:11 ` pinskia at gcc dot gnu.org
  2024-02-09  3:37 ` absoler at smail dot nju.edu.cn
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-09  1:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |x86_64
          Component|tree-optimization           |target
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The difference from the gimple level IR:
```
  _14 = g_26[5][3][0];
  _15 = (int) _14;
  _16 = _13 ^ _15;
  g_51 = _16;
  if (_13 != _15)
```

vs:
```
  _14 = g_26[5][3][0];
  _15 = (int) _14;
  _16 = _13 ^ _15;
  g_51 = _16;
  if (_16 != 0)
    goto <bb 4>; [50.00%]
  else
    goto <bb 3>; [50.00%]
```


This is expected behavior even for the x86_64 target

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113838] regression of redundant load operation introduced by -fno-tree-forwprop introduce
  2024-02-08 17:37 [Bug tree-optimization/113838] New: regression of redundant load operation introduced by -fno-tree-forwprop introduce absoler at smail dot nju.edu.cn
  2024-02-08 17:50 ` [Bug tree-optimization/113838] " jakub at gcc dot gnu.org
  2024-02-09  1:11 ` [Bug target/113838] " pinskia at gcc dot gnu.org
@ 2024-02-09  3:37 ` absoler at smail dot nju.edu.cn
  2024-02-09  3:38 ` absoler at smail dot nju.edu.cn
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: absoler at smail dot nju.edu.cn @ 2024-02-09  3:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838

--- Comment #3 from absoler at smail dot nju.edu.cn ---
The gimple ir has no problem, but `_13` is replaced with g_26[5][3][0] in the
follow-up process, this shouldn't be expected behavior.

We question this option because we found in an older version of gcc (10.2.0),
only the O2 option is needed to produce the same bad code, so we worry about
there's a hidden un-fixed problem and it's re-triggered by this option.

Besides, the bad binary code introduce more load operation than the source code
(without optimization), so we thought it's necessary to check it regardless of
which optimization is disabled.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113838] regression of redundant load operation introduced by -fno-tree-forwprop introduce
  2024-02-08 17:37 [Bug tree-optimization/113838] New: regression of redundant load operation introduced by -fno-tree-forwprop introduce absoler at smail dot nju.edu.cn
                   ` (2 preceding siblings ...)
  2024-02-09  3:37 ` absoler at smail dot nju.edu.cn
@ 2024-02-09  3:38 ` absoler at smail dot nju.edu.cn
  2024-02-09  3:39 ` absoler at smail dot nju.edu.cn
  2024-02-09  7:59 ` xry111 at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: absoler at smail dot nju.edu.cn @ 2024-02-09  3:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838

--- Comment #4 from absoler at smail dot nju.edu.cn ---
@(In reply to Jakub Jelinek from comment #1)
> Disabling optimizations and then wondering why optimizations didn't happen
> is too weird.  Don't do that.  Such options are intended for debugging GCC,
> or perhaps working around some compiler or application bug, but performance
> of generated code should be only judged without such options.

The gimple ir has no problem, but `_13` is replaced with g_26[5][3][0] in the
follow-up process, this shouldn't be expected behavior.

We question this option because we found in an older version of gcc (10.2.0),
only the O2 option is needed to produce the same bad code, so we worry about
there's a hidden un-fixed problem and it's re-triggered by this option.

Besides, the bad binary code introduce more load operation than the source code
(without optimization), so we thought it's necessary to check it regardless of
which optimization is disabled.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113838] regression of redundant load operation introduced by -fno-tree-forwprop introduce
  2024-02-08 17:37 [Bug tree-optimization/113838] New: regression of redundant load operation introduced by -fno-tree-forwprop introduce absoler at smail dot nju.edu.cn
                   ` (3 preceding siblings ...)
  2024-02-09  3:38 ` absoler at smail dot nju.edu.cn
@ 2024-02-09  3:39 ` absoler at smail dot nju.edu.cn
  2024-02-09  7:59 ` xry111 at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: absoler at smail dot nju.edu.cn @ 2024-02-09  3:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838

--- Comment #5 from absoler at smail dot nju.edu.cn ---
(In reply to Andrew Pinski from comment #2)
> The difference from the gimple level IR:
> ```
>   _14 = g_26[5][3][0];
>   _15 = (int) _14;
>   _16 = _13 ^ _15;
>   g_51 = _16;
>   if (_13 != _15)
> ```
> 
> vs:
> ```
>   _14 = g_26[5][3][0];
>   _15 = (int) _14;
>   _16 = _13 ^ _15;
>   g_51 = _16;
>   if (_16 != 0)
>     goto <bb 4>; [50.00%]
>   else
>     goto <bb 3>; [50.00%]
> ```
> 
> 
> This is expected behavior even for the x86_64 target

The gimple ir has no problem, but `_13` is replaced with g_26[5][3][0] in the
follow-up process, this shouldn't be expected behavior.

We question this option because we found in an older version of gcc (10.2.0),
only the O2 option is needed to produce the same bad code, so we worry about
there's a hidden un-fixed problem and it's re-triggered by this option.

Besides, the bad binary code introduce more load operation than the source code
(without optimization), so we thought it's necessary to check it regardless of
which optimization is disabled.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/113838] regression of redundant load operation introduced by -fno-tree-forwprop introduce
  2024-02-08 17:37 [Bug tree-optimization/113838] New: regression of redundant load operation introduced by -fno-tree-forwprop introduce absoler at smail dot nju.edu.cn
                   ` (4 preceding siblings ...)
  2024-02-09  3:39 ` absoler at smail dot nju.edu.cn
@ 2024-02-09  7:59 ` xry111 at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: xry111 at gcc dot gnu.org @ 2024-02-09  7:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113838

Xi Ruoyao <xry111 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xry111 at gcc dot gnu.org

--- Comment #6 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
(In reply to absoler from comment #3)
> The gimple ir has no problem, but `_13` is replaced with g_26[5][3][0] in
> the follow-up process, this shouldn't be expected behavior.
> 
> We question this option because we found in an older version of gcc
> (10.2.0), only the O2 option is needed to produce the same bad code, so we
> worry about there's a hidden un-fixed problem and it's re-triggered by this
> option.
> 
> Besides, the bad binary code introduce more load operation than the source
> code (without optimization), so we thought it's necessary to check it
> regardless of which optimization is disabled.

(In reply to absoler from comment #5)
> (In reply to Andrew Pinski from comment #2)
> > The difference from the gimple level IR:
> > ```
> >   _14 = g_26[5][3][0];
> >   _15 = (int) _14;
> >   _16 = _13 ^ _15;
> >   g_51 = _16;
> >   if (_13 != _15)
> > ```
> > 
> > vs:
> > ```
> >   _14 = g_26[5][3][0];
> >   _15 = (int) _14;
> >   _16 = _13 ^ _15;
> >   g_51 = _16;
> >   if (_16 != 0)
> >     goto <bb 4>; [50.00%]
> >   else
> >     goto <bb 3>; [50.00%]
> > ```
> > 
> > 
> > This is expected behavior even for the x86_64 target
> 
> The gimple ir has no problem, but `_13` is replaced with g_26[5][3][0] in
> the follow-up process, this shouldn't be expected behavior.

No.  GIMPLE pass knows nothing about register pressure, _13 is only a temporary
variable, not necessarily an register.

> We question this option because we found in an older version of gcc
> (10.2.0), only the O2 option is needed to produce the same bad code, so we
> worry about there's a hidden un-fixed problem and it's re-triggered by this
> option.

So are you expecting a bug must be fixed in at least two different passes and
any -fno-* option shouldn't regress the code?  No this won't happen.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-02-09  7:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-08 17:37 [Bug tree-optimization/113838] New: regression of redundant load operation introduced by -fno-tree-forwprop introduce absoler at smail dot nju.edu.cn
2024-02-08 17:50 ` [Bug tree-optimization/113838] " jakub at gcc dot gnu.org
2024-02-09  1:11 ` [Bug target/113838] " pinskia at gcc dot gnu.org
2024-02-09  3:37 ` absoler at smail dot nju.edu.cn
2024-02-09  3:38 ` absoler at smail dot nju.edu.cn
2024-02-09  3:39 ` absoler at smail dot nju.edu.cn
2024-02-09  7:59 ` xry111 at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).