public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/114545] New: [11/12/13/14 Regression] Missed optimization for CSE
@ 2024-04-01  9:20 652023330028 at smail dot nju.edu.cn
  2024-04-01 21:11 ` [Bug tree-optimization/114545] " pinskia at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: 652023330028 at smail dot nju.edu.cn @ 2024-04-01  9:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114545

            Bug ID: 114545
           Summary: [11/12/13/14 Regression] Missed optimization for CSE
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: 652023330028 at smail dot nju.edu.cn
  Target Milestone: ---

Hello, we noticed that there may be a missed CSE (t+c). Here is the reduced
code:

https://godbolt.org/z/f68qWa89s

int a, b, c;
void func() {
  int t=0;
  t = -a;
  b = -(t + c);
  a = t + c;
}

GCC -O3 -fwrapv:
func():
        mov     edx, DWORD PTR a[rip]
        mov     eax, DWORD PTR c[rip]
        mov     ecx, edx
        sub     ecx, eax
        sub     eax, edx
        mov     DWORD PTR b[rip], ecx
        mov     DWORD PTR a[rip], eax
        ret

Expected code:
GCC-7.5: 
func():
        mov     eax, DWORD PTR c[rip]
        sub     eax, DWORD PTR a[rip]
        mov     edx, eax
        mov     DWORD PTR a[rip], eax
        neg     edx
        mov     DWORD PTR b[rip], edx
        ret

Thank you very much for your time and effort! We look forward to hearing from
you.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/114545] [11/12/13/14 Regression] Missed optimization for CSE
  2024-04-01  9:20 [Bug tree-optimization/114545] New: [11/12/13/14 Regression] Missed optimization for CSE 652023330028 at smail dot nju.edu.cn
@ 2024-04-01 21:11 ` pinskia at gcc dot gnu.org
  2024-04-01 21:18 ` pinskia at gcc dot gnu.org
  2024-04-05  2:34 ` law at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-01 21:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114545

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |11.5
           Keywords|                            |missed-optimization

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/114545] [11/12/13/14 Regression] Missed optimization for CSE
  2024-04-01  9:20 [Bug tree-optimization/114545] New: [11/12/13/14 Regression] Missed optimization for CSE 652023330028 at smail dot nju.edu.cn
  2024-04-01 21:11 ` [Bug tree-optimization/114545] " pinskia at gcc dot gnu.org
@ 2024-04-01 21:18 ` pinskia at gcc dot gnu.org
  2024-04-05  2:34 ` law at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-01 21:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114545

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I am not sure this is worse.

In the GCC 7 case we have:
```
        sub     eax, DWORD PTR a[rip]
        mov     edx, eax
        ...
        neg     edx
```

While in GCC 8+ we get:
```
        movl    %edx, %ecx
        subl    %eax, %ecx
        subl    %edx, %eax
```

In the case of GCC 8, we have 2 independent sub and still a move. In GCC 7 we
get one sub followed by a move an dependent neg. The latency for the GCC 8+
will be less than what was done for GCC 7 because both sub can happen at the
same time and the mov (which only happens on x86_64) is removed during rename.



aarch64 produces for GCC 8+:
```
        adrp    x1, a
        adrp    x2, c
        adrp    x3, b
        ldr     w0, [x1, #:lo12:a]
        ldr     w2, [x2, #:lo12:c]
        sub     w4, w2, w0
        sub     w0, w0, w2
        str     w4, [x1, #:lo12:a]
        str     w0, [x3, #:lo12:b]
        ret
```

While before:
```
        adrp    x1, a
        adrp    x0, c
        adrp    x2, b
        ldr     w3, [x1, #:lo12:a]
        ldr     w0, [x0, #:lo12:c]
        sub     w0, w0, w3
        str     w0, [x1, #:lo12:a]
        neg     w0, w0
        str     w0, [x2, #:lo12:b]
        ret
```

So the neg will issue with the first str but if you have 2 store units and 2
ALUs, the GCC 8+ is better.
So for superscalars, what GCC 8+ is doing is better and even in order cores,
GCC 8+ will still be better due to the 2 independent instructions
I think only at -Os/-Oz it might make a difference for x86_64 really.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/114545] [11/12/13/14 Regression] Missed optimization for CSE
  2024-04-01  9:20 [Bug tree-optimization/114545] New: [11/12/13/14 Regression] Missed optimization for CSE 652023330028 at smail dot nju.edu.cn
  2024-04-01 21:11 ` [Bug tree-optimization/114545] " pinskia at gcc dot gnu.org
  2024-04-01 21:18 ` pinskia at gcc dot gnu.org
@ 2024-04-05  2:34 ` law at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: law at gcc dot gnu.org @ 2024-04-05  2:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114545

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
                 CC|                            |law at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-04-05  2:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-01  9:20 [Bug tree-optimization/114545] New: [11/12/13/14 Regression] Missed optimization for CSE 652023330028 at smail dot nju.edu.cn
2024-04-01 21:11 ` [Bug tree-optimization/114545] " pinskia at gcc dot gnu.org
2024-04-01 21:18 ` pinskia at gcc dot gnu.org
2024-04-05  2:34 ` law at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).