public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/114559] New: After function inlining some optimizations missing
@ 2024-04-02  9:19 antoshkka at gmail dot com
  2024-04-03  0:21 ` [Bug middle-end/114559] " pinskia at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: antoshkka at gmail dot com @ 2024-04-02  9:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114559

            Bug ID: 114559
           Summary: After function inlining some optimizations missing
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: antoshkka at gmail dot com
  Target Milestone: ---

Consider the example:

template <class Func>
int AtomicUpdate(int& atomic, Func updater) {
  int old_value = atomic;
  while (true) {
    const int new_value = updater(int{old_value});
    if (old_value == new_value) return old_value;
    if (__atomic_compare_exchange_n(&atomic, &old_value, new_value, 1, 5, 5))
return new_value;
  }
}

int AtomicMin(int& atomic, int value) {
  return AtomicUpdate(atomic, [value](int old_value) {
    return value < old_value ? value : old_value;
  });
}


With -O2 GCC produces the assembly:


AtomicMin(int&, int):
        mov     eax, DWORD PTR [rdi]
.L3:
        cmp     esi, eax
        mov     edx, eax
        cmovle  edx, esi
        jge     .L4
        lock cmpxchg    DWORD PTR [rdi], edx
        jne     .L3
.L1:
        mov     eax, edx
        ret
.L4:
        mov     edx, eax
        jmp     .L1


However, a more optimal assembly is possible:


AtomicMin(int&, int):                        # @AtomicMin(int&, int)
        mov     eax, dword ptr [rdi]
.LBB0_1:                                # =>This Inner Loop Header: Depth=1
        cmp     eax, esi
        jle     .LBB0_4
        lock            cmpxchg dword ptr [rdi], esi
        jne     .LBB0_1
        mov     eax, esi
.LBB0_4:
        ret


Note that manual inlining of the lambda improves the codegen:

int AtomicMin(int& atomic, int value) {
  int old_value = atomic;
  while (true) {
    const int new_value = (value < old_value ? value : old_value);
    if (old_value == new_value) return old_value;
    if (__atomic_compare_exchange_n(&atomic, &old_value, new_value, 1, 5, 5))
return new_value;
  }
}

Results in

AtomicMin(int&, int):
        mov     eax, DWORD PTR [rdi]
.L3:
        cmp     esi, eax
        mov     edx, eax
        cmovle  edx, esi
        jge     .L1
        lock cmpxchg    DWORD PTR [rdi], edx
        jne     .L3
.L1:
        mov     eax, edx
        ret


Godbolt playground: https://godbolt.org/z/G6YEGb15q

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug middle-end/114559] After function inlining some optimizations missing
  2024-04-02  9:19 [Bug middle-end/114559] New: After function inlining some optimizations missing antoshkka at gmail dot com
@ 2024-04-03  0:21 ` pinskia at gcc dot gnu.org
  2024-04-03  0:30 ` [Bug tree-optimization/114559] [11/12/13/14 Regression] " pinskia at gcc dot gnu.org
  2024-04-05  2:33 ` law at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-03  0:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114559

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=110199
   Last reconfirmed|                            |2024-04-03
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed. Reduced testcase without the need of inlining nor atomics:
```
int g(int);
int h(int);

int f(int a, int b)
{
  while (true)
   {
      int t = a < b ? a : b; // MIN<a,b>
      if (b <= a)
        return a; 
      { a = g(a); if (h(t)) return t; }
        // t here should be old a before the assignment from `g(a);`
   }
}
```

As far as I can tell this is basically slightly more complex version of PR
110199 really wehre the MIN (MAX) has 2 usages. Note if we the usage of t in
`h(t)` the trunk (due to  PR 110199) is able to optimize it correctly to
`return a`.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/114559] [11/12/13/14 Regression] After function inlining some optimizations missing
  2024-04-02  9:19 [Bug middle-end/114559] New: After function inlining some optimizations missing antoshkka at gmail dot com
  2024-04-03  0:21 ` [Bug middle-end/114559] " pinskia at gcc dot gnu.org
@ 2024-04-03  0:30 ` pinskia at gcc dot gnu.org
  2024-04-05  2:33 ` law at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-04-03  0:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114559

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
      Known to work|                            |8.5.0
                 CC|                            |pinskia at gcc dot gnu.org
      Known to fail|                            |9.1.0
   Target Milestone|---                         |11.5
            Summary|After function inlining     |[11/12/13/14 Regression]
                   |some optimizations missing  |After function inlining
                   |                            |some optimizations missing

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note the original testcase worked in GCC 8.5.0 and before and the reduced
testcase with the C++ front-end too. Due to not creating MIN/MAX expression
early.
So I am going to make this as a regression because the user would not
understand the difference really.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/114559] [11/12/13/14 Regression] After function inlining some optimizations missing
  2024-04-02  9:19 [Bug middle-end/114559] New: After function inlining some optimizations missing antoshkka at gmail dot com
  2024-04-03  0:21 ` [Bug middle-end/114559] " pinskia at gcc dot gnu.org
  2024-04-03  0:30 ` [Bug tree-optimization/114559] [11/12/13/14 Regression] " pinskia at gcc dot gnu.org
@ 2024-04-05  2:33 ` law at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: law at gcc dot gnu.org @ 2024-04-05  2:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114559

Jeffrey A. Law <law at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
                 CC|                            |law at gcc dot gnu.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-04-05  2:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-02  9:19 [Bug middle-end/114559] New: After function inlining some optimizations missing antoshkka at gmail dot com
2024-04-03  0:21 ` [Bug middle-end/114559] " pinskia at gcc dot gnu.org
2024-04-03  0:30 ` [Bug tree-optimization/114559] [11/12/13/14 Regression] " pinskia at gcc dot gnu.org
2024-04-05  2:33 ` law at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).