[Bug rtl-optimization/48986] New: Missed optimization in atomic decrement on x86/x64

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug rtl-optimization/48986] New: Missed optimization in atomic decrement on x86/x64
@ 2011-05-13 10:32 piotr.wyderski at gmail dot com
  2011-05-13 15:07 ` [Bug rtl-optimization/48986] " jakub at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: piotr.wyderski at gmail dot com @ 2011-05-13 10:32 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48986

           Summary: Missed optimization in atomic decrement on x86/x64
           Product: gcc
           Version: 4.6.0
            Status: UNCONFIRMED
          Severity: minor
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: piotr.wyderski@gmail.com


Many uses of __sync_fetch_and_add() boil down to
decrement operation and checking if the result is
zero in order to delete the pointee. The most natural
way is to define it as:

bool xxx_decrement(int* p) {

   return __sync_fetch_and_add(p, -1) == 1;
}

void yyy(int* p) {

    if (xxx_decrement(p)) {

        delete p;
    }
}

Unfortunately, GCC compiles it in a literal way:

<__Z3yyyPi>:
  40edd0:    83 ec 0c                 sub    $0xc,%esp
  40edd3:    ba ff ff ff ff           mov    $0xffffffff,%edx
  40edd8:    8b 44 24 10              mov    0x10(%esp),%eax
  40eddc:    f0 0f c1 10              lock xadd %edx,(%eax)
  40ede0:    83 fa 01                 cmp    $0x1,%edx
  40ede3:    74 0b                    je     40edf0 <__Z3yyyPi+0x20>
  40ede5:    83 c4 0c                 add    $0xc,%esp
  40ede8:    c3                       ret    
  40ede9:    8d b4 26 00 00 00 00     lea    0x0(%esi,%eiz,1),%esi
  40edf0:    89 44 24 10              mov    %eax,0x10(%esp)
  40edf4:    83 c4 0c                 add    $0xc,%esp
  40edf7:    e9 24 03 00 00           jmp    40f120 <___wrap__ZdlPv>
  40edfc:    8d 74 26 00              lea    0x0(%esi,%eiz,1),%esi 

with the gist being:

  40edd3:    ba ff ff ff ff           mov    $0xffffffff,%edx
  40eddc:    f0 0f c1 10              lock xadd %edx,(%eax)
  40ede0:    83 fa 01                 cmp    $0x1,%edx
  40ede3:    74 0b                    je     40edf0 <__Z3yyyPi+0x20>

This special case should be handled by the optimizer and produce:

   lock sub $0x01,(%eax)
   je ...

or:

   lock dec (%eax)
   je ...

on platforms which do not suffer carry chain dependency penalties,
e.g. some AMD's chips.

Please note that this generalizes for any N:

   return __sync_fetch_and_add(p, -N) == N;

with a remark that for N != 1 the dec replacement can't be used.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/48986] Missed optimization in atomic decrement on x86/x64
  2011-05-13 10:32 [Bug rtl-optimization/48986] New: Missed optimization in atomic decrement on x86/x64 piotr.wyderski at gmail dot com
@ 2011-05-13 15:07 ` jakub at gcc dot gnu.org
  2011-05-16 12:07 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-05-13 15:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48986

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2011.05.13 14:57:43
                 CC|                            |jakub at gcc dot gnu.org
         AssignedTo|unassigned at gcc dot       |jakub at gcc dot gnu.org
                   |gnu.org                     |
     Ever Confirmed|0                           |1

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-05-13 14:57:43 UTC ---
I'll look at this.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/48986] Missed optimization in atomic decrement on x86/x64
  2011-05-13 10:32 [Bug rtl-optimization/48986] New: Missed optimization in atomic decrement on x86/x64 piotr.wyderski at gmail dot com
  2011-05-13 15:07 ` [Bug rtl-optimization/48986] " jakub at gcc dot gnu.org
@ 2011-05-16 12:07 ` jakub at gcc dot gnu.org
  2011-05-16 13:14 ` [Bug target/48986] " jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-05-16 12:07 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48986

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |NEW
                 CC|                            |uros at gcc dot gnu.org
         AssignedTo|jakub at gcc dot gnu.org    |unassigned at gcc dot
                   |                            |gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-05-16 11:26:51 UTC ---
On:
int
foo (int *p)
{
  return __sync_fetch_and_add (p, -1) == 1;
}

int
bar (int *p)
{
  return __sync_add_and_fetch (p, -1) == 0;
}

I get better generated code for the second routine if I do:
--- gcc/config/i386/sync.md.jj 72010-05-21 11:46:29.000000000 +0200
+++ gcc/config/i386/sync.md 2011-05-16 13:06:13.000000000 +0200
@@ -170,7 +170,7 @@
   [(match_operand:SWI 1 "memory_operand" "+m")] UNSPECV_XCHG))
    (set (match_dup 1)
         (plus:SWI (match_dup 1)
-                  (match_operand:SWI 2 "register_operand" "0")))
+                  (match_operand:SWI 2 "nonmemory_operand" "0")))
    (clobber (reg:CC FLAGS_REG))]
   "TARGET_XADD"
   "lock{%;} xadd{<imodesuffix>}\t{%0, %1|%1, %0}")

and for foo identical code, so maybe that change is always beneficial, allowing
combiner and other early RTL passes to see there a constant instead of a REG.
Unfortunately, even with this change the combiner doesn't attempt to combine
this pattern with the following cmpsi_1 pattern, supposedly because
sync_old_addsi pattern isn't single_set.  I guess we could handle this during
expansion, but it would be a mess, or some other pass (e.g. peephole2 or
something similar).  peephole2 might kind of too late though, by that time the
constant must be loaded already into some register, so we'd need to peephole2 3
insns, where the load of the constant might often not be the first one.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/48986] Missed optimization in atomic decrement on x86/x64
  2011-05-13 10:32 [Bug rtl-optimization/48986] New: Missed optimization in atomic decrement on x86/x64 piotr.wyderski at gmail dot com
  2011-05-13 15:07 ` [Bug rtl-optimization/48986] " jakub at gcc dot gnu.org
  2011-05-16 12:07 ` jakub at gcc dot gnu.org
@ 2011-05-16 13:14 ` jakub at gcc dot gnu.org
  2011-05-17  7:58 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-05-16 13:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48986

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-05-16 12:50:59 UTC ---
Created attachment 24252
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24252
gcc47-pr48986.patch

Untested patch using peephole2.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/48986] Missed optimization in atomic decrement on x86/x64
  2011-05-13 10:32 [Bug rtl-optimization/48986] New: Missed optimization in atomic decrement on x86/x64 piotr.wyderski at gmail dot com
                   ` (2 preceding siblings ...)
  2011-05-16 13:14 ` [Bug target/48986] " jakub at gcc dot gnu.org
@ 2011-05-17  7:58 ` jakub at gcc dot gnu.org
  2011-05-17  8:17 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-05-17  7:58 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48986

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-05-17 07:43:33 UTC ---
Fixed on the trunk, on the 4.6 branch just small improvement for
__sync_add_and_fetch (p, -1) == 0.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/48986] Missed optimization in atomic decrement on x86/x64
  2011-05-13 10:32 [Bug rtl-optimization/48986] New: Missed optimization in atomic decrement on x86/x64 piotr.wyderski at gmail dot com
                   ` (3 preceding siblings ...)
  2011-05-17  7:58 ` jakub at gcc dot gnu.org
@ 2011-05-17  8:17 ` jakub at gcc dot gnu.org
  2011-05-17  8:41 ` jakub at gcc dot gnu.org
  2021-07-26  5:22 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-05-17  8:17 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48986

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-05-17 07:42:38 UTC ---
Author: jakub
Date: Tue May 17 07:42:30 2011
New Revision: 173817

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=173817
Log:
    PR target/48986
    * config/i386/sync.md (sync_old_add<mode>): Relax operand 2
    predicate to allow CONST_INT.

Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/config/i386/sync.md


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/48986] Missed optimization in atomic decrement on x86/x64
  2011-05-13 10:32 [Bug rtl-optimization/48986] New: Missed optimization in atomic decrement on x86/x64 piotr.wyderski at gmail dot com
                   ` (4 preceding siblings ...)
  2011-05-17  8:17 ` jakub at gcc dot gnu.org
@ 2011-05-17  8:41 ` jakub at gcc dot gnu.org
  2021-07-26  5:22 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: jakub at gcc dot gnu.org @ 2011-05-17  8:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48986

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-05-17 07:38:03 UTC ---
Author: jakub
Date: Tue May 17 07:37:59 2011
New Revision: 173816

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=173816
Log:
    PR target/48986
    * config/i386/sync.md (sync_old_add<mode>): Relax operand 2
    predicate to allow CONST_INT.
    (*sync_old_add_cmp<mode>): New insn and peephole2 for it.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/i386/sync.md


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/48986] Missed optimization in atomic decrement on x86/x64
  2011-05-13 10:32 [Bug rtl-optimization/48986] New: Missed optimization in atomic decrement on x86/x64 piotr.wyderski at gmail dot com
                   ` (5 preceding siblings ...)
  2011-05-17  8:41 ` jakub at gcc dot gnu.org
@ 2021-07-26  5:22 ` pinskia at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-26  5:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=48986

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bcrl at kvack dot org

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
*** Bug 25230 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-07-26  5:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-13 10:32 [Bug rtl-optimization/48986] New: Missed optimization in atomic decrement on x86/x64 piotr.wyderski at gmail dot com
2011-05-13 15:07 ` [Bug rtl-optimization/48986] " jakub at gcc dot gnu.org
2011-05-16 12:07 ` jakub at gcc dot gnu.org
2011-05-16 13:14 ` [Bug target/48986] " jakub at gcc dot gnu.org
2011-05-17  7:58 ` jakub at gcc dot gnu.org
2011-05-17  8:17 ` jakub at gcc dot gnu.org
2011-05-17  8:41 ` jakub at gcc dot gnu.org
2021-07-26  5:22 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).