[Bug target/110591] New: [i386] (Maybe) Missed optimisation:

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/110591] New: [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags
@ 2023-07-07 17:13 thiago at kde dot org
  2023-07-10  2:21 ` [Bug target/110591] " crazylht at gmail dot com
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: thiago at kde dot org @ 2023-07-07 17:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110591

            Bug ID: 110591
           Summary: [i386] (Maybe) Missed optimisation: _cmpccxadd sets
                    flags
           Product: gcc
           Version: 13.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: thiago at kde dot org
  Target Milestone: ---

In:
#include <immintrin.h>

bool increment_if(int *ptr, int v)
{
    return _cmpccxadd_epi32(ptr, v, 1, _CMPCCX_Z) == v;
}

GCC generates (and current Clang does the same):

increment_if(int*, int):
        movl    $1, %edx
        movl    %esi, %eax
        cmpzxadd        %edx, %eax, (%rdi)
        cmpl    %eax, %esi
        sete    %al
        ret

The CMPccXADD instructions set EFLAGS to the result of the comparison of their
memory operand to the middle one, which will get the current value of that
memory location whether the comparison succeeded or not. That means the CMP
instruction on the next line is superfluous, since it'll set the flags to
exactly what they are already set to. That means this particular example could
be written:

        movl    $1, %edx
        cmpzxadd        %edx, %esi, (%rdi)
        sete    %al
        ret

Saving 2 retire slots and 1 uop. This can be done every time the result of the
intrinsic is compared to the same value that was passed as the intrinsic's
second parameter.

However, in a real workload, this function is likely to be inlined, where the
extra MOV may not be present at all and the CMP is likely to be followed by a
Jcc instead of a SETcc. For the latter case, the CMP+Jcc would be macro-fused,
so there would be no 1-uop gain. Moreover, this atomic operation is likely
going to be multiple cycles long and the conditional code after it probably
can't be speculated very well either.

I'll leave it up to you to decide whether it's worth pursuing this.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/110591] [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags
  2023-07-07 17:13 [Bug target/110591] New: [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags thiago at kde dot org
@ 2023-07-10  2:21 ` crazylht at gmail dot com
  2023-07-10  4:27 ` crazylht at gmail dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2023-07-10  2:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110591

--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---
I guess we can add a peephole for this, middle-end optimizer doesn't know
cmpccxadd set EFLAGS same as cmp.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/110591] [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags
  2023-07-07 17:13 [Bug target/110591] New: [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags thiago at kde dot org
  2023-07-10  2:21 ` [Bug target/110591] " crazylht at gmail dot com
@ 2023-07-10  4:27 ` crazylht at gmail dot com
  2023-07-10  6:22 ` crazylht at gmail dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2023-07-10  4:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110591

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #1)
> I guess we can add a peephole for this, middle-end optimizer doesn't know
> cmpccxadd set EFLAGS same as cmp.

We already have a peephole for cmpxchg, for cmpxchg it's only valid for
CCZmode, but for cmpccxadd, it should be ok for all CCmode since it sets EFLAGS
exactly the same as CMP. The _CMPCCX_Z in the intrinsic is used as condition of
updating memory.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/110591] [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags
  2023-07-07 17:13 [Bug target/110591] New: [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags thiago at kde dot org
  2023-07-10  2:21 ` [Bug target/110591] " crazylht at gmail dot com
  2023-07-10  4:27 ` crazylht at gmail dot com
@ 2023-07-10  6:22 ` crazylht at gmail dot com
  2023-07-18  3:31 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2023-07-10  6:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110591

--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
Created attachment 55510
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55510&action=edit
untested patch.

Under testing.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/110591] [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags
  2023-07-07 17:13 [Bug target/110591] New: [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags thiago at kde dot org
                   ` (2 preceding siblings ...)
  2023-07-10  6:22 ` crazylht at gmail dot com
@ 2023-07-18  3:31 ` cvs-commit at gcc dot gnu.org
  2023-07-18  3:32 ` crazylht at gmail dot com
  2023-11-30 10:53 ` liuhongt at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-07-18  3:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110591

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:06cc38c1c350b34cbd6dde23aefca32442c07a73

commit r14-2595-g06cc38c1c350b34cbd6dde23aefca32442c07a73
Author: liuhongt <hongtao.liu@intel.com>
Date:   Mon Jul 10 14:12:07 2023 +0800

    Add peephole to eliminate redundant comparison after cmpccxadd.

    Similar like we did for cmpxchg, but extended to all
    ix86_comparison_int_operator since cmpccxadd set EFLAGS exactly same
    as CMP.

    When operand order in compare insn is same as that in cmpccxadd,
    compare insn can be eliminated directly.

    When operand order is swapped in compare insn, only optimize cmpccxadd
    + cmpl + jcc/setcc to cmpccxadd + jcc/setcc when FLAGS_REG is dead
    after jcc/setcc.

    gcc/ChangeLog:

            PR target/110591
            * config/i386/sync.md (cmpccxadd_<mode>): Adjust the pattern
            to explicitly set FLAGS_REG like *cmp<mode>_1, also add extra
            3 define_peephole2 after the pattern.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr110591.c: New test.
            * gcc.target/i386/pr110591-2.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/110591] [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags
  2023-07-07 17:13 [Bug target/110591] New: [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags thiago at kde dot org
                   ` (3 preceding siblings ...)
  2023-07-18  3:31 ` cvs-commit at gcc dot gnu.org
@ 2023-07-18  3:32 ` crazylht at gmail dot com
  2023-11-30 10:53 ` liuhongt at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: crazylht at gmail dot com @ 2023-07-18  3:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110591

--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC14.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug target/110591] [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags
  2023-07-07 17:13 [Bug target/110591] New: [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags thiago at kde dot org
                   ` (4 preceding siblings ...)
  2023-07-18  3:32 ` crazylht at gmail dot com
@ 2023-11-30 10:53 ` liuhongt at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2023-11-30 10:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110591

liuhongt at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED
                 CC|                            |liuhongt at gcc dot gnu.org

--- Comment #6 from liuhongt at gcc dot gnu.org ---
.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-11-30 10:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-07 17:13 [Bug target/110591] New: [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags thiago at kde dot org
2023-07-10  2:21 ` [Bug target/110591] " crazylht at gmail dot com
2023-07-10  4:27 ` crazylht at gmail dot com
2023-07-10  6:22 ` crazylht at gmail dot com
2023-07-18  3:31 ` cvs-commit at gcc dot gnu.org
2023-07-18  3:32 ` crazylht at gmail dot com
2023-11-30 10:53 ` liuhongt at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).