From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 4126A3858422; Fri, 7 Jul 2023 17:13:32 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4126A3858422 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1688750012; bh=Qm2EahkEVyrYvVfFt2gYNk9zEzvJXgvVrl40Ate1PWA=; h=From:To:Subject:Date:From; b=B3tb2gUw74ze9TzfzEH8MsgCe7Vu/jwTKM3hawx+JO593N8vOjw41ED1uQDcUcBTj /G+2PdzQmW3Grc57NNKP9LzLgwrg9agPghUr3Gv6hbZcGpg3enOSuipfsO61Nx5Lx3 oPSgD3kWRqRZQAV/mt0y2PY3VKwso7FKS3mcmVcI= From: "thiago at kde dot org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/110591] New: [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags Date: Fri, 07 Jul 2023 17:13:32 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 13.1.1 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: thiago at kde dot org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110591 Bug ID: 110591 Summary: [i386] (Maybe) Missed optimisation: _cmpccxadd sets flags Product: gcc Version: 13.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: thiago at kde dot org Target Milestone: --- In: #include bool increment_if(int *ptr, int v) { return _cmpccxadd_epi32(ptr, v, 1, _CMPCCX_Z) =3D=3D v; } GCC generates (and current Clang does the same): increment_if(int*, int): movl $1, %edx movl %esi, %eax cmpzxadd %edx, %eax, (%rdi) cmpl %eax, %esi sete %al ret The CMPccXADD instructions set EFLAGS to the result of the comparison of th= eir memory operand to the middle one, which will get the current value of that memory location whether the comparison succeeded or not. That means the CMP instruction on the next line is superfluous, since it'll set the flags to exactly what they are already set to. That means this particular example co= uld be written: movl $1, %edx cmpzxadd %edx, %esi, (%rdi) sete %al ret Saving 2 retire slots and 1 uop. This can be done every time the result of = the intrinsic is compared to the same value that was passed as the intrinsic's second parameter. However, in a real workload, this function is likely to be inlined, where t= he extra MOV may not be present at all and the CMP is likely to be followed by= a Jcc instead of a SETcc. For the latter case, the CMP+Jcc would be macro-fus= ed, so there would be no 1-uop gain. Moreover, this atomic operation is likely going to be multiple cycles long and the conditional code after it probably can't be speculated very well either. I'll leave it up to you to decide whether it's worth pursuing this.=