From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 104EE3858D35; Wed, 3 Nov 2021 19:08:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 104EE3858D35 From: "hjl.tools at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/103069] New: cmpxchg isn't optimized Date: Wed, 03 Nov 2021 19:08:30 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: hjl.tools at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter cc blocked target_milestone cf_gcctarget Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 03 Nov 2021 19:08:31 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103069 Bug ID: 103069 Summary: cmpxchg isn't optimized Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: hjl.tools at gmail dot com CC: crazylht at gmail dot com, wwwhhhyyy333 at gmail dot com Blocks: 103065 Target Milestone: --- Target: i386,x86-64 >>From the CPU's point of view, getting a cache line for writing is more expensive than reading. See Appendix A.2 Spinlock in: https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/x= eon-lock-scaling-analysis-paper.pdf The full compare and swap will grab the cache line exclusive and causes excessive cache line bouncing. [hjl@gnu-cfl-2 pr102566]$ cat e.c=20 int f3 (int *a) { return __atomic_fetch_or (a, 0x40000000, __ATOMIC_RELAXED); } [hjl@gnu-cfl-2 pr102566]$ gcc -S -O2 x.c=20 [hjl@gnu-cfl-2 pr102566]$ cat x.s=20 .file "x.c" .text .p2align 4 .globl foo .type foo, @function foo: .LFB0: .cfi_startproc movl v(%rip), %eax .L2: movl %eax, %ecx movl %eax, %edx orl $1, %ecx lock cmpxchgl %ecx, v(%rip) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ GCC should first emit a normal load, check and jump to .L2 if cmpxchgl may fail. Before jump to .L2, PAUSE should be inserted to to yield the CPU to another hyperthread and to save power. It also serves to slightly limit the rate of accesses on the processor interconnect. jne .L2 movl %edx, %eax andl $1, %eax ret .cfi_endproc .LFE0: .size foo, .-foo .ident "GCC: (GNU) 11.2.1 20211019 (Red Hat 11.2.1-6)" .section .note.GNU-stack,"",@progbits [hjl@gnu-cfl-2 pr102566]$ Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103065 [Bug 103065] [meta] atomic operations aren't optimized=