From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 678CA3857C5D; Fri, 5 Nov 2021 13:25:48 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 678CA3857C5D From: "jakub at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/103066] __sync_val_compare_and_swap/__sync_bool_compare_and_swap aren't optimized Date: Fri, 05 Nov 2021 13:25:48 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: jakub at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Nov 2021 13:25:48 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D103066 --- Comment #8 from Jakub Jelinek --- (In reply to H.J. Lu from comment #7) > Instead of generating: >=20 > movl f(%rip), %eax > .L2: > movd %eax, %xmm0 > addss .LC0(%rip), %xmm0 > movd %xmm0, %edx > lock cmpxchgl %edx, f(%rip) > jne .L2 > ret >=20 > we want >=20 > movl f(%rip), %eax > .L2: > movd %eax, %xmm0 > addss .LC0(%rip), %xmm0 > movd %xmm0, %edx > cmpl f(%rip), %eax > jne .L2 > lock cmpxchgl %edx, f(%rip) > jne .L2 > ret No, certainly not. The mov before or the remembered value from previous lo= ck cmpxchgl already has the right value unless the atomic memory is extremely contended, so you don't want to add the non-atomic comparison in between. = Not to mention that the way you've written it totally breaks it, because if the memory is not equal to the expected value, you should get the current value. With the above code, if f is modified by another thread in between the init= ial movl f(%rip), %eax and cmpl f(%rip), %eax and never after it, it will loop forever. I believe what the above paper is talking about should be addressed by user= s of these intrinsics if they care and if it is beneficial (e.g. depending on ex= tra information on how much the lock etc. is contended etc., in OpenMP one has omp_sync_hint_* constants one can use in hint clause to tell if the lock is contended, uncontended, unknown, speculative, non-speculative, unknown etc.= ).=