From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 6620 invoked by alias); 14 Jan 2013 06:11:43 -0000 Received: (qmail 6536 invoked by uid 48); 14 Jan 2013 06:11:17 -0000 From: "andi-gcc at firstfloor dot org" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/55966] New: __atomic_fetch_* generate wrong code for HLE Date: Mon, 14 Jan 2013 06:11:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: andi-gcc at firstfloor dot org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Changed-Fields: Message-ID: X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated Content-Type: text/plain; charset="UTF-8" MIME-Version: 1.0 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2013-01/txt/msg01149.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55966 Bug #: 55966 Summary: __atomic_fetch_* generate wrong code for HLE Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned@gcc.gnu.org ReportedBy: andi-gcc@firstfloor.org Target: x86_64-linux __atomic_fetch_(and|xor|or|nand) sometimes generate a cmpxchg loop instead of the direct instruction. nand always does that because there is no x86 nand The others can in principle generate direct instructions, and do, but not always. When specifying __ATOMIC_HLE_RELEASE or ACQUIRE the HLE prefix is not generated. Also when the CMPXCHG loop is generated it would be needed to put a PAUSE for the unsuccessfull path, otherwise poor performance will happen. Generating correct code for a CMPXCHG HLE loop is tricky and it may be better to forbid the nand case. But for others which can be implemented as a single atomic operations it would be better to ensure they always do that instead of falling back to cmpxchg. Testcase TBD.