From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3381 invoked by alias); 14 Mar 2007 19:05:36 -0000 Received: (qmail 2666 invoked by uid 48); 14 Mar 2007 19:05:24 -0000 Date: Wed, 14 Mar 2007 19:05:00 -0000 Message-ID: <20070314190524.2665.qmail@sourceware.org> X-Bugzilla-Reason: CC References: Subject: [Bug rtl-optimization/31170] cmpxchgq not emitted. In-Reply-To: Reply-To: gcc-bugzilla@gcc.gnu.org To: gcc-bugs@gcc.gnu.org From: "pluto at agmk dot net" Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org X-SW-Source: 2007-03/txt/msg01322.txt.bz2 ------- Comment #2 from pluto at agmk dot net 2007-03-14 19:05 ------- (In reply to comment #1) > ifcvt could do this. But is cmpxchgq really faster with its atomictiy > guarantee? only `lock; cmpxchg' has atomicity guarantee on smp. > They are all vector-path instructions, a compare - cmov sequence looks > faster (8 cycle latency vs. 10 and also with less constraints on register > allocation). Even the code we emit now: > > emit_cmpxchg: > .LFB2: > movq (%rdi), %rax > cmpq %rsi, %rax > je .L6 > rep ; ret > .p2align 4,,7 > .L6: > movq %rdx, (%rdi) > ret > > could be faster dependent on branch probability. yes, it could be faster, but for -Os we could emit a small branchless code: movq %rsi, %rax cmpxchgq %rdx, (%rdi) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31170