From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1817 invoked by alias); 24 Nov 2014 12:13:57 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 1140 invoked by uid 48); 24 Nov 2014 12:13:51 -0000 From: "filter-gcc at preshing dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug middle-end/59448] Code generation doesn't respect C11 address-dependency Date: Mon, 24 Nov 2014 12:13:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: middle-end X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: filter-gcc at preshing dot com X-Bugzilla-Status: NEW X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-11/txt/msg02719.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59448 --- Comment #23 from preshing --- Hi, I went ahead and verified this bug using a cross-compiler built from GCC 4.9.2 sources. The bug indeed exists and happens when compiling for AArch64, but not PowerPC. Andrew's patch fixes it (changing the first ldr instruction to an ldar in this case). Full AArch64 assembly listings below. I've also written a blog post on this subject in the hope of clarifying the issue for anyone determined enough to make sense of it: http://preshing.com/20141124/fixing-gccs-implementation-of-memory_order_consume Andrew's patch, if it works the way I understand it, seems like the correct thing for GCC to do until somebody figures out how to safely implement the "efficient" compiler strategy for consume semantics. I guess the next step is to run the test suite on a few platforms to make sure there are no regressions, then submit? Cheers, Jeff ------------------ AArch64 listing of threadB() without Andrew's patch: _Z7threadBv: .LFB2304: .cfi_startproc adrp x1, .LANCHOR0 stp x29, x30, [sp, -16]! .cfi_def_cfa_offset 16 .cfi_offset 29, -16 .cfi_offset 30, -8 add x1, x1, :lo12:.LANCHOR0 add x29, sp, 0 .cfi_def_cfa_register 29 .L10: add x0, x1, 8 ldr w0, [x0] cbz w0, .L10 ldr w0, [x1] cmp w0, 1 bne .L15 str wzr, [x1] add x0, x1, 8 stlr wzr, [x0] b .L10 .L15: adrp x3, .LANCHOR1 adrp x0, .LC2 adrp x1, .LC1 add x3, x3, :lo12:.LANCHOR1 add x0, x0, :lo12:.LC2 add x1, x1, :lo12:.LC1 mov w2, 47 add x3, x3, 16 bl __assert_fail .cfi_endproc ------------------ AArch64 listing of threadB() with Andrew's patch: _Z7threadBv: .LFB2304: .cfi_startproc adrp x1, .LANCHOR0 stp x29, x30, [sp, -16]! .cfi_def_cfa_offset 16 .cfi_offset 29, -16 .cfi_offset 30, -8 add x1, x1, :lo12:.LANCHOR0 add x29, sp, 0 .cfi_def_cfa_register 29 .L10: add x0, x1, 8 ldar w0, [x0] cbz w0, .L10 ldr w0, [x1] cmp w0, 1 bne .L15 str wzr, [x1] add x0, x1, 8 stlr wzr, [x0] b .L10 .L15: adrp x3, .LANCHOR1 adrp x0, .LC2 adrp x1, .LC1 add x3, x3, :lo12:.LANCHOR1 add x0, x0, :lo12:.LC2 add x1, x1, :lo12:.LC1 mov w2, 47 add x3, x3, 16 bl __assert_fail .cfi_endproc