From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1632 invoked by alias); 18 Feb 2014 19:47:51 -0000 Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org Received: (qmail 1623 invoked by uid 89); 18 Feb 2014 19:47:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00,RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: e33.co.us.ibm.com Received: from e33.co.us.ibm.com (HELO e33.co.us.ibm.com) (32.97.110.151) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Tue, 18 Feb 2014 19:47:50 +0000 Received: from /spool/local by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 18 Feb 2014 12:47:48 -0700 Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 18 Feb 2014 12:47:47 -0700 Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 4DA9A19D8042 for ; Tue, 18 Feb 2014 12:47:45 -0700 (MST) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by b03cxnp08028.gho.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id s1IJlkfA10617168 for ; Tue, 18 Feb 2014 20:47:46 +0100 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id s1IJpADl015033 for ; Tue, 18 Feb 2014 12:51:10 -0700 Received: from paulmck-ThinkPad-W500 ([9.70.82.174]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id s1IJpA6S015014; Tue, 18 Feb 2014 12:51:10 -0700 Received: by paulmck-ThinkPad-W500 (Postfix, from userid 1000) id 50315380E59; Tue, 18 Feb 2014 11:47:45 -0800 (PST) Date: Tue, 18 Feb 2014 19:47:00 -0000 From: "Paul E. McKenney" To: Linus Torvalds Cc: Peter.Sewell@cl.cam.ac.uk, "mark.batty@cl.cam.ac.uk" , Peter Zijlstra , Torvald Riegel , Will Deacon , Ramana Radhakrishnan , David Howells , "linux-arch@vger.kernel.org" , Linux Kernel Mailing List , Andrew Morton , Ingo Molnar , "gcc@gcc.gnu.org" Subject: Re: [RFC][PATCH 0/5] arch: atomic rework Message-ID: <20140218194745.GV4250@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14021819-0928-0000-0000-0000068F766C X-SW-Source: 2014-02/txt/msg00327.txt.bz2 On Tue, Feb 18, 2014 at 10:49:27AM -0800, Linus Torvalds wrote: > On Tue, Feb 18, 2014 at 10:21 AM, Peter Sewell > wrote: > > > > This is a bit more subtle, because (on ARM and POWER) removing the > > dependency and conditional branch is actually in general *not* equivalent > > in the hardware, in a concurrent context. > > So I agree, but I think that's a generic issue with non-local memory > ordering, and is not at all specific to the optimization wrt that > "x?42:42" expression. > > If you have a value that you loaded with a non-relaxed load, and you > pass that value off to a non-local function that you don't know what > it does, in my opinion that implies that the compiler had better add > the necessary serialization to say "whatever that other function does, > we guarantee the semantics of the load". > > So on ppc, if you do a load with "consume" or "acquire" and then call > another function without having had something in the caller that > serializes the load, you'd better add the lwsync or whatever before > the call. Exactly because the function call itself otherwise basically > breaks the visibility into ordering. You've basically turned a > load-with-ordering-guarantees into just an integer that you passed off > to something that doesn't know about the ordering guarantees - and you > need that "lwsync" in order to still guarantee the ordering. > > Tough titties. That's what a CPU with weak memory ordering semantics > gets in order to have sufficient memory ordering. And that is in fact what C11 compilers are supposed to do if the function doesn't have the [[carries_dependency]] attribute on the corresponding argument or return of the non-local function. If the function is marked with [[carries_dependency]], then the compiler has the information needed in both compilations to make things work correctly. Thanx, Paul > And I don't think it's actually a problem in practice. If you are > doing loads with ordered semantics, you're not going to pass the > result off willy-nilly to random functions (or you really *do* require > the ordering, because the load that did the "acquire" was actually for > a lock! > > So I really think that the "local optimization" is correct regardless. > > Linus >