From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1697 invoked by alias); 9 Sep 2011 08:07:53 -0000 Received: (qmail 1686 invoked by uid 22791); 9 Sep 2011 08:07:51 -0000 X-SWARE-Spam-Status: No, hits=-2.4 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,FREEMAIL_FROM,RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from mail-vx0-f175.google.com (HELO mail-vx0-f175.google.com) (209.85.220.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 09 Sep 2011 08:07:35 +0000 Received: by vxh2 with SMTP id 2so44406vxh.20 for ; Fri, 09 Sep 2011 01:07:35 -0700 (PDT) Received: by 10.52.29.136 with SMTP id k8mr1665032vdh.283.1315555653926; Fri, 09 Sep 2011 01:07:33 -0700 (PDT) Received: from yakj.usersys.redhat.com (93-34-199-31.ip51.fastwebnet.it [93.34.199.31]) by mx.google.com with ESMTPS id jo8sm4321660vdb.20.2011.09.09.01.07.31 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 09 Sep 2011 01:07:33 -0700 (PDT) Message-ID: <4E69C942.3090808@gnu.org> Date: Fri, 09 Sep 2011 08:07:00 -0000 From: Paolo Bonzini User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20110816 Thunderbird/6.0 MIME-Version: 1.0 To: GCC Mailing List , Jakub Jelinek , Aldy Hernandez , amacleod@redhat.com Subject: should sync builtins be full optimization barriers? Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Mailing-List: contact gcc-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-owner@gcc.gnu.org X-SW-Source: 2011-09/txt/msg00088.txt.bz2 Hi all, sync builtins are described in the documentations as being full memory barriers, with the possible exception of __sync_lock_test_and_set. However, GCC is not enforcing the fact that they are also full _optimization_ barriers. The RTL produced by builtins does not in general include a memory optimization barrier such as a set of (mem/v:BLK (scratch:P)). This can cause problems with lock-free algorithms, for example this: http://libdispatch.macosforge.org/trac/ticket/35 This can be solved either in generic code, by wrapping sync builtins (before and after) with an asm("":::"memory"), or in the single machine descriptions by adding a memory barrier in parallel to the locked instructions or with the ll/sc instructions. Is the above analysis correct? Or should the users put explicit compiler barriers? Paolo