From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-help-return-53131-listarch-gcc-help=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 23409 invoked by alias); 5 Jun 2013 02:45:55 -0000
Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-help.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-help/>
List-Post: <mailto:gcc-help@gcc.gnu.org>
List-Help: <mailto:gcc-help-help@gcc.gnu.org>
Sender: gcc-help-owner@gcc.gnu.org
Received: (qmail 23384 invoked by uid 89); 5 Jun 2013 02:45:49 -0000
X-Spam-SWARE-Status: No, score=-3.7 required=5.0 tests=AWL,BAYES_00,FREEMAIL_FROM,KHOP_THREADED,RCVD_IN_DNSWL_NONE,RCVD_IN_HOSTKARMA_YE,RP_MATCHES_RCVD autolearn=ham version=3.3.1
Received: from nm37-vm0.bullet.mail.bf1.yahoo.com (HELO nm37-vm0.bullet.mail.bf1.yahoo.com) (72.30.238.200)    by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Wed, 05 Jun 2013 02:45:47 +0000
Received: from [98.139.212.153] by nm37.bullet.mail.bf1.yahoo.com with NNFMP; 05 Jun 2013 02:45:44 -0000
Received: from [98.139.211.204] by tm10.bullet.mail.bf1.yahoo.com with NNFMP; 05 Jun 2013 02:45:44 -0000
Received: from [127.0.0.1] by smtp213.mail.bf1.yahoo.com with NNFMP; 05 Jun 2013 02:45:44 -0000
X-Yahoo-SMTP: fQw2HLOswBAfNNAqnKVM7sze51rEPzp2JLIy
X-Rocket-Received: from [192.168.1.45] (limegreensocks@207.118.20.56 with )        by smtp213.mail.bf1.yahoo.com with SMTP; 04 Jun 2013 19:45:44 -0700 PDT
Message-ID: <51AEA657.9080607@yahoo.com>
Date: Wed, 05 Jun 2013 02:45:00 -0000
From: dw <limegreensocks@yahoo.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130509 Thunderbird/17.0.6
MIME-Version: 1.0
To: "gcc-help@gcc.gnu.org" <gcc-help@gcc.gnu.org>
Subject: Re: Question about __builtin_ia32_mfence and memory barriers
References: <51AE7119.5090000@yahoo.com> <CAKOQZ8yxRncKoRjLcnR5rZnkybtOTAtCoLo9f-OJyCFe47JWEw@mail.gmail.com>
In-Reply-To: <CAKOQZ8yxRncKoRjLcnR5rZnkybtOTAtCoLo9f-OJyCFe47JWEw@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-SW-Source: 2013-06/txt/msg00028.txt.bz2

 > A better choice these days is __atomic_thread_fence(__ATOMIC_SEQ_CST)
 > (or __atomic_signal_fence).

This sounded so promising. Unfortunately, it's not producing the results 
I need.  I can put all these statements in the code, and none of them 
generate -any- fence instruction:

     __atomic_thread_fence(__ATOMIC_RELAXED);
     __atomic_thread_fence(__ATOMIC_CONSUME);
     __atomic_thread_fence(__ATOMIC_ACQUIRE);
     __atomic_thread_fence(__ATOMIC_RELEASE);
     __atomic_thread_fence(__ATOMIC_ACQ_REL);

     __atomic_signal_fence(__ATOMIC_RELAXED);
     __atomic_signal_fence(__ATOMIC_CONSUME);
     __atomic_signal_fence(__ATOMIC_ACQUIRE);
     __atomic_signal_fence(__ATOMIC_RELEASE);
     __atomic_signal_fence(__ATOMIC_ACQ_REL);
     __atomic_signal_fence(__ATOMIC_SEQ_CST);

And while I get an mfence instruction with this:

     __atomic_thread_fence(__ATOMIC_SEQ_CST);

It doesn't produce quite the same instruction ordering as:

   asm volatile ("mfence" ::: "memory");

Which makes me think that whatever __ATOMIC_SEQ_CST means, it's not the 
same as the "memory" clobber.  Also, I'm looking to support SFENCE and 
LFENCE, which these don't appear to support at all.

 > I'm not clear on whether _mm_mfence is meant to be a compiler memory 
barrier or not.

Every authoritative reference I have found is maddeningly silent on this 
point.

However, I have tried compiling x64 code with MSVC, and the instruction 
ordering it produces for _mm_mfence is not the same as what it produces 
for _mm_sfence.  In fact, the asm produced when using _mm_sfence bears a 
striking similarity to what you get with just _WriteBarrier (minus the 
sfence instruction, of course), and _mm_mfence looks like _ReadWriteBarrier.

While I'm not prepared to call this conclusive evidence, it is becoming 
suspicious.

And apparently I'm not the only person who thinks there is a problem 
here 
(http://doxygen.reactos.org/dd/dcb/intrin__x86_8h_a0dee6d755a43d9f9d8072d6202b487db.html#a0dee6d755a43d9f9d8072d6202b487db). 
I was concerned about using 2 statements and hoping the compiler didn't 
re-order any code around them.  I'm not convinced that 3 statements 
makes me feel any better.

dw