public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/59767] New: __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier
@ 2014-01-11  2:09 mikulas at artax dot karlin.mff.cuni.cz
  2015-07-02 16:35 ` [Bug target/59767] " jamrial at gmail dot com
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: mikulas at artax dot karlin.mff.cuni.cz @ 2014-01-11  2:09 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59767

            Bug ID: 59767
           Summary: __atomic_load_n with __ATOMIC_SEQ_CST should result in
                    a memory barrier
           Product: gcc
           Version: 4.8.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: mikulas at artax dot karlin.mff.cuni.cz
              Host: x86_64-unknown-linux-gnu
            Target: x86_64-unknown-linux-gnu
             Build: x86_64-unknown-linux-gnu

The following function:
int f(void)
{
        int x;
        return __atomic_load_n(&x, __ATOMIC_SEQ_CST);
}
results in this code when compiled with -O2:
f:
.LFB0:
        .cfi_startproc
        movl    -4(%rsp), %eax
        ret
        .cfi_endproc
.LFE0:

The flag "__ATOMIC_SEQ_CST" should imply a full memory barrier before and after
the operation, but there is no barrier in the generated code.

The bug also exists in 32-bit mode and in x32 mode.

You should place a "mfence" instruction before the movl instruction that loads
the value.

I believe that it is not necessary to place the mfence instruction after the
memory load, but you should better check the ia32 specification for that.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/59767] __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier
  2014-01-11  2:09 [Bug target/59767] New: __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier mikulas at artax dot karlin.mff.cuni.cz
@ 2015-07-02 16:35 ` jamrial at gmail dot com
  2015-07-06 14:21 ` amacleod at redhat dot com
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jamrial at gmail dot com @ 2015-07-02 16:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59767

James Almer <jamrial at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jamrial at gmail dot com

--- Comment #1 from James Almer <jamrial at gmail dot com> ---
This still happens with gcc 4.9 and gcc 5.

int foo(int *bar)
{
    return *bar;
}

and

int foo(int *bar)
{
    return __atomic_load_n(bar, __ATOMIC_SEQ_CST);
}

Generate the exact same assembly. The latter should add an mfence before the
mov instruction.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/59767] __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier
  2014-01-11  2:09 [Bug target/59767] New: __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier mikulas at artax dot karlin.mff.cuni.cz
  2015-07-02 16:35 ` [Bug target/59767] " jamrial at gmail dot com
@ 2015-07-06 14:21 ` amacleod at redhat dot com
  2015-07-06 15:00 ` mikulas at artax dot karlin.mff.cuni.cz
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: amacleod at redhat dot com @ 2015-07-06 14:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59767

Andrew Macleod <amacleod at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amacleod at redhat dot com

--- Comment #2 from Andrew Macleod <amacleod at redhat dot com> ---
I do not believe the lock is needed.  See
https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

GCC will emit the fence on all stores which then synchronize with loads.  If we
did emit a fence on the loads, then we wouldn't need the fence on stores.   It
is likely there will be more loads than stores, so the current approach is
usually more efficient.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/59767] __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier
  2014-01-11  2:09 [Bug target/59767] New: __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier mikulas at artax dot karlin.mff.cuni.cz
  2015-07-02 16:35 ` [Bug target/59767] " jamrial at gmail dot com
  2015-07-06 14:21 ` amacleod at redhat dot com
@ 2015-07-06 15:00 ` mikulas at artax dot karlin.mff.cuni.cz
  2015-07-06 16:16 ` amacleod at redhat dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: mikulas at artax dot karlin.mff.cuni.cz @ 2015-07-06 15:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59767

--- Comment #3 from mikulas at artax dot karlin.mff.cuni.cz ---
The problem here is that we don't really know what does the standard specify.

People often suggest the Batty's paper Mathematizing C++ Concurrency (
http://www.cl.cam.ac.uk/~pes20/cpp/popl085ap-sewell.pdf ) as the explanation,
but the paper really describes a different memory model than the C++ standard
(citing section 4: "Unfortunately N3092 allows the following nonsequentially
consistent execution of the SB example with SC atomics (initialisation writes,
such as (a) and (b), are non-atomic so that they need not be compiled with
memory fences): - We devised a stronger restriction on the values that may be
read by SC atomics, stated in §2.7, that does provide sequential consistency
here." - so it doesn't describe the standard, but something stronger that
authors have "devised")

You can look at this example https://gcc.gnu.org/ml/gcc/2014-02/msg00053.html .
The assert can fail - so it implies that __ATOMIC_SEQ_CST is not a full
barrier, it is somehow weaker. But how much weaker is it? Who knows?

I look at https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html and I would
really like to know "why is it so?" "Where does the standard specify this
mapping?" I couldn't really find an answer for that.
>From gcc-bugs-return-491575-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Mon Jul 06 15:18:26 2015
Return-Path: <gcc-bugs-return-491575-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 89911 invoked by alias); 6 Jul 2015 15:18:26 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 89670 invoked by uid 55); 6 Jul 2015 15:18:17 -0000
From: "hjl at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/66749] [4.9/5/6] gcc.target/i386/addr-sel-1.c fails to merge array index into one instruction with -m32 -mregparm=3 or with -miamcu
Date: Mon, 06 Jul 2015 15:18:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 6.0
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: hjl at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 6.0
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields:
Message-ID: <bug-66749-4-bMELDQwhSU@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-66749-4@http.gcc.gnu.org/bugzilla/>
References: <bug-66749-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-07/txt/msg00465.txt.bz2
Content-length: 1378

https://gcc.gnu.org/bugzilla/show_bug.cgi?idf749

--- Comment #6 from hjl at gcc dot gnu.org <hjl at gcc dot gnu.org> ---
Author: hjl
Date: Mon Jul  6 15:17:44 2015
New Revision: 225460

URL: https://gcc.gnu.org/viewcvs?rev"5460&root=gcc&view=rev
Log:
Add -march=iamcu to optimize for IA MCU

IA MCU is based on Intel Pentium ISA without x87 and passing parameters
in registers.  We want to optimize for IA MCU without changing existing
Pentium codegen.  This patch adds PROCESSOR_IAMCU for -march=iamcu,
which is based on -march=pentium with updated cost tables.

gcc/

        PR target/66749
        * config/i386/i386.c (iamcu_cost): New.
        (m_IAMCU): Likewise.
        (initial_ix86_arch_features): Disable X86_ARCH_CMOV for m_IAMCU.
        (processor_target_table): Add an entry for "iamcu".
        (processor_alias_table): Likewise.
        (ix86_issue_rate): Handle PROCESSOR_IAMCU.
        (ix86_adjust_cost): Likewise.
        (ia32_multipass_dfa_lookahead): Likewise.
        * config/i386/i386.h (processor_type): Add PROCESSOR_IAMCU.
        * config/i386/x86-tune.def: Updated for m_IAMCU.

gcc/testsuite/

        PR target/66749
        * gcc.target/i386/pr66749.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr66749.c
Modified:
    trunk/gcc/config/i386/i386.c
    trunk/gcc/config/i386/i386.h
    trunk/gcc/config/i386/x86-tune.def


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/59767] __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier
  2014-01-11  2:09 [Bug target/59767] New: __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier mikulas at artax dot karlin.mff.cuni.cz
                   ` (2 preceding siblings ...)
  2015-07-06 15:00 ` mikulas at artax dot karlin.mff.cuni.cz
@ 2015-07-06 16:16 ` amacleod at redhat dot com
  2015-07-06 17:06 ` mikulas at artax dot karlin.mff.cuni.cz
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: amacleod at redhat dot com @ 2015-07-06 16:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59767

--- Comment #4 from Andrew Macleod <amacleod at redhat dot com> ---
I'm not sure where the problem is.  We interacted quite a bit as the model was
being developed. As I recall, it started with the standard, but they
strengthened some of the problem spots for a complete testable model. 

 To the best of my knowledge, this is the most efficient implementation that we
*know* to be safe, and that is why we implement it.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/59767] __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier
  2014-01-11  2:09 [Bug target/59767] New: __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier mikulas at artax dot karlin.mff.cuni.cz
                   ` (3 preceding siblings ...)
  2015-07-06 16:16 ` amacleod at redhat dot com
@ 2015-07-06 17:06 ` mikulas at artax dot karlin.mff.cuni.cz
  2015-07-06 18:40 ` amacleod at redhat dot com
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: mikulas at artax dot karlin.mff.cuni.cz @ 2015-07-06 17:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59767

--- Comment #5 from mikulas at artax dot karlin.mff.cuni.cz ---
So, please pinpoint specific parahraph(s) in the standard that specify that
this behavior is acceptable.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/59767] __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier
  2014-01-11  2:09 [Bug target/59767] New: __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier mikulas at artax dot karlin.mff.cuni.cz
                   ` (4 preceding siblings ...)
  2015-07-06 17:06 ` mikulas at artax dot karlin.mff.cuni.cz
@ 2015-07-06 18:40 ` amacleod at redhat dot com
  2015-07-06 22:36 ` torvald at gcc dot gnu.org
  2015-07-09 11:40 ` amacleod at redhat dot com
  7 siblings, 0 replies; 9+ messages in thread
From: amacleod at redhat dot com @ 2015-07-06 18:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59767

--- Comment #6 from Andrew Macleod <amacleod at redhat dot com> ---
The standard doesn't define what machines should generate what code. It defines
terms for observing effects that need to be adhered to. Their machine model was
created over a few years during the early stages of the memory model to help
observe and test those effects. To the best of our knowledge we think the
results from this model are an efficient and correct implementation within the
standards definition.

If you can provide a test case which demonstrates that this is not a correct
implementation based on observable effects (ie the load is observed out of
order somewhere), then we'd look at fixing it in GCC.

If you want to discuss whether parts are or aren't compliant without a test
case that fails, then you should contact the authors with your questions since
they can give you a far better answer than I ever could.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/59767] __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier
  2014-01-11  2:09 [Bug target/59767] New: __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier mikulas at artax dot karlin.mff.cuni.cz
                   ` (5 preceding siblings ...)
  2015-07-06 18:40 ` amacleod at redhat dot com
@ 2015-07-06 22:36 ` torvald at gcc dot gnu.org
  2015-07-09 11:40 ` amacleod at redhat dot com
  7 siblings, 0 replies; 9+ messages in thread
From: torvald at gcc dot gnu.org @ 2015-07-06 22:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59767

torvald at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |torvald at gcc dot gnu.org

--- Comment #8 from torvald at gcc dot gnu.org ---
(In reply to mikulas from comment #7)
> #include <stdatomic.h>
> 
> atomic_int a = ATOMIC_VAR_INIT(0);
> atomic_int b = ATOMIC_VAR_INIT(0);
> atomic_int p = ATOMIC_VAR_INIT(0);
> 
> int thread_1(void)
> {
>         atomic_store_explicit(&b, 1, memory_order_relaxed);
>         atomic_load_explicit(&p, memory_order_seq_cst);
>         return atomic_load_explicit(&a, memory_order_relaxed);
> }
> 
> int thread_2(void)
> {
>         atomic_store_explicit(&a, 1, memory_order_relaxed);
>         atomic_load_explicit(&p, memory_order_seq_cst);
>         return atomic_load_explicit(&b, memory_order_relaxed);
> }
> 
> See for example this. Suppose that thread_1 and thread_2 are executed
> concurrently. If memory_order_seq_cst were a proper full memory barrier, it
> would be impossible that both functions return 0.

memory_order_seq_cst is a memory order in the Standard's terminology.  Fences
are something else (ie, atomic_thread_fence()) , and parametrized by a memory
order.  A memory_order_seq_cst *memory access* does not have the same effects
as a memory_order_seq_cst fence.  See C++14 29.3p4-7; those paragraphs talk
about memory_order_seq_cst fences specifically, not about memory_order_seq_cst
operations in general.

If you want to make this example of Dekker synchronization correct, you need to
use fences instead of the accesses to p; alternatively, you need to use seq-cst
accesses for all the stores and loads to a and b, in which case there will be
HW fences added via the stores (as Andrew already pointed out).


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/59767] __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier
  2014-01-11  2:09 [Bug target/59767] New: __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier mikulas at artax dot karlin.mff.cuni.cz
                   ` (6 preceding siblings ...)
  2015-07-06 22:36 ` torvald at gcc dot gnu.org
@ 2015-07-09 11:40 ` amacleod at redhat dot com
  7 siblings, 0 replies; 9+ messages in thread
From: amacleod at redhat dot com @ 2015-07-09 11:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59767

Andrew Macleod <amacleod at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #9 from Andrew Macleod <amacleod at redhat dot com> ---
GCC emits the fence on stores, and thus loads don't need a fence.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-07-09 11:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-11  2:09 [Bug target/59767] New: __atomic_load_n with __ATOMIC_SEQ_CST should result in a memory barrier mikulas at artax dot karlin.mff.cuni.cz
2015-07-02 16:35 ` [Bug target/59767] " jamrial at gmail dot com
2015-07-06 14:21 ` amacleod at redhat dot com
2015-07-06 15:00 ` mikulas at artax dot karlin.mff.cuni.cz
2015-07-06 16:16 ` amacleod at redhat dot com
2015-07-06 17:06 ` mikulas at artax dot karlin.mff.cuni.cz
2015-07-06 18:40 ` amacleod at redhat dot com
2015-07-06 22:36 ` torvald at gcc dot gnu.org
2015-07-09 11:40 ` amacleod at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).