[Bug middle-end/51766] New: [4.7 regression] sync_fetch_and

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity
@ 2012-01-05 15:01 dje at gcc dot gnu.org
  2012-01-05 15:02 ` [Bug middle-end/51766] " dje at gcc dot gnu.org
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: dje at gcc dot gnu.org @ 2012-01-05 15:01 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

             Bug #: 51766
           Summary: [4.7 regression] sync_fetch_and_xxx atomicity
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: dje@gcc.gnu.org


During the C++11 memory model conversion, expand_builtin_sync_operation was
updated.  __sync_fetch_and_xxx and __sync_xxx_and_fetch now invoke
expand_atomic_fetch_op with MEMMODEL_SEQUENTIAL_CST.

  return expand_atomic_fetch_op (target, mem, val, code, MEMMODEL_SEQ_CST,
                                 after);

This has changed the memory model semantics of the builtins and causes the
rs6000/POWER architecture to emit a heavy-weight "sync" instruction instead of
a light-weight "lwsync" instruction, which is a regression.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
@ 2012-01-05 15:02 ` dje at gcc dot gnu.org
  2012-01-09 15:40 ` rguenth at gcc dot gnu.org
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: dje at gcc dot gnu.org @ 2012-01-05 15:02 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

David Edelsohn <dje at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-01-05
                 CC|                            |rth at gcc dot gnu.org
      Known to work|                            |4.6.0, 4.6.1
   Target Milestone|---                         |4.7.0
     Ever Confirmed|0                           |1
      Known to fail|                            |4.7.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
  2012-01-05 15:02 ` [Bug middle-end/51766] " dje at gcc dot gnu.org
@ 2012-01-09 15:40 ` rguenth at gcc dot gnu.org
  2012-01-09 15:45 ` redi at gcc dot gnu.org
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-01-09 15:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

Richard Guenther <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1

--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-01-09 15:40:10 UTC ---
The docs of __sync_* say

This builtin is not a full barrier, but rather an @dfn{acquire barrier}.
This means that references after the builtin cannot move to (or be
speculated to) before the builtin, but previous memory stores may not
be globally visible yet, and previous memory loads may not yet be
satisfied.

But it is not exactly clear to which builtins this applies.  Thus, is
the intended behavior actually target depedent?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
  2012-01-05 15:02 ` [Bug middle-end/51766] " dje at gcc dot gnu.org
  2012-01-09 15:40 ` rguenth at gcc dot gnu.org
@ 2012-01-09 15:45 ` redi at gcc dot gnu.org
  2012-01-09 16:51 ` dje at gcc dot gnu.org
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: redi at gcc dot gnu.org @ 2012-01-09 15:45 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> 2012-01-09 15:45:06 UTC ---
(In reply to comment #1)
> The docs of __sync_* say
> 
> This builtin is not a full barrier, but rather an @dfn{acquire barrier}.
> This means that references after the builtin cannot move to (or be
> speculated to) before the builtin, but previous memory stores may not
> be globally visible yet, and previous memory loads may not yet be
> satisfied.
> 
> But it is not exactly clear to which builtins this applies.  Thus, is
> the intended behavior actually target depedent?

It refers to __sync_lock_test_and_set only (it says "this builtin" and follows
that one)

And "This builtin is not a full barrier, but rather a release barrier." refers
to __sync_lock_release.

All the others are full barriers.  It says above them "In most cases, these
builtins are considered a full barrier." and only __sync_lock_test_and_set and
__sync_lock_release specify different barrier semantics.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2012-01-09 15:45 ` redi at gcc dot gnu.org
@ 2012-01-09 16:51 ` dje at gcc dot gnu.org
  2012-01-10  9:43 ` rguenth at gcc dot gnu.org
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: dje at gcc dot gnu.org @ 2012-01-09 16:51 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

--- Comment #3 from David Edelsohn <dje at gcc dot gnu.org> 2012-01-09 16:49:10 UTC ---
> It says above them "In most cases, these
> builtins are considered a full barrier." and only __sync_lock_test_and_set and
> __sync_lock_release specify different barrier semantics.

The next sentence is:

"That is, no memory operand will be moved across the operation, either forward
or
backward."

Note that this refers to memory operands, not memory operations -- memory
stores and memory loads referenced in documentation of the other sync builtins.
 In other words, one could interpret "full memory barrier" as:

asm volatile ("" : : : "memory");

that refers to a GCC scheduling barrier.

The GCC documentation references Intel processors, which do not have have a
distinction between instructions for RELEASE, ACQ_REL and SEQ_CST semantics.

The basic problem is that the GCC builtins and atomic instruction semantics
were designed for Intel processors that do not provide the level of granularity
implemented in POWER processors.  The POWER port implemented lighter weight
ACQ_REL semantics. Retrofitting the original builtins on the new C++11 memory
model semantics and imposing SEQ_CST interpretation has changed the behavior
and performance on POWER, but not on other targets.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2012-01-09 16:51 ` dje at gcc dot gnu.org
@ 2012-01-10  9:43 ` rguenth at gcc dot gnu.org
  2012-01-10 14:40 ` dje at gcc dot gnu.org
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-01-10  9:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

--- Comment #4 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-01-10 09:43:16 UTC ---
(In reply to comment #3)
> > It says above them "In most cases, these
> > builtins are considered a full barrier." and only __sync_lock_test_and_set and
> > __sync_lock_release specify different barrier semantics.
> 
> The next sentence is:
> 
> "That is, no memory operand will be moved across the operation, either forward
> or
> backward."
> 
> Note that this refers to memory operands, not memory operations -- memory
> stores and memory loads referenced in documentation of the other sync builtins.
>  In other words, one could interpret "full memory barrier" as:
> 
> asm volatile ("" : : : "memory");
> 
> that refers to a GCC scheduling barrier.
> 
> The GCC documentation references Intel processors, which do not have have a
> distinction between instructions for RELEASE, ACQ_REL and SEQ_CST semantics.
> 
> The basic problem is that the GCC builtins and atomic instruction semantics
> were designed for Intel processors that do not provide the level of granularity
> implemented in POWER processors.  The POWER port implemented lighter weight
> ACQ_REL semantics. Retrofitting the original builtins on the new C++11 memory
> model semantics and imposing SEQ_CST interpretation has changed the behavior
> and performance on POWER, but not on other targets.

But for more precise semantics we now have the __atomic_* builtins, right?
And the __sync_* ones are deprecated.  I don't see how we can preserve
old behavior for the __sync_* ones without adding a new target hook.
The documentation would need to be adjusted, of course, I'm not sure
that different atomic semantics are "useful" for these "portable"
synchronization primitives?

Thus, fixing the libstdc++ side seems worthwhile, but I'm not sure
with respect to the deprecated __sync builtins?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2012-01-10  9:43 ` rguenth at gcc dot gnu.org
@ 2012-01-10 14:40 ` dje at gcc dot gnu.org
  2012-01-10 14:49 ` rguenth at gcc dot gnu.org
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: dje at gcc dot gnu.org @ 2012-01-10 14:40 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

--- Comment #5 from David Edelsohn <dje at gcc dot gnu.org> 2012-01-10 14:39:16 UTC ---
I understand that fixing __sync_* is a hassle.  This is why I opened a separate
bug for libstdc++.

While __sync_* is deprecated in favor of __atomic_*, use of __sync_* for
portability is fairly pervasive in FOSS applications that need it because of
its implementation in GCC.  Most programmers do not know about memory models
and do not care about memory models.  And it will take time for programmers to
switch to __atomic_*, if they even bother to choose a memory model and don't
introduce a bug.

The basic problem is MEMMODEL_SEQ_CST only makes a performance difference for
POWER and developers are going to continue to use __sync_* builtins for a
while.  This change in default behavior only hurts performance for applications
on POWER relative to all other architectures, which sucks. :-(

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2012-01-10 14:40 ` dje at gcc dot gnu.org
@ 2012-01-10 14:49 ` rguenth at gcc dot gnu.org
  2012-01-10 15:31 ` redi at gcc dot gnu.org
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2012-01-10 14:49 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

--- Comment #6 from Richard Guenther <rguenth at gcc dot gnu.org> 2012-01-10 14:48:54 UTC ---
(In reply to comment #5)
> I understand that fixing __sync_* is a hassle.  This is why I opened a separate
> bug for libstdc++.
> 
> While __sync_* is deprecated in favor of __atomic_*, use of __sync_* for
> portability is fairly pervasive in FOSS applications that need it because of
> its implementation in GCC.  Most programmers do not know about memory models
> and do not care about memory models.  And it will take time for programmers to
> switch to __atomic_*, if they even bother to choose a memory model and don't
> introduce a bug.
> 
> The basic problem is MEMMODEL_SEQ_CST only makes a performance difference for
> POWER and developers are going to continue to use __sync_* builtins for a
> while.  This change in default behavior only hurts performance for applications
> on POWER relative to all other architectures, which sucks. :-(

Yes, I see that.  But my question is - did a developer reading the
documentation
get _correct_ code on POWER (which uses a laxer memory model than documented!)
in all cases?  Or can you construct a testcase that works fine on IA64
while surprisingly (after reading docs) does not work on POWER?  Thus, didn't
we simply fix a wrong-code bug (albeit by producing slower code)?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2012-01-10 14:49 ` rguenth at gcc dot gnu.org
@ 2012-01-10 15:31 ` redi at gcc dot gnu.org
  2012-01-10 18:09 ` dje at gcc dot gnu.org
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: redi at gcc dot gnu.org @ 2012-01-10 15:31 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

--- Comment #7 from Jonathan Wakely <redi at gcc dot gnu.org> 2012-01-10 15:29:09 UTC ---
(In reply to comment #3)
> > It says above them "In most cases, these
> > builtins are considered a full barrier." and only __sync_lock_test_and_set and
> > __sync_lock_release specify different barrier semantics.
> 
> The next sentence is:
> 
> "That is, no memory operand will be moved across the operation, either forward
> or
> backward."

It continues:
"Further, instructions will be issued as necessary to prevent the processor
from speculating loads across the operation and from queuing stores after the
operation."

Doesn't that rule out lwsync, which allows loads to be moved to before the
store preceding the lwsync?


Isn't the fact users will continue to use the __sync builtins all the more
reason they should work as documented?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2012-01-10 15:31 ` redi at gcc dot gnu.org
@ 2012-01-10 18:09 ` dje at gcc dot gnu.org
  2012-01-10 18:20 ` redi at gcc dot gnu.org
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: dje at gcc dot gnu.org @ 2012-01-10 18:09 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

--- Comment #8 from David Edelsohn <dje at gcc dot gnu.org> 2012-01-10 18:08:23 UTC ---
For the way that programmers use __sync_* builtins, release or acquire-release
semantics are sufficient.  As we see in libstdc++, release semantics are overly
strict when incrementing the reference, as opposed to destroying an object.

Again, there is no cost to Intel specifying sequential consistency as opposed
to a slightly weaker memory model.  Intel chose a memory model that matched and
benefited their architecture.  IBM should have the freedom to choose memory
models that benefit its architectures.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2012-01-10 18:09 ` dje at gcc dot gnu.org
@ 2012-01-10 18:20 ` redi at gcc dot gnu.org
  2012-01-10 18:24 ` redi at gcc dot gnu.org
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: redi at gcc dot gnu.org @ 2012-01-10 18:20 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

--- Comment #9 from Jonathan Wakely <redi at gcc dot gnu.org> 2012-01-10 18:20:00 UTC ---
(In reply to comment #8)
> For the way that programmers use __sync_* builtins, release or acquire-release
> semantics are sufficient.

Are you claiming you know how all programmers have used those builtins?

>  As we see in libstdc++, release semantics are overly
> strict when incrementing the reference, as opposed to destroying an object.
> 
> Again, there is no cost to Intel specifying sequential consistency as opposed
> to a slightly weaker memory model.  Intel chose a memory model that matched and
> benefited their architecture.  IBM should have the freedom to choose memory
> models that benefit its architectures.

How or why Intel chose those semantics is not really relevant, the fact is that
with the exception of lock_test_and_set and lock_release the __sync builtins
are documented to be full barriers, and always have been documented as full
barriers, both in GCC's and Intel's docs. If someone uses them in their code
relying on the fact they'll get a full barrier, then someone else runs that
code on POWER and there's a bug because it isn't a full barrier, who is to
blame for the bug?

That said, I don't have any personal interest in what they do on POWER, as long
as I'm not asked to deal with any libstdc++ bugs resulting from the builtins
not behaving as documented.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2012-01-10 18:20 ` redi at gcc dot gnu.org
@ 2012-01-10 18:24 ` redi at gcc dot gnu.org
  2012-01-10 18:34 ` dje at gcc dot gnu.org
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: redi at gcc dot gnu.org @ 2012-01-10 18:24 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

--- Comment #10 from Jonathan Wakely <redi at gcc dot gnu.org> 2012-01-10 18:22:01 UTC ---
(In reply to comment #8)
> IBM should have the freedom to choose memory
> models that benefit its architectures.

I meant to add that the new __atomic builtins give that freedom, by design.

Changing the documented behaviour of the __sync builtins to do something they
weren't designed for doesn't seem like a good idea - if they're inefficient
then convince people not to use them.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2012-01-10 18:24 ` redi at gcc dot gnu.org
@ 2012-01-10 18:34 ` dje at gcc dot gnu.org
  2012-01-10 18:47 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: dje at gcc dot gnu.org @ 2012-01-10 18:34 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

--- Comment #11 from David Edelsohn <dje at gcc dot gnu.org> 2012-01-10 18:33:56 UTC ---
Jonathan,

I understand that the new __atomic_* builtins provide that flexibility.

I also am pointing out that GCC followed Intel semantics and Intel chose
semantics to benefit them.  Why should GCC prefer and impose semantics that
advantage a particular architecture or vendor?

You can argue that the __sync_* definition is the playing field and the rule
and POWER has to deal with it.  And I am responding that the builtins were
implemented appropriate for POWER and now GCC has moved the goal post.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2012-01-10 18:34 ` dje at gcc dot gnu.org
@ 2012-01-10 18:47 ` jakub at gcc dot gnu.org
  2012-01-10 19:14 ` amacleod at redhat dot com
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: jakub at gcc dot gnu.org @ 2012-01-10 18:47 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> 2012-01-10 18:46:59 UTC ---
If the PPC implementation of __sync_fetch_and_* and __sync_*_and_fetch was
intentionally implemented to violate the documentation, then the documentation
should have been adjusted at that point to note that on PPC it behaves
differently.  Otherwise it should be just considered a bug in the
implementation that it didn't implement the documented semantics.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2012-01-10 18:47 ` jakub at gcc dot gnu.org
@ 2012-01-10 19:14 ` amacleod at redhat dot com
  2012-01-10 20:26 ` dje at gcc dot gnu.org
  2012-01-12 20:41 ` dje at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: amacleod at redhat dot com @ 2012-01-10 19:14 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

Andrew Macleod <amacleod at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amacleod at redhat dot com

--- Comment #13 from Andrew Macleod <amacleod at redhat dot com> 2012-01-10 19:13:44 UTC ---
I don't think it matters what the original rationale was for implementing
__sync. They are documented as being seq-cst.

I would argue that power implementing them not as seq-cst is not truly
compliant.  It shouldn't be too difficult to write a test case using __sync
which expects seq-cst as documented that would pass on all targets except
power.

It may be that power-specific programmers have been taking advantage of this,
but doesn't it has to have the potential to break generic code?... (lucky TM
only needed relaxed atomics!)

That said... it would be possible to simply have an optional macro with a
default definition of 
  #define __SYNC_LEGACY_MODE   MEMMODEL_SEQ_CST
and if a port like power wants something different, it can define
  #define __SYNC_LEGACY_MODE   MEMMODEL_ACQ_REL

then use that value instead of MEMMODEL_SEQ_CST for legacy __sync routines.

Would that satisfy all your requirements?  shouldn't really need any other
changes then.  Although as Jakub says, it should then be clearly documented.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2012-01-10 19:14 ` amacleod at redhat dot com
@ 2012-01-10 20:26 ` dje at gcc dot gnu.org
  2012-01-12 20:41 ` dje at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: dje at gcc dot gnu.org @ 2012-01-10 20:26 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

--- Comment #14 from David Edelsohn <dje at gcc dot gnu.org> 2012-01-10 20:25:57 UTC ---
Jakub,

The POWER implementation of __sync_* was not written intentionally to violate
the documentation.  I don't think those of us who implemented the feature on
POWER realized the documentation was trying to require sequential consistency.

Given this clarification, the POWER maintainers need to figure out what
implementation is appropriate.

The lwsync implementation was more than sufficient for the common use of atomic
ops, like fetch_and_<op>.  That other architectures provide stronger semantics
is a bonus.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug middle-end/51766] [4.7 regression] sync_fetch_and_xxx atomicity
  2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2012-01-10 20:26 ` dje at gcc dot gnu.org
@ 2012-01-12 20:41 ` dje at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: dje at gcc dot gnu.org @ 2012-01-12 20:41 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51766

David Edelsohn <dje at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |WONTFIX

--- Comment #15 from David Edelsohn <dje at gcc dot gnu.org> 2012-01-12 20:33:57 UTC ---
It looks like POWER needs to accept and deal with the sequentially consistent
semantics now specified for the __sync_* intrinsics, so I am closing this bug
as WONTFIX.  But we still need to fix the libstdc++ regression created by this
change.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2012-01-12 20:41 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-05 15:01 [Bug middle-end/51766] New: [4.7 regression] sync_fetch_and_xxx atomicity dje at gcc dot gnu.org
2012-01-05 15:02 ` [Bug middle-end/51766] " dje at gcc dot gnu.org
2012-01-09 15:40 ` rguenth at gcc dot gnu.org
2012-01-09 15:45 ` redi at gcc dot gnu.org
2012-01-09 16:51 ` dje at gcc dot gnu.org
2012-01-10  9:43 ` rguenth at gcc dot gnu.org
2012-01-10 14:40 ` dje at gcc dot gnu.org
2012-01-10 14:49 ` rguenth at gcc dot gnu.org
2012-01-10 15:31 ` redi at gcc dot gnu.org
2012-01-10 18:09 ` dje at gcc dot gnu.org
2012-01-10 18:20 ` redi at gcc dot gnu.org
2012-01-10 18:24 ` redi at gcc dot gnu.org
2012-01-10 18:34 ` dje at gcc dot gnu.org
2012-01-10 18:47 ` jakub at gcc dot gnu.org
2012-01-10 19:14 ` amacleod at redhat dot com
2012-01-10 20:26 ` dje at gcc dot gnu.org
2012-01-12 20:41 ` dje at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).