From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugs-return-492358-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Received: (qmail 65715 invoked by alias); 15 Jul 2015 16:05:10 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Received: (qmail 65688 invoked by uid 48); 15 Jul 2015 16:05:06 -0000
From: "tkoeppe at google dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug middle-end/66881] New: Possibly inefficient std::atomic<int> codegen on x86 for simple arithmetic
Date: Wed, 15 Jul 2015 16:05:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: middle-end
X-Bugzilla-Version: 4.9.2
X-Bugzilla-Keywords:
X-Bugzilla-Severity: normal
X-Bugzilla-Who: tkoeppe at google dot com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution:
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone
Message-ID: <bug-66881-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2015-07/txt/msg01248.txt.bz2

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66881

            Bug ID: 66881
           Summary: Possibly inefficient std::atomic<int> codegen on x86
                    for simple arithmetic
           Product: gcc
           Version: 4.9.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: tkoeppe at google dot com
  Target Milestone: ---

Consider these two simple versions of addition:

  #include <atomic>

  std::atomic<int> x;
  int y;

  void f(int a) {
    x.store(x.load(std::memory_order_relaxed) + a, std::memory_order_relaxed);
  }

  void g(int a) {
    y += a;
  }

GCC generates the following assembly:

  f(int):
        mov     eax, DWORD PTR x[rip]
        add     edi, eax
        mov     DWORD PTR x[rip], edi
        ret

  g(int):
        add     DWORD PTR y[rip], edi
        ret

Now, it is clear to me that the correct atomic codegen for store() and load()
is "mov", as it appears here, but why aren't the two consecutive operations not
folded into a single add? Aren't the semantics and the memory ordering the
same? x86 says that (most) "reads" and "writes" are strongly ordered; doesn't
that apply to the read and write produced by "add", too?

(My original motivation came from a variant of this with floats, where the
non-atomic code executed noticeably faster, even though I would have expected
the two to produce the same machine code.)