From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 4ECF63858D1E; Thu, 10 Feb 2022 08:13:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4ECF63858D1E From: "marc.mutz at hotmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug c++/104480] New: [trunk] Combining stores across memory locations might violate [intro.memory]/3 Date: Thu, 10 Feb 2022 08:13:21 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: c++ X-Bugzilla-Version: unknown X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: marc.mutz at hotmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Feb 2022 08:13:21 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104480 Bug ID: 104480 Summary: [trunk] Combining stores across memory locations might violate [intro.memory]/3 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: marc.mutz at hotmail dot com Target Milestone: --- I'm not sure whether GCC trunk just became much smarter, or introduced a regresssion. Sorry if it's the former. Consider: // https://gcc.godbolt.org/z/ch8rTob7c struct S1 { int a1 : 16; int a2 : 16; }; struct S2 { short a1; short a2; }; extern char x; template void f(T &t) { t.a1 =3D x; t.a2 =3D x + 1; } template void f(S1 &); template void f(S2 &); All GCC version up to 11.2 will use two movw to implement both f() instantiations. GCC trunk now uses one movl in both instantiations. That's clearly allowed for f() by [intro.memory]/3, but it's less clear that i= t's an allowed optimisation for S2, because a1, s2 are two separate memory locations there. Clang, in fact, produces different code for the two instantiations. Of course, GCC might be the clever kid here and realize that it can combine= the writes, because that's a valid observable sequence, but an object of type S= 2, having alignment 2, may cross a cacheline-boundary, in which case the movl might not be atomic, even on x86, and then a different core may observe the writes out of order, which probably shouldn't happen.=