From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 311B93855587; Fri, 21 Jul 2023 23:46:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 311B93855587 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689983202; bh=RPZe6FRjSftjBhCHpN6L/MEVjXEuYNzghtFFzELQabk=; h=From:To:Subject:Date:From; b=RDmkL5youYYw+i7BEzRaxZde9CYw0QmWl6bzh89xTVXp6J+0hMt3mOdM19372HqBy 4eKbCYVMRh22gQ8oAEqKFZDsEZzaFF/emp58g7fMyAYjW4Qr97wEnM4NrhmWp77MNE qMLoKfqrFCzZHvaSyBKL9Xko2my289Oz+v6r0SoI= From: "scw-gcc at google dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/110773] New: [Aarch64] crash (SIGBUS) due to atomic instructions on under-aligned memory Date: Fri, 21 Jul 2023 23:46:41 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: scw-gcc at google dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D110773 Bug ID: 110773 Summary: [Aarch64] crash (SIGBUS) due to atomic instructions on under-aligned memory Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: scw-gcc at google dot com Target Milestone: --- This reproduces in versions as far back as godbolt has ARM64 gcc (5.4). The following code snippet has two copies of 4-byte-aligned, 8-byte-sized objects `fp1` and `fp2`. Their placements in `storage` guarantee that they = are 12 bytes apart, and thus one would be 8-byte aligned and one would not (it'd still be 4-byte aligned, though). =3D=3D=3D=3D=3D struct FloatPair { float f1; float f2; }; struct Storage { FloatPair fp1; float padding; FloatPair fp2; } storage; float f() { FloatPair fp1, fp2; __atomic_load(&storage.fp1, &fp1, __ATOMIC_SEQ_CST); __atomic_load(&storage.fp2, &fp2, __ATOMIC_SEQ_CST); return fp1.f1 + fp1.f2 + fp2.f1 + fp2.f2; } =3D=3D=3D=3D=3D Godbolt with GCC and Clang: https://godbolt.org/z/P9rbTePnG GCC uses two `ldar` instructions for the loads while Clang makes calls to libatomic. The GCC codegen crashes on AArch64 machines (tested on Cavium ThunderX2 as well as Neoverse-N1). AArch64 allows unaligned memory access, except for atomic operations: https://developer.arm.com/documentation/ddi0596/2021-03/Shared-Pseudocode/A= Arch64-Functions?lang=3Den#AArch64.CheckAlignment.4 For memory reads, it uses the size as the alignment (unless it's operating = on a pair of 64-bit registers): https://developer.arm.com/documentation/ddi0596/2021-03/Shared-Pseudocode/A= Arch64-Functions?lang=3Den#impl-aarch64.Mem.read.3 In this case, `FloatPair` has size 8 and fit in one 64-bit register, so the 64-bit `ldar` can only be used on 8-byte-aligned reads. One of the two call will thus violate the alignment requirement. Potentially related, GCC also uses single atomic RMW instructions when available regardless of alignment. To trigger this, one has to construct unaligned pointers so I'm not sure if it's a problem. =3D=3D=3D=3D=3D struct Storage { char c; int i; } __attribute__((packed)) storage; int inc() { return __atomic_add_fetch(&storage.i, 1, __ATOMIC_SEQ_CST); } =3D=3D=3D=3D=3D Needs -march=3Darmv8.1-a (LSE) to reproduce: https://godbolt.org/z/qKM1fGbrj In my testing, even using the clang codegen it still crashes inside libatom= ic, though.=