public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "peter at cordes dot ca" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug sanitizer/84508] Load of misaligned address using _mm_load_sd
Date: Sat, 26 Mar 2022 23:35:31 +0000	[thread overview]
Message-ID: <bug-84508-4-rTU1s3p7Pp@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-84508-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84508

Peter Cordes <peter at cordes dot ca> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |peter at cordes dot ca

--- Comment #14 from Peter Cordes <peter at cordes dot ca> ---
This bug is mis-categorized; it's not a sanitizer bug, it's a bug in the
implementation _mm_load_ss / sd.

It currently derefs the  `float const*` arg directly, which is not
strict-aliasing or alignment safe.  alignof(float) is 4, but Intel's
documentation for this API still says "mem_addr does not need to be aligned on
any particular boundary."

_mm_load_ss (float const *__P)
{
  return _mm_set_ss (*__P);
}


As discussed on PR99754 _mm_load_si32(const void*) *is* strict-aliasing and
alignment safe.  But it only existed recently, and GCC11's implementation of it
is buggy (shuffling the element to the wrong place).  Before that, one safe way
to do a 32-bit SIMD load is with _mm_load_ss and _mm_castps_si128.  Or it was
supposed to be safe, but isn't!!

Clang uses a packed may_alias struct containing a float to get a safe load
done.  Another way would be casting the pointer to

typdef float aliasing_unaligned_f32 __attribute__((aligned(1),may_alias));

This is similar to what we do with __m32_u for use in aliasing-safe integer
load/store, except we define that as int with
vector_size(4),may_alias,aligned(1) for some reason.  Perhaps influenced by
__m64_u which is a vector of 2 ints.

MSVC is like gcc -fno-strict-aliasing, so however it handles intrinsics,
they're always aliasing-safe.

I'm not 100% sure about what ICC formally guarantees, but in practice it
doesn't move aliasing short*  stores across a _mm_load_ss( (float*)pshort )
load.
https://godbolt.org/z/6s76v71xz  I didn't test with _mm_store_ss aliasing with
short loads, only vice versa.

So GCC is the odd one out, out of the major 4 compilers that support Intel's
intrinsics API.  All our narrow load/store intrinsics should be strict-aliasing
and alignment safe, regardless of what pointer type they accept.

Intel's early design of taking float* and double* instead of void* could be
considered poor design.  Their naming with just load/store instead of
_mm_loadu_ss / storeu is also poor design, clearly motivated by the asm
differences rather than an actual intrinsic API difference.

In x86 asm, loads/stores narrower than 16 bytes never require alignment (unless
the AC bit is set in EFLAGS).  Assuming Intel modeled their intrinsics API
after their asm, then it makes sense to have load and loadu for ps and si128,
but only load/store with an implied lack of alignment for intrinsics that wrap
instructions like movlps / movhps / movss / movsd, and movd / movq, which do
narrower memory accesses.

That of course *doesn't* make sense in C terms, where it's always potentially a
problem to dereference misaligned pointers to narrow objects, even when
compiling for x86-64:
https://stackoverflow.com/questions/47510783/why-does-unaligned-access-to-mmaped-memory-sometimes-segfault-on-amd64
has an example and links some others, showing that compilers *don't* define the
behaviour of deref of misaligned pointers.

I'm pretty certain that Intel always intended their narrow load/store
intrinsics to not have any alignment requirements, like the asm instructions
that wrap them, but weren't thinking in C terms when naming them.  And were
sloppily in their choices of which ones to provide until decades later, since
it seems they thought that _mm_cvtsi32_si128(*x) was sufficient for a movd
load.  (Only the case on a compiler without strict-aliasing or alignment, since
the deref happens on the user's plain int*).

Anyway, hopefully this refutes the argument that _mm_load_sd should be aligned
because of the name, and clarifies what Intel might have been thinking when
naming these.

  parent reply	other threads:[~2022-03-26 23:35 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <bug-84508-4@http.gcc.gnu.org/bugzilla/>
2021-05-17  8:08 ` jakub at gcc dot gnu.org
2022-03-26 23:35 ` peter at cordes dot ca [this message]
2022-03-28  1:59 ` crazylht at gmail dot com
2022-03-28  2:09 ` pinskia at gcc dot gnu.org
2022-03-28  7:23 ` peter at cordes dot ca
2023-12-17 23:21 ` pinskia at gcc dot gnu.org
2024-05-09  7:43 ` cvs-commit at gcc dot gnu.org
2024-05-09  8:55 ` liuhongt at gcc dot gnu.org
2024-05-09 19:02 ` cvs-commit at gcc dot gnu.org
2024-05-29 18:45 ` pcordes at gmail dot com
2024-05-29 19:01 ` noloader at gmail dot com
2024-05-29 19:13 ` pcordes at gmail dot com
2024-05-30  6:01 ` liuhongt at gcc dot gnu.org
2024-05-30  6:33 ` liuhongt at gcc dot gnu.org
2024-05-30 18:38 ` pcordes at gmail dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-84508-4-rTU1s3p7Pp@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).