public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access
@ 2024-03-03  7:03 akihiko.odaki at daynix dot com
  2024-03-03  7:10 ` [Bug sanitizer/114217] " pinskia at gcc dot gnu.org
                   ` (13 more replies)
  0 siblings, 14 replies; 15+ messages in thread
From: akihiko.odaki at daynix dot com @ 2024-03-03  7:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

            Bug ID: 114217
           Summary: -fsanitize=alignment false positive with intended
                    unaligned struct member access
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: sanitizer
          Assignee: unassigned at gcc dot gnu.org
          Reporter: akihiko.odaki at daynix dot com
                CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
                    jakub at gcc dot gnu.org, kcc at gcc dot gnu.org
  Target Milestone: ---

-fsanitize=alignment generates a false positive error for an intended unaligned
struct member access. The intention of unaligned struct member access is
expressed with __builtin_memcpy() as done by QEMU or packed struct access as
done by Linux. GCC translates such a construct to code to access memory
unaligned for architectures like rv64gc as intended but also emits code to
enforce the alignment.

The relevant code of QEMU is at:
https://gitlab.com/qemu-project/qemu/-/blob/v8.2.1/include/qemu/bswap.h?ref_type=tags
The relevant code of Linux is at:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/asm-generic/unaligned.h?h=v6.7

FYI, this issue is reproducible also with clang 17.0.1, and I'm going to open
an issue for it, too.

To reproduce the issue, compile the code shown below with -O2
-fsanitize=alignment for rv64gc:

#include <stdint.h>

typedef uint64_t u64;

/*
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/linux/compiler_attributes.h?h=v6.7
*/

/*
 *   gcc:
https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#index-packed-type-attribute
 * clang:
https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#index-packed-variable-attribute
 */
#define __packed                        __attribute__((__packed__))

/*
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/asm-generic/unaligned.h?h=v6.7
*/

#define __get_unaligned_t(type, ptr) ({                                        
\
        const struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr);     
\
        __pptr->x;                                                             
\
})

#define __put_unaligned_t(type, val, ptr) do {                                 
\
        struct { type x; } __packed *__pptr = (typeof(__pptr))(ptr);           
\
        __pptr->x = (val);                                                     
\
} while (0)

#define get_unaligned(ptr)      __get_unaligned_t(typeof(*(ptr)), (ptr))
#define put_unaligned(val, ptr) __put_unaligned_t(typeof(*(ptr)), (val), (ptr))

/*
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/btrfs/inode.c?h=v6.7
*/

struct dir_entry {
        u64 ino;
        u64 offset;
        unsigned type;
        int name_len;
};

/*
 * This function is intended to perform an unaligned access.
 * GCC emits code for an unaligned operation as intended,
 * but also emits code to assert alignment.
 */
u64 f(struct dir_entry *entry)
{
    return get_unaligned(&entry->offset);
}

/*
 * This function is intended to perform an aligned access.
 * GCC emits code for an aligned operation,
 * and emits code to assert alignment.
 */
u64 g(struct dir_entry *entry)
{
    return entry->offset;
}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
@ 2024-03-03  7:10 ` pinskia at gcc dot gnu.org
  2024-03-03  7:15 ` pinskia at gcc dot gnu.org
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-03  7:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
>but also emits code to assert alignment.


Yes because the code is broken still.

The alignment is not about when the access happens but rather when the pointer
is casted to.

So in this case when passing in the argument to f, the argument entry should be
aligned to what the `struct dir_entry` is aligned to; otherwise it is undefined
code.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
  2024-03-03  7:10 ` [Bug sanitizer/114217] " pinskia at gcc dot gnu.org
@ 2024-03-03  7:15 ` pinskia at gcc dot gnu.org
  2024-03-03  7:19 ` akihiko.odaki at daynix dot com
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-03  7:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
That is if we have:

void f(void)
{
  char t[sizeof(int)] __attribute__((aligned(1)));
  int *a = (int*)&t;
  //
}

The above code is undefined even if you have not accessed via *a at all.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
  2024-03-03  7:10 ` [Bug sanitizer/114217] " pinskia at gcc dot gnu.org
  2024-03-03  7:15 ` pinskia at gcc dot gnu.org
@ 2024-03-03  7:19 ` akihiko.odaki at daynix dot com
  2024-03-03  7:22 ` pinskia at gcc dot gnu.org
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: akihiko.odaki at daynix dot com @ 2024-03-03  7:19 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

--- Comment #3 from Akihiko Odaki <akihiko.odaki at daynix dot com> ---
(In reply to Andrew Pinski from comment #1)
> >but also emits code to assert alignment.
> 
> 
> Yes because the code is broken still.
> 
> The alignment is not about when the access happens but rather when the
> pointer is casted to.
> 
> So in this case when passing in the argument to f, the argument entry should
> be aligned to what the `struct dir_entry` is aligned to; otherwise it is
> undefined code.


I had a similar thought when I faced the same issue before and didn't report it
then, but this time I realized GCC still emits code to perform slow unaligned
access for such a construct. Whichever is right, to assume an aligned or
unaligned access, it is not consistent.

Theoretically, it also makes sense to emit unaligned memory access for such a
construct instead of ignoring it when -fsanitize=address, but I'm worried that
such a change will break too many things.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
                   ` (2 preceding siblings ...)
  2024-03-03  7:19 ` akihiko.odaki at daynix dot com
@ 2024-03-03  7:22 ` pinskia at gcc dot gnu.org
  2024-03-03  7:29 ` akihiko.odaki at daynix dot com
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-03  7:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/asm-generic/unaligned.h?h=v6.7

is correct except it should not expose get_unaligned/put_unaligned since the
undefined code happens way before.

The problem is with the btrfs code in btrfs_filldir:
```
static int btrfs_filldir(void *addr, int entries, struct dir_context *ctx)
{
        while (entries--) {
                struct dir_entry *entry = addr; /// THIS IS BROKEN and causes
the -fsanitize=alignment error
                char *name = (char *)(entry + 1);

                ctx->pos = get_unaligned(&entry->offset);
                if (!dir_emit(ctx, name, get_unaligned(&entry->name_len),
                                         get_unaligned(&entry->ino),
                                         get_unaligned(&entry->type)))
                        return 1;
                addr += sizeof(struct dir_entry) +
                        get_unaligned(&entry->name_len);
                ctx->pos++;
        }
        return 0;
}
```

Added comment on where the error comes from. The get_unaligned macro really
should not be used here. What should be used here is an unaligned version of
`struct dir_entry` instead.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
                   ` (3 preceding siblings ...)
  2024-03-03  7:22 ` pinskia at gcc dot gnu.org
@ 2024-03-03  7:29 ` akihiko.odaki at daynix dot com
  2024-03-03  7:46 ` akihiko.odaki at daynix dot com
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: akihiko.odaki at daynix dot com @ 2024-03-03  7:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

--- Comment #5 from Akihiko Odaki <akihiko.odaki at daynix dot com> ---
(In reply to Andrew Pinski from comment #4)
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/
> include/asm-generic/unaligned.h?h=v6.7
> 
> is correct except it should not expose get_unaligned/put_unaligned since the
> undefined code happens way before.
> 
> The problem is with the btrfs code in btrfs_filldir:
> ```
> static int btrfs_filldir(void *addr, int entries, struct dir_context *ctx)
> {
> 	while (entries--) {
> 		struct dir_entry *entry = addr; /// THIS IS BROKEN and causes the
> -fsanitize=alignment error
> 		char *name = (char *)(entry + 1);
> 
> 		ctx->pos = get_unaligned(&entry->offset);
> 		if (!dir_emit(ctx, name, get_unaligned(&entry->name_len),
> 					 get_unaligned(&entry->ino),
> 					 get_unaligned(&entry->type)))
> 			return 1;
> 		addr += sizeof(struct dir_entry) +
> 			get_unaligned(&entry->name_len);
> 		ctx->pos++;
> 	}
> 	return 0;
> }
> ```
> 
> Added comment on where the error comes from. The get_unaligned macro really
> should not be used here. What should be used here is an unaligned version of
> `struct dir_entry` instead.

I understand the idea. What I don't get is that GCC still emits code for
unaligned memory access in such a case. It is just a waste of performance if
GCC doesn't provide a guarantee that the unaligned access is performed in such
a case and is not optimal.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
                   ` (4 preceding siblings ...)
  2024-03-03  7:29 ` akihiko.odaki at daynix dot com
@ 2024-03-03  7:46 ` akihiko.odaki at daynix dot com
  2024-03-03 19:01 ` jakub at gcc dot gnu.org
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: akihiko.odaki at daynix dot com @ 2024-03-03  7:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

--- Comment #6 from Akihiko Odaki <akihiko.odaki at daynix dot com> ---
(In reply to Andrew Pinski from comment #4)
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/
> include/asm-generic/unaligned.h?h=v6.7
> 
> is correct except it should not expose get_unaligned/put_unaligned since the
> undefined code happens way before.
> 
> The problem is with the btrfs code in btrfs_filldir:
> ```
> static int btrfs_filldir(void *addr, int entries, struct dir_context *ctx)
> {
> 	while (entries--) {
> 		struct dir_entry *entry = addr; /// THIS IS BROKEN and causes the
> -fsanitize=alignment error
> 		char *name = (char *)(entry + 1);
> 
> 		ctx->pos = get_unaligned(&entry->offset);
> 		if (!dir_emit(ctx, name, get_unaligned(&entry->name_len),
> 					 get_unaligned(&entry->ino),
> 					 get_unaligned(&entry->type)))
> 			return 1;
> 		addr += sizeof(struct dir_entry) +
> 			get_unaligned(&entry->name_len);
> 		ctx->pos++;
> 	}
> 	return 0;
> }
> ```
> 
> Added comment on where the error comes from. The get_unaligned macro really
> should not be used here. What should be used here is an unaligned version of
> `struct dir_entry` instead.

With looking at this comment, I did another experiment to see if it's specific
to struct members, 
Also, note that this behavior of UBSan is specific to struct members. Think of
the following functions:

u64 f2(u64 *offset)
{
    return get_unaligned(offset);
}

u64 g2(u64 *offset)
{
    return *offset;
}

f2() and g2() correspond to f() and g(). The only difference is that it does
not involve struct member access. Nevertheless, GCC changes its behavior and
doesn't emit alignment checks for f2().

If casting a pointer with a strict alignment requirement to one with a relaxed
alignment requirement doesn't relax the alignment requirement, UBSan should
emit an error for f2().

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
                   ` (5 preceding siblings ...)
  2024-03-03  7:46 ` akihiko.odaki at daynix dot com
@ 2024-03-03 19:01 ` jakub at gcc dot gnu.org
  2024-03-04  5:26 ` akihiko.odaki at daynix dot com
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-03-03 19:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
GCC actually doesn't diagnose on mere pointer assignment, but what triggers the
alignment check is
&entry->offset
even when the code later on just takes its address, entry must be sufficiently
aligned, otherwise entry->offset is invalid.
Under standard C rules, already forming the pointer would be UB, so somewhere
in the caller when you prepare what to pass to the f function.

If you want something that will still be invalid C,
but will not trigger UBSAN errors, then e.g.
unsigned long long h(struct dir_entry *entry)
{
    return get_unaligned((unsigned long long *) (((char *) entry) + offsetof
(struct dir_entry, offset)));
}
will do.
If you want something that will be valid even in C, don't pass struct dir_entry
*entry
argument, but void *entry instead, and use e.g.
__get_unaligned_t(__typeof(((struct dir_entry *)0)->offset), ((char
*)entry)+offsetof(struct dir_entry, offset)))
You can surely hide that all under some macro.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
                   ` (6 preceding siblings ...)
  2024-03-03 19:01 ` jakub at gcc dot gnu.org
@ 2024-03-04  5:26 ` akihiko.odaki at daynix dot com
  2024-03-04  7:46 ` jakub at gcc dot gnu.org
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: akihiko.odaki at daynix dot com @ 2024-03-04  5:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

--- Comment #8 from Akihiko Odaki <akihiko.odaki at daynix dot com> ---
(In reply to Jakub Jelinek from comment #7)
> GCC actually doesn't diagnose on mere pointer assignment, but what triggers
> the alignment check is
> &entry->offset
> even when the code later on just takes its address, entry must be
> sufficiently aligned, otherwise entry->offset is invalid.
> Under standard C rules, already forming the pointer would be UB, so
> somewhere in the caller when you prepare what to pass to the f function.
> 
> If you want something that will still be invalid C,
> but will not trigger UBSAN errors, then e.g.
> unsigned long long h(struct dir_entry *entry)
> {
>     return get_unaligned((unsigned long long *) (((char *) entry) + offsetof
> (struct dir_entry, offset)));
> }
> will do.

It would certainly workaround the issue, but it's only dirtier and brings no
benefit except suppressed UBSan errors. Why not allow
get_unaligned(&entry->offset) when UBSan does not complain about that hack?

> If you want something that will be valid even in C, don't pass struct
> dir_entry *entry
> argument, but void *entry instead, and use e.g.
> __get_unaligned_t(__typeof(((struct dir_entry *)0)->offset), ((char
> *)entry)+offsetof(struct dir_entry, offset)))
> You can surely hide that all under some macro.

The definition still involves UB for ((struct dir_entry *)0)->offset. Perhaps
__typeof() may be considered as an exception, but what if offsetof() is defined
as follows?

#define offsetof(T, x) ((uintptr_t)&(((T *)0)->x))

GCC does provide __builtin_offsetof(), but I think definitions of offsetof()
like this are still prevalent, and expected to work although it's UB. If GCC
tolerates this kind of trick, why not tolerate get_unaligned(&entry->offset)?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
                   ` (7 preceding siblings ...)
  2024-03-04  5:26 ` akihiko.odaki at daynix dot com
@ 2024-03-04  7:46 ` jakub at gcc dot gnu.org
  2024-03-04  7:54 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-03-04  7:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jsm28 at gcc dot gnu.org

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Akihiko Odaki from comment #8)
> It would certainly workaround the issue, but it's only dirtier and brings no
> benefit except suppressed UBSan errors. Why not allow
> get_unaligned(&entry->offset) when UBSan does not complain about that hack?

The purpose of the sanitizer is to find undefined behavior, and it found one.
If you don't want it to find undefined behavior, don't use it.

> > If you want something that will be valid even in C, don't pass struct
> > dir_entry *entry
> > argument, but void *entry instead, and use e.g.
> > __get_unaligned_t(__typeof(((struct dir_entry *)0)->offset), ((char
> > *)entry)+offsetof(struct dir_entry, offset)))
> > You can surely hide that all under some macro.
> 
> The definition still involves UB for ((struct dir_entry *)0)->offset.
> Perhaps __typeof() may be considered as an exception, but what if offsetof()
> is defined as follows?
> 
> #define offsetof(T, x) ((uintptr_t)&(((T *)0)->x))

typeof nor sizeof operators don't evaluate their arguments unless the
expressions have VLA type.  So, if there would be undefined behavior if they
were evaluated, it doesn't matter because they aren't.  So, e.g.

void
foo (int n, int *p)
{
  typedef int A[n];
  A b[4];
  typedef int C[10];
  C d[4];
  int i = 0;
  int j = 0;
  typeof (b[++i]) e;
  typeof (d[++j]) f;
  typeof (d[100][4]) g;
  p[0] = i;
  p[1] = j;
}

doesn't invoke undefined behavior and sets p[0] to 1 and p[1] to 0, b[++i] had
VLA type, so it got evaluated, the other 2 didn't, so ++j wasn't evaluated and
the d[100]
UB wasn't triggered, as if the program did if (n == 54) d[100][4]; and n at
runtime was
never 54.  __typeof is a GNU extension but behaves the same way.

offsetof is a standard C macro.  It is not defined in the standard as
dereferencing NULL pointer, but as if there was an object of the T type and it
computed the difference between the address of the x member of it and the start
of the object.
If offsetof is defined as
#define offsetof(T, x) ((size_t)(uintptr_t)&(((T *)0)->x))
by the implementation (and yes, e.g. GCC < 4 used to define it similarly), then
obviously the compiler can't assume there is undefined behavior on such
expressions,
but it didn't at that point, there was no UBSAN, such expressions evaluated to
a constant expression early before anything could have considered there might
be UB and after it was just the resulting constant.
If the implementation doesn't have some exception that considers expressions
like
&(((T *)0)->x) valid, then it can't use such offsetof definition and needs to
use something else, like GCC uses __builtin_offsetof.  That said, GCC AFAIK
still treats expressions like the above as offsetof-like (because there has
been a lot of such code in the wild for decades), folds them to constant right
away and doesn't complain about it with ubsan; clang does complain about it
though and doesn't treat it that way.
Anyway, if you use something like ((size_t)(uintptr_t)&(((T *)0)->x)), per C
rules you are invoking UB, while if you use offsetof(T, x), it is well defined.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
                   ` (8 preceding siblings ...)
  2024-03-04  7:46 ` jakub at gcc dot gnu.org
@ 2024-03-04  7:54 ` jakub at gcc dot gnu.org
  2024-03-04  8:11 ` akihiko.odaki at daynix dot com
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-03-04  7:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
BTW, with compilers from the last decade or so, it would be much better idea to
just use standard memcpy for get_unaligned/put_unaligned rather than messing
around with pointers to packed types.  The compiler should optimize it
correctly to unaligned loads or stores itself when one operand is cast from
pointer to certain integral type and the size is the size of that type.  Sure,
you'd need to still use the GNU ({ ... }) extension
#define __get_unaligned_t(type, ptr) ({                                        
\
        type __get_unaligned_t_x;                                              
\
        memcpy(&__get_unaligned_t_x, (ptr), sizeof (__get_unaligned_t_x));     
\
        __get_unaligned_t_x;                                                   
\
})
For the get_unaligned from structure element you'd then have something like
get_unaligned_member(type, member, ptr) macro where it would use
typeof (((type *)0)->member) and offsetof (type, member) and memcpy.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
                   ` (9 preceding siblings ...)
  2024-03-04  7:54 ` jakub at gcc dot gnu.org
@ 2024-03-04  8:11 ` akihiko.odaki at daynix dot com
  2024-03-04  8:35 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: akihiko.odaki at daynix dot com @ 2024-03-04  8:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

--- Comment #11 from Akihiko Odaki <akihiko.odaki at daynix dot com> ---
(In reply to Jakub Jelinek from comment #9)
> (In reply to Akihiko Odaki from comment #8)
> > It would certainly workaround the issue, but it's only dirtier and brings no
> > benefit except suppressed UBSan errors. Why not allow
> > get_unaligned(&entry->offset) when UBSan does not complain about that hack?
> 
> The purpose of the sanitizer is to find undefined behavior, and it found one.
> If you don't want it to find undefined behavior, don't use it.
> 
> > > If you want something that will be valid even in C, don't pass struct
> > > dir_entry *entry
> > > argument, but void *entry instead, and use e.g.
> > > __get_unaligned_t(__typeof(((struct dir_entry *)0)->offset), ((char
> > > *)entry)+offsetof(struct dir_entry, offset)))
> > > You can surely hide that all under some macro.
> > 
> > The definition still involves UB for ((struct dir_entry *)0)->offset.
> > Perhaps __typeof() may be considered as an exception, but what if offsetof()
> > is defined as follows?
> > 
> > #define offsetof(T, x) ((uintptr_t)&(((T *)0)->x))
> 
> typeof nor sizeof operators don't evaluate their arguments unless the
> expressions have VLA type.  So, if there would be undefined behavior if they
> were evaluated, it doesn't matter because they aren't.  So, e.g.
> 
> void
> foo (int n, int *p)
> {
>   typedef int A[n];
>   A b[4];
>   typedef int C[10];
>   C d[4];
>   int i = 0;
>   int j = 0;
>   typeof (b[++i]) e;
>   typeof (d[++j]) f;
>   typeof (d[100][4]) g;
>   p[0] = i;
>   p[1] = j;
> }
> 
> doesn't invoke undefined behavior and sets p[0] to 1 and p[1] to 0, b[++i]
> had VLA type, so it got evaluated, the other 2 didn't, so ++j wasn't
> evaluated and the d[100]
> UB wasn't triggered, as if the program did if (n == 54) d[100][4]; and n at
> runtime was
> never 54.  __typeof is a GNU extension but behaves the same way.
> 
> offsetof is a standard C macro.  It is not defined in the standard as
> dereferencing NULL pointer, but as if there was an object of the T type and
> it computed the difference between the address of the x member of it and the
> start of the object.
> If offsetof is defined as
> #define offsetof(T, x) ((size_t)(uintptr_t)&(((T *)0)->x))
> by the implementation (and yes, e.g. GCC < 4 used to define it similarly),
> then obviously the compiler can't assume there is undefined behavior on such
> expressions,
> but it didn't at that point, there was no UBSAN, such expressions evaluated
> to a constant expression early before anything could have considered there
> might be UB and after it was just the resulting constant.
> If the implementation doesn't have some exception that considers expressions
> like
> &(((T *)0)->x) valid, then it can't use such offsetof definition and needs
> to use something else, like GCC uses __builtin_offsetof.  That said, GCC
> AFAIK still treats expressions like the above as offsetof-like (because
> there has been a lot of such code in the wild for decades), folds them to
> constant right away and doesn't complain about it with ubsan; clang does
> complain about it though and doesn't treat it that way.
> Anyway, if you use something like ((size_t)(uintptr_t)&(((T *)0)->x)), per C
> rules you are invoking UB, while if you use offsetof(T, x), it is well
> defined.

So there are two constructs invoking UBs but ignored by UBSan: 1) construction
of misaligned pointers that do not involve a struct member, and 2) pointer
arithmetic with a member of an invalid struct pointer as done by old
offsetof().

If UBSan catches get_unaligned(&entry->offset), why not catch those constructs
too? Perhaps 2) may not be relevant for UBSan since it's done at compile time,
but 1) is.

Whether 1) should be caught is a more complicated question. From a certain
viewpoint, it should be caught since it's UB, and future GCC and other
compilers may introduce incompatible optimization; there will be real bugs if
that happens.

On the other hand, the current GCC does not perform such an optimization, and
clang does not either at least when a packed struct is used or
__builtin_memcpy() is used and its argument is casted to void * or char *
(c.f., https://github.com/llvm/llvm-project/issues/83710). In that sense,
catching 1) brings only noises.

In any case, the answer to the question of whether 1) should be caught should
not be different from whether get_unaligned(&entry->offset) should be caught.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
                   ` (10 preceding siblings ...)
  2024-03-04  8:11 ` akihiko.odaki at daynix dot com
@ 2024-03-04  8:35 ` jakub at gcc dot gnu.org
  2024-03-04  8:45 ` akihiko.odaki at daynix dot com
  2024-03-04 21:48 ` i at maskray dot me
  13 siblings, 0 replies; 15+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-03-04  8:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

--- Comment #12 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Akihiko Odaki from comment #11)
> So there are two constructs invoking UBs but ignored by UBSan: 1)

That is an understatement. UBSan is a best effort which diagnoses some forms of
undefined behavior.  There are tons of undefined behavior UBSan doesn't catch,
most importantly e.g. aliasing violations, but far from limited to just that.
If a program is diagnostic free with -fsanitize=undefined,address , it doesn't
mean it is UB free, but the goal is that if there is diagnostic, there is a
real UB in the program.

You are basically asking for the PR80797 fix to be reverted just because you
aren't willing to fix UB in your code.  That is not going to happen, we've been
diagnosing this for almost 7 years now, I think clang even for 11 years, it is
a real UB and other projects have been able to cope with it.  By reverting the
change new UB in other programs couldn't be discovered.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
                   ` (11 preceding siblings ...)
  2024-03-04  8:35 ` jakub at gcc dot gnu.org
@ 2024-03-04  8:45 ` akihiko.odaki at daynix dot com
  2024-03-04 21:48 ` i at maskray dot me
  13 siblings, 0 replies; 15+ messages in thread
From: akihiko.odaki at daynix dot com @ 2024-03-04  8:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

--- Comment #13 from Akihiko Odaki <akihiko.odaki at daynix dot com> ---
(In reply to Jakub Jelinek from comment #12)
> (In reply to Akihiko Odaki from comment #11)
> > So there are two constructs invoking UBs but ignored by UBSan: 1)
> 
> That is an understatement. UBSan is a best effort which diagnoses some forms
> of undefined behavior.  There are tons of undefined behavior UBSan doesn't
> catch,
> most importantly e.g. aliasing violations, but far from limited to just that.
> If a program is diagnostic free with -fsanitize=undefined,address , it
> doesn't mean it is UB free, but the goal is that if there is diagnostic,
> there is a real UB in the program.

Right. I just listed the two relevant constructs and don't intend to say they
are the only case that UBSan doesn't catch.

> 
> You are basically asking for the PR80797 fix to be reverted just because you
> aren't willing to fix UB in your code.  That is not going to happen, we've
> been diagnosing this for almost 7 years now, I think clang even for 11
> years, it is a real UB and other projects have been able to cope with it. 
> By reverting the change new UB in other programs couldn't be discovered.

I'm not asking for reverting PR80797. See f() and f2() I wrote earlier:
u64 f(struct dir_entry *entry)
{
    return get_unaligned(&entry->offset);
}

u64 f2(u64 *offset)
{
    return get_unaligned(offset);
}

Both f(NULL) and f2(NULL) should be caught and there should be no discrepancy
in behavior for these two functions. However, there is a discrepancy when it
comes to -fsanitize=alignment.

I'm not saying ignoring UB in f() is the only sensible option either. Catching
UBs in both f() and f2() is a logical option.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Bug sanitizer/114217] -fsanitize=alignment false positive with intended unaligned struct member access
  2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
                   ` (12 preceding siblings ...)
  2024-03-04  8:45 ` akihiko.odaki at daynix dot com
@ 2024-03-04 21:48 ` i at maskray dot me
  13 siblings, 0 replies; 15+ messages in thread
From: i at maskray dot me @ 2024-03-04 21:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114217

Fangrui Song <i at maskray dot me> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |i at maskray dot me

--- Comment #14 from Fangrui Song <i at maskray dot me> ---
I agree with Jakub and Andrew.

The relevant rules: C11 6.3.2.3 says

> An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.
>
> A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. ...

C++ [expr.static.cast]p14 says of conversions from a misaligned pointer:

> A prvalue of type “pointer to cv1 void” can be converted to a prvalue of type “pointer to cv2 T”, where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified. ...

Which is allowed to be an invalid pointer value, which the compiler is then
permitted to give whatever semantics we like, such as disallowing it being
passed to memcpy.

---

memcpy is preferred for expressing an unaligned access.

typedef struct dir_entry dir_entry_u __attribute__((aligned(1)));
// In C++, there is an alternative: using dir_entry_u
__attribute__((aligned(1))) = dir_entry;

u64 gu(dir_entry_u *entry)
{
    return entry->offset;
}

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2024-03-04 21:48 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-03  7:03 [Bug sanitizer/114217] New: -fsanitize=alignment false positive with intended unaligned struct member access akihiko.odaki at daynix dot com
2024-03-03  7:10 ` [Bug sanitizer/114217] " pinskia at gcc dot gnu.org
2024-03-03  7:15 ` pinskia at gcc dot gnu.org
2024-03-03  7:19 ` akihiko.odaki at daynix dot com
2024-03-03  7:22 ` pinskia at gcc dot gnu.org
2024-03-03  7:29 ` akihiko.odaki at daynix dot com
2024-03-03  7:46 ` akihiko.odaki at daynix dot com
2024-03-03 19:01 ` jakub at gcc dot gnu.org
2024-03-04  5:26 ` akihiko.odaki at daynix dot com
2024-03-04  7:46 ` jakub at gcc dot gnu.org
2024-03-04  7:54 ` jakub at gcc dot gnu.org
2024-03-04  8:11 ` akihiko.odaki at daynix dot com
2024-03-04  8:35 ` jakub at gcc dot gnu.org
2024-03-04  8:45 ` akihiko.odaki at daynix dot com
2024-03-04 21:48 ` i at maskray dot me

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).