public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c/114080] New: Inconsistent handling of 128bit ints between GCC versions
@ 2024-02-23 15:32 rapier at psc dot edu
  2024-02-23 15:37 ` [Bug c/114080] Inconsistent handling of unaligned " jakub at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: rapier at psc dot edu @ 2024-02-23 15:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114080

            Bug ID: 114080
           Summary: Inconsistent handling of 128bit ints between GCC
                    versions
           Product: gcc
           Version: 13.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rapier at psc dot edu
  Target Milestone: ---

I'm attempting to XOR 2 128bit ints in 13.2.1 and am consistently getting a
segfault when optimization is set at -O2. The problem is that this behaviour
doesn't happen when using older versions of GCC. As an aside, what we are
trying to do is XOR a stream of data as quickly as possible so using 128bit
ints reduced the number of XORs we need to perform. 

I've been using the following MWE to test this:
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>

/* just informative */
void printAlignment(void *ptr, char *label) {
        for (int ta = 64; ta > 0; ta /= 2)
                if ((uint64_t) ptr % ta == 0) {
                        printf("%s is %3u-bit aligned (%p)\n", label, ta * 8,
ptr);
                        return;
                }
}

/* offset is the desired alignment in BYTES */
/* startptr exists to free it later */
void * misaligned_128bit_malloc(size_t offset, void **startptr) {
        *startptr = malloc(16 + offset); /* 16 bytes = 128 bits */
        void * ret = *startptr + 1; /* force minimal misalignment */
        while ((uint64_t) ret % offset != 0) /* iterate until aligned */
                ret = ret + 1;
        return ret;
}

int main() {
        __uint128_t *dstp, *srcp, *bufp;
        void *dst, *src, *buf;

        dstp = misaligned_128bit_malloc(16, &dst);
        srcp = misaligned_128bit_malloc(4,  &src);
        bufp = misaligned_128bit_malloc(8,  &buf);

        printAlignment(dstp, "dst");
        printAlignment(srcp, "src");
        printAlignment(bufp, "buf");

        *dstp = 0;
        /* fill in some dummy data */
        for(int i=0; i<16; i++) ((uint8_t *) srcp)[i] = 0x10;
        for(int i=0; i<16; i++) ((uint8_t *) bufp)[i] = i << 3;

        printf("src: 0x%016lx%016lx\n", (uint64_t) (*srcp >> 64), (uint64_t)
(*srcp));
        printf("buf: 0x%016lx%016lx\n", (uint64_t) (*bufp >> 64), (uint64_t)
(*bufp));
        printf("dst: 0x%016lx%016lx\n", (uint64_t) (*dstp >> 64), (uint64_t)
(*dstp));

        printf("xoring...\n");
        fflush(stdout);
        *dstp = *srcp ^ *bufp;
        printf("success!\n");

        printf("dst: 0x%016lx%016lx\n", (uint64_t) (*dstp >> 64), (uint64_t)
(*dstp));

        free(dst);
        free(src);
        free(buf);

        return 0;
}

Results:
gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
~/test$ gcc -O2 mwe.c -o mwe
$ ./mwe
dst is 128-bit aligned (0x5637185eb2b0)
src is  32-bit aligned (0x5637185eb2d4)
buf is  64-bit aligned (0x5637185eb2f8)
src: 0x10101010101010101010101010101010
buf: 0x78706860585048403830282018100800
dst: 0x00000000000000000000000000000000
xoring...
success!
dst: 0x68607870484058502820383008001810

gcc version 13.2.1 20231011 (Red Hat 13.2.1-4) (GCC)
$ gcc -O2 mwe.c -o mwe
$ ./mwe
dst is 128-bit aligned (0x1cbc2b0)
src is  32-bit aligned (0x1cbc2d4)
buf is  64-bit aligned (0x1cbc2f8)
src: 0x10101010101010101010101010101010
buf: 0x78706860585048403830282018100800
dst: 0x00000000000000000000000000000000
xoring...
Segmentation fault (core dumped)

gcc version 13.2.1 20231011 (Red Hat 13.2.1-4) (GCC)
$ gcc -O0 mwe.c -o mwe
$ ./mwe
dst is 128-bit aligned (0xb022b0)
src is  32-bit aligned (0xb022d4)
buf is  64-bit aligned (0xb022f8)
src: 0x10101010101010101010101010101010
buf: 0x78706860585048403830282018100800
dst: 0x00000000000000000000000000000000
xoring...
success!
dst: 0x68607870484058502820383008001810

I don't know if this is a bug in 13.2.1 or if the might be undefined behaviour
that is now being enforced with a segfault. I've looked through the release
documents for 13.2.1 and didn't see anything that seems to indicate the latter
but I might have missed it. 

Any help or insight you could provide would be appreciated. 

Thanks for your time, 
Chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/114080] Inconsistent handling of unaligned 128bit ints between GCC versions
  2024-02-23 15:32 [Bug c/114080] New: Inconsistent handling of 128bit ints between GCC versions rapier at psc dot edu
@ 2024-02-23 15:37 ` jakub at gcc dot gnu.org
  2024-02-23 15:40 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-23 15:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114080

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
That is undefined behavior, __int128/__int128_t/__uint128_t needs 16-byte
alignment.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/114080] Inconsistent handling of unaligned 128bit ints between GCC versions
  2024-02-23 15:32 [Bug c/114080] New: Inconsistent handling of 128bit ints between GCC versions rapier at psc dot edu
  2024-02-23 15:37 ` [Bug c/114080] Inconsistent handling of unaligned " jakub at gcc dot gnu.org
@ 2024-02-23 15:40 ` pinskia at gcc dot gnu.org
  2024-02-23 15:46 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-23 15:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114080

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |INVALID

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The alignment requirement for int128_t has always been 16byte aligned.
So if you cast an unaligned pointer to int128_t pointer you run into c
undefined behavior.

Just happens now gcc will use the sse registers to do things like xor and such
and the sse loads trap on unaligned.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/114080] Inconsistent handling of unaligned 128bit ints between GCC versions
  2024-02-23 15:32 [Bug c/114080] New: Inconsistent handling of 128bit ints between GCC versions rapier at psc dot edu
  2024-02-23 15:37 ` [Bug c/114080] Inconsistent handling of unaligned " jakub at gcc dot gnu.org
  2024-02-23 15:40 ` pinskia at gcc dot gnu.org
@ 2024-02-23 15:46 ` jakub at gcc dot gnu.org
  2024-02-23 15:48 ` redi at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-23 15:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114080

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The testcase segfaults since r13-1607-gc3ed9e0d6e96d8697e4bab994f8acbc5506240ee
when the backend started using more aggressively vector instructions for
operations like the 128-bit logical ops, but that doesn't change anything on
the testcase being invalid.
Don't do that.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/114080] Inconsistent handling of unaligned 128bit ints between GCC versions
  2024-02-23 15:32 [Bug c/114080] New: Inconsistent handling of 128bit ints between GCC versions rapier at psc dot edu
                   ` (2 preceding siblings ...)
  2024-02-23 15:46 ` jakub at gcc dot gnu.org
@ 2024-02-23 15:48 ` redi at gcc dot gnu.org
  2024-02-23 16:04 ` rapier at psc dot edu
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: redi at gcc dot gnu.org @ 2024-02-23 15:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114080

--- Comment #4 from Jonathan Wakely <redi at gcc dot gnu.org> ---
You could have checked this very easily using -fsanitize=undefined just like it
asks you to at https://gcc.gnu.org/bugs/ and at the top of the page when you
created this bug.

dst is 512-bit aligned (0x10112c0)
src is  32-bit aligned (0x10112e4)
buf is  64-bit aligned (0x1011308)
al.c:41:72: runtime error: load of misaligned address 0x0000010112e4 for type
'__int128 unsigned', which requires 16 byte alignment
0x0000010112e4: note: pointer points here
  00 00 00 00 10 10 10 10  10 10 10 10 10 10 10 10  10 10 10 10 00 00 00 00  21
00 00 00 00 00 00 00
              ^ 
al.c:41:46: runtime error: load of misaligned address 0x0000010112e4 for type
'__int128 unsigned', which requires 16 byte alignment
0x0000010112e4: note: pointer points here
  00 00 00 00 10 10 10 10  10 10 10 10 10 10 10 10  10 10 10 10 00 00 00 00  21
00 00 00 00 00 00 00
              ^ 
src: 0x10101010101010101010101010101010
al.c:42:72: runtime error: load of misaligned address 0x000001011308 for type
'__int128 unsigned', which requires 16 byte alignment
0x000001011308: note: pointer points here
 00 00 00 00  00 08 10 18 20 28 30 38  40 48 50 58 60 68 70 78  11 04 00 00 00
00 00 00  73 72 63 3a
              ^ 
al.c:42:46: runtime error: load of misaligned address 0x000001011308 for type
'__int128 unsigned', which requires 16 byte alignment
0x000001011308: note: pointer points here
 00 00 00 00  00 08 10 18 20 28 30 38  40 48 50 58 60 68 70 78  11 04 00 00 00
00 00 00  73 72 63 3a
              ^ 
buf: 0x78706860585048403830282018100800
dst: 0x00000000000000000000000000000000
xoring...
al.c:47:10: runtime error: load of misaligned address 0x0000010112e4 for type
'__int128 unsigned', which requires 16 byte alignment
0x0000010112e4: note: pointer points here
  00 00 00 00 10 10 10 10  10 10 10 10 10 10 10 10  10 10 10 10 00 00 00 00  21
00 00 00 00 00 00 00
              ^ 
al.c:47:18: runtime error: load of misaligned address 0x000001011308 for type
'__int128 unsigned', which requires 16 byte alignment
0x000001011308: note: pointer points here
 00 00 00 00  00 08 10 18 20 28 30 38  40 48 50 58 60 68 70 78  11 04 00 00 00
00 00 00  78 6f 72 69
              ^ 
success!
dst: 0x68607870484058502820383008001810

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/114080] Inconsistent handling of unaligned 128bit ints between GCC versions
  2024-02-23 15:32 [Bug c/114080] New: Inconsistent handling of 128bit ints between GCC versions rapier at psc dot edu
                   ` (3 preceding siblings ...)
  2024-02-23 15:48 ` redi at gcc dot gnu.org
@ 2024-02-23 16:04 ` rapier at psc dot edu
  2024-02-23 16:08 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rapier at psc dot edu @ 2024-02-23 16:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114080

--- Comment #5 from Chris Rapier <rapier at psc dot edu> ---
So what you are saying is that behaviour *has* changed and what was a valid
operation for 15 years is now invalid. I'm not mad about that. I just needed to
know.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/114080] Inconsistent handling of unaligned 128bit ints between GCC versions
  2024-02-23 15:32 [Bug c/114080] New: Inconsistent handling of 128bit ints between GCC versions rapier at psc dot edu
                   ` (4 preceding siblings ...)
  2024-02-23 16:04 ` rapier at psc dot edu
@ 2024-02-23 16:08 ` jakub at gcc dot gnu.org
  2024-02-23 16:11 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-23 16:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114080

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Chris Rapier from comment #5)
> So what you are saying is that behaviour *has* changed and what was a valid
> operation for 15 years is now invalid. I'm not mad about that. I just needed
> to know.

No.  Please read the above comments again.  The testcase has been always
invalid, but invoking undefined behavior doesn't mean you always get a
segfault, that would then be defined behavior.  It can as well seem to do what
the programmer expected it to do, or do something completely different.
See e.g. https://blog.regehr.org/archives/213 for more details.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/114080] Inconsistent handling of unaligned 128bit ints between GCC versions
  2024-02-23 15:32 [Bug c/114080] New: Inconsistent handling of 128bit ints between GCC versions rapier at psc dot edu
                   ` (5 preceding siblings ...)
  2024-02-23 16:08 ` jakub at gcc dot gnu.org
@ 2024-02-23 16:11 ` pinskia at gcc dot gnu.org
  2024-02-23 16:36 ` rapier at psc dot edu
  2024-02-23 17:14 ` jakub at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-02-23 16:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114080

--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Chris Rapier from comment #5)
> So what you are saying is that behaviour *has* changed and what was a valid
> operation for 15 years is now invalid. I'm not mad about that. I just needed
> to know.

behavior didn't change, it was always undefined. -fsanitize=undefined has been
catching it almost for the last 10 years (since r5-2363-g944fa280bc92d1).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/114080] Inconsistent handling of unaligned 128bit ints between GCC versions
  2024-02-23 15:32 [Bug c/114080] New: Inconsistent handling of 128bit ints between GCC versions rapier at psc dot edu
                   ` (6 preceding siblings ...)
  2024-02-23 16:11 ` pinskia at gcc dot gnu.org
@ 2024-02-23 16:36 ` rapier at psc dot edu
  2024-02-23 17:14 ` jakub at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: rapier at psc dot edu @ 2024-02-23 16:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114080

--- Comment #8 from Chris Rapier <rapier at psc dot edu> ---
My apologies for misunderstanding and for coming across as aggressive in my
last response. This section of the code is about 15 years old so it hasn't,
obviously, been subject to a close enough review until now. We'll be working on
fixing that. I really do want to thank you all for the insight. This has been
very helpful and, hopefully, we'll have a performative fix soon. 

Thanks again for this. 

Chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug c/114080] Inconsistent handling of unaligned 128bit ints between GCC versions
  2024-02-23 15:32 [Bug c/114080] New: Inconsistent handling of 128bit ints between GCC versions rapier at psc dot edu
                   ` (7 preceding siblings ...)
  2024-02-23 16:36 ` rapier at psc dot edu
@ 2024-02-23 17:14 ` jakub at gcc dot gnu.org
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-02-23 17:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114080

--- Comment #9 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Note, most not too old compilers handle small constant size memcpy as an
efficient way to load/store unaligned values and it is also portable.  So,
instead of
  *dstp = *srcp ^ *bufp;
if all those can be unaligned use
  __uint128_t t1, t2;
  memcpy (&t1, srcp, sizeof (t1));
  memcpy (&t2, bufp, sizeof (t2));
  t1 = t1 ^ t2;
  memcpy (dstp, &t1, sizeof (t1));
should result in decent code (unless -O0 of course).

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-02-23 17:14 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-23 15:32 [Bug c/114080] New: Inconsistent handling of 128bit ints between GCC versions rapier at psc dot edu
2024-02-23 15:37 ` [Bug c/114080] Inconsistent handling of unaligned " jakub at gcc dot gnu.org
2024-02-23 15:40 ` pinskia at gcc dot gnu.org
2024-02-23 15:46 ` jakub at gcc dot gnu.org
2024-02-23 15:48 ` redi at gcc dot gnu.org
2024-02-23 16:04 ` rapier at psc dot edu
2024-02-23 16:08 ` jakub at gcc dot gnu.org
2024-02-23 16:11 ` pinskia at gcc dot gnu.org
2024-02-23 16:36 ` rapier at psc dot edu
2024-02-23 17:14 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).