public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/116383] New: Value from __atomic_store not forwarded to non-atomic load at same address
@ 2024-08-15 13:22 redbeard0531 at gmail dot com
  2024-08-15 16:21 ` [Bug middle-end/116383] " pinskia at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: redbeard0531 at gmail dot com @ 2024-08-15 13:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116383

            Bug ID: 116383
           Summary: Value from __atomic_store not forwarded to non-atomic
                    load at same address
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: redbeard0531 at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/1bbjoc87n

int test(int* i, int val) {
    __atomic_store_n(i, val, __ATOMIC_RELAXED);
    return *i;
}

The non-atomic load should be able to directly use the value stored by the
atomic store, but instead GCC issues a new load:

        mov     DWORD PTR [rdi], esi
        mov     eax, DWORD PTR [rdi]
        ret

Clang recognizes that the load is unnecessary and propagates the value:

        mov     eax, esi
        mov     dword ptr [rdi], esi
        ret

In addition to simply being an unnecessary load, there is an additionally
penalty in most CPUs from accessing reading a value still in the CPU's store
buffer which it almost certainly would be in this case. And of course this also
disables further optimizations eg DSE and value propagation where the compiler
knows something about the value.

void blocking_further_optimizations(int* i) {
    if (test(i, 1) == 0) {
        __builtin_abort();
    }
}

generates the following with gcc

        mov     DWORD PTR [rdi], 1
        mov     edx, DWORD PTR [rdi]
        test    edx, edx
        je      .L5
        ret
blocking_further_optimizations(int*) [clone .cold]:
.L5:
        push    rax
        call    abort

And this much better output with clang

        mov     dword ptr [rdi], 1
        ret

While I'm using a relaxed store here to show that gcc doesn't apply the
optimization in that case, I think the optimization should apply regardless of
memory ordering (and clang seems to agree). Also while the minimal example code
is contrived, there are several real-world use cases where this pattern can
come up. I would expect it in cases where there is a single writer thread but
many reader threads. The writes and off-thread reads need to use __atomic ops
to avoid data races, but on-thread reads should be safe using ordinary loads,
and you would want them to be optimized as much as possible.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug middle-end/116383] Value from __atomic_store not forwarded to non-atomic load at same address
  2024-08-15 13:22 [Bug middle-end/116383] New: Value from __atomic_store not forwarded to non-atomic load at same address redbeard0531 at gmail dot com
@ 2024-08-15 16:21 ` pinskia at gcc dot gnu.org
  2024-08-19 10:30 ` redi at gcc dot gnu.org
  2024-08-19 10:30 ` redi at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-08-15 16:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116383

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
           Severity|normal                      |enhancement

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I am not sure this is a wanted optimization.
Note I also think the original code is questionable and maybe even has a race
condition in it.

>but on-thread reads should be safe using ordinary loads, 

I don't think that is valid thing to do.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug middle-end/116383] Value from __atomic_store not forwarded to non-atomic load at same address
  2024-08-15 13:22 [Bug middle-end/116383] New: Value from __atomic_store not forwarded to non-atomic load at same address redbeard0531 at gmail dot com
  2024-08-15 16:21 ` [Bug middle-end/116383] " pinskia at gcc dot gnu.org
@ 2024-08-19 10:30 ` redi at gcc dot gnu.org
  2024-08-19 10:30 ` redi at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: redi at gcc dot gnu.org @ 2024-08-19 10:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116383

--- Comment #2 from Jonathan Wakely <redi at gcc dot gnu.org> ---
It's valid iff no other thread does any writes that are potentially concurrent,
e.g. if the other threads are only doing atomic_load on that variable.

And that's precisely why the optimization would be valid. If any other thread
is writing to the variable (atomically or not) the non-atomic load of *i would
be a data race, so undefined. So either no other thread is changing the value,
in which case we don't need to reload it, or the code has undefined behaviour,
in which case we don't need to reload it (because returning *any* value would
be valid, and returning the one we just wrote is better than returning garbage,
since that would be a valid result to get even if an atomic load had been used
instead).

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug middle-end/116383] Value from __atomic_store not forwarded to non-atomic load at same address
  2024-08-15 13:22 [Bug middle-end/116383] New: Value from __atomic_store not forwarded to non-atomic load at same address redbeard0531 at gmail dot com
  2024-08-15 16:21 ` [Bug middle-end/116383] " pinskia at gcc dot gnu.org
  2024-08-19 10:30 ` redi at gcc dot gnu.org
@ 2024-08-19 10:30 ` redi at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: redi at gcc dot gnu.org @ 2024-08-19 10:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116383

Jonathan Wakely <redi at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2024-08-19

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-08-19 10:30 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-08-15 13:22 [Bug middle-end/116383] New: Value from __atomic_store not forwarded to non-atomic load at same address redbeard0531 at gmail dot com
2024-08-15 16:21 ` [Bug middle-end/116383] " pinskia at gcc dot gnu.org
2024-08-19 10:30 ` redi at gcc dot gnu.org
2024-08-19 10:30 ` redi at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).