public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/110573] New: MIPS64: Enhancement PR of load of pointer to atomic
@ 2023-07-06 13:58 luke.geeson at cs dot ucl.ac.uk
  2023-07-06 14:02 ` [Bug middle-end/110573] " luke.geeson at cs dot ucl.ac.uk
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-07-06 13:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110573

            Bug ID: 110573
           Summary: MIPS64: Enhancement PR of load of pointer to atomic
           Product: gcc
           Version: 13.1.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: luke.geeson at cs dot ucl.ac.uk
                CC: luke.geeson at cs dot ucl.ac.uk
  Target Milestone: ---
            Target: MIPS64

This is my first report for GCC so please forgive me if I make a mistake. This
is an enhancement report - the behaviour of the program is ok, but an
instruction could be removed to be consistent with the non-atomic variant of
the below code.

Consider the code in GCC 13.1.0 built for MIPS64
(https://godbolt.org/z/as68sEWda)

```
void P1() {
  int r0;
  r0 = *y;
  if(r0 == (1))   {
    atomic_store_explicit(x,1,memory_order_release);  }

  *P1_r0 = r0;
}
```
When compiled using `-O1 -pthread -std=c11 -g -c` the branch to label L7 loads
a pointer to `P1_r0` using the delay slot. Likewise `P1_r0` is loaded the line
above L7 when the branch is taken. 


```
                                            #... (code in if branch)
        ld      $3,%got_disp(x)($5)

        ld      $3,%got_disp(P1_r0)($5).    # ld P1_r0 on branch taken
.L7:
        ld      $3,0($3)
        jr      $31
        sw      $2,0($3)

.L6:
        ld      $3,0($3)
        sync
        li      $4,1                        # 0x1
        sw      $4,0($3)
        b       .L7
        ld      $3,%got_disp(P1_r0)($5).  # ld P1_r0 on branch not taken
```


The ld could be moved into L7, thus saving one instruction:


```
                                            #... (code in if branch)
        ld      $3,%got_disp(x)($5)
.L7:
        ld      $3,0($3)
        jr      $31
        sw      $2,0($3)

.L6:    ld      $3,%got_disp(P1_r0)($5). 
        ld      $3,0($3)
        sync
        li      $4,1                        # 0x1
        b       .L7
        sw      $4,0($3)
```

The above optimisation already occurs if x is non-atomic (see
https://godbolt.org/z/8dhxvsE18)

The optimisation can also be applied for `-O2`
(https://godbolt.org/z/8aMj6xqTq)
as well.

I hope this helps.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-07-06 23:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-06 13:58 [Bug middle-end/110573] New: MIPS64: Enhancement PR of load of pointer to atomic luke.geeson at cs dot ucl.ac.uk
2023-07-06 14:02 ` [Bug middle-end/110573] " luke.geeson at cs dot ucl.ac.uk
2023-07-06 17:30 ` pinskia at gcc dot gnu.org
2023-07-06 17:31 ` pinskia at gcc dot gnu.org
2023-07-06 23:44 ` [Bug rtl-optimization/110573] branch delay slots are not filled with atomic stores luke.geeson at cs dot ucl.ac.uk
2023-07-06 23:45 ` luke.geeson at cs dot ucl.ac.uk
2023-07-06 23:49 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).