[Bug middle-end/110573] New: MIPS64: Enhancement PR of load of pointer to atomic

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug middle-end/110573] New: MIPS64: Enhancement PR of load of pointer to atomic
@ 2023-07-06 13:58 luke.geeson at cs dot ucl.ac.uk
  2023-07-06 14:02 ` [Bug middle-end/110573] " luke.geeson at cs dot ucl.ac.uk
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-07-06 13:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110573

            Bug ID: 110573
           Summary: MIPS64: Enhancement PR of load of pointer to atomic
           Product: gcc
           Version: 13.1.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: luke.geeson at cs dot ucl.ac.uk
                CC: luke.geeson at cs dot ucl.ac.uk
  Target Milestone: ---
            Target: MIPS64

This is my first report for GCC so please forgive me if I make a mistake. This
is an enhancement report - the behaviour of the program is ok, but an
instruction could be removed to be consistent with the non-atomic variant of
the below code.

Consider the code in GCC 13.1.0 built for MIPS64
(https://godbolt.org/z/as68sEWda)

```
void P1() {
  int r0;
  r0 = *y;
  if(r0 == (1))   {
    atomic_store_explicit(x,1,memory_order_release);  }

  *P1_r0 = r0;
}
```
When compiled using `-O1 -pthread -std=c11 -g -c` the branch to label L7 loads
a pointer to `P1_r0` using the delay slot. Likewise `P1_r0` is loaded the line
above L7 when the branch is taken. 


```
                                            #... (code in if branch)
        ld      $3,%got_disp(x)($5)

        ld      $3,%got_disp(P1_r0)($5).    # ld P1_r0 on branch taken
.L7:
        ld      $3,0($3)
        jr      $31
        sw      $2,0($3)

.L6:
        ld      $3,0($3)
        sync
        li      $4,1                        # 0x1
        sw      $4,0($3)
        b       .L7
        ld      $3,%got_disp(P1_r0)($5).  # ld P1_r0 on branch not taken
```


The ld could be moved into L7, thus saving one instruction:


```
                                            #... (code in if branch)
        ld      $3,%got_disp(x)($5)
.L7:
        ld      $3,0($3)
        jr      $31
        sw      $2,0($3)

.L6:    ld      $3,%got_disp(P1_r0)($5). 
        ld      $3,0($3)
        sync
        li      $4,1                        # 0x1
        b       .L7
        sw      $4,0($3)
```

The above optimisation already occurs if x is non-atomic (see
https://godbolt.org/z/8dhxvsE18)

The optimisation can also be applied for `-O2`
(https://godbolt.org/z/8aMj6xqTq)
as well.

I hope this helps.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug middle-end/110573] MIPS64: Enhancement PR of load of pointer to atomic
  2023-07-06 13:58 [Bug middle-end/110573] New: MIPS64: Enhancement PR of load of pointer to atomic luke.geeson at cs dot ucl.ac.uk
@ 2023-07-06 14:02 ` luke.geeson at cs dot ucl.ac.uk
  2023-07-06 17:30 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-07-06 14:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110573

--- Comment #1 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
My apologies - I should have put the ld on the line with L7:
```
.L7:    ld      $3,%got_disp(P1_r0)($5). 

```

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug middle-end/110573] MIPS64: Enhancement PR of load of pointer to atomic
  2023-07-06 13:58 [Bug middle-end/110573] New: MIPS64: Enhancement PR of load of pointer to atomic luke.geeson at cs dot ucl.ac.uk
  2023-07-06 14:02 ` [Bug middle-end/110573] " luke.geeson at cs dot ucl.ac.uk
@ 2023-07-06 17:30 ` pinskia at gcc dot gnu.org
  2023-07-06 17:31 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-07-06 17:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110573

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
volatile (atomics) stores are not considered for branch delay slots.

https://inbox.sourceware.org/gcc/3077458.gu9Dx72xcF@arcturus.home/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug middle-end/110573] MIPS64: Enhancement PR of load of pointer to atomic
  2023-07-06 13:58 [Bug middle-end/110573] New: MIPS64: Enhancement PR of load of pointer to atomic luke.geeson at cs dot ucl.ac.uk
  2023-07-06 14:02 ` [Bug middle-end/110573] " luke.geeson at cs dot ucl.ac.uk
  2023-07-06 17:30 ` pinskia at gcc dot gnu.org
@ 2023-07-06 17:31 ` pinskia at gcc dot gnu.org
  2023-07-06 23:44 ` [Bug rtl-optimization/110573] branch delay slots are not filled with atomic stores luke.geeson at cs dot ucl.ac.uk
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-07-06 17:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110573

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
See
https://inbox.sourceware.org/gcc/d7787b3f-9450-5642-ffac-21cf36176073@redhat.com/
also.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/110573] branch delay slots are not filled with atomic stores
  2023-07-06 13:58 [Bug middle-end/110573] New: MIPS64: Enhancement PR of load of pointer to atomic luke.geeson at cs dot ucl.ac.uk
                   ` (2 preceding siblings ...)
  2023-07-06 17:31 ` pinskia at gcc dot gnu.org
@ 2023-07-06 23:44 ` luke.geeson at cs dot ucl.ac.uk
  2023-07-06 23:45 ` luke.geeson at cs dot ucl.ac.uk
  2023-07-06 23:49 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-07-06 23:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110573

--- Comment #4 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
Ah so since atomics are treated as volatile (like LLVM) instructions that
access them cannot inhabit a delay slot. Is it still valid to treat atomics as
volatile?

Consider the following MIPS litmus test:
```
{ %x0=x; %y0=y; %y1=y; %x1=x; }
 P0           | P1           ;
 lw $2,0(%x0) | lw $2,0(%y1) ;
 ori $3,$0,1  | ori $3,$0,1  ;
 sw $3,0(%y0) | sw $3,0(%x1) ;

exists (0:$2=1 /\ 1:$2=1)
```
When run under the mips model we do not observe the outcome in the exists
clause:
```
0:$2=0; 1:$2=0;
0:$2=0; 1:$2=1;
0:$2=1; 1:$2=0;
```
That is, from an ordering perspective it is unlikely that unexpected behaviours
can occur - in this case putting sw in a delay slot should be ok (the same
doesn't hold for RISC-V/Arm models of course).

I understand treating atomics as volatile has historical precedent but a case
can be made, at least on modern architectures and with improved understanding
of models, that atomics are not volatile and more optimisations can be applied.
What do you think?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/110573] branch delay slots are not filled with atomic stores
  2023-07-06 13:58 [Bug middle-end/110573] New: MIPS64: Enhancement PR of load of pointer to atomic luke.geeson at cs dot ucl.ac.uk
                   ` (3 preceding siblings ...)
  2023-07-06 23:44 ` [Bug rtl-optimization/110573] branch delay slots are not filled with atomic stores luke.geeson at cs dot ucl.ac.uk
@ 2023-07-06 23:45 ` luke.geeson at cs dot ucl.ac.uk
  2023-07-06 23:49 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-07-06 23:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110573

--- Comment #5 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
For the record the %registers are symbolic - simply replace them with concrete
ones containing the location x,y,etc...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug rtl-optimization/110573] branch delay slots are not filled with atomic stores
  2023-07-06 13:58 [Bug middle-end/110573] New: MIPS64: Enhancement PR of load of pointer to atomic luke.geeson at cs dot ucl.ac.uk
                   ` (4 preceding siblings ...)
  2023-07-06 23:45 ` luke.geeson at cs dot ucl.ac.uk
@ 2023-07-06 23:49 ` pinskia at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-07-06 23:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110573

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Luke Geeson from comment #4)
> I understand treating atomics as volatile has historical precedent but a
> case can be made, at least on modern architectures and with improved
> understanding of models, that atomics are not volatile and more
> optimisations can be applied.
> What do you think?

Not really. The problem is you will need to add a new kind of memory access
type on the RTL level, this is not something which can be done without getting
things wrong and/or forgetting to update every place that might change
volatileness (including the scheduler which itself getting right is hard).
So treating them as volatile memory access on the RTL level is the easiest and
best form here.

Now on the gimple level, they are treated as a function call which itself is
another can of worms.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-07-06 23:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-06 13:58 [Bug middle-end/110573] New: MIPS64: Enhancement PR of load of pointer to atomic luke.geeson at cs dot ucl.ac.uk
2023-07-06 14:02 ` [Bug middle-end/110573] " luke.geeson at cs dot ucl.ac.uk
2023-07-06 17:30 ` pinskia at gcc dot gnu.org
2023-07-06 17:31 ` pinskia at gcc dot gnu.org
2023-07-06 23:44 ` [Bug rtl-optimization/110573] branch delay slots are not filled with atomic stores luke.geeson at cs dot ucl.ac.uk
2023-07-06 23:45 ` luke.geeson at cs dot ucl.ac.uk
2023-07-06 23:49 ` pinskia at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).