public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores
@ 2023-08-30 20:23 luke.geeson at cs dot ucl.ac.uk
  2023-08-30 20:35 ` [Bug target/111246] " luke.geeson at cs dot ucl.ac.uk
                   ` (17 more replies)
  0 siblings, 18 replies; 19+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-08-30 20:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

            Bug ID: 111246
           Summary: PPC64 Sequentially Consistent Load allows Reordering
                    of Stores
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: luke.geeson at cs dot ucl.ac.uk
  Target Milestone: ---

Consider the following litmus test that captures bad behaviour:

```
C test

{ *x = 0; *y = 0; } // fixed initial state where x and y are 0

// Concurrent threads
P0 (atomic_int* y,atomic_int* x) {
  atomic_store_explicit(x,2,memory_order_relaxed);
  atomic_thread_fence(memory_order_consume);
  atomic_store_explicit(y,1,memory_order_seq_cst);
}

P1 (atomic_int* y,atomic_int* x) {
  int r0 = atomic_load_explicit(y,memory_order_seq_cst);
  atomic_thread_fence(memory_order_relaxed);
  atomic_store_explicit(x,1,memory_order_relaxed);
}

exists ([x]=2 /\ 1:r0=1)

```
where 'P0:r0 = 0' means thread P0, local variable r0 has value 0. [x] means x
is a global.

When simulating this test under the C/C++ model from its initial state, the
outcome of execution in the exists clause is forbidden by the source model. The
allowed outcomes are:
```
{ P1:r0=0; [x]=1; }
{ P1:r0=0; [x]=2; }
{ P1:r0=1; [x]=1; }
```
When compiling to target PPC64le with GCC trunk `-c -g -O1 -pthread --std=c11
-ffreestanding -mabi=elfv1` (https://godbolt.org/z/xj9W7nr9G), the compiled
program has the following outcomes when simulated under the PPC model:
```
{ P1:r0=0; [x]=1; }                           
{ P1:r0=0; [x]=2; }                                                             
{ P1:r0=1; [x]=1; }    
{ P1:r0=1; [x]=2; } <--- forbidden by source model, bug!
```
which occurs since the `b L0x50; isync` pattern allows the memory effect of the
load of y on thread P1 to be reordered after the store to x on P1. We observe
the execution:

{ P1:r0=0;[x]=0} -> P1:W[x]=1 -> P0:W[x]=2 -> P0:W[y]=1 -> P1:R[y]=0 -> {
P1:r0=1; [x]=2; } 

where the local variable `r0` is stored in register r9 in the compiled code.

The problem is isync forces the instructions to finish, but not their memory
effects. Hence the forbidden outcome { P1:r0=1; [x]=2; } is allowed under the
ppc model.

```
PPC test

{ [P1_r0]=0;[x]=0;[y]=0;
  uint64_t %P0_x=x; uint64_t %P0_y=y;
  uint64_t %P1_P1_r0=P1_r0;uint64_t %P1_x=x;
  uint64_t %P1_y=y }
(*****************************************************************)
(*                 the Telechat toolsuite                        *)
(*                                                               *)
(* Luke Geeson, University College London, UK.                   *)
(*                                                               *)
(*****************************************************************)
  P0                 |  P1                    ;
   li r10,2          |   sync                 ;
   stw r10,0(%P0_x)  |   lwz r9,0(%P1_y)      ;
   lwsync            |   cmpw r9,r9           ;
   sync              |   b   L0x50            ;
   li r10,1          |  L0x50: isync          ;
   stw r10,0(%P0_y)  |   li r8,1              ;
                     |   stw r8,0(%P1_x)      ;
                     |   stw r9,0(%P1_P1_r0)  ;

exists ([x]=2 /\ P1_r0=1)  // YES under PPC model

```
We have observed this behaviour with several hundred Message Passing and 'S'
pattern tests (the above is an S pattern test).

To fix this, we advise replacing the isync with lwfence to forbid the buggy
behaviour:

```
PPC test-fixed

{ [P1_r0]=0;[x]=0;[y]=0;
  uint64_t %P0_x=x; uint64_t %P0_y=y;
  uint64_t %P1_P1_r0=P1_r0;uint64_t %P1_x=x;
  uint64_t %P1_y=y }
(*****************************************************************)
(*                 the Telechat toolsuite                        *)
(*                                                               *)
(* Luke Geeson, University College London, UK.                   *)
(*                                                               *)
(*****************************************************************)
  P0                 |  P1                    ;
   li r10,2          |   sync                 ;
   stw r10,0(%P0_x)  |   lwz r9,0(%P1_y)      ;
   lwsync            |   cmpw r9,r9           ;
   sync              |   b   L0x50            ;
   li r10,1          |  L0x50: lwfence        ;
   stw r10,0(%P0_y)  |   li r8,1              ;
                     |   stw r8,0(%P1_x)      ;
                     |   stw r9,0(%P1_P1_r0)  ;


exists ([x]=2 /\ P1_r0=1)  // NO under PPC model

```
which no longer allows the buggy outcome under the ppc model:
```
{ P1:r0=0; [x]=1; }                           
{ P1:r0=0; [x]=2; }                                                             
{ P1:r0=1; [x]=1; } 
```

I'm happy to discuss this as needed.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
@ 2023-08-30 20:35 ` luke.geeson at cs dot ucl.ac.uk
  2023-08-30 20:38 ` pinskia at gcc dot gnu.org
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-08-30 20:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #1 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
Edit: The godbolt link points to the same bug in LLVM I am reporting, here is
the GCC godbolt link https://godbolt.org/z/Tan5jrvKa

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
  2023-08-30 20:35 ` [Bug target/111246] " luke.geeson at cs dot ucl.ac.uk
@ 2023-08-30 20:38 ` pinskia at gcc dot gnu.org
  2023-08-30 20:47 ` luke.geeson at cs dot ucl.ac.uk
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-30 20:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|                            |powerpc64le

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
You are sure this is correct.
x is a relaxed order so the store could be reorded before all other atomics
even.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
  2023-08-30 20:35 ` [Bug target/111246] " luke.geeson at cs dot ucl.ac.uk
  2023-08-30 20:38 ` pinskia at gcc dot gnu.org
@ 2023-08-30 20:47 ` luke.geeson at cs dot ucl.ac.uk
  2023-08-30 20:49 ` luke.geeson at cs dot ucl.ac.uk
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-08-30 20:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #3 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
According to the latest C/C++ and PPC models, yes. 

If x was non-atomic, then this would be a racy (UB) test, but making x atomic
with relaxed order is well-defined according to the C model (and therefore any
racy behaviour should not occur).

In any case, a sequentially consistent load should not allow re-ordering. We
reported (and are patching) a similar bug in LLVM for AArch64
(https://github.com/llvm/llvm-project/issues/62652)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (2 preceding siblings ...)
  2023-08-30 20:47 ` luke.geeson at cs dot ucl.ac.uk
@ 2023-08-30 20:49 ` luke.geeson at cs dot ucl.ac.uk
  2023-08-31 13:41 ` dje at gcc dot gnu.org
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-08-30 20:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #4 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
Add Wilco also addressed this for AArch64 in GCC:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108891

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (3 preceding siblings ...)
  2023-08-30 20:49 ` luke.geeson at cs dot ucl.ac.uk
@ 2023-08-31 13:41 ` dje at gcc dot gnu.org
  2023-08-31 13:49 ` luke.geeson at cs dot ucl.ac.uk
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: dje at gcc dot gnu.org @ 2023-08-31 13:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

David Edelsohn <dje at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2023-08-31

--- Comment #5 from David Edelsohn <dje at gcc dot gnu.org> ---
There is no lwfence instruction in PowerPC.  We will need to examine what code
generation change is necessary.  Did you intend to suggest lwsync?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (4 preceding siblings ...)
  2023-08-31 13:41 ` dje at gcc dot gnu.org
@ 2023-08-31 13:49 ` luke.geeson at cs dot ucl.ac.uk
  2023-08-31 16:23 ` dje at gcc dot gnu.org
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-08-31 13:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #6 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
Apologies - I've been thinking in syncs and fences too much!

Yes I mean `lwsync`, for clarity I repeat the above so you know it is the
correct fix:
```
lukegeeson@machine:~/Dev/tv-dev/herdtools7$  cat test.litmus
PPC test-fixed

{ [P1_r0]=0;[x]=0;[y]=0;
  uint64_t %P0_x=x; uint64_t %P0_y=y;
  uint64_t %P1_P1_r0=P1_r0;uint64_t %P1_x=x;
  uint64_t %P1_y=y }
(*****************************************************************)
(*                 the Telechat toolsuite                        *)
(*                                                               *)
(* Luke Geeson, University College London, UK.                   *)
(*                                                               *)
(*****************************************************************)
  P0                 |  P1                    ;
   li r10,2          |   sync                 ;
   stw r10,0(%P0_x)  |   lwz r9,0(%P1_y)      ;
   lwsync            |   cmpw r9,r9           ;
   sync              |   b   L0x50            ;
   li r10,1          |  L0x50: lwsync         ;
   stw r10,0(%P0_y)  |   li r8,1              ;
                     |   stw r8,0(%P1_x)      ;
                     |   stw r9,0(%P1_P1_r0)  ;


exists ([x]=2 /\ P1_r0=1)
lukegeeson@machine:~/Dev/tv-dev/herdtools7$  ./_build/install/default/bin/herd7
-model ppc.cat -I herd/libdir test.litmus
Test test-fixed Allowed
States 3
[P1_r0]=0; [x]=1;
[P1_r0]=0; [x]=2;
[P1_r0]=1; [x]=1;
No
Witnesses
Positive: 0 Negative: 3
Condition exists ([x]=2 /\ [P1_r0]=1)
Observation test-fixed Never 0 3
Time test-fixed 0.01
Hash=b215018fe694934d3b5fd1dc5eef48e9
```

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (5 preceding siblings ...)
  2023-08-31 13:49 ` luke.geeson at cs dot ucl.ac.uk
@ 2023-08-31 16:23 ` dje at gcc dot gnu.org
  2023-08-31 18:04 ` luke.geeson at cs dot ucl.ac.uk
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: dje at gcc dot gnu.org @ 2023-08-31 16:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #7 from David Edelsohn <dje at gcc dot gnu.org> ---
Have you reached out to Paul McKenney (now at Meta) who suggested the
instruction sequences to implement the C/C++ memory for PowerPC?
https://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2745.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (6 preceding siblings ...)
  2023-08-31 16:23 ` dje at gcc dot gnu.org
@ 2023-08-31 18:04 ` luke.geeson at cs dot ucl.ac.uk
  2023-08-31 18:06 ` luke.geeson at cs dot ucl.ac.uk
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-08-31 18:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #8 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
I have not, but I will contact him and link this discussion. In the meantime, I
read that page and provide some more testing. Consider the passage:
```
bc;isync: this is a very low-overhead and very weak form of memory fence. A
specific set of preceding loads on which the bc (branch conditional)
instruction depends are guaranteed to have completed before any subsequent
instruction begins execution
```
Note how the set of loads depend on a conditional load - in the test above GCC
emits an unconditional branch - and so this passage does not apply. We can see
this if we compare the output of the test with and without a conditional
branch:
```
$ cat test.litmus
PPC test

{ [P1_r0]=0;[x]=0;[y]=0;
  uint64_t %P0_x=x; uint64_t %P0_y=y;
  uint64_t %P1_P1_r0=P1_r0;uint64_t %P1_x=x;
  uint64_t %P1_y=y }
(*****************************************************************)
(*                 the Telechat toolsuite                        *)
(*                                                               *)
(* Luke Geeson, University College London, UK.                   *)
(*                                                               *)
(*****************************************************************)
  P0                 |  P1                    ;
   li r10,2          |   sync                 ;
   stw r10,0(%P0_x)  |   lwz r9,0(%P1_y)      ;
   lwsync            |   cmpw r9,r9           ;
   sync              |   b   L0x50            ;
   li r10,1          |  L0x50: isync          ;
   stw r10,0(%P0_y)  |   li r8,1              ;
                     |   stw r8,0(%P1_x)      ;
                     |   stw r9,0(%P1_P1_r0)  ;


exists ([x]=2 /\ P1_r0=1)

$ herd7 -model ppc.cat -I herd/libdir test.litmus
Test test Allowed
States 4
[P1_r0]=0; [x]=1;
[P1_r0]=0; [x]=2;
[P1_r0]=1; [x]=1;
[P1_r0]=1; [x]=2;
Ok
Witnesses
Positive: 1 Negative: 3
Condition exists ([x]=2 /\ [P1_r0]=1)
Observation test Sometimes 1 3
Time test 0.00
Hash=490e848f65aae63641db024a0fc82aa2
```
Then we make the branch a conditional branch:
```
PPC test

{ [P1_r0]=0;[x]=0;[y]=0;
  uint64_t %P0_x=x; uint64_t %P0_y=y;
  uint64_t %P1_P1_r0=P1_r0;uint64_t %P1_x=x;
  uint64_t %P1_y=y }
(*****************************************************************)
(*                 the Telechat toolsuite                        *)
(*                                                               *)
(* Luke Geeson, University College London, UK.                   *)
(*                                                               *)
(*****************************************************************)
  P0                 |  P1                    ;
   li r10,2          |   sync                 ;
   stw r10,0(%P0_x)  |   lwz r9,0(%P1_y)      ;
   lwsync            |   cmpw r9,r9           ;
   sync              |   beq   L0x50          ;
   li r10,1          |  L0x50: isync          ;
   stw r10,0(%P0_y)  |   li r8,1              ;
                     |   stw r8,0(%P1_x)      ;
                     |   stw r9,0(%P1_P1_r0)  ;


exists ([x]=2 /\ P1_r0=1)%
lukegeeson@LA:~/Dev/tv-dev/herdtools7|825f7fd5 ⇒ 
./_build/install/default/bin/herd7 -model ppc.cat -I herd/libdir test.litmus
Test test Allowed
States 3
[P1_r0]=0; [x]=1;
[P1_r0]=0; [x]=2;
[P1_r0]=1; [x]=1;
No
Witnesses
Positive: 0 Negative: 3
Condition exists ([x]=2 /\ [P1_r0]=1)
Observation test Never 0 3
Time test 0.00
Hash=52d2f37ed1564ba98544ff790020b69a
```

As you can see the buggy behaviour disappears when b is changed to bc. So there
are two solutions:

- Make isync a lwsync (better maps to the intent of sequential consistency,
perf costs)
- Make b a conditional branch bee (restores the intent of Paul's work, but adds
a control dependency where there was none in the source program - I am not a
fan of a mismatch of such dependencies).

I hope this helps.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (7 preceding siblings ...)
  2023-08-31 18:04 ` luke.geeson at cs dot ucl.ac.uk
@ 2023-08-31 18:06 ` luke.geeson at cs dot ucl.ac.uk
  2023-08-31 18:25 ` dje at gcc dot gnu.org
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-08-31 18:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #9 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
apologies typos:
bc -> bee
bee -> beq

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (8 preceding siblings ...)
  2023-08-31 18:06 ` luke.geeson at cs dot ucl.ac.uk
@ 2023-08-31 18:25 ` dje at gcc dot gnu.org
  2023-08-31 18:54 ` luke.geeson at cs dot ucl.ac.uk
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: dje at gcc dot gnu.org @ 2023-08-31 18:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #10 from David Edelsohn <dje at gcc dot gnu.org> ---
If I compile your testcase with either GCC 11.3 or GCC trunk, GCC produces

P1:
.LFB1:
        .cfi_startproc
        .localentry     P1,1
        pld 9,.LANCHOR0+8@pcrel
        sync
        lwz 9,0(9)
        cmpw 0,9,9                <*** compare
        bne- 0,$+4                <*** branch conditional
        isync                     <*** isync
        pld 10,.LANCHOR0@pcrel
        li 8,1
        stw 8,0(10)
        pld 10,.LANCHOR0+16@pcrel
        stw 9,0(10)
        blr

which contains the branch conditional that should satisfy the processor memory
model.  I'm not seeing the unconditional branch that you see in your example.

Your link to Compiler Explorer code generated by Clang also shows a branch
conditional

P1:                                     # @P1
        .quad   .Lfunc_begin1
        .quad   .TOC.@tocbase
        .quad   0
.Lfunc_begin1:
        addis 3, 2, y@toc@ha
        ld 3, y@toc@l(3)
        sync
        li 5, 1
        lwz 3, 0(3)
        cmpd 7, 3, 3          <*** compare
        bne- 7, .+4           <*** branch conditional
        isync                 <*** isync

What is generating or parsing or interpreting the unconditional branch shown in
your pasted code?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (9 preceding siblings ...)
  2023-08-31 18:25 ` dje at gcc dot gnu.org
@ 2023-08-31 18:54 ` luke.geeson at cs dot ucl.ac.uk
  2023-08-31 18:58 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-08-31 18:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #11 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
I'm using

`powerpc64le-linux-gnu-objdump -Dr --disassemble --section=.text
--no-show-raw-insn`. 

installed using `binutils-powerpc64le-linux-gnu` under Ubuntu 20:04 (in a
docker container running on an Arm AArch64 machine). I realise I should be
testing using trunk binutils, but I provide the objdump info regardless:
```
$ powerpc64le-linux-gnu-objdump --version
GNU objdump (GNU Binutils for Ubuntu) 2.34
Copyright (C) 2020 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) any later version.
This program has absolutely no warranty.
```
Do you have a trunk build of binutils to hand to test? Godbolt doesn't support
objdump. I will try to install binutils from source to reproduce.

I have a custom tool that parses that and turns the disassembly into the below
(The custom tool is correct as far as I can see.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (10 preceding siblings ...)
  2023-08-31 18:54 ` luke.geeson at cs dot ucl.ac.uk
@ 2023-08-31 18:58 ` pinskia at gcc dot gnu.org
  2023-08-31 19:12 ` luke.geeson at cs dot ucl.ac.uk
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-31 18:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #12 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Luke Geeson from comment #11)
> Do you have a trunk build of binutils to hand to test? Godbolt doesn't
> support objdump. I will try to install binutils from source to reproduce.

Yes it does.
Use the option "Compile to binary object" under the output option.

This is what I see with that option:
 lwz     r9,0(r9)
 cmpw    r9,r9
 bne     44 <P1+0x18>
 isync
 lis     r10,0

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (11 preceding siblings ...)
  2023-08-31 18:58 ` pinskia at gcc dot gnu.org
@ 2023-08-31 19:12 ` luke.geeson at cs dot ucl.ac.uk
  2023-08-31 20:00 ` dje at gcc dot gnu.org
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-08-31 19:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #13 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
Please bear with me whilst I figure out what has happened here.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (12 preceding siblings ...)
  2023-08-31 19:12 ` luke.geeson at cs dot ucl.ac.uk
@ 2023-08-31 20:00 ` dje at gcc dot gnu.org
  2023-08-31 20:49 ` luke.geeson at cs dot ucl.ac.uk
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: dje at gcc dot gnu.org @ 2023-08-31 20:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #14 from David Edelsohn <dje at gcc dot gnu.org> ---
The conditional branch always will proceed to the next instruction, so the code
that you showed in the PR is a correct "optimization" of the original code, but
the processor does execute the conditional branch in the original code.  Is the
simulator performing an optimization?  Something seems to have transformed the
code.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (13 preceding siblings ...)
  2023-08-31 20:00 ` dje at gcc dot gnu.org
@ 2023-08-31 20:49 ` luke.geeson at cs dot ucl.ac.uk
  2023-08-31 20:50 ` luke.geeson at cs dot ucl.ac.uk
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-08-31 20:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #15 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
I am sorry to you all - I have wasted your time. It was a bug in the
translation tool. 

In the future I will make absolutely sure that I check everything before
submitting a bug - this is an embarrassing mistake.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (14 preceding siblings ...)
  2023-08-31 20:49 ` luke.geeson at cs dot ucl.ac.uk
@ 2023-08-31 20:50 ` luke.geeson at cs dot ucl.ac.uk
  2023-08-31 20:52 ` dje at gcc dot gnu.org
  2023-08-31 20:55 ` luke.geeson at cs dot ucl.ac.uk
  17 siblings, 0 replies; 19+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-08-31 20:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

Luke Geeson <luke.geeson at cs dot ucl.ac.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |INVALID
             Status|NEW                         |RESOLVED

--- Comment #16 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
closing as invalid

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (15 preceding siblings ...)
  2023-08-31 20:50 ` luke.geeson at cs dot ucl.ac.uk
@ 2023-08-31 20:52 ` dje at gcc dot gnu.org
  2023-08-31 20:55 ` luke.geeson at cs dot ucl.ac.uk
  17 siblings, 0 replies; 19+ messages in thread
From: dje at gcc dot gnu.org @ 2023-08-31 20:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #17 from David Edelsohn <dje at gcc dot gnu.org> ---
I'm glad that we have confirmed that the GCC and LLVM code generation for
PowerPC are correct for the memory model. And now your translation tool is more
accurate.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Bug target/111246] PPC64 Sequentially Consistent Load allows Reordering of Stores
  2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
                   ` (16 preceding siblings ...)
  2023-08-31 20:52 ` dje at gcc dot gnu.org
@ 2023-08-31 20:55 ` luke.geeson at cs dot ucl.ac.uk
  17 siblings, 0 replies; 19+ messages in thread
From: luke.geeson at cs dot ucl.ac.uk @ 2023-08-31 20:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111246

--- Comment #18 from Luke Geeson <luke.geeson at cs dot ucl.ac.uk> ---
Indeed that is a plus - I have 167k concurrency tests that I feed through
LLVM/GCC -O1/2/3/fast and for GCC -Og that test compilation from the C/C++ to
PPC memory models. If you are interested I can provide them publicly (although
I understand given the above that you may like to wait until I have ensured
there are no more issues).

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2023-08-31 20:55 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-30 20:23 [Bug target/111246] New: PPC64 Sequentially Consistent Load allows Reordering of Stores luke.geeson at cs dot ucl.ac.uk
2023-08-30 20:35 ` [Bug target/111246] " luke.geeson at cs dot ucl.ac.uk
2023-08-30 20:38 ` pinskia at gcc dot gnu.org
2023-08-30 20:47 ` luke.geeson at cs dot ucl.ac.uk
2023-08-30 20:49 ` luke.geeson at cs dot ucl.ac.uk
2023-08-31 13:41 ` dje at gcc dot gnu.org
2023-08-31 13:49 ` luke.geeson at cs dot ucl.ac.uk
2023-08-31 16:23 ` dje at gcc dot gnu.org
2023-08-31 18:04 ` luke.geeson at cs dot ucl.ac.uk
2023-08-31 18:06 ` luke.geeson at cs dot ucl.ac.uk
2023-08-31 18:25 ` dje at gcc dot gnu.org
2023-08-31 18:54 ` luke.geeson at cs dot ucl.ac.uk
2023-08-31 18:58 ` pinskia at gcc dot gnu.org
2023-08-31 19:12 ` luke.geeson at cs dot ucl.ac.uk
2023-08-31 20:00 ` dje at gcc dot gnu.org
2023-08-31 20:49 ` luke.geeson at cs dot ucl.ac.uk
2023-08-31 20:50 ` luke.geeson at cs dot ucl.ac.uk
2023-08-31 20:52 ` dje at gcc dot gnu.org
2023-08-31 20:55 ` luke.geeson at cs dot ucl.ac.uk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).