[PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64
@ 2024-02-28 12:23 Kito Cheng
  2024-02-28 14:57 ` Jeff Law
  0 siblings, 1 reply; 6+ messages in thread
From: Kito Cheng @ 2024-02-28 12:23 UTC (permalink / raw)
  To: gcc-patches, kito.cheng, palmer, pinskia, patrick, jeffreyalaw; +Cc: Kito Cheng

atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic operation on
RV64, however lr.w is doing sign extend to DI and compare instruction only have
DI mode on RV64, so the expected value should be sign extend before compare as
well, so that we can get right compare result.

gcc/ChangeLog:

	PR target/114130
	* config/riscv/sync.md (atomic_compare_and_swap<mode>): Sign
	extend the expected value if needed.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/pr114130.c: New.
---
 gcc/config/riscv/sync.md                  |  9 +++++++++
 gcc/testsuite/gcc.target/riscv/pr114130.c | 12 ++++++++++++
 2 files changed, 21 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr114130.c

diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
index 54bb0a66518..6f0b5aae08d 100644
--- a/gcc/config/riscv/sync.md
+++ b/gcc/config/riscv/sync.md
@@ -353,6 +353,15 @@
    (match_operand:SI 7 "const_int_operand" "")] ;; mod_f
   "TARGET_ATOMIC"
 {
+  if (word_mode != <MODE>mode && operands[3] != const0_rtx)
+    {
+      /* We don't have SI mode compare on RV64, so we need to make sure expected
+	 value is sign-extended.  */
+      rtx tmp0 = gen_reg_rtx (word_mode);
+      emit_insn (gen_extend_insn (tmp0, operands[3], word_mode, <MODE>mode, 0));
+      operands[3] = simplify_gen_subreg (<MODE>mode, tmp0, word_mode, 0);
+    }
+
   emit_insn (gen_atomic_cas_value_strong<mode> (operands[1], operands[2],
 						operands[3], operands[4],
 						operands[6], operands[7]));
diff --git a/gcc/testsuite/gcc.target/riscv/pr114130.c b/gcc/testsuite/gcc.target/riscv/pr114130.c
new file mode 100644
index 00000000000..647e27dab32
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr114130.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64" } */
+#include <stdint-gcc.h>
+
+void foo(uint32_t *p) {
+    uintptr_t x = *(uintptr_t *)p;
+    uint32_t e = !p ? 0 : (uintptr_t)p >> 1;
+    uint32_t d = (uintptr_t)x;
+    __atomic_compare_exchange(p, &e, &d, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
+}
+
+/* { dg-final { scan-assembler-bound {sext.w\t} >= 1 } } */
-- 
2.34.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64
  2024-02-28 12:23 [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64 Kito Cheng
@ 2024-02-28 14:57 ` Jeff Law
  2024-02-28 15:02   ` Palmer Dabbelt
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff Law @ 2024-02-28 14:57 UTC (permalink / raw)
  To: Kito Cheng, gcc-patches, kito.cheng, palmer, pinskia, patrick



On 2/28/24 05:23, Kito Cheng wrote:
> atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic operation on
> RV64, however lr.w is doing sign extend to DI and compare instruction only have
> DI mode on RV64, so the expected value should be sign extend before compare as
> well, so that we can get right compare result.
> 
> gcc/ChangeLog:
> 
> 	PR target/114130
> 	* config/riscv/sync.md (atomic_compare_and_swap<mode>): Sign
> 	extend the expected value if needed.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/pr114130.c: New.
Nearly rejected this as I think the description was a bit ambiguous and 
I thought you were extending the result of the lr.w.  But it's actually 
the other value you're ensuring gets properly extended.

OK.

Jeff


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64
  2024-02-28 14:57 ` Jeff Law
@ 2024-02-28 15:02   ` Palmer Dabbelt
  2024-02-28 17:36     ` Patrick O'Neill
  0 siblings, 1 reply; 6+ messages in thread
From: Palmer Dabbelt @ 2024-02-28 15:02 UTC (permalink / raw)
  To: jeffreyalaw
  Cc: kito.cheng, gcc-patches, Kito Cheng, pinskia, Patrick O'Neill

On Wed, 28 Feb 2024 06:57:53 PST (-0800), jeffreyalaw@gmail.com wrote:
>
>
> On 2/28/24 05:23, Kito Cheng wrote:
>> atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic operation on
>> RV64, however lr.w is doing sign extend to DI and compare instruction only have
>> DI mode on RV64, so the expected value should be sign extend before compare as
>> well, so that we can get right compare result.
>>
>> gcc/ChangeLog:
>>
>> 	PR target/114130
>> 	* config/riscv/sync.md (atomic_compare_and_swap<mode>): Sign
>> 	extend the expected value if needed.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	* gcc.target/riscv/pr114130.c: New.
> Nearly rejected this as I think the description was a bit ambiguous and
> I thought you were extending the result of the lr.w.  But it's actually
> the other value you're ensuring gets properly extended.

I had the same response, but after reading it I'm not quite sure how to 
say it better.

> OK.

I was looking at the code to try and ask if we have the same bug for the 
short inline CAS routines, but I've got to run to some meetings...

>
> Jeff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64
  2024-02-28 15:02   ` Palmer Dabbelt
@ 2024-02-28 17:36     ` Patrick O'Neill
  2024-02-28 20:27       ` Palmer Dabbelt
  0 siblings, 1 reply; 6+ messages in thread
From: Patrick O'Neill @ 2024-02-28 17:36 UTC (permalink / raw)
  To: Palmer Dabbelt; +Cc: kito.cheng, gcc-patches, Kito Cheng, pinskia, jeffreyalaw


On 2/28/24 07:02, Palmer Dabbelt wrote:
> On Wed, 28 Feb 2024 06:57:53 PST (-0800), jeffreyalaw@gmail.com wrote:
>>
>>
>> On 2/28/24 05:23, Kito Cheng wrote:
>>> atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic 
>>> operation on
>>> RV64, however lr.w is doing sign extend to DI and compare 
>>> instruction only have
>>> DI mode on RV64, so the expected value should be sign extend before 
>>> compare as
>>> well, so that we can get right compare result.
>>>
>>> gcc/ChangeLog:
>>>
>>>     PR target/114130
>>>     * config/riscv/sync.md (atomic_compare_and_swap<mode>): Sign
>>>     extend the expected value if needed.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>     * gcc.target/riscv/pr114130.c: New.
>> Nearly rejected this as I think the description was a bit ambiguous and
>> I thought you were extending the result of the lr.w.  But it's actually
>> the other value you're ensuring gets properly extended.
>
> I had the same response, but after reading it I'm not quite sure how 
> to say it better.
>
>> OK.
>
> I was looking at the code to try and ask if we have the same bug for 
> the short inline CAS routines, but I've got to run to some meetings...

I don't think subword AMO CAS is impacted.

As part of the CAS we mask both the expected value [2] and the retrieved 
value[1] before comparing.

- Patrick

[1]: 
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/riscv/sync.md;h=54bb0a66518ae353fa4ed640339213bf5da6682c;hb=refs/heads/master#l495
[2]: 
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/riscv/sync.md;h=54bb0a66518ae353fa4ed640339213bf5da6682c;hb=refs/heads/master#l459

>
>>
>> Jeff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64
  2024-02-28 17:36     ` Patrick O'Neill
@ 2024-02-28 20:27       ` Palmer Dabbelt
  2024-02-29  3:07         ` Kito Cheng
  0 siblings, 1 reply; 6+ messages in thread
From: Palmer Dabbelt @ 2024-02-28 20:27 UTC (permalink / raw)
  To: Patrick O'Neill
  Cc: kito.cheng, gcc-patches, Kito Cheng, pinskia, jeffreyalaw

On Wed, 28 Feb 2024 09:36:38 PST (-0800), Patrick O'Neill wrote:
>
> On 2/28/24 07:02, Palmer Dabbelt wrote:
>> On Wed, 28 Feb 2024 06:57:53 PST (-0800), jeffreyalaw@gmail.com wrote:
>>>
>>>
>>> On 2/28/24 05:23, Kito Cheng wrote:
>>>> atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic
>>>> operation on
>>>> RV64, however lr.w is doing sign extend to DI and compare
>>>> instruction only have
>>>> DI mode on RV64, so the expected value should be sign extend before
>>>> compare as
>>>> well, so that we can get right compare result.
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>     PR target/114130
>>>>     * config/riscv/sync.md (atomic_compare_and_swap<mode>): Sign
>>>>     extend the expected value if needed.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>>     * gcc.target/riscv/pr114130.c: New.
>>> Nearly rejected this as I think the description was a bit ambiguous and
>>> I thought you were extending the result of the lr.w.  But it's actually
>>> the other value you're ensuring gets properly extended.
>>
>> I had the same response, but after reading it I'm not quite sure how
>> to say it better.

Maybe something like

    atomic_compare_and_swapsi will use lr.w to do obtain the original value, 
    which sign extends to DI.  RV64 only has DI comparisons, so we also need 
    to sign extend the expected value to DI as otherwise the comparison will 
    fail when the expected value has the 32nd bit set.

would do it?  Either way

Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>

as I've managed to convince myself it's correct.  We should probably 
backport this one, the bug has likely been around for a while.

>>
>>> OK.
>>
>> I was looking at the code to try and ask if we have the same bug for
>> the short inline CAS routines, but I've got to run to some meetings...
>
> I don't think subword AMO CAS is impacted.
>
> As part of the CAS we mask both the expected value [2] and the retrieved
> value[1] before comparing.

I'm always a bit lost when it comes to bit arithmetic, but I think it's 
OK.  It smells like it's being a little loose with the 
extensions/comparisons, but just looking at some generated code for this 
simple case:

    void foo(uint16_t *p, uint16_t *e, uint16_t *d) {
        __atomic_compare_exchange(p, e, d, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
    }

    foo:
            lhu     a3,0(a2)
            lhu     a2,0(a1)
            andi    a4,a0,3
            li      a5,65536
            slliw   a4,a4,3
            addiw   a5,a5,-1
            sllw    a5,a5,a4
            sllw    a3,a3,a4
            sllw    a7,a2,a4
            andi    a0,a0,-4
            and     a3,a3,a5
            not     t1,a5
            and     a7,a7,a5
            1:
            lr.w    a6, 0(a0)
            and     t3, a6, a5    // Both a6 (from the lr.w) and a5 
				  // (from the sllw) are sign extended, 
				  // so the result in t3 is sign extended.
            bne     t3, a7, 1f    // a7 is also sign extended (before 
				  // and after the masking above), so 
				  // it's safe for comparison
            and     t3, a6, t1
            or      t3, t3, a3
            sc.w    t3, t3, 0(a0) // The top bits of t3 end up polluted 
	                          // with sign extension, but it doesn't 
				  // matter because of the sc.w.
            bnez    t3, 1b
            1:
            sraw    a6,a6,a4
            slliw   a2,a2,16
            slliw   a5,a6,16
            sraiw   a2,a2,16
            sraiw   a5,a5,16
            subw    a5,a5,a2
            beq     a5,zero,.L1
            sh      a6,0(a1)
    .L1:
            ret

So I think we're OK -- that masking of a7 looks redundant here, but I 
don't think we could get away with just

    diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
    index 54bb0a66518..15956940032 100644
    --- a/gcc/config/riscv/sync.md
    +++ b/gcc/config/riscv/sync.md
    @@ -456,7 +456,6 @@ (define_expand "atomic_cas_value_strong<mode>"
       riscv_lshift_subword (<MODE>mode, o, shift, &shifted_o);
       riscv_lshift_subword (<MODE>mode, n, shift, &shifted_n);
    
    -  emit_move_insn (shifted_o, gen_rtx_AND (SImode, shifted_o, mask));
       emit_move_insn (shifted_n, gen_rtx_AND (SImode, shifted_n, mask));
    
       enum memmodel model_success = (enum memmodel) INTVAL (operands[4]);

because we'd need the masking for when we don't know the high bits are 
safe pre-shift.  So maybe some sort of simplify call could help out 
there, but I bet it's not really worth bothering -- the bookeeping 
doesn't generally matter that much around AMOs.

> - Patrick
>
> [1]:
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/riscv/sync.md;h=54bb0a66518ae353fa4ed640339213bf5da6682c;hb=refs/heads/master#l495
> [2]:
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/riscv/sync.md;h=54bb0a66518ae353fa4ed640339213bf5da6682c;hb=refs/heads/master#l459
>
>>
>>>
>>> Jeff

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64
  2024-02-28 20:27       ` Palmer Dabbelt
@ 2024-02-29  3:07         ` Kito Cheng
  0 siblings, 0 replies; 6+ messages in thread
From: Kito Cheng @ 2024-02-29  3:07 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: Patrick O'Neill, kito.cheng, gcc-patches, pinskia, jeffreyalaw

Committed with Palmer's suggestions for the commit message, also I
plan to back port that to 11, 12 and 13 release branches :)

On Thu, Feb 29, 2024 at 4:27 AM Palmer Dabbelt <palmer@dabbelt.com> wrote:
>
> On Wed, 28 Feb 2024 09:36:38 PST (-0800), Patrick O'Neill wrote:
> >
> > On 2/28/24 07:02, Palmer Dabbelt wrote:
> >> On Wed, 28 Feb 2024 06:57:53 PST (-0800), jeffreyalaw@gmail.com wrote:
> >>>
> >>>
> >>> On 2/28/24 05:23, Kito Cheng wrote:
> >>>> atomic_compare_and_swapsi will use lr.w and sc.w to do the atomic
> >>>> operation on
> >>>> RV64, however lr.w is doing sign extend to DI and compare
> >>>> instruction only have
> >>>> DI mode on RV64, so the expected value should be sign extend before
> >>>> compare as
> >>>> well, so that we can get right compare result.
> >>>>
> >>>> gcc/ChangeLog:
> >>>>
> >>>>     PR target/114130
> >>>>     * config/riscv/sync.md (atomic_compare_and_swap<mode>): Sign
> >>>>     extend the expected value if needed.
> >>>>
> >>>> gcc/testsuite/ChangeLog:
> >>>>
> >>>>     * gcc.target/riscv/pr114130.c: New.
> >>> Nearly rejected this as I think the description was a bit ambiguous and
> >>> I thought you were extending the result of the lr.w.  But it's actually
> >>> the other value you're ensuring gets properly extended.
> >>
> >> I had the same response, but after reading it I'm not quite sure how
> >> to say it better.
>
> Maybe something like
>
>     atomic_compare_and_swapsi will use lr.w to do obtain the original value,
>     which sign extends to DI.  RV64 only has DI comparisons, so we also need
>     to sign extend the expected value to DI as otherwise the comparison will
>     fail when the expected value has the 32nd bit set.
>
> would do it?  Either way
>
> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
>
> as I've managed to convince myself it's correct.  We should probably
> backport this one, the bug has likely been around for a while.
>
> >>
> >>> OK.
> >>
> >> I was looking at the code to try and ask if we have the same bug for
> >> the short inline CAS routines, but I've got to run to some meetings...
> >
> > I don't think subword AMO CAS is impacted.
> >
> > As part of the CAS we mask both the expected value [2] and the retrieved
> > value[1] before comparing.
>
> I'm always a bit lost when it comes to bit arithmetic, but I think it's
> OK.  It smells like it's being a little loose with the
> extensions/comparisons, but just looking at some generated code for this
> simple case:
>
>     void foo(uint16_t *p, uint16_t *e, uint16_t *d) {
>         __atomic_compare_exchange(p, e, d, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
>     }
>
>     foo:
>             lhu     a3,0(a2)
>             lhu     a2,0(a1)
>             andi    a4,a0,3
>             li      a5,65536
>             slliw   a4,a4,3
>             addiw   a5,a5,-1
>             sllw    a5,a5,a4
>             sllw    a3,a3,a4
>             sllw    a7,a2,a4
>             andi    a0,a0,-4
>             and     a3,a3,a5
>             not     t1,a5
>             and     a7,a7,a5
>             1:
>             lr.w    a6, 0(a0)
>             and     t3, a6, a5    // Both a6 (from the lr.w) and a5
>                                   // (from the sllw) are sign extended,
>                                   // so the result in t3 is sign extended.
>             bne     t3, a7, 1f    // a7 is also sign extended (before
>                                   // and after the masking above), so
>                                   // it's safe for comparison
>             and     t3, a6, t1
>             or      t3, t3, a3
>             sc.w    t3, t3, 0(a0) // The top bits of t3 end up polluted
>                                   // with sign extension, but it doesn't
>                                   // matter because of the sc.w.
>             bnez    t3, 1b
>             1:
>             sraw    a6,a6,a4
>             slliw   a2,a2,16
>             slliw   a5,a6,16
>             sraiw   a2,a2,16
>             sraiw   a5,a5,16
>             subw    a5,a5,a2
>             beq     a5,zero,.L1
>             sh      a6,0(a1)
>     .L1:
>             ret
>
> So I think we're OK -- that masking of a7 looks redundant here, but I
> don't think we could get away with just
>
>     diff --git a/gcc/config/riscv/sync.md b/gcc/config/riscv/sync.md
>     index 54bb0a66518..15956940032 100644
>     --- a/gcc/config/riscv/sync.md
>     +++ b/gcc/config/riscv/sync.md
>     @@ -456,7 +456,6 @@ (define_expand "atomic_cas_value_strong<mode>"
>        riscv_lshift_subword (<MODE>mode, o, shift, &shifted_o);
>        riscv_lshift_subword (<MODE>mode, n, shift, &shifted_n);
>
>     -  emit_move_insn (shifted_o, gen_rtx_AND (SImode, shifted_o, mask));
>        emit_move_insn (shifted_n, gen_rtx_AND (SImode, shifted_n, mask));
>
>        enum memmodel model_success = (enum memmodel) INTVAL (operands[4]);
>
> because we'd need the masking for when we don't know the high bits are
> safe pre-shift.  So maybe some sort of simplify call could help out
> there, but I bet it's not really worth bothering -- the bookeeping
> doesn't generally matter that much around AMOs.
>
> > - Patrick
> >
> > [1]:
> > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/riscv/sync.md;h=54bb0a66518ae353fa4ed640339213bf5da6682c;hb=refs/heads/master#l495
> > [2]:
> > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/config/riscv/sync.md;h=54bb0a66518ae353fa4ed640339213bf5da6682c;hb=refs/heads/master#l459
> >
> >>
> >>>
> >>> Jeff

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-02-29  3:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-28 12:23 [PATCH] RISC-V: Fix __atomic_compare_exchange with 32 bit value on RV64 Kito Cheng
2024-02-28 14:57 ` Jeff Law
2024-02-28 15:02   ` Palmer Dabbelt
2024-02-28 17:36     ` Patrick O'Neill
2024-02-28 20:27       ` Palmer Dabbelt
2024-02-29  3:07         ` Kito Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).