Is there a way to tell GCC not to reorder a specific instruction?

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Is there a way to tell GCC not to reorder a specific instruction?
@ 2020-09-29 10:47 夏 晋
  2020-09-29 12:00 ` Richard Biener
  2020-09-29 19:45 ` Jim Wilson
  0 siblings, 2 replies; 10+ messages in thread
From: 夏 晋 @ 2020-09-29 10:47 UTC (permalink / raw)
  To: gcc

Hi everyone,
I tried to set the "vlen" after the add & multi, as shown in the following code:
➜
vf32 x3,x4;
void foo1(float16_t* input, float16_t* output, int vlen){
    vf32 add = x3 + x4;
    vf32 mul = x3 * x4;
    __builtin_riscv_vlen(vlen);  //<----
    storevf(&output[0], add);
    storevf(&output[4], mul);
}
but after compilation, the "vlen" is reordered:
➜
foo1:
    lui     a5,%hi(.LANCHOR0)
    addi    a5,a5,%lo(.LANCHOR0)
    addi    a4,a5,64
    vfld    v0,a5
    vfld    v1,a4
    csrw    vlen,a2  //<----
    vfadd   v2,v0,v1
    addi    a5,a1,8
    vfmul   v0,v0,v1
    vfst    v2,a1
    vfst    v0,a5
    ret
And I've tried to add some barrier code shown as the following:
➜
#define barrier() __asm__ __volatile__("": : :"memory")
vf32 x3,x4;
void foo1(float16_t* input, float16_t* output, int vlen){
    vf32 add = x3 + x4;
    vf32 mul = x3 * x4;
    barrier();
    __builtin_riscv_vlen(vlen);
    barrier();
    storevf(&output[0], add);
    storevf(&output[4], mul);
}
➜
vf32 x3,x4;
void foo1(float16_t* input, float16_t* output, int vlen){
    vf32 add = x3 + x4;
    vf32 mul = x3 * x4;
    __asm__ __volatile__ ("csrw\tvlen,%0" : : "rJ"(vlen) : "memory");
    storevf(&output[0], add);
    storevf(&output[4], mul);
}
Both methods compiled out the same false assembly.
=======
But if I tried the code like: (add & multi are using different operands)
➜
vf32 x1,x2;
vf32 x3,x4;
void foo1(float16_t* input, float16_t* output, int vlen){
    vf32 add = x3 + x4;
    vf32 mul = x1 * x2;
    __builtin_riscv_vlen(vlen);
    storevf(&output[0], add);
    storevf(&output[4], mul);
}
the assembly will be right:
➜
foo1:
    lui     a5,%hi(.LANCHOR0)
    addi    a5,a5,%lo(.LANCHOR0)
    addi    a0,a5,64
    addi    a3,a5,128
    addi    a4,a5,192
    vfld    v1,a5
    vfld    v3,a0
    vfld    v0,a3
    vfld    v2,a4
    vfadd   v1,v1,v3
    vfmul   v0,v0,v2
    csrw    vlen,a2  <----
    addi    a5,a1,8
    vfst    v1,a1
    vfst    v0,a5
    ret

Is there any other way for coding or other option for gcc compilation to deal with this issue.
Any suggestion would be appreciated. Thank you very much!

Best,
Jin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a way to tell GCC not to reorder a specific instruction?
  2020-09-29 10:47 Is there a way to tell GCC not to reorder a specific instruction? 夏 晋
@ 2020-09-29 12:00 ` Richard Biener
  2020-09-29 13:59   ` 回复: " 夏 晋
  2020-09-29 19:45 ` Jim Wilson
  1 sibling, 1 reply; 10+ messages in thread
From: Richard Biener @ 2020-09-29 12:00 UTC (permalink / raw)
  To: 夏 晋; +Cc: gcc

On Tue, Sep 29, 2020 at 12:55 PM 夏 晋 via Gcc <gcc@gcc.gnu.org> wrote:
>
> Hi everyone,
> I tried to set the "vlen" after the add & multi, as shown in the following code:
> ➜
> vf32 x3,x4;
> void foo1(float16_t* input, float16_t* output, int vlen){
>     vf32 add = x3 + x4;
>     vf32 mul = x3 * x4;
>     __builtin_riscv_vlen(vlen);  //<----
>     storevf(&output[0], add);
>     storevf(&output[4], mul);
> }
> but after compilation, the "vlen" is reordered:
> ➜
> foo1:
>     lui     a5,%hi(.LANCHOR0)
>     addi    a5,a5,%lo(.LANCHOR0)
>     addi    a4,a5,64
>     vfld    v0,a5
>     vfld    v1,a4
>     csrw    vlen,a2  //<----
>     vfadd   v2,v0,v1
>     addi    a5,a1,8
>     vfmul   v0,v0,v1
>     vfst    v2,a1
>     vfst    v0,a5
>     ret
> And I've tried to add some barrier code shown as the following:
> ➜
> #define barrier() __asm__ __volatile__("": : :"memory")
> vf32 x3,x4;
> void foo1(float16_t* input, float16_t* output, int vlen){
>     vf32 add = x3 + x4;
>     vf32 mul = x3 * x4;
>     barrier();
>     __builtin_riscv_vlen(vlen);
>     barrier();
>     storevf(&output[0], add);
>     storevf(&output[4], mul);
> }
> ➜
> vf32 x3,x4;
> void foo1(float16_t* input, float16_t* output, int vlen){
>     vf32 add = x3 + x4;
>     vf32 mul = x3 * x4;
>     __asm__ __volatile__ ("csrw\tvlen,%0" : : "rJ"(vlen) : "memory");
>     storevf(&output[0], add);
>     storevf(&output[4], mul);
> }
> Both methods compiled out the same false assembly.
> =======
> But if I tried the code like: (add & multi are using different operands)
> ➜
> vf32 x1,x2;
> vf32 x3,x4;
> void foo1(float16_t* input, float16_t* output, int vlen){
>     vf32 add = x3 + x4;
>     vf32 mul = x1 * x2;
>     __builtin_riscv_vlen(vlen);
>     storevf(&output[0], add);
>     storevf(&output[4], mul);
> }
> the assembly will be right:
> ➜
> foo1:
>     lui     a5,%hi(.LANCHOR0)
>     addi    a5,a5,%lo(.LANCHOR0)
>     addi    a0,a5,64
>     addi    a3,a5,128
>     addi    a4,a5,192
>     vfld    v1,a5
>     vfld    v3,a0
>     vfld    v0,a3
>     vfld    v2,a4
>     vfadd   v1,v1,v3
>     vfmul   v0,v0,v2
>     csrw    vlen,a2  <----
>     addi    a5,a1,8
>     vfst    v1,a1
>     vfst    v0,a5
>     ret
>
> Is there any other way for coding or other option for gcc compilation to deal with this issue.
> Any suggestion would be appreciated. Thank you very much!

You need to present GCC with a data dependence that prevents the re-ordering
for example by adding input/outputs for add/mul like

asm volatile ("crsw\tvlen, %0" : "=r" (add), "=r" (mul) : "0" (add),
"0" (mul), "rJ" (vlen));

Richard.

> Best,
> Jin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* 回复: Is there a way to tell GCC not to reorder a specific instruction?
  2020-09-29 12:00 ` Richard Biener
@ 2020-09-29 13:59   ` 夏 晋
  0 siblings, 0 replies; 10+ messages in thread
From: 夏 晋 @ 2020-09-29 13:59 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc

Hi Richard,
Thank you for your reply.
If we use < asm volatile ("crsw\tvlen, %0" : "=r" (add), "=r" (mul) : "0" (add),"0" (mul), "rJ" (vlen)); > we need to know all the calculation steps before the "vlen" setting.
For example we add a "sub" like:
vf32 sub = x3 - x4;
at this time, we need to add "0" (sub) to the dependency.
Is there a way that we can ignore the steps before the "vlen"? so that we can make it more universal.

Best,
Jin
________________________________
发件人: Richard Biener <richard.guenther@gmail.com>
发送时间: 2020年9月29日 20:00
收件人: 夏 晋 <ilyply2006@hotmail.com>
抄送: gcc@gcc.gnu.org <gcc@gcc.gnu.org>
主题: Re: Is there a way to tell GCC not to reorder a specific instruction?

On Tue, Sep 29, 2020 at 12:55 PM 夏 晋 via Gcc <gcc@gcc.gnu.org> wrote:
>
> Hi everyone,
> I tried to set the "vlen" after the add & multi, as shown in the following code:
> ➜
> vf32 x3,x4;
> void foo1(float16_t* input, float16_t* output, int vlen){
>     vf32 add = x3 + x4;
>     vf32 mul = x3 * x4;
>     __builtin_riscv_vlen(vlen);  //<----
>     storevf(&output[0], add);
>     storevf(&output[4], mul);
> }
> but after compilation, the "vlen" is reordered:
> ➜
> foo1:
>     lui     a5,%hi(.LANCHOR0)
>     addi    a5,a5,%lo(.LANCHOR0)
>     addi    a4,a5,64
>     vfld    v0,a5
>     vfld    v1,a4
>     csrw    vlen,a2  //<----
>     vfadd   v2,v0,v1
>     addi    a5,a1,8
>     vfmul   v0,v0,v1
>     vfst    v2,a1
>     vfst    v0,a5
>     ret
> And I've tried to add some barrier code shown as the following:
> ➜
> #define barrier() __asm__ __volatile__("": : :"memory")
> vf32 x3,x4;
> void foo1(float16_t* input, float16_t* output, int vlen){
>     vf32 add = x3 + x4;
>     vf32 mul = x3 * x4;
>     barrier();
>     __builtin_riscv_vlen(vlen);
>     barrier();
>     storevf(&output[0], add);
>     storevf(&output[4], mul);
> }
> ➜
> vf32 x3,x4;
> void foo1(float16_t* input, float16_t* output, int vlen){
>     vf32 add = x3 + x4;
>     vf32 mul = x3 * x4;
>     __asm__ __volatile__ ("csrw\tvlen,%0" : : "rJ"(vlen) : "memory");
>     storevf(&output[0], add);
>     storevf(&output[4], mul);
> }
> Both methods compiled out the same false assembly.
> =======
> But if I tried the code like: (add & multi are using different operands)
> ➜
> vf32 x1,x2;
> vf32 x3,x4;
> void foo1(float16_t* input, float16_t* output, int vlen){
>     vf32 add = x3 + x4;
>     vf32 mul = x1 * x2;
>     __builtin_riscv_vlen(vlen);
>     storevf(&output[0], add);
>     storevf(&output[4], mul);
> }
> the assembly will be right:
> ➜
> foo1:
>     lui     a5,%hi(.LANCHOR0)
>     addi    a5,a5,%lo(.LANCHOR0)
>     addi    a0,a5,64
>     addi    a3,a5,128
>     addi    a4,a5,192
>     vfld    v1,a5
>     vfld    v3,a0
>     vfld    v0,a3
>     vfld    v2,a4
>     vfadd   v1,v1,v3
>     vfmul   v0,v0,v2
>     csrw    vlen,a2  <----
>     addi    a5,a1,8
>     vfst    v1,a1
>     vfst    v0,a5
>     ret
>
> Is there any other way for coding or other option for gcc compilation to deal with this issue.
> Any suggestion would be appreciated. Thank you very much!

You need to present GCC with a data dependence that prevents the re-ordering
for example by adding input/outputs for add/mul like

asm volatile ("crsw\tvlen, %0" : "=r" (add), "=r" (mul) : "0" (add),
"0" (mul), "rJ" (vlen));

Richard.

> Best,
> Jin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a way to tell GCC not to reorder a specific instruction?
  2020-09-29 10:47 Is there a way to tell GCC not to reorder a specific instruction? 夏 晋
  2020-09-29 12:00 ` Richard Biener
@ 2020-09-29 19:45 ` Jim Wilson
  2020-09-30  2:22   ` 回复: " 夏 晋
  2020-09-30  6:40   ` Richard Biener
  1 sibling, 2 replies; 10+ messages in thread
From: Jim Wilson @ 2020-09-29 19:45 UTC (permalink / raw)
  To: 夏 晋; +Cc: gcc

On Tue, Sep 29, 2020 at 3:47 AM 夏 晋 via Gcc <gcc@gcc.gnu.org> wrote:
> I tried to set the "vlen" after the add & multi, as shown in the following code:

> vf32 x3,x4;
> void foo1(float16_t* input, float16_t* output, int vlen){
>     vf32 add = x3 + x4;
>     vf32 mul = x3 * x4;
>     __builtin_riscv_vlen(vlen);  //<----
>     storevf(&output[0], add);
>     storevf(&output[4], mul);
> }

Not clear what __builtin_riscv_vlen is doing, or what exactly your
target is, but the gcc port I did for the RISC-V draft V extension
creates new fake vector type and vector length registers, like the
existing fake fp and arg pointer registers, and the vsetvl{i}
instruction sets the fake vector type and vector length registers, and
all vector instructions read the fake vector type and vector length
registers.  That creates the dependence between the instructions that
prevents reordering.  It is a little more complicated than that, as
you can have more than one vsetvl{i} instruction setting different
vector type and/or vector length values, so we have to match on the
expected values to make sure that vector instructions are tied to the
right vsetvl{i} instruction.  This is a work in progress, but overall
it is working pretty well.  This requires changes to the gcc port, as
you have to add the new fake registers in gcc/config/riscv/riscv.h.
This isn't something you can do with macros and extended asms.

See for instance
    https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/Krhw8--wmi4/m/-3IPvT7JCgAJ

Jim

^ permalink raw reply	[flat|nested] 10+ messages in thread

* 回复: Is there a way to tell GCC not to reorder a specific instruction?
  2020-09-29 19:45 ` Jim Wilson
@ 2020-09-30  2:22   ` 夏 晋
  2020-09-30 19:44     ` Jim Wilson
  2020-09-30  6:40   ` Richard Biener
  1 sibling, 1 reply; 10+ messages in thread
From: 夏 晋 @ 2020-09-30  2:22 UTC (permalink / raw)
  To: Jim Wilson; +Cc: gcc

Hi Jim,
Thank you for your reply. I've tried the following code on GCC for RVV extendsion, still met the same issue.
➜
vint16m1_t foo3(vint16m1_t a, vint16m1_t b){
  vint16m1_t add = a+b;
  vint16m1_t mul = a*b;
  vsetvl_e8m1(32);
  return add + mul;
}

the assembly is:
➜
foo3:
        li      a4,32
        vl1r.v  v1,0(a1)
        vl1r.v  v3,0(a2)
        vsetvli a4,a4,e8,m1
        vsetvli x0,x0,e16,m1
        vadd.vv v2,v1,v3
        vmul.vv v1,v1,v3
        vadd.vv v1,v2,v1
        vs1r.v  v1,0(a0)
        ret

Unfortunately, the "vsetvl_e8m1" has been reordered.
Have you ever encountered this problem? Has it been solved and how? Thanks again.

Best,
Jin
________________________________
发件人: Jim Wilson <jimw@sifive.com>
发送时间: 2020年9月30日 3:45
收件人: 夏 晋 <ilyply2006@hotmail.com>
抄送: gcc@gcc.gnu.org <gcc@gcc.gnu.org>
主题: Re: Is there a way to tell GCC not to reorder a specific instruction?

On Tue, Sep 29, 2020 at 3:47 AM 夏 晋 via Gcc <gcc@gcc.gnu.org> wrote:
> I tried to set the "vlen" after the add & multi, as shown in the following code:

> vf32 x3,x4;
> void foo1(float16_t* input, float16_t* output, int vlen){
>     vf32 add = x3 + x4;
>     vf32 mul = x3 * x4;
>     __builtin_riscv_vlen(vlen);  //<----
>     storevf(&output[0], add);
>     storevf(&output[4], mul);
> }

Not clear what __builtin_riscv_vlen is doing, or what exactly your
target is, but the gcc port I did for the RISC-V draft V extension
creates new fake vector type and vector length registers, like the
existing fake fp and arg pointer registers, and the vsetvl{i}
instruction sets the fake vector type and vector length registers, and
all vector instructions read the fake vector type and vector length
registers.  That creates the dependence between the instructions that
prevents reordering.  It is a little more complicated than that, as
you can have more than one vsetvl{i} instruction setting different
vector type and/or vector length values, so we have to match on the
expected values to make sure that vector instructions are tied to the
right vsetvl{i} instruction.  This is a work in progress, but overall
it is working pretty well.  This requires changes to the gcc port, as
you have to add the new fake registers in gcc/config/riscv/riscv.h.
This isn't something you can do with macros and extended asms.

See for instance
    https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/Krhw8--wmi4/m/-3IPvT7JCgAJ

Jim

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a way to tell GCC not to reorder a specific instruction?
  2020-09-29 19:45 ` Jim Wilson
  2020-09-30  2:22   ` 回复: " 夏 晋
@ 2020-09-30  6:40   ` Richard Biener
  2020-09-30 20:01     ` Jim Wilson
  1 sibling, 1 reply; 10+ messages in thread
From: Richard Biener @ 2020-09-30  6:40 UTC (permalink / raw)
  To: Jim Wilson; +Cc: 夏 晋, gcc

On Tue, Sep 29, 2020 at 9:46 PM Jim Wilson <jimw@sifive.com> wrote:
>
> On Tue, Sep 29, 2020 at 3:47 AM 夏 晋 via Gcc <gcc@gcc.gnu.org> wrote:
> > I tried to set the "vlen" after the add & multi, as shown in the following code:
>
> > vf32 x3,x4;
> > void foo1(float16_t* input, float16_t* output, int vlen){
> >     vf32 add = x3 + x4;
> >     vf32 mul = x3 * x4;
> >     __builtin_riscv_vlen(vlen);  //<----
> >     storevf(&output[0], add);
> >     storevf(&output[4], mul);
> > }
>
> Not clear what __builtin_riscv_vlen is doing, or what exactly your
> target is, but the gcc port I did for the RISC-V draft V extension
> creates new fake vector type and vector length registers, like the
> existing fake fp and arg pointer registers, and the vsetvl{i}
> instruction sets the fake vector type and vector length registers, and
> all vector instructions read the fake vector type and vector length
> registers.  That creates the dependence between the instructions that
> prevents reordering.  It is a little more complicated than that, as
> you can have more than one vsetvl{i} instruction setting different
> vector type and/or vector length values, so we have to match on the
> expected values to make sure that vector instructions are tied to the
> right vsetvl{i} instruction.  This is a work in progress, but overall
> it is working pretty well.  This requires changes to the gcc port, as
> you have to add the new fake registers in gcc/config/riscv/riscv.h.
> This isn't something you can do with macros and extended asms.

But this also doesn't work on GIMPLE.  On GIMPLE riscv_vlen would
be a barrier for code motion if you make it __attribute__((returns_twice))
since then abnormal edges distort the CFG in a way preventing such motion.

> See for instance
>     https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/Krhw8--wmi4/m/-3IPvT7JCgAJ
>
> Jim

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a way to tell GCC not to reorder a specific instruction?
  2020-09-30  2:22   ` 回复: " 夏 晋
@ 2020-09-30 19:44     ` Jim Wilson
  0 siblings, 0 replies; 10+ messages in thread
From: Jim Wilson @ 2020-09-30 19:44 UTC (permalink / raw)
  To: 夏 晋; +Cc: gcc

On Tue, Sep 29, 2020 at 7:22 PM 夏 晋 <ilyply2006@hotmail.com> wrote:
> vint16m1_t foo3(vint16m1_t a, vint16m1_t b){
>   vint16m1_t add = a+b;
>   vint16m1_t mul = a*b;
>   vsetvl_e8m1(32);
>   return add + mul;
> }

Taking another look at your example, you have type confusion.  Using
vsetvl to specify an element width of 8 does not magically convert
types into 8-bit vector types.  They are still 16-bit vector types and
will still result in 16-bit vector operations.  So your explicit
vsetvl_e8m1 is completely useless.

In the RISC-V V scheme, every vector operation emits an implicit
vsetvl instruction, and then we optimize away the redundant ones.  So
the add and mul at the start are emitting two vsetvl instructions.
Then you have an explicit vsetvl.  Then another add, which will emit
another implicit vsetvl.  The compiler reordered the arithmetic in
such a way that two of the implicit vsetvl instructions can be
optimized away.  That probably happened by accident.  But we don't
have support for optimizing away the useless explicit vsetvl, so it
remains.

Jim

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a way to tell GCC not to reorder a specific instruction?
  2020-09-30  6:40   ` Richard Biener
@ 2020-09-30 20:01     ` Jim Wilson
  2020-10-01  6:34       ` Richard Biener
  0 siblings, 1 reply; 10+ messages in thread
From: Jim Wilson @ 2020-09-30 20:01 UTC (permalink / raw)
  To: Richard Biener; +Cc: 夏 晋, gcc

On Tue, Sep 29, 2020 at 11:40 PM Richard Biener
<richard.guenther@gmail.com> wrote:
> But this also doesn't work on GIMPLE.  On GIMPLE riscv_vlen would
> be a barrier for code motion if you make it __attribute__((returns_twice))
> since then abnormal edges distort the CFG in a way preventing such motion.

At the gimple level, all vector operations have an implicit vsetvl, so
it doesn't matter much how they are sorted.  As long as they don't get
sorted across an explicit vsetvl that they depend on.  But the normal
way to use explicit vsetvl is to control a loop, and you can't move
dependent operations out of the loop, so it tends to work.  Setting
vsetvl in the middle of a basic block is less useful and less common,
and very unlikely to work unless you really know what you are doing.
Basically, RISC-V wasn't designed to work this way, and so you
probably shouldn't be writing your code this way.  There might be edge
cases where we aren't handling this right, as we aren't writing code
this way, and hence we aren't testing this support.  This is still a
work in progress.

Good RVV code should look more like this:

#include <riscv_vector.h>
#include <stddef.h>

void saxpy(size_t n, const float a, const float *x, float *y) {
  size_t l;

  vfloat32m8_t vx, vy;

  for (; (l = vsetvl_e32m8(n)) > 0; n -= l) {
    vx = vle32_v_f32m8(x);
    x += l;
    vy = vle32_v_f32m8(y);
    // vfmacc
    vy = a * vx + vy;
    vse32_v_f32m8(y, vy);
    y += l;
  }
}

We have a lot of examples in gcc/testsuite/gcc.target/riscv/rvv that
we are using for testing the vector support.

Jim

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a way to tell GCC not to reorder a specific instruction?
  2020-09-30 20:01     ` Jim Wilson
@ 2020-10-01  6:34       ` Richard Biener
  2020-10-01 16:59         ` Jim Wilson
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2020-10-01  6:34 UTC (permalink / raw)
  To: Jim Wilson; +Cc: 夏 晋, gcc

On Wed, Sep 30, 2020 at 10:01 PM Jim Wilson <jimw@sifive.com> wrote:
>
> On Tue, Sep 29, 2020 at 11:40 PM Richard Biener
> <richard.guenther@gmail.com> wrote:
> > But this also doesn't work on GIMPLE.  On GIMPLE riscv_vlen would
> > be a barrier for code motion if you make it __attribute__((returns_twice))
> > since then abnormal edges distort the CFG in a way preventing such motion.
>
> At the gimple level, all vector operations have an implicit vsetvl, so
> it doesn't matter much how they are sorted.  As long as they don't get
> sorted across an explicit vsetvl that they depend on.  But the normal
> way to use explicit vsetvl is to control a loop, and you can't move
> dependent operations out of the loop, so it tends to work.  Setting
> vsetvl in the middle of a basic block is less useful and less common,
> and very unlikely to work unless you really know what you are doing.
> Basically, RISC-V wasn't designed to work this way, and so you
> probably shouldn't be writing your code this way.  There might be edge
> cases where we aren't handling this right, as we aren't writing code
> this way, and hence we aren't testing this support.  This is still a
> work in progress.
>
> Good RVV code should look more like this:
>
> #include <riscv_vector.h>
> #include <stddef.h>
>
> void saxpy(size_t n, const float a, const float *x, float *y) {
>   size_t l;
>
>   vfloat32m8_t vx, vy;
>
>   for (; (l = vsetvl_e32m8(n)) > 0; n -= l) {
>     vx = vle32_v_f32m8(x);
>     x += l;
>     vy = vle32_v_f32m8(y);
>     // vfmacc
>     vy = a * vx + vy;
>     vse32_v_f32m8(y, vy);
>     y += l;
>   }
> }

Ah, ok - that makes sense.

> We have a lot of examples in gcc/testsuite/gcc.target/riscv/rvv that
> we are using for testing the vector support.

That doesn't seem to exist (but maybe it's just not on trunk yet).

Richard.

> Jim

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Is there a way to tell GCC not to reorder a specific instruction?
  2020-10-01  6:34       ` Richard Biener
@ 2020-10-01 16:59         ` Jim Wilson
  0 siblings, 0 replies; 10+ messages in thread
From: Jim Wilson @ 2020-10-01 16:59 UTC (permalink / raw)
  To: Richard Biener; +Cc: 夏 晋, gcc

On Wed, Sep 30, 2020 at 11:35 PM Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Sep 30, 2020 at 10:01 PM Jim Wilson <jimw@sifive.com> wrote:
> > We have a lot of examples in gcc/testsuite/gcc.target/riscv/rvv that
> > we are using for testing the vector support.
>
> That doesn't seem to exist (but maybe it's just not on trunk yet).

The vector extension is still in draft form, and they are still making
major compatibility breaks.  There was yet another one about 3-4 weeks
ago.  I don't want to upstream anything until we have an officially
accepted V extension, at which point they will stop allowing
compatibility breaks.  If we upstream now, we would need some protocol
for how to handle unsupported experimental patches in mainline, and I
don't think that we have one.

So for now, the vector support is on a branch in the RISC-V
International github repo.
https://github.com/riscv/riscv-gnu-toolchain/tree/rvv-intrinsic
The gcc testcases specifically are here
https://github.com/riscv/riscv-gcc/tree/riscv-gcc-10.1-rvv-dev/gcc/testsuite/gcc.target/riscv/rvv
A lot of the testcases use macros so we can test every variation of an
instruction, and there is a large number of variations for most
instructions, so most of these testcases aren't very readable.  They
are just to verify that we can generate the instructions we expect.
Only the algorithm ones are readable, like saxpy, memcpy, strcpy.

Jim

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-10-01 17:00 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-29 10:47 Is there a way to tell GCC not to reorder a specific instruction? 夏 晋
2020-09-29 12:00 ` Richard Biener
2020-09-29 13:59   ` 回复: " 夏 晋
2020-09-29 19:45 ` Jim Wilson
2020-09-30  2:22   ` 回复: " 夏 晋
2020-09-30 19:44     ` Jim Wilson
2020-09-30  6:40   ` Richard Biener
2020-09-30 20:01     ` Jim Wilson
2020-10-01  6:34       ` Richard Biener
2020-10-01 16:59         ` Jim Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).