Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Sandiford <richard.sandiford@arm.com>
To: "Li\, Pan2" <pan2.li@intel.com>
Cc: "盼 李" <incarnation.p.lee@outlook.com>,
	"incarnation.p.lee--- via Gcc-patches" <gcc-patches@gcc.gnu.org>,
	"juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>,
	"kito.cheng@sifive.com" <kito.cheng@sifive.com>,
	"rguenther@suse.de" <rguenther@suse.de>
Subject: Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment
Date: Wed, 01 Mar 2023 10:11:44 +0000	[thread overview]
Message-ID: <mptedq8ok67.fsf@arm.com> (raw)
In-Reply-To: <MW5PR11MB5908CAEDD81742D2CDCB6CC2A9AC9@MW5PR11MB5908.namprd11.prod.outlook.com> (Pan2 Li's message of "Tue, 28 Feb 2023 14:07:15 +0000")

"Li, Pan2" <pan2.li@intel.com> writes:
> Hi Richard Sandiford,
>
> Just tried the overloaded constant divisors with below print div, it works as you mentioned, 😉!
>
> printf ("    can_div_away_from_zero_p (mode_precision[E_%smode], "
>      "BITS_PER_UNIT, &mode_size[E_%smode]);\n", m->name, m->name);
>
> template<unsigned int N, typename Ca, typename Cb, typename Cq>
> inline typename if_nonpoly<Cb, bool>::type
> can_div_away_from_zero_p (const poly_int_pod<N, Ca> &a,
>                          Cb b,
>                          poly_int_pod<N, Cq> *quotient)
> {
>   if (!can_div_trunc_p (a, b, quotient))
>     return false;
>   if (maybe_ne (*quotient * b, a))
>     for (unsigned int i = 0; i < N; ++i)
>       quotient->coeffs[i] += (quotient->coeffs[i] < 0 ? -1 : 1);
>   return true;
> }
>
> But I may have a question about the one case as below.
>
> Assume:
> a = [4, 4], b = 8.
>
> When meet can_div_trunc_p, it will check if the reminder is constant or not, aka a.coeffs[i] % 8 == 0 (i >= 1). If not constant reminder, the can_div_trunc_p will do nothing about quotient and return false.
>
> Thus, when a = [4, 4] for can_div_away_from_zero_p, the output *quotient will be unchanged, aka the mod_size[E_%smode] will be unchanged for this case. However, the underlying mode_size will adjust it to the real byte size, and I am not sure if it is by design or requires additional handling.

Is it right that, for RVV, a load or store of [4,4] will access [8,8]
bits, even when that means accessing fully-unused bytes?  E.g. 4+4X
when X=3 would be 16 bits/2 bytes of useful data, but a bitsize of
8+8X would be 32 bits/4 bytes.  So a store of [8,8] for a precision
of [4,4] would store 2 bytes beyond the end of the useful data when X==3?

Richard

> Pan
>
> From: 盼 李 <incarnation.p.lee@outlook.com>
> Sent: Tuesday, February 28, 2023 5:59 PM
> To: Richard Sandiford <richard.sandiford@arm.com>; Li, Pan2 <pan2.li@intel.com>
> Cc: incarnation.p.lee--- via Gcc-patches <gcc-patches@gcc.gnu.org>; juzhe.zhong@rivai.ai; kito.cheng@sifive.com; rguenther@suse.de
> Subject: Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment
>
> Understood, thanks for the explanations and suggestions. Let me have a try and keep you posted.
>
> Pan
> ________________________________
> From: Richard Sandiford <richard.sandiford@arm.com<mailto:richard.sandiford@arm.com>>
> Sent: Tuesday, February 28, 2023 17:50
> To: Li, Pan2 <pan2.li@intel.com<mailto:pan2.li@intel.com>>
> Cc: 盼 李 <incarnation.p.lee@outlook.com<mailto:incarnation.p.lee@outlook.com>>; incarnation.p.lee--- via Gcc-patches <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>>; juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai> <juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>>; kito.cheng@sifive.com<mailto:kito.cheng@sifive.com> <kito.cheng@sifive.com<mailto:kito.cheng@sifive.com>>; rguenther@suse.de<mailto:rguenther@suse.de> <rguenther@suse.de<mailto:rguenther@suse.de>>
> Subject: Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment
>
> "Li, Pan2" <pan2.li@intel.com<mailto:pan2.li@intel.com>> writes:
>> Hi Richard Sandiford,
>>
>> After some investigation, I am not sure if it is possible to make it general without any changes to exact_div. We can add one method like below to get the unit poly for all possible N.
>>
>> template<unsigned int N, typename Ca>
>> inline POLY_CONST_RESULT (N, Ca, Ca)
>> normalize_to_unit (const poly_int_pod<N, Ca> &a)
>> {
>>   typedef POLY_CONST_COEFF (Ca, Ca) C;
>>
>>   poly_int<N, C> normalized = a;
>>
>>   if (normalized.is_constant())
>>     normalized.coeffs[0] = 1;
>>   else
>>     for (unsigned int i = 0; i < N; i++)
>>       POLY_SET_COEFF (C, normalized, i, 1);
>>
>>   return normalized;
>> }
>>
>> And then adjust the genmodes like below to consume the unit poly.
>>
>>       printf ("    poly_uint16 unit_poly = "
>>              "normalize_to_unit (mode_precision[E_%smode]);\n", m->name);
>>       printf ("    if (known_lt (mode_precision[E_%smode], "
>>              "unit_poly * BITS_PER_UNIT))\n", m->name);
>>       printf ("      mode_size[E_%smode] = unit_poly;\n", m->name);
>>
>> I am not sure if it is a good idea to introduce above normalize code into exact_div. Given the comment of the exact_div indicates that “/* Return A / B, given that A is known to be a multiple of B. */”.
>
> My point was that we have multiple ways of dividing poly_ints:
>
> - exact_div, for when the caller knows that the result is always exact
> - can_div_trunc_p, for truncating division (round towards 0)
> - can_div_away_from_zero_p, for rounding away from 0
> - ...
>
> This is like how we have multiple division *_EXPRs on trees.
>
> Until now, exact_div was the correct choice for modes because vector
> modes didn't have padding.  We're now changing that, so my suggestion
> in the review was to change the division operation that we use.
> Rather than use exact_div, we should now use can_div_away_from_zero_p,
> which would have the effect of rounding the quotient up.
>
> Something like:
>
>       if (!can_div_away_from_zero_p (mode_precision[E_%smode], BITS_PER_UNIT,
>                                      &mode_size[E_%smode]))
>         gcc_unreachable ();
>
> But this will require a new overload of can_div_away_from_zero_p, since
> the existing one is for constant quotients rather than constant divisors.
>
> Thanks,
> Richard
>
>>
>> Could you please help to share your opinion about this from the expert’s perspective ? Thank you!
>>
>> Pan
>>
>> From: 盼 李 <incarnation.p.lee@outlook.com<mailto:incarnation.p.lee@outlook.com>>
>> Sent: Monday, February 27, 2023 11:13 PM
>> To: Richard Sandiford <richard.sandiford@arm.com<mailto:richard.sandiford@arm.com>>; incarnation.p.lee--- via Gcc-patches <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>>
>> Cc: juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai>; kito.cheng@sifive.com<mailto:kito.cheng@sifive.com>; rguenther@suse.de<mailto:rguenther@suse.de>; Li, Pan2 <pan2.li@intel.com<mailto:pan2.li@intel.com>>
>> Subject: Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment
>>
>> Never mind, wish you have a good holiday.
>>
>> Thanks for pointing this out, the if part cannot take care of poly_int with N > 2. As I understand, we need to make it general for all the N of poly_int.
>>
>> Thus I would like to double confirm with you about how to make it general. I suppose there will be a new function can_div_away_from_zero_p to replace the if (known_lt(,)) part in genmodes.cc, and leave exact_div unchanged(consider the word exact, I suppose we should not touch here), right? Then we still need one poly_int with all 1 for N as the return if can_div_away_from_zero_p is true.
>>
>> Thanks again for your professional suggestion, have a nice day, 😉!
>>
>> Pan
>> ________________________________
>> From: Richard Sandiford <richard.sandiford@arm.com<mailto:richard.sandiford@arm.com<mailto:richard.sandiford@arm.com%3cmailto:richard.sandiford@arm.com>>>
>> Sent: Monday, February 27, 2023 22:24
>> To: incarnation.p.lee--- via Gcc-patches <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org%3cmailto:gcc-patches@gcc.gnu.org>>>
>> Cc: incarnation.p.lee@outlook.com<mailto:incarnation.p.lee@outlook.com<mailto:incarnation.p.lee@outlook.com%3cmailto:incarnation.p.lee@outlook.com>> <incarnation.p.lee@outlook.com<mailto:incarnation.p.lee@outlook.com<mailto:incarnation.p.lee@outlook.com%3cmailto:incarnation.p.lee@outlook.com>>>; juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai%3cmailto:juzhe.zhong@rivai.ai>> <juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai<mailto:juzhe.zhong@rivai.ai%3cmailto:juzhe.zhong@rivai.ai>>>; kito.cheng@sifive.com<mailto:kito.cheng@sifive.com<mailto:kito.cheng@sifive.com%3cmailto:kito.cheng@sifive.com>> <kito.cheng@sifive.com<mailto:kito.cheng@sifive.com<mailto:kito.cheng@sifive.com%3cmailto:kito.cheng@sifive.com>>>; rguenther@suse.de<mailto:rguenther@suse.de> <rguenther@suse.de<mailto:rguenther@suse.de<mailto:rguenther@suse.de%3cmailto:rguenther@suse.de>>>; pan2.li@intel.com<mailto:pan2.li@intel.com> <pan2.li@intel.com<mailto:pan2.li@intel.com<mailto:pan2.li@intel.com%3cmailto:pan2.li@intel.com>>>
>> Subject: Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment
>>
>> Sorry for the slow reply, been away for a couple of weeks.
>>
>> "incarnation.p.lee--- via Gcc-patches" <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org%3cmailto:gcc-patches@gcc.gnu.org>>> writes:
>>> From: Pan Li <pan2.li@intel.com<mailto:pan2.li@intel.com<mailto:pan2.li@intel.com%3cmailto:pan2.li@intel.com>>>
>>>
>>>        Fix the bug of the rvv bool mode precision with the adjustment.
>>>        The bits size of vbool*_t will be adjusted to
>>>        [1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
>>>        adjusted mode precison of vbool*_t will help underlying pass to
>>>        make the right decision for both the correctness and optimization.
>>>
>>>        Given below sample code:
>>>        void test_1(int8_t * restrict in, int8_t * restrict out)
>>>        {
>>>          vbool8_t v2 = *(vbool8_t*)in;
>>>          vbool16_t v5 = *(vbool16_t*)in;
>>>          *(vbool16_t*)(out + 200) = v5;
>>>          *(vbool8_t*)(out + 100) = v2;
>>>        }
>>>
>>>        Before the precision adjustment:
>>>        addi    a4,a1,100
>>>        vsetvli a5,zero,e8,m1,ta,ma
>>>        addi    a1,a1,200
>>>        vlm.v   v24,0(a0)
>>>        vsm.v   v24,0(a4)
>>>        // Need one vsetvli and vlm.v for correctness here.
>>>        vsm.v   v24,0(a1)
>>>
>>>        After the precision adjustment:
>>>        csrr    t0,vlenb
>>>        slli    t1,t0,1
>>>        csrr    a3,vlenb
>>>        sub     sp,sp,t1
>>>        slli    a4,a3,1
>>>        add     a4,a4,sp
>>>        sub     a3,a4,a3
>>>        vsetvli a5,zero,e8,m1,ta,ma
>>>        addi    a2,a1,200
>>>        vlm.v   v24,0(a0)
>>>        vsm.v   v24,0(a3)
>>>        addi    a1,a1,100
>>>        vsetvli a4,zero,e8,mf2,ta,ma
>>>        csrr    t0,vlenb
>>>        vlm.v   v25,0(a3)
>>>        vsm.v   v25,0(a2)
>>>        slli    t1,t0,1
>>>        vsetvli a5,zero,e8,m1,ta,ma
>>>        vsm.v   v24,0(a1)
>>>        add     sp,sp,t1
>>>        jr      ra
>>>
>>>        However, there may be some optimization opportunates after
>>>        the mode precision adjustment. It can be token care of in
>>>        the RISC-V backend in the underlying separted PR(s).
>>>
>>>        PR 108185
>>>        PR 108654
>>>
>>> gcc/ChangeLog:
>>>
>>>        * config/riscv/riscv-modes.def (ADJUST_PRECISION):
>>>        * config/riscv/riscv.cc (riscv_v_adjust_precision):
>>>        * config/riscv/riscv.h (riscv_v_adjust_precision):
>>>        * genmodes.cc (ADJUST_PRECISION):
>>>        (emit_mode_adjustments):
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>        * gcc.target/riscv/pr108185-1.c: New test.
>>>        * gcc.target/riscv/pr108185-2.c: New test.
>>>        * gcc.target/riscv/pr108185-3.c: New test.
>>>        * gcc.target/riscv/pr108185-4.c: New test.
>>>        * gcc.target/riscv/pr108185-5.c: New test.
>>>        * gcc.target/riscv/pr108185-6.c: New test.
>>>        * gcc.target/riscv/pr108185-7.c: New test.
>>>        * gcc.target/riscv/pr108185-8.c: New test.
>>>
>>> Signed-off-by: Pan Li <pan2.li@intel.com<mailto:pan2.li@intel.com<mailto:pan2.li@intel.com%3cmailto:pan2.li@intel.com>>>
>>> ---
>>>  gcc/config/riscv/riscv-modes.def            |  8 +++
>>>  gcc/config/riscv/riscv.cc                   | 12 ++++
>>>  gcc/config/riscv/riscv.h                    |  1 +
>>>  gcc/genmodes.cc                             | 25 ++++++-
>>>  gcc/testsuite/gcc.target/riscv/pr108185-1.c | 68 ++++++++++++++++++
>>>  gcc/testsuite/gcc.target/riscv/pr108185-2.c | 68 ++++++++++++++++++
>>>  gcc/testsuite/gcc.target/riscv/pr108185-3.c | 68 ++++++++++++++++++
>>>  gcc/testsuite/gcc.target/riscv/pr108185-4.c | 68 ++++++++++++++++++
>>>  gcc/testsuite/gcc.target/riscv/pr108185-5.c | 68 ++++++++++++++++++
>>>  gcc/testsuite/gcc.target/riscv/pr108185-6.c | 68 ++++++++++++++++++
>>>  gcc/testsuite/gcc.target/riscv/pr108185-7.c | 68 ++++++++++++++++++
>>>  gcc/testsuite/gcc.target/riscv/pr108185-8.c | 77 +++++++++++++++++++++
>>>  12 files changed, 598 insertions(+), 1 deletion(-)
>>>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-1.c
>>>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-2.c
>>>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-3.c
>>>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-4.c
>>>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-5.c
>>>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-6.c
>>>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-7.c
>>>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-8.c
>>>
>>> diff --git a/gcc/config/riscv/riscv-modes.def b/gcc/config/riscv/riscv-modes.def
>>> index d5305efa8a6..110bddce851 100644
>>> --- a/gcc/config/riscv/riscv-modes.def
>>> +++ b/gcc/config/riscv/riscv-modes.def
>>> @@ -72,6 +72,14 @@ ADJUST_BYTESIZE (VNx16BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
>>>  ADJUST_BYTESIZE (VNx32BI, riscv_vector_chunks * riscv_bytes_per_vector_chunk);
>>>  ADJUST_BYTESIZE (VNx64BI, riscv_v_adjust_nunits (VNx64BImode, 8));
>>>
>>> +ADJUST_PRECISION (VNx1BI, riscv_v_adjust_precision (VNx1BImode, 1));
>>> +ADJUST_PRECISION (VNx2BI, riscv_v_adjust_precision (VNx2BImode, 2));
>>> +ADJUST_PRECISION (VNx4BI, riscv_v_adjust_precision (VNx4BImode, 4));
>>> +ADJUST_PRECISION (VNx8BI, riscv_v_adjust_precision (VNx8BImode, 8));
>>> +ADJUST_PRECISION (VNx16BI, riscv_v_adjust_precision (VNx16BImode, 16));
>>> +ADJUST_PRECISION (VNx32BI, riscv_v_adjust_precision (VNx32BImode, 32));
>>> +ADJUST_PRECISION (VNx64BI, riscv_v_adjust_precision (VNx64BImode, 64));
>>> +
>>>  /*
>>>     | Mode        | MIN_VLEN=32 | MIN_VLEN=32 | MIN_VLEN=64 | MIN_VLEN=64 |
>>>     |             | LMUL        | SEW/LMUL    | LMUL        | SEW/LMUL    |
>>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>>> index de3e1f903c7..cbe66c0e35b 100644
>>> --- a/gcc/config/riscv/riscv.cc
>>> +++ b/gcc/config/riscv/riscv.cc
>>> @@ -1003,6 +1003,18 @@ riscv_v_adjust_nunits (machine_mode mode, int scale)
>>>    return scale;
>>>  }
>>>
>>> +/* Call from ADJUST_PRECISION in riscv-modes.def.  Return the correct
>>> +   PRECISION size for corresponding machine_mode.  */
>>> +
>>> +poly_int64
>>> +riscv_v_adjust_precision (machine_mode mode, int scale)
>>> +{
>>> +  if (riscv_v_ext_vector_mode_p (mode))
>>> +    return riscv_vector_chunks * scale;
>>> +
>>> +  return scale;
>>> +}
>>> +
>>>  /* Return true if X is a valid address for machine mode MODE.  If it is,
>>>     fill in INFO appropriately.  STRICT_P is true if REG_OK_STRICT is in
>>>     effect.  */
>>> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
>>> index 5bc7f2f467d..15b9317a8ce 100644
>>> --- a/gcc/config/riscv/riscv.h
>>> +++ b/gcc/config/riscv/riscv.h
>>> @@ -1025,6 +1025,7 @@ extern unsigned riscv_stack_boundary;
>>>  extern unsigned riscv_bytes_per_vector_chunk;
>>>  extern poly_uint16 riscv_vector_chunks;
>>>  extern poly_int64 riscv_v_adjust_nunits (enum machine_mode, int);
>>> +extern poly_int64 riscv_v_adjust_precision (enum machine_mode, int);
>>>  /* The number of bits and bytes in a RVV vector.  */
>>>  #define BITS_PER_RISCV_VECTOR (poly_uint16 (riscv_vector_chunks * riscv_bytes_per_vector_chunk * 8))
>>>  #define BYTES_PER_RISCV_VECTOR (poly_uint16 (riscv_vector_chunks * riscv_bytes_per_vector_chunk))
>>> diff --git a/gcc/genmodes.cc b/gcc/genmodes.cc
>>> index 2d418f09aab..12f4e6335e6 100644
>>> --- a/gcc/genmodes.cc
>>> +++ b/gcc/genmodes.cc
>>> @@ -114,6 +114,7 @@ static struct mode_adjust *adj_alignment;
>>>  static struct mode_adjust *adj_format;
>>>  static struct mode_adjust *adj_ibit;
>>>  static struct mode_adjust *adj_fbit;
>>> +static struct mode_adjust *adj_precision;
>>>
>>>  /* Mode class operations.  */
>>>  static enum mode_class
>>> @@ -819,6 +820,7 @@ make_vector_mode (enum mode_class bclass,
>>>  #define ADJUST_NUNITS(M, X)    _ADD_ADJUST (nunits, M, X, RANDOM, RANDOM)
>>>  #define ADJUST_BYTESIZE(M, X)  _ADD_ADJUST (bytesize, M, X, RANDOM, RANDOM)
>>>  #define ADJUST_ALIGNMENT(M, X) _ADD_ADJUST (alignment, M, X, RANDOM, RANDOM)
>>> +#define ADJUST_PRECISION(M, X) _ADD_ADJUST (precision, M, X, RANDOM, RANDOM)
>>>  #define ADJUST_FLOAT_FORMAT(M, X)    _ADD_ADJUST (format, M, X, FLOAT, FLOAT)
>>>  #define ADJUST_IBIT(M, X)  _ADD_ADJUST (ibit, M, X, ACCUM, UACCUM)
>>>  #define ADJUST_FBIT(M, X)  _ADD_ADJUST (fbit, M, X, FRACT, UACCUM)
>>> @@ -1829,7 +1831,15 @@ emit_mode_adjustments (void)
>>>              " (mode_precision[E_%smode], mode_nunits[E_%smode]);\n",
>>>              m->name, m->name);
>>>        printf ("    mode_precision[E_%smode] = ps * old_factor;\n", m->name);
>>> -      printf ("    mode_size[E_%smode] = exact_div (mode_precision[E_%smode],"
>>> +      /* Normalize the size to 1 if precison is less than BITS_PER_UNIT.  */
>>> +      printf ("    poly_uint16 size_one = "
>>> +           "mode_precision[E_%smode].is_constant ()\n", m->name);
>>> +      printf ("      ? poly_uint16 (1, 0) : poly_uint16 (1, 1);\n");
>>
>> Have you tried this on an x86_64 system?  I wouldn't expect it to work
>> because of the:
>>
>>   STATIC_ASSERT (N >= 2);
>>
>> in the poly_uint16 constructor.
>>
>>> +      printf ("    if (known_lt (mode_precision[E_%smode], "
>>> +           "size_one * BITS_PER_UNIT))\n", m->name);
>>> +      printf ("      mode_size[E_%smode] = size_one;\n", m->name);
>>> +      printf ("    else\n");
>>> +      printf ("      mode_size[E_%smode] = exact_div (mode_precision[E_%smode],"
>>
>> Now that the assert implicit in the original exact_div no longer holds,
>> I think we should instead generalise it to can_div_away_from_zero_p
>> (which will involve defining a new overload of can_div_away_from_zero_p).
>> I think that will give the same result as the code above for the cases
>> that the code above handles.  But it should be more general too.
>>
>> TBH, I'm still sceptical that this is all that is needed.  It seems
>> unlikely that we've been so good at writing vector support code that
>> we've made it work for precision < bitsize, despite that being an
>> unsupported combination until now.  But I guess we can fix problems
>> on a case-by-case basis.
>>
>> Thanks,
>> Richard
>>
>>>              " BITS_PER_UNIT);\n", m->name, m->name);
>>>        printf ("    mode_nunits[E_%smode] = ps;\n", m->name);
>>>        printf ("    adjust_mode_mask (E_%smode);\n", m->name);
>>> @@ -1963,6 +1973,19 @@ emit_mode_adjustments (void)
>>>      printf ("\n  /* %s:%d */\n  REAL_MODE_FORMAT (E_%smode) = %s;\n",
>>>            a->file, a->line, a->mode->name, a->adjustment);
>>>
>>> +  /* Adjust precision to the actual bits size.  */
>>> +  for (a = adj_precision; a; a = a->next)
>>> +    switch (a->mode->cl)
>>> +      {
>>> +     case MODE_VECTOR_BOOL:
>>> +       printf ("\n  /* %s:%d.  */\n  ps = %s;\n", a->file, a->line,
>>> +               a->adjustment);
>>> +       printf ("  mode_precision[E_%smode] = ps;\n", a->mode->name);
>>> +       break;
>>> +     default:
>>> +       break;
>>> +      }
>>> +
>>>    puts ("}");
>>>  }
>>>
>>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-1.c b/gcc/testsuite/gcc.target/riscv/pr108185-1.c
>>> new file mode 100644
>>> index 00000000000..e70960c5b6d
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-1.c
>>> @@ -0,0 +1,68 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */
>>> +
>>> +#include "riscv_vector.h"
>>> +
>>> +void
>>> +test_vbool1_then_vbool2(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool1_t v1 = *(vbool1_t*)in;
>>> +    vbool2_t v2 = *(vbool2_t*)in;
>>> +
>>> +    *(vbool1_t*)(out + 100) = v1;
>>> +    *(vbool2_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool1_then_vbool4(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool1_t v1 = *(vbool1_t*)in;
>>> +    vbool4_t v2 = *(vbool4_t*)in;
>>> +
>>> +    *(vbool1_t*)(out + 100) = v1;
>>> +    *(vbool4_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool1_then_vbool8(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool1_t v1 = *(vbool1_t*)in;
>>> +    vbool8_t v2 = *(vbool8_t*)in;
>>> +
>>> +    *(vbool1_t*)(out + 100) = v1;
>>> +    *(vbool8_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool1_then_vbool16(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool1_t v1 = *(vbool1_t*)in;
>>> +    vbool16_t v2 = *(vbool16_t*)in;
>>> +
>>> +    *(vbool1_t*)(out + 100) = v1;
>>> +    *(vbool16_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool1_then_vbool32(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool1_t v1 = *(vbool1_t*)in;
>>> +    vbool32_t v2 = *(vbool32_t*)in;
>>> +
>>> +    *(vbool1_t*)(out + 100) = v1;
>>> +    *(vbool32_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool1_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool1_t v1 = *(vbool1_t*)in;
>>> +    vbool64_t v2 = *(vbool64_t*)in;
>>> +
>>> +    *(vbool1_t*)(out + 100) = v1;
>>> +    *(vbool64_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 6 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
>>> +/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 18 } } */
>>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-2.c b/gcc/testsuite/gcc.target/riscv/pr108185-2.c
>>> new file mode 100644
>>> index 00000000000..dcc7a644a88
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-2.c
>>> @@ -0,0 +1,68 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */
>>> +
>>> +#include "riscv_vector.h"
>>> +
>>> +void
>>> +test_vbool2_then_vbool1(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool2_t v1 = *(vbool2_t*)in;
>>> +    vbool1_t v2 = *(vbool1_t*)in;
>>> +
>>> +    *(vbool2_t*)(out + 100) = v1;
>>> +    *(vbool1_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool2_then_vbool4(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool2_t v1 = *(vbool2_t*)in;
>>> +    vbool4_t v2 = *(vbool4_t*)in;
>>> +
>>> +    *(vbool2_t*)(out + 100) = v1;
>>> +    *(vbool4_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool2_then_vbool8(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool2_t v1 = *(vbool2_t*)in;
>>> +    vbool8_t v2 = *(vbool8_t*)in;
>>> +
>>> +    *(vbool2_t*)(out + 100) = v1;
>>> +    *(vbool8_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool2_then_vbool16(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool2_t v1 = *(vbool2_t*)in;
>>> +    vbool16_t v2 = *(vbool16_t*)in;
>>> +
>>> +    *(vbool2_t*)(out + 100) = v1;
>>> +    *(vbool16_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool2_then_vbool32(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool2_t v1 = *(vbool2_t*)in;
>>> +    vbool32_t v2 = *(vbool32_t*)in;
>>> +
>>> +    *(vbool2_t*)(out + 100) = v1;
>>> +    *(vbool32_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool2_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool2_t v1 = *(vbool2_t*)in;
>>> +    vbool64_t v2 = *(vbool64_t*)in;
>>> +
>>> +    *(vbool2_t*)(out + 100) = v1;
>>> +    *(vbool64_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 6 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
>>> +/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 17 } } */
>>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-3.c b/gcc/testsuite/gcc.target/riscv/pr108185-3.c
>>> new file mode 100644
>>> index 00000000000..3af0513e006
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-3.c
>>> @@ -0,0 +1,68 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */
>>> +
>>> +#include "riscv_vector.h"
>>> +
>>> +void
>>> +test_vbool4_then_vbool1(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool4_t v1 = *(vbool4_t*)in;
>>> +    vbool1_t v2 = *(vbool1_t*)in;
>>> +
>>> +    *(vbool4_t*)(out + 100) = v1;
>>> +    *(vbool1_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool4_then_vbool2(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool4_t v1 = *(vbool4_t*)in;
>>> +    vbool2_t v2 = *(vbool2_t*)in;
>>> +
>>> +    *(vbool4_t*)(out + 100) = v1;
>>> +    *(vbool2_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool4_then_vbool8(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool4_t v1 = *(vbool4_t*)in;
>>> +    vbool8_t v2 = *(vbool8_t*)in;
>>> +
>>> +    *(vbool4_t*)(out + 100) = v1;
>>> +    *(vbool8_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool4_then_vbool16(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool4_t v1 = *(vbool4_t*)in;
>>> +    vbool16_t v2 = *(vbool16_t*)in;
>>> +
>>> +    *(vbool4_t*)(out + 100) = v1;
>>> +    *(vbool16_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool4_then_vbool32(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool4_t v1 = *(vbool4_t*)in;
>>> +    vbool32_t v2 = *(vbool32_t*)in;
>>> +
>>> +    *(vbool4_t*)(out + 100) = v1;
>>> +    *(vbool32_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool4_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool4_t v1 = *(vbool4_t*)in;
>>> +    vbool64_t v2 = *(vbool64_t*)in;
>>> +
>>> +    *(vbool4_t*)(out + 100) = v1;
>>> +    *(vbool64_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 6 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
>>> +/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 16 } } */
>>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-4.c b/gcc/testsuite/gcc.target/riscv/pr108185-4.c
>>> new file mode 100644
>>> index 00000000000..ea3c360d756
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-4.c
>>> @@ -0,0 +1,68 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */
>>> +
>>> +#include "riscv_vector.h"
>>> +
>>> +void
>>> +test_vbool8_then_vbool1(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool8_t v1 = *(vbool8_t*)in;
>>> +    vbool1_t v2 = *(vbool1_t*)in;
>>> +
>>> +    *(vbool8_t*)(out + 100) = v1;
>>> +    *(vbool1_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool8_then_vbool2(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool8_t v1 = *(vbool8_t*)in;
>>> +    vbool2_t v2 = *(vbool2_t*)in;
>>> +
>>> +    *(vbool8_t*)(out + 100) = v1;
>>> +    *(vbool2_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool8_then_vbool4(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool8_t v1 = *(vbool8_t*)in;
>>> +    vbool4_t v2 = *(vbool4_t*)in;
>>> +
>>> +    *(vbool8_t*)(out + 100) = v1;
>>> +    *(vbool4_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool8_then_vbool16(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool8_t v1 = *(vbool8_t*)in;
>>> +    vbool16_t v2 = *(vbool16_t*)in;
>>> +
>>> +    *(vbool8_t*)(out + 100) = v1;
>>> +    *(vbool16_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool8_then_vbool32(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool8_t v1 = *(vbool8_t*)in;
>>> +    vbool32_t v2 = *(vbool32_t*)in;
>>> +
>>> +    *(vbool8_t*)(out + 100) = v1;
>>> +    *(vbool32_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool8_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool8_t v1 = *(vbool8_t*)in;
>>> +    vbool64_t v2 = *(vbool64_t*)in;
>>> +
>>> +    *(vbool8_t*)(out + 100) = v1;
>>> +    *(vbool64_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 6 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
>>> +/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 15 } } */
>>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-5.c b/gcc/testsuite/gcc.target/riscv/pr108185-5.c
>>> new file mode 100644
>>> index 00000000000..9fc659d2402
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-5.c
>>> @@ -0,0 +1,68 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */
>>> +
>>> +#include "riscv_vector.h"
>>> +
>>> +void
>>> +test_vbool16_then_vbool1(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool16_t v1 = *(vbool16_t*)in;
>>> +    vbool1_t v2 = *(vbool1_t*)in;
>>> +
>>> +    *(vbool16_t*)(out + 100) = v1;
>>> +    *(vbool1_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool16_then_vbool2(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool16_t v1 = *(vbool16_t*)in;
>>> +    vbool2_t v2 = *(vbool2_t*)in;
>>> +
>>> +    *(vbool16_t*)(out + 100) = v1;
>>> +    *(vbool2_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool16_then_vbool4(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool16_t v1 = *(vbool16_t*)in;
>>> +    vbool4_t v2 = *(vbool4_t*)in;
>>> +
>>> +    *(vbool16_t*)(out + 100) = v1;
>>> +    *(vbool4_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool16_then_vbool8(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool16_t v1 = *(vbool16_t*)in;
>>> +    vbool8_t v2 = *(vbool8_t*)in;
>>> +
>>> +    *(vbool16_t*)(out + 100) = v1;
>>> +    *(vbool8_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool16_then_vbool32(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool16_t v1 = *(vbool16_t*)in;
>>> +    vbool32_t v2 = *(vbool32_t*)in;
>>> +
>>> +    *(vbool16_t*)(out + 100) = v1;
>>> +    *(vbool32_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool16_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool16_t v1 = *(vbool16_t*)in;
>>> +    vbool64_t v2 = *(vbool64_t*)in;
>>> +
>>> +    *(vbool16_t*)(out + 100) = v1;
>>> +    *(vbool64_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 6 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
>>> +/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 14 } } */
>>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-6.c b/gcc/testsuite/gcc.target/riscv/pr108185-6.c
>>> new file mode 100644
>>> index 00000000000..98275e5267d
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-6.c
>>> @@ -0,0 +1,68 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */
>>> +
>>> +#include "riscv_vector.h"
>>> +
>>> +void
>>> +test_vbool32_then_vbool1(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool32_t v1 = *(vbool32_t*)in;
>>> +    vbool1_t v2 = *(vbool1_t*)in;
>>> +
>>> +    *(vbool32_t*)(out + 100) = v1;
>>> +    *(vbool1_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool32_then_vbool2(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool32_t v1 = *(vbool32_t*)in;
>>> +    vbool2_t v2 = *(vbool2_t*)in;
>>> +
>>> +    *(vbool32_t*)(out + 100) = v1;
>>> +    *(vbool2_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool32_then_vbool4(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool32_t v1 = *(vbool32_t*)in;
>>> +    vbool4_t v2 = *(vbool4_t*)in;
>>> +
>>> +    *(vbool32_t*)(out + 100) = v1;
>>> +    *(vbool4_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool32_then_vbool8(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool32_t v1 = *(vbool32_t*)in;
>>> +    vbool8_t v2 = *(vbool8_t*)in;
>>> +
>>> +    *(vbool32_t*)(out + 100) = v1;
>>> +    *(vbool8_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool32_then_vbool16(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool32_t v1 = *(vbool32_t*)in;
>>> +    vbool16_t v2 = *(vbool16_t*)in;
>>> +
>>> +    *(vbool32_t*)(out + 100) = v1;
>>> +    *(vbool16_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool32_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool32_t v1 = *(vbool32_t*)in;
>>> +    vbool64_t v2 = *(vbool64_t*)in;
>>> +
>>> +    *(vbool32_t*)(out + 100) = v1;
>>> +    *(vbool64_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 6 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
>>> +/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 13 } } */
>>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-7.c b/gcc/testsuite/gcc.target/riscv/pr108185-7.c
>>> new file mode 100644
>>> index 00000000000..8f6f0b11f09
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-7.c
>>> @@ -0,0 +1,68 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */
>>> +
>>> +#include "riscv_vector.h"
>>> +
>>> +void
>>> +test_vbool64_then_vbool1(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool64_t v1 = *(vbool64_t*)in;
>>> +    vbool1_t v2 = *(vbool1_t*)in;
>>> +
>>> +    *(vbool64_t*)(out + 100) = v1;
>>> +    *(vbool1_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool64_then_vbool2(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool64_t v1 = *(vbool64_t*)in;
>>> +    vbool2_t v2 = *(vbool2_t*)in;
>>> +
>>> +    *(vbool64_t*)(out + 100) = v1;
>>> +    *(vbool2_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool64_then_vbool4(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool64_t v1 = *(vbool64_t*)in;
>>> +    vbool4_t v2 = *(vbool4_t*)in;
>>> +
>>> +    *(vbool64_t*)(out + 100) = v1;
>>> +    *(vbool4_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool64_then_vbool8(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool64_t v1 = *(vbool64_t*)in;
>>> +    vbool8_t v2 = *(vbool8_t*)in;
>>> +
>>> +    *(vbool64_t*)(out + 100) = v1;
>>> +    *(vbool8_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool64_then_vbool16(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool64_t v1 = *(vbool64_t*)in;
>>> +    vbool16_t v2 = *(vbool16_t*)in;
>>> +
>>> +    *(vbool64_t*)(out + 100) = v1;
>>> +    *(vbool16_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool64_then_vbool32(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool64_t v1 = *(vbool64_t*)in;
>>> +    vbool32_t v2 = *(vbool32_t*)in;
>>> +
>>> +    *(vbool64_t*)(out + 100) = v1;
>>> +    *(vbool32_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 6 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
>>> +/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 12 } } */
>>> diff --git a/gcc/testsuite/gcc.target/riscv/pr108185-8.c b/gcc/testsuite/gcc.target/riscv/pr108185-8.c
>>> new file mode 100644
>>> index 00000000000..d96959dd064
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/riscv/pr108185-8.c
>>> @@ -0,0 +1,77 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-march=rv64gcv -mabi=lp64 -O3" } */
>>> +
>>> +#include "riscv_vector.h"
>>> +
>>> +void
>>> +test_vbool1_then_vbool1(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool1_t v1 = *(vbool1_t*)in;
>>> +    vbool1_t v2 = *(vbool1_t*)in;
>>> +
>>> +    *(vbool1_t*)(out + 100) = v1;
>>> +    *(vbool1_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool2_then_vbool2(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool2_t v1 = *(vbool2_t*)in;
>>> +    vbool2_t v2 = *(vbool2_t*)in;
>>> +
>>> +    *(vbool2_t*)(out + 100) = v1;
>>> +    *(vbool2_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool4_then_vbool4(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool4_t v1 = *(vbool4_t*)in;
>>> +    vbool4_t v2 = *(vbool4_t*)in;
>>> +
>>> +    *(vbool4_t*)(out + 100) = v1;
>>> +    *(vbool4_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool8_then_vbool8(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool8_t v1 = *(vbool8_t*)in;
>>> +    vbool8_t v2 = *(vbool8_t*)in;
>>> +
>>> +    *(vbool8_t*)(out + 100) = v1;
>>> +    *(vbool8_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool16_then_vbool16(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool16_t v1 = *(vbool16_t*)in;
>>> +    vbool16_t v2 = *(vbool16_t*)in;
>>> +
>>> +    *(vbool16_t*)(out + 100) = v1;
>>> +    *(vbool16_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool32_then_vbool32(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool32_t v1 = *(vbool32_t*)in;
>>> +    vbool32_t v2 = *(vbool32_t*)in;
>>> +
>>> +    *(vbool32_t*)(out + 100) = v1;
>>> +    *(vbool32_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +void
>>> +test_vbool64_then_vbool64(int8_t * restrict in, int8_t * restrict out) {
>>> +    vbool64_t v1 = *(vbool64_t*)in;
>>> +    vbool64_t v2 = *(vbool64_t*)in;
>>> +
>>> +    *(vbool64_t*)(out + 100) = v1;
>>> +    *(vbool64_t*)(out + 200) = v2;
>>> +}
>>> +
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*m1,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf2,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf4,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x][0-9]+,\s*zero,\s*e8,\s*mf8,\s*ta,\s*ma} 1 } } */
>>> +/* { dg-final { scan-assembler-times {vlm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 7 } } */
>>> +/* { dg-final { scan-assembler-times {vsm\.v\s+v[0-9]+,\s*0\([a-x][0-9]+\)} 14 } } */

next prev parent reply	other threads:[~2023-03-01 10:11 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-16 15:11 incarnation.p.lee
     [not found] ` <9800822AA73B1E3D+5F679DFB-633A-446F-BB7F-59ADEEE67E50@rivai.ai>
2023-02-17  7:18   ` Li, Pan2
2023-02-17  7:36   ` Richard Biener
2023-02-17  8:39     ` Li, Pan2
2023-02-21  6:36       ` Li, Pan2
2023-02-21  8:28         ` Kito Cheng
2023-02-24  5:08           ` juzhe.zhong
2023-02-24  7:21             ` Li, Pan2
2023-02-27  3:43               ` Li, Pan2
2023-02-27 14:24 ` Richard Sandiford
2023-02-27 15:13   ` 盼 李
2023-02-28  2:27     ` Li, Pan2
2023-02-28  9:50       ` Richard Sandiford
2023-02-28  9:59         ` 盼 李
2023-02-28 14:07           ` Li, Pan2
2023-03-01 10:11             ` Richard Sandiford [this message]
2023-03-01 10:46               ` juzhe.zhong
2023-03-01 10:55                 ` 盼 李
2023-03-01 11:11                   ` Richard Sandiford
2023-03-01 11:26                     ` 盼 李
2023-03-01 11:53                     ` 盼 李
2023-03-01 12:03                       ` Richard Sandiford
2023-03-01 12:13                         ` juzhe.zhong
2023-03-01 12:27                           ` 盼 李
2023-03-01 12:33                         ` Richard Biener
2023-03-01 12:56                           ` Pan Li
2023-03-01 13:11                             ` Richard Biener
2023-03-01 13:19                             ` Richard Sandiford
2023-03-01 13:26                               ` Richard Biener
2023-03-01 13:50                               ` juzhe.zhong
2023-03-01 13:59                                 ` Richard Biener
2023-03-01 14:03                                   ` Richard Biener
2023-03-01 14:19                                     ` juzhe.zhong
2023-03-01 15:42                                       ` Li, Pan2
2023-03-01 15:46                                         ` Pan Li
2023-03-01 16:14                                         ` Richard Sandiford
2023-03-01 22:53                                           ` juzhe.zhong
2023-03-02  6:07                                             ` Li, Pan2
2023-03-02  8:25                                             ` Richard Biener
2023-03-02  8:37                                               ` juzhe.zhong
2023-03-02  9:39                                                 ` Richard Sandiford
2023-03-02 10:19                                                   ` juzhe.zhong
     [not found]                               ` <2023030121501634323743@rivai.ai>
2023-03-01 13:52                                 ` juzhe.zhong
2023-03-02  5:55 ` [PATCH v2] " pan2.li
2023-03-02  9:43   ` Richard Sandiford
2023-03-02 14:46     ` Li, Pan2
2023-03-02 17:54       ` Richard Sandiford
2023-03-03  2:34         ` Li, Pan2

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptedq8ok67.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=incarnation.p.lee@outlook.com \
    --cc=juzhe.zhong@rivai.ai \
    --cc=kito.cheng@sifive.com \
    --cc=pan2.li@intel.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).