[SVE] PR96463 - Optimise svld1rq from vectors

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [SVE] PR96463 - Optimise svld1rq from vectors
@ 2021-12-02 10:50 Prathamesh Kulkarni
  2021-12-02 17:41 ` Richard Sandiford
  0 siblings, 1 reply; 6+ messages in thread
From: Prathamesh Kulkarni @ 2021-12-02 10:50 UTC (permalink / raw)
  To: gcc Patches, Richard Sandiford

[-- Attachment #1: Type: text/plain, Size: 1009 bytes --]

Hi Richard,
I have attached a WIP untested patch for PR96463.
IIUC, the PR suggests to transform
lhs = svld1rq ({-1, -1, ...}, &v[0])
into:
lhs = vec_perm_expr<v, v, {0, 0, ...}>
if v is vector of 4 elements, and each element is 32 bits on little
endian target ?

I am sorry if this sounds like a silly question, but I am not sure how
to convert a vector of type int32x4_t into svint32_t ? In the patch, I
simply used NOP_EXPR (which I expected to fail), and gave type error
during gimple verification:

svint32_t
foo (int32x4_t x)
{
  return svld1rq (svptrue_b8 (), &x[0]);
}

transformed to:
EMERGENCY DUMP:

svint32_t foo (int32x4_t x)
{
  svint32_t _3;
  __Int32x4_t _4;

  <bb 2> :
  _4 = VEC_PERM_EXPR <x_5(D), x_5(D), { 0, 0, 0, 0 }>;
  _3 = (svint32_t) _4;
  return _3;

}

and ICE's with:
pr96463.c:8:1: error: invalid vector types in nop conversion
    8 | }
      | ^
svint32_t
__Int32x4_t
_3 = (svint32_t) _4;
during GIMPLE pass: ccp

Could you please suggest how to proceed ?

Thanks,
Prathamesh

[-- Attachment #2: pr96463-1.txt --]
[-- Type: text/plain, Size: 2370 bytes --]

diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 02e42a71e5e..3834f33443a 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -44,6 +44,13 @@
 #include "aarch64-sve-builtins-shapes.h"
 #include "aarch64-sve-builtins-base.h"
 #include "aarch64-sve-builtins-functions.h"
+#include "print-tree.h"
+#include "gimple-pretty-print.h"
+
+/* ??? Including tree-ssanames.h requires including other header dependencies.
+       Just including the prototype for now.  */
+extern tree make_ssa_name_fn (struct function *, tree, gimple *,
+			      unsigned int version = 0);
 
 using namespace aarch64_sve;
 
@@ -1207,6 +1214,52 @@ public:
     insn_code icode = code_for_aarch64_sve_ld1rq (e.vector_mode (0));
     return e.use_contiguous_load_insn (icode);
   }
+
+  gimple *
+  fold (gimple_folder &f) const OVERRIDE
+  {
+    tree arg0 = gimple_call_arg (f.call, 0);
+    tree arg1 = gimple_call_arg (f.call, 1);
+
+    /* Transform:
+       lhs = svld1rq ({-1, -1, ... }, &v[0])
+       into:
+       tmp = vec_perm_expr<v, v, {0, 0, ...}>.
+       lhs = nop_expr tmp
+       on little endian target.  */
+
+    if (!BYTES_BIG_ENDIAN
+	&& integer_all_onesp (arg0)
+	&& TREE_CODE (arg1) == ADDR_EXPR)
+      {
+	tree t = TREE_OPERAND (arg1, 0);
+	if (TREE_CODE (t) == ARRAY_REF)
+	  {
+	    tree index = TREE_OPERAND (t, 1);
+	    t = TREE_OPERAND (t, 0);
+	    if (integer_zerop (index) && TREE_CODE (t) == VIEW_CONVERT_EXPR)
+	      {
+		t = TREE_OPERAND (t, 0);
+		tree vectype = TREE_TYPE (t);
+		if (VECTOR_TYPE_P (vectype)
+		    && known_eq (TYPE_VECTOR_SUBPARTS (vectype), 4u)
+		    && wi::to_wide (TYPE_SIZE (vectype)) == 128)
+		  {
+		    tree new_temp = ::make_ssa_name_fn (cfun, vectype, NULL);
+		    tree zero_vec = build_vector_from_val (vectype, index);
+		    gimple *g = gimple_build_assign (new_temp, VEC_PERM_EXPR, t, t, zero_vec);
+		    /* ??? How to convert between vector types if gimple_call_lhs (f.call) and
+			   new_temp have different types ?  */
+		    gimple *g2 = gimple_build_assign (gimple_call_lhs (f.call), NOP_EXPR, new_temp);
+		    gsi_insert_before (f.gsi, g, GSI_SAME_STMT);
+		    return g2;
+		  }
+	      }
+	  }
+      }
+
+    return NULL;
+  }
 };
 
 class svld1ro_impl : public load_replicate

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [SVE] PR96463 - Optimise svld1rq from vectors
  2021-12-02 10:50 [SVE] PR96463 - Optimise svld1rq from vectors Prathamesh Kulkarni
@ 2021-12-02 17:41 ` Richard Sandiford
  2021-12-07 11:45   ` Prathamesh Kulkarni
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Sandiford @ 2021-12-02 17:41 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: gcc Patches

Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> Hi Richard,
> I have attached a WIP untested patch for PR96463.
> IIUC, the PR suggests to transform
> lhs = svld1rq ({-1, -1, ...}, &v[0])
> into:
> lhs = vec_perm_expr<v, v, {0, 0, ...}>
> if v is vector of 4 elements, and each element is 32 bits on little
> endian target ?
>
> I am sorry if this sounds like a silly question, but I am not sure how
> to convert a vector of type int32x4_t into svint32_t ? In the patch, I
> simply used NOP_EXPR (which I expected to fail), and gave type error
> during gimple verification:

It should be possible in principle to have a VEC_PERM_EXPR in which
the operands are Advanced SIMD vectors and the result is an SVE vector.

E.g., the dup in the PR would be something like this:

foo (int32x4_t a)
{
  svint32_t _2;

  _2 = VEC_PERM_EXPR <x_1(D), x_1(D), { 0, 1, 2, 3, 0, 1, 2, 3, ... }>;
  return _2;
}

where the final operand can be built using:

  int source_nelts = TYPE_VECTOR_SUBPARTS (…rhs type…).to_constant ();
  vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (…lhs type…), source_nelts, 1);
  for (int i = 0; i < source_nelts; ++i)
    sel.quick_push (i);

I'm not sure how well-tested that combination is though.  It might need
changes to target-independent code.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [SVE] PR96463 - Optimise svld1rq from vectors
  2021-12-02 17:41 ` Richard Sandiford
@ 2021-12-07 11:45   ` Prathamesh Kulkarni
  2021-12-07 13:38     ` Richard Sandiford
  0 siblings, 1 reply; 6+ messages in thread
From: Prathamesh Kulkarni @ 2021-12-07 11:45 UTC (permalink / raw)
  To: Prathamesh Kulkarni, gcc Patches, richard.sandiford

[-- Attachment #1: Type: text/plain, Size: 2114 bytes --]

On Thu, 2 Dec 2021 at 23:11, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > Hi Richard,
> > I have attached a WIP untested patch for PR96463.
> > IIUC, the PR suggests to transform
> > lhs = svld1rq ({-1, -1, ...}, &v[0])
> > into:
> > lhs = vec_perm_expr<v, v, {0, 0, ...}>
> > if v is vector of 4 elements, and each element is 32 bits on little
> > endian target ?
> >
> > I am sorry if this sounds like a silly question, but I am not sure how
> > to convert a vector of type int32x4_t into svint32_t ? In the patch, I
> > simply used NOP_EXPR (which I expected to fail), and gave type error
> > during gimple verification:
>
> It should be possible in principle to have a VEC_PERM_EXPR in which
> the operands are Advanced SIMD vectors and the result is an SVE vector.
>
> E.g., the dup in the PR would be something like this:
>
> foo (int32x4_t a)
> {
>   svint32_t _2;
>
>   _2 = VEC_PERM_EXPR <x_1(D), x_1(D), { 0, 1, 2, 3, 0, 1, 2, 3, ... }>;
>   return _2;
> }
>
> where the final operand can be built using:
>
>   int source_nelts = TYPE_VECTOR_SUBPARTS (…rhs type…).to_constant ();
>   vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (…lhs type…), source_nelts, 1);
>   for (int i = 0; i < source_nelts; ++i)
>     sel.quick_push (i);
>
> I'm not sure how well-tested that combination is though.  It might need
> changes to target-independent code.
Hi Richard,
Thanks for the suggestions.
I tried the above approach in attached patch, but it still results in
ICE due to type mismatch:

pr96463.c: In function ‘foo’:
pr96463.c:8:1: error: type mismatch in ‘vec_perm_expr’
    8 | }
      | ^
svint32_t
int32x4_t
int32x4_t
svint32_t
_3 = VEC_PERM_EXPR <x_4(D), x_4(D), { 0, 1, 2, 3, ... }>;
during GIMPLE pass: ccp
dump file: pr96463.c.032t.ccp1
pr96463.c:8:1: internal compiler error: verify_gimple failed

Should we perhaps add another tree code, that "extends" a fixed-width
vector into it's VLA equivalent ?

Thanks,
Prathamesh
>
> Thanks,
> Richard

[-- Attachment #2: pr96463-2.txt --]
[-- Type: text/plain, Size: 2234 bytes --]

diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 02e42a71e5e..b38c4641535 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -44,6 +44,8 @@
 #include "aarch64-sve-builtins-shapes.h"
 #include "aarch64-sve-builtins-base.h"
 #include "aarch64-sve-builtins-functions.h"
+#include "print-tree.h"
+#include "gimple-pretty-print.h"
 
 using namespace aarch64_sve;
 
@@ -1207,6 +1209,57 @@ public:
     insn_code icode = code_for_aarch64_sve_ld1rq (e.vector_mode (0));
     return e.use_contiguous_load_insn (icode);
   }
+
+  gimple *
+  fold (gimple_folder &f) const OVERRIDE
+  {
+    tree arg0 = gimple_call_arg (f.call, 0);
+    tree arg1 = gimple_call_arg (f.call, 1);
+
+    /* Transform:
+       lhs = svld1rq ({-1, -1, ... }, &v[0])
+       into:
+       tmp = vec_perm_expr<v, v, {0, 0, ...}>.
+       lhs = nop_expr tmp
+       on little endian target.  */
+
+    if (!BYTES_BIG_ENDIAN
+	&& integer_all_onesp (arg0)
+	&& TREE_CODE (arg1) == ADDR_EXPR)
+      {
+	tree t = TREE_OPERAND (arg1, 0);
+	if (TREE_CODE (t) == ARRAY_REF)
+	  {
+	    tree index = TREE_OPERAND (t, 1);
+	    t = TREE_OPERAND (t, 0);
+	    if (integer_zerop (index) && TREE_CODE (t) == VIEW_CONVERT_EXPR)
+	      {
+		t = TREE_OPERAND (t, 0);
+		tree vectype = TREE_TYPE (t);
+		if (VECTOR_TYPE_P (vectype)
+		    && known_eq (TYPE_VECTOR_SUBPARTS (vectype), 4u)
+		    && wi::to_wide (TYPE_SIZE (vectype)) == 128)
+		  {
+		    tree lhs = gimple_call_lhs (f.call);
+		    tree lhs_type = TREE_TYPE (lhs);
+		    int source_nelts = TYPE_VECTOR_SUBPARTS (vectype).to_constant ();
+		    vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (lhs_type), source_nelts, 1);
+		    for (int i = 0; i < source_nelts; i++)
+		      sel.quick_push (i);
+
+		    vec_perm_indices indices (sel, 1, source_nelts);
+		    if (!can_vec_perm_const_p (TYPE_MODE (lhs_type), indices))
+		      return NULL;
+
+		    tree mask = vec_perm_indices_to_tree (lhs_type, indices);
+		    return gimple_build_assign (lhs, VEC_PERM_EXPR, t, t, mask);
+		  }
+	      }
+	  }
+      }
+
+    return NULL;
+  }
 };
 
 class svld1ro_impl : public load_replicate

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [SVE] PR96463 - Optimise svld1rq from vectors
  2021-12-07 11:45   ` Prathamesh Kulkarni
@ 2021-12-07 13:38     ` Richard Sandiford
  2021-12-14  8:34       ` Prathamesh Kulkarni
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Sandiford @ 2021-12-07 13:38 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: gcc Patches, rguenther

Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> On Thu, 2 Dec 2021 at 23:11, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
>> > Hi Richard,
>> > I have attached a WIP untested patch for PR96463.
>> > IIUC, the PR suggests to transform
>> > lhs = svld1rq ({-1, -1, ...}, &v[0])
>> > into:
>> > lhs = vec_perm_expr<v, v, {0, 0, ...}>
>> > if v is vector of 4 elements, and each element is 32 bits on little
>> > endian target ?
>> >
>> > I am sorry if this sounds like a silly question, but I am not sure how
>> > to convert a vector of type int32x4_t into svint32_t ? In the patch, I
>> > simply used NOP_EXPR (which I expected to fail), and gave type error
>> > during gimple verification:
>>
>> It should be possible in principle to have a VEC_PERM_EXPR in which
>> the operands are Advanced SIMD vectors and the result is an SVE vector.
>>
>> E.g., the dup in the PR would be something like this:
>>
>> foo (int32x4_t a)
>> {
>>   svint32_t _2;
>>
>>   _2 = VEC_PERM_EXPR <x_1(D), x_1(D), { 0, 1, 2, 3, 0, 1, 2, 3, ... }>;
>>   return _2;
>> }
>>
>> where the final operand can be built using:
>>
>>   int source_nelts = TYPE_VECTOR_SUBPARTS (…rhs type…).to_constant ();
>>   vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (…lhs type…), source_nelts, 1);
>>   for (int i = 0; i < source_nelts; ++i)
>>     sel.quick_push (i);
>>
>> I'm not sure how well-tested that combination is though.  It might need
>> changes to target-independent code.
> Hi Richard,
> Thanks for the suggestions.
> I tried the above approach in attached patch, but it still results in
> ICE due to type mismatch:
>
> pr96463.c: In function ‘foo’:
> pr96463.c:8:1: error: type mismatch in ‘vec_perm_expr’
>     8 | }
>       | ^
> svint32_t
> int32x4_t
> int32x4_t
> svint32_t
> _3 = VEC_PERM_EXPR <x_4(D), x_4(D), { 0, 1, 2, 3, ... }>;
> during GIMPLE pass: ccp
> dump file: pr96463.c.032t.ccp1
> pr96463.c:8:1: internal compiler error: verify_gimple failed
>
> Should we perhaps add another tree code, that "extends" a fixed-width
> vector into it's VLA equivalent ?

No, I think this is just an extreme example of the combination not being
well-tested. :-)  Obviously it's worse than I thought.

I think accepting this kind of VEC_PERM_EXPR is still the way to go.
Richi, WDYT?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [SVE] PR96463 - Optimise svld1rq from vectors
  2021-12-07 13:38     ` Richard Sandiford
@ 2021-12-14  8:34       ` Prathamesh Kulkarni
  2022-01-03 10:18         ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Prathamesh Kulkarni @ 2021-12-14  8:34 UTC (permalink / raw)
  To: Prathamesh Kulkarni, gcc Patches, rguenther, richard.sandiford

On Tue, 7 Dec 2021 at 19:08, Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > On Thu, 2 Dec 2021 at 23:11, Richard Sandiford
> > <richard.sandiford@arm.com> wrote:
> >>
> >> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> >> > Hi Richard,
> >> > I have attached a WIP untested patch for PR96463.
> >> > IIUC, the PR suggests to transform
> >> > lhs = svld1rq ({-1, -1, ...}, &v[0])
> >> > into:
> >> > lhs = vec_perm_expr<v, v, {0, 0, ...}>
> >> > if v is vector of 4 elements, and each element is 32 bits on little
> >> > endian target ?
> >> >
> >> > I am sorry if this sounds like a silly question, but I am not sure how
> >> > to convert a vector of type int32x4_t into svint32_t ? In the patch, I
> >> > simply used NOP_EXPR (which I expected to fail), and gave type error
> >> > during gimple verification:
> >>
> >> It should be possible in principle to have a VEC_PERM_EXPR in which
> >> the operands are Advanced SIMD vectors and the result is an SVE vector.
> >>
> >> E.g., the dup in the PR would be something like this:
> >>
> >> foo (int32x4_t a)
> >> {
> >>   svint32_t _2;
> >>
> >>   _2 = VEC_PERM_EXPR <x_1(D), x_1(D), { 0, 1, 2, 3, 0, 1, 2, 3, ... }>;
> >>   return _2;
> >> }
> >>
> >> where the final operand can be built using:
> >>
> >>   int source_nelts = TYPE_VECTOR_SUBPARTS (…rhs type…).to_constant ();
> >>   vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (…lhs type…), source_nelts, 1);
> >>   for (int i = 0; i < source_nelts; ++i)
> >>     sel.quick_push (i);
> >>
> >> I'm not sure how well-tested that combination is though.  It might need
> >> changes to target-independent code.
> > Hi Richard,
> > Thanks for the suggestions.
> > I tried the above approach in attached patch, but it still results in
> > ICE due to type mismatch:
> >
> > pr96463.c: In function ‘foo’:
> > pr96463.c:8:1: error: type mismatch in ‘vec_perm_expr’
> >     8 | }
> >       | ^
> > svint32_t
> > int32x4_t
> > int32x4_t
> > svint32_t
> > _3 = VEC_PERM_EXPR <x_4(D), x_4(D), { 0, 1, 2, 3, ... }>;
> > during GIMPLE pass: ccp
> > dump file: pr96463.c.032t.ccp1
> > pr96463.c:8:1: internal compiler error: verify_gimple failed
> >
> > Should we perhaps add another tree code, that "extends" a fixed-width
> > vector into it's VLA equivalent ?
>
> No, I think this is just an extreme example of the combination not being
> well-tested. :-)  Obviously it's worse than I thought.
>
> I think accepting this kind of VEC_PERM_EXPR is still the way to go.
> Richi, WDYT?
Hi Richi, ping ?

Thanks,
Prathamesh
>
> Thanks,
> Richard

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [SVE] PR96463 - Optimise svld1rq from vectors
  2021-12-14  8:34       ` Prathamesh Kulkarni
@ 2022-01-03 10:18         ` Richard Biener
  0 siblings, 0 replies; 6+ messages in thread
From: Richard Biener @ 2022-01-03 10:18 UTC (permalink / raw)
  To: Prathamesh Kulkarni; +Cc: gcc Patches, richard.sandiford

On Tue, 14 Dec 2021, Prathamesh Kulkarni wrote:

> On Tue, 7 Dec 2021 at 19:08, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
> >
> > Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > > On Thu, 2 Dec 2021 at 23:11, Richard Sandiford
> > > <richard.sandiford@arm.com> wrote:
> > >>
> > >> Prathamesh Kulkarni <prathamesh.kulkarni@linaro.org> writes:
> > >> > Hi Richard,
> > >> > I have attached a WIP untested patch for PR96463.
> > >> > IIUC, the PR suggests to transform
> > >> > lhs = svld1rq ({-1, -1, ...}, &v[0])
> > >> > into:
> > >> > lhs = vec_perm_expr<v, v, {0, 0, ...}>
> > >> > if v is vector of 4 elements, and each element is 32 bits on little
> > >> > endian target ?
> > >> >
> > >> > I am sorry if this sounds like a silly question, but I am not sure how
> > >> > to convert a vector of type int32x4_t into svint32_t ? In the patch, I
> > >> > simply used NOP_EXPR (which I expected to fail), and gave type error
> > >> > during gimple verification:
> > >>
> > >> It should be possible in principle to have a VEC_PERM_EXPR in which
> > >> the operands are Advanced SIMD vectors and the result is an SVE vector.
> > >>
> > >> E.g., the dup in the PR would be something like this:
> > >>
> > >> foo (int32x4_t a)
> > >> {
> > >>   svint32_t _2;
> > >>
> > >>   _2 = VEC_PERM_EXPR <x_1(D), x_1(D), { 0, 1, 2, 3, 0, 1, 2, 3, ... }>;
> > >>   return _2;
> > >> }
> > >>
> > >> where the final operand can be built using:
> > >>
> > >>   int source_nelts = TYPE_VECTOR_SUBPARTS (…rhs type…).to_constant ();
> > >>   vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (…lhs type…), source_nelts, 1);
> > >>   for (int i = 0; i < source_nelts; ++i)
> > >>     sel.quick_push (i);
> > >>
> > >> I'm not sure how well-tested that combination is though.  It might need
> > >> changes to target-independent code.
> > > Hi Richard,
> > > Thanks for the suggestions.
> > > I tried the above approach in attached patch, but it still results in
> > > ICE due to type mismatch:
> > >
> > > pr96463.c: In function ‘foo’:
> > > pr96463.c:8:1: error: type mismatch in ‘vec_perm_expr’
> > >     8 | }
> > >       | ^
> > > svint32_t
> > > int32x4_t
> > > int32x4_t
> > > svint32_t
> > > _3 = VEC_PERM_EXPR <x_4(D), x_4(D), { 0, 1, 2, 3, ... }>;
> > > during GIMPLE pass: ccp
> > > dump file: pr96463.c.032t.ccp1
> > > pr96463.c:8:1: internal compiler error: verify_gimple failed
> > >
> > > Should we perhaps add another tree code, that "extends" a fixed-width
> > > vector into it's VLA equivalent ?
> >
> > No, I think this is just an extreme example of the combination not being
> > well-tested. :-)  Obviously it's worse than I thought.
> >
> > I think accepting this kind of VEC_PERM_EXPR is still the way to go.
> > Richi, WDYT?
> Hi Richi, ping ?

We check

    case VEC_PERM_EXPR:
      if (!useless_type_conversion_p (lhs_type, rhs1_type)
          || !useless_type_conversion_p (lhs_type, rhs2_type))
        {
          error ("type mismatch in %qs", code_name);

and LHS is svint32_t while x_4 is int32x4_t (the permutation type
can indeed be different - it needs to be integer for example).

The test is indeed unnecessarily strict if there are two vector
types that are not compatible but have the same element mode and the
same number of elements.

I guess we could check sth like

   if (!useless_type_conversion_p (TREE_TYPE (lhs_type), TREE_TYPE 
(rhs1_type))
       || !types_compatible_p (rhs1_type, rhs2_type))

instead - we later check TYPE_VECTOR_SUBPARTS so they match.  But
note that vec_perm_optab has a single mode only so I'm not sure
what mode we should pick at RTL expansion time for your quoted case
so I'm a bit nervous here.  Richard?

Richard. 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-01-03 10:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-02 10:50 [SVE] PR96463 - Optimise svld1rq from vectors Prathamesh Kulkarni
2021-12-02 17:41 ` Richard Sandiford
2021-12-07 11:45   ` Prathamesh Kulkarni
2021-12-07 13:38     ` Richard Sandiford
2021-12-14  8:34       ` Prathamesh Kulkarni
2022-01-03 10:18         ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).