RE: [PATCH]AArch64 relax predicate on load structure load instructions

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Tamar Christina <Tamar.Christina@arm.com>
To: Richard Sandiford <Richard.Sandiford@arm.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	nd <nd@arm.com>, Richard Earnshaw <Richard.Earnshaw@arm.com>,
	Marcus Shawcroft <Marcus.Shawcroft@arm.com>,
	Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
Subject: RE: [PATCH]AArch64 relax predicate on load structure load instructions
Date: Wed, 8 Jun 2022 13:51:08 +0000	[thread overview]
Message-ID: <VI1PR08MB532510AAA956E28EDFAFBE3AFFA49@VI1PR08MB5325.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <mptbkv3igzp.fsf@arm.com>

> -----Original Message-----
> From: Richard Sandiford <richard.sandiford@arm.com>
> Sent: Wednesday, June 8, 2022 11:31 AM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>; Marcus Shawcroft
> <Marcus.Shawcroft@arm.com>; Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> Subject: Re: [PATCH]AArch64 relax predicate on load structure load
> instructions
> 
> Tamar Christina <tamar.christina@arm.com> writes:
> > Hi All,
> >
> > At some point in time we started lowering the ld1r instructions in gimple.
> >
> > That is:
> >
> > uint8x8_t f1(const uint8_t *in) {
> >     return vld1_dup_u8(&in[1]);
> > }
> >
> > generates at gimple:
> >
> >   _3 = MEM[(const uint8_t *)in_1(D) + 1B];
> >   _4 = {_3, _3, _3, _3, _3, _3, _3, _3};
> >
> > Which is good, but we then generate:
> >
> > f1:
> > 	ldr     b0, [x0, 1]
> > 	dup     v0.8b, v0.b[0]
> > 	ret
> >
> > instead of ld1r.
> >
> > The reason for this is because the load instructions have a too
> > restrictive predicate on them which causes combine not to be able to
> > combine the instructions due to the predicate only accepting simple
> addressing modes.
> >
> > This patch relaxes the predicate to accept any memory operand and
> > relies on LRA to legitimize the address when it needs to as the
> > constraint still only allows the simple addressing mode.  Reload is
> > always able to legitimize to these.
> >
> > Secondly since we are now actually generating more ld1r it became
> > clear that the lane instructions suffer from a similar issue.
> >
> > i.e.
> >
> > float32x4_t f2(const float32_t *in, float32x4_t a) {
> >     float32x4_t dup = vld1q_dup_f32(&in[1]);
> >     return vfmaq_laneq_f32 (a, a, dup, 1); }
> >
> > would generate ld1r + vector fmla instead of ldr + lane fmla.
> >
> > The reason for this is similar to the ld1r issue.  The predicate is
> > too restrictive in only acception register operands but not memory.
> >
> > This relaxes it to accept register and/or memory while leaving the
> > constraint to only accept registers.  This will have LRA generate a
> > reload if needed forcing the memory to registers using the standard
> patterns.
> >
> > These two changes allow combine and reload to generate the right
> sequences.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> This is going against the general direction of travel, which is to make the
> instruction's predicates and conditions enforce the constraints as much as
> possible (making optimistic assumptions about pseudo registers).
> 
> The RA *can* deal with things like:
> 
>   (match_operand:M N "general_operand" "r")
> 
> but it's best avoided, for a few reasons:
> 
> (1) The fix-up will be done in LRA, so IRA will not see the temporary
>     registers.  This can make the allocation of those temporaries
>     suboptimal but (more importantly) it might require other
>     previously-allocated registers to be spilled late due to the
>     unexpected increase in register pressure.
> 
> (2) It ends up hiding instructions from the pre-RA optimisers.
> 
> (3) It can also prevent combine opportunities (as well as create them),
>     unless the loose predicates in an insn I are propagated to all
>     patterns that might result from combining I with something else.
> 
> It sounds like the first problem (not generating ld1r) could be fixed by (a)
> combining aarch64_simd_dup<mode> and *aarch64_simd_ld1r<mode>, so
> that the register and memory alternatives are in the same pattern and (b)
> using the merged instruction(s) to implement the vec_duplicate optab.
> Target-independent code should then make the address satisfy the
> predicate, simplifying the address where necessary.
> 

I think I am likely missing something here. I would assume that you wanted
to use the optab to split the addressing off from the mem expression so the
combined insn matches.

But in that case, why do you need to combine the two instructions?
I've tried and it doesn't work since the vec_duplicate optab doesn't see the
mem as op1, because in gimple the mem is not part of the duplicate.

So you still just see:

>>> dbgrtx (ops[1].value)
(subreg/s/v:QI (reg:SI 92 [ _3 ]) 0)

As the operand as the argument to the dup is just an SSA_NAME.

If not and you wanted the combined insn to accept

(set (reg:SI 92 [ _3 ])
    (zero_extend:SI (mem:QI (plus:DI (reg:DI 97)
                (const_int 1 [0x1])) [0 MEM[(const uint8_tD.4561 *)in_1(D) + 1B]+0 S1 A8])))

Then that's also not possible without relaxing the combined predicates.  As far as I can tell
If I'm not allowed to use LRA for this, then the only thing that could work is an early split?

Or do I have to modify store_constructor to try a variant where it tries pushing in the
Decl of an SSA_NAME first?

I guess this also only really works for ld1r, whenever we lower ld2(r) etc we'll have the same
issue again... But I suppose that's for the next person 😊

Thanks,
Tamar

> I'm not sure whether fixing the ld1r problem that way will avoid the
> vfmaq_laneq_f32 problem; let me know if not.
> 
> Thanks,
> Richard
> 
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > 	* config/aarch64/aarch64-simd.md (mul_lane<mode>3,
> mul_laneq<mode>3,
> > 	mul_n<mode>3, *aarch64_mul3_elt_to_64v2df,
> *aarch64_mla_elt<mode>,
> > 	*aarch64_mla_elt_<vswap_width_name><mode>,
> aarch64_mla_n<mode>,
> > 	*aarch64_mls_elt<mode>,
> *aarch64_mls_elt_<vswap_width_name><mode>,
> > 	aarch64_mls_n<mode>, *aarch64_fma4_elt<mode>,
> > 	*aarch64_fma4_elt_<vswap_width_name><mode>,
> > 	*aarch64_fma4_elt_from_dup<mode>,
> *aarch64_fma4_elt_to_64v2df,
> > 	*aarch64_fnma4_elt<mode>,
> *aarch64_fnma4_elt_<vswap_width_name><mode>,
> > 	*aarch64_fnma4_elt_from_dup<mode>,
> *aarch64_fnma4_elt_to_64v2df,
> > 	*aarch64_mulx_elt_<vswap_width_name><mode>,
> > 	*aarch64_mulx_elt<mode>, *aarch64_mulx_elt_from_dup<mode>,
> > 	*aarch64_vgetfmulx<mode>): Relax register_operand to
> > 	nonimmediate_operand.
> > 	(aarch64_simd_ld2<vstruct_elt>, aarch64_simd_ld2r<vstruct_elt>,
> > 	aarch64_vec_load_lanes<mode>_lane<vstruct_elt>,
> > 	vec_load_lanes<mode><vstruct_elt>,
> aarch64_simd_st2<vstruct_elt>,
> > 	aarch64_vec_store_lanes<mode>_lane<vstruct_elt>,
> > 	vec_store_lanes<mode><vstruct_elt>,
> aarch64_simd_ld3<vstruct_elt>,
> > 	aarch64_simd_ld3r<vstruct_elt>,
> > 	aarch64_vec_load_lanes<mode>_lane<vstruct_elt>,
> > 	vec_load_lanes<mode><vstruct_elt>,
> aarch64_simd_st3<vstruct_elt>,
> > 	aarch64_vec_store_lanes<mode>_lane<vstruct_elt>,
> > 	vec_store_lanes<mode><vstruct_elt>,
> aarch64_simd_ld4<vstruct_elt>,
> > 	aarch64_simd_ld4r<vstruct_elt>,
> > 	aarch64_vec_load_lanes<mode>_lane<vstruct_elt>,
> > 	vec_load_lanes<mode><vstruct_elt>,
> aarch64_simd_st4<vstruct_elt>,
> > 	aarch64_vec_store_lanes<mode>_lane<vstruct_elt>,
> > 	vec_store_lanes<mode><vstruct_elt>,
> aarch64_ld1_x3_<vstruct_elt>,
> > 	aarch64_ld1_x4_<vstruct_elt>, aarch64_st1_x2_<vstruct_elt>,
> > 	aarch64_st1_x3_<vstruct_elt>, aarch64_st1_x4_<vstruct_elt>,
> > 	aarch64_be_ld1<mode>, aarch64_be_st1<mode>,
> > 	aarch64_ld2<vstruct_elt>_dreg, aarch64_ld2<vstruct_elt>_dreg,
> > 	aarch64_ld3<vstruct_elt>_dreg, aarch64_ld3<vstruct_elt>_dreg,
> > 	aarch64_ld4<vstruct_elt>_dreg, aarch64_ld4<vstruct_elt>_dreg,
> > 	aarch64_st2<vstruct_elt>_dreg, aarch64_st2<vstruct_elt>_dreg,
> > 	aarch64_st3<vstruct_elt>_dreg, aarch64_st3<vstruct_elt>_dreg,
> > 	aarch64_st4<vstruct_elt>_dreg, aarch64_st4<vstruct_elt>_dreg,
> > 	*aarch64_simd_ld1r<mode>, aarch64_simd_ld1<vstruct_elt>_x2):
> Relax
> > 	aarch64_simd_struct_operand to memory_operand.
> > 	* config/aarch64/predicates.md (aarch64_simd_struct_operand):
> Remove.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 	* gcc.target/aarch64/vld1r.c: New test.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/config/aarch64/aarch64-simd.md
> > b/gcc/config/aarch64/aarch64-simd.md
> > index
> >
> be5c70bbb7520ae93d19c4a432ce34863e5b9a64..24e3274ddda2ea76c83571fa
> da8f
> > f4c953b752a1 100644
> > --- a/gcc/config/aarch64/aarch64-simd.md
> > +++ b/gcc/config/aarch64/aarch64-simd.md
> > @@ -712,7 +712,7 @@ (define_insn "mul_lane<mode>3"
> >         (mult:VMULD
> >  	 (vec_duplicate:VMULD
> >  	   (vec_select:<VEL>
> > -	     (match_operand:<VCOND> 2 "register_operand" "<h_con>")
> > +	     (match_operand:<VCOND> 2 "nonimmediate_operand"
> "<h_con>")
> >  	     (parallel [(match_operand:SI 3 "immediate_operand" "i")])))
> >  	 (match_operand:VMULD 1 "register_operand" "w")))]
> >    "TARGET_SIMD"
> > @@ -728,7 +728,7 @@ (define_insn "mul_laneq<mode>3"
> >       (mult:VMUL
> >         (vec_duplicate:VMUL
> >  	  (vec_select:<VEL>
> > -	    (match_operand:<VCONQ> 2 "register_operand" "<h_con>")
> > +	    (match_operand:<VCONQ> 2 "nonimmediate_operand"
> "<h_con>")
> >  	    (parallel [(match_operand:SI 3 "immediate_operand")])))
> >        (match_operand:VMUL 1 "register_operand" "w")))]
> >    "TARGET_SIMD"
> > @@ -743,7 +743,7 @@ (define_insn "mul_n<mode>3"
> >   [(set (match_operand:VMUL 0 "register_operand" "=w")
> >         (mult:VMUL
> >  	 (vec_duplicate:VMUL
> > -	   (match_operand:<VEL> 2 "register_operand" "<h_con>"))
> > +	   (match_operand:<VEL> 2 "nonimmediate_operand" "<h_con>"))
> >  	 (match_operand:VMUL 1 "register_operand" "w")))]
> >    "TARGET_SIMD"
> >    "<f>mul\t%0.<Vtype>, %1.<Vtype>, %2.<Vetype>[0]"; @@ -789,7 +789,7
> > @@ (define_insn "*aarch64_mul3_elt_to_64v2df"
> >    [(set (match_operand:DF 0 "register_operand" "=w")
> >       (mult:DF
> >         (vec_select:DF
> > -	 (match_operand:V2DF 1 "register_operand" "w")
> > +	 (match_operand:V2DF 1 "nonimmediate_operand" "w")
> >  	 (parallel [(match_operand:SI 2 "immediate_operand")]))
> >         (match_operand:DF 3 "register_operand" "w")))]
> >    "TARGET_SIMD"
> > @@ -1406,7 +1406,7 @@ (define_insn "*aarch64_mla_elt<mode>"
> >  	 (mult:VDQHS
> >  	   (vec_duplicate:VDQHS
> >  	      (vec_select:<VEL>
> > -		(match_operand:VDQHS 1 "register_operand" "<h_con>")
> > +		(match_operand:VDQHS 1 "nonimmediate_operand"
> "<h_con>")
> >  		  (parallel [(match_operand:SI 2 "immediate_operand")])))
> >  	   (match_operand:VDQHS 3 "register_operand" "w"))
> >  	 (match_operand:VDQHS 4 "register_operand" "0")))] @@ -1424,7
> > +1424,7 @@ (define_insn
> "*aarch64_mla_elt_<vswap_width_name><mode>"
> >  	 (mult:VDQHS
> >  	   (vec_duplicate:VDQHS
> >  	      (vec_select:<VEL>
> > -		(match_operand:<VSWAP_WIDTH> 1 "register_operand"
> "<h_con>")
> > +		(match_operand:<VSWAP_WIDTH> 1
> "nonimmediate_operand" "<h_con>")
> >  		  (parallel [(match_operand:SI 2 "immediate_operand")])))
> >  	   (match_operand:VDQHS 3 "register_operand" "w"))
> >  	 (match_operand:VDQHS 4 "register_operand" "0")))] @@ -1441,7
> > +1441,7 @@ (define_insn "aarch64_mla_n<mode>"
> >  	(plus:VDQHS
> >  	  (mult:VDQHS
> >  	    (vec_duplicate:VDQHS
> > -	      (match_operand:<VEL> 3 "register_operand" "<h_con>"))
> > +	      (match_operand:<VEL> 3 "nonimmediate_operand" "<h_con>"))
> >  	    (match_operand:VDQHS 2 "register_operand" "w"))
> >  	  (match_operand:VDQHS 1 "register_operand" "0")))]
> >   "TARGET_SIMD"
> > @@ -1466,7 +1466,7 @@ (define_insn "*aarch64_mls_elt<mode>"
> >  	 (mult:VDQHS
> >  	   (vec_duplicate:VDQHS
> >  	      (vec_select:<VEL>
> > -		(match_operand:VDQHS 1 "register_operand" "<h_con>")
> > +		(match_operand:VDQHS 1 "nonimmediate_operand"
> "<h_con>")
> >  		  (parallel [(match_operand:SI 2 "immediate_operand")])))
> >  	   (match_operand:VDQHS 3 "register_operand" "w"))))]
> >   "TARGET_SIMD"
> > @@ -1484,7 +1484,7 @@ (define_insn
> "*aarch64_mls_elt_<vswap_width_name><mode>"
> >  	 (mult:VDQHS
> >  	   (vec_duplicate:VDQHS
> >  	      (vec_select:<VEL>
> > -		(match_operand:<VSWAP_WIDTH> 1 "register_operand"
> "<h_con>")
> > +		(match_operand:<VSWAP_WIDTH> 1
> "nonimmediate_operand" "<h_con>")
> >  		  (parallel [(match_operand:SI 2 "immediate_operand")])))
> >  	   (match_operand:VDQHS 3 "register_operand" "w"))))]
> >   "TARGET_SIMD"
> > @@ -1501,7 +1501,7 @@ (define_insn "aarch64_mls_n<mode>"
> >  	  (match_operand:VDQHS 1 "register_operand" "0")
> >  	  (mult:VDQHS
> >  	    (vec_duplicate:VDQHS
> > -	      (match_operand:<VEL> 3 "register_operand" "<h_con>"))
> > +	      (match_operand:<VEL> 3 "nonimmediate_operand" "<h_con>"))
> >  	    (match_operand:VDQHS 2 "register_operand" "w"))))]
> >    "TARGET_SIMD"
> >    "mls\t%0.<Vtype>, %2.<Vtype>, %3.<Vetype>[0]"
> > @@ -2882,7 +2882,7 @@ (define_insn "*aarch64_fma4_elt<mode>"
> >      (fma:VDQF
> >        (vec_duplicate:VDQF
> >  	(vec_select:<VEL>
> > -	  (match_operand:VDQF 1 "register_operand" "<h_con>")
> > +	  (match_operand:VDQF 1 "nonimmediate_operand" "<h_con>")
> >  	  (parallel [(match_operand:SI 2 "immediate_operand")])))
> >        (match_operand:VDQF 3 "register_operand" "w")
> >        (match_operand:VDQF 4 "register_operand" "0")))] @@ -2899,7
> > +2899,7 @@ (define_insn
> "*aarch64_fma4_elt_<vswap_width_name><mode>"
> >      (fma:VDQSF
> >        (vec_duplicate:VDQSF
> >  	(vec_select:<VEL>
> > -	  (match_operand:<VSWAP_WIDTH> 1 "register_operand"
> "<h_con>")
> > +	  (match_operand:<VSWAP_WIDTH> 1 "nonimmediate_operand"
> "<h_con>")
> >  	  (parallel [(match_operand:SI 2 "immediate_operand")])))
> >        (match_operand:VDQSF 3 "register_operand" "w")
> >        (match_operand:VDQSF 4 "register_operand" "0")))] @@ -2915,7
> > +2915,7 @@ (define_insn "*aarch64_fma4_elt_from_dup<mode>"
> >    [(set (match_operand:VMUL 0 "register_operand" "=w")
> >      (fma:VMUL
> >        (vec_duplicate:VMUL
> > -	  (match_operand:<VEL> 1 "register_operand" "<h_con>"))
> > +	  (match_operand:<VEL> 1 "nonimmediate_operand" "<h_con>"))
> >        (match_operand:VMUL 2 "register_operand" "w")
> >        (match_operand:VMUL 3 "register_operand" "0")))]
> >    "TARGET_SIMD"
> > @@ -2927,7 +2927,7 @@ (define_insn "*aarch64_fma4_elt_to_64v2df"
> >    [(set (match_operand:DF 0 "register_operand" "=w")
> >      (fma:DF
> >  	(vec_select:DF
> > -	  (match_operand:V2DF 1 "register_operand" "w")
> > +	  (match_operand:V2DF 1 "nonimmediate_operand" "w")
> >  	  (parallel [(match_operand:SI 2 "immediate_operand")]))
> >        (match_operand:DF 3 "register_operand" "w")
> >        (match_operand:DF 4 "register_operand" "0")))] @@ -2957,7
> > +2957,7 @@ (define_insn "*aarch64_fnma4_elt<mode>"
> >          (match_operand:VDQF 3 "register_operand" "w"))
> >        (vec_duplicate:VDQF
> >  	(vec_select:<VEL>
> > -	  (match_operand:VDQF 1 "register_operand" "<h_con>")
> > +	  (match_operand:VDQF 1 "nonimmediate_operand" "<h_con>")
> >  	  (parallel [(match_operand:SI 2 "immediate_operand")])))
> >        (match_operand:VDQF 4 "register_operand" "0")))]
> >    "TARGET_SIMD"
> > @@ -2975,7 +2975,7 @@ (define_insn
> "*aarch64_fnma4_elt_<vswap_width_name><mode>"
> >          (match_operand:VDQSF 3 "register_operand" "w"))
> >        (vec_duplicate:VDQSF
> >  	(vec_select:<VEL>
> > -	  (match_operand:<VSWAP_WIDTH> 1 "register_operand"
> "<h_con>")
> > +	  (match_operand:<VSWAP_WIDTH> 1 "nonimmediate_operand"
> "<h_con>")
> >  	  (parallel [(match_operand:SI 2 "immediate_operand")])))
> >        (match_operand:VDQSF 4 "register_operand" "0")))]
> >    "TARGET_SIMD"
> > @@ -2992,7 +2992,7 @@ (define_insn
> "*aarch64_fnma4_elt_from_dup<mode>"
> >        (neg:VMUL
> >          (match_operand:VMUL 2 "register_operand" "w"))
> >        (vec_duplicate:VMUL
> > -	(match_operand:<VEL> 1 "register_operand" "<h_con>"))
> > +	(match_operand:<VEL> 1 "nonimmediate_operand" "<h_con>"))
> >        (match_operand:VMUL 3 "register_operand" "0")))]
> >    "TARGET_SIMD"
> >    "fmls\t%0.<Vtype>, %2.<Vtype>, %1.<Vetype>[0]"
> > @@ -3003,7 +3003,7 @@ (define_insn "*aarch64_fnma4_elt_to_64v2df"
> >    [(set (match_operand:DF 0 "register_operand" "=w")
> >      (fma:DF
> >        (vec_select:DF
> > -	(match_operand:V2DF 1 "register_operand" "w")
> > +	(match_operand:V2DF 1 "nonimmediate_operand" "w")
> >  	(parallel [(match_operand:SI 2 "immediate_operand")]))
> >        (neg:DF
> >          (match_operand:DF 3 "register_operand" "w")) @@ -4934,7
> > +4934,7 @@ (define_insn
> "*aarch64_mulx_elt_<vswap_width_name><mode>"
> >  	 [(match_operand:VDQSF 1 "register_operand" "w")
> >  	  (vec_duplicate:VDQSF
> >  	   (vec_select:<VEL>
> > -	    (match_operand:<VSWAP_WIDTH> 2 "register_operand" "w")
> > +	    (match_operand:<VSWAP_WIDTH> 2 "nonimmediate_operand"
> "w")
> >  	    (parallel [(match_operand:SI 3 "immediate_operand" "i")])))]
> >  	 UNSPEC_FMULX))]
> >    "TARGET_SIMD"
> > @@ -4953,7 +4953,7 @@ (define_insn "*aarch64_mulx_elt<mode>"
> >  	 [(match_operand:VDQF 1 "register_operand" "w")
> >  	  (vec_duplicate:VDQF
> >  	   (vec_select:<VEL>
> > -	    (match_operand:VDQF 2 "register_operand" "w")
> > +	    (match_operand:VDQF 2 "nonimmediate_operand" "w")
> >  	    (parallel [(match_operand:SI 3 "immediate_operand" "i")])))]
> >  	 UNSPEC_FMULX))]
> >    "TARGET_SIMD"
> > @@ -4971,7 +4971,7 @@ (define_insn
> "*aarch64_mulx_elt_from_dup<mode>"
> >  	(unspec:VHSDF
> >  	 [(match_operand:VHSDF 1 "register_operand" "w")
> >  	  (vec_duplicate:VHSDF
> > -	    (match_operand:<VEL> 2 "register_operand" "<h_con>"))]
> > +	    (match_operand:<VEL> 2 "nonimmediate_operand" "<h_con>"))]
> >  	 UNSPEC_FMULX))]
> >    "TARGET_SIMD"
> >    "fmulx\t%0.<Vtype>, %1.<Vtype>, %2.<Vetype>[0]"; @@ -4987,7
> +4987,7
> > @@ (define_insn "*aarch64_vgetfmulx<mode>"
> >  	(unspec:<VEL>
> >  	 [(match_operand:<VEL> 1 "register_operand" "w")
> >  	  (vec_select:<VEL>
> > -	   (match_operand:VDQF 2 "register_operand" "w")
> > +	   (match_operand:VDQF 2 "nonimmediate_operand" "w")
> >  	    (parallel [(match_operand:SI 3 "immediate_operand" "i")]))]
> >  	 UNSPEC_FMULX))]
> >    "TARGET_SIMD"
> > @@ -6768,7 +6768,7 @@ (define_insn "*sqrt<mode>2"
> >  (define_insn "aarch64_simd_ld2<vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_2Q 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_2Q [
> > -	  (match_operand:VSTRUCT_2Q 1 "aarch64_simd_struct_operand"
> "Utv")]
> > +	  (match_operand:VSTRUCT_2Q 1 "memory_operand" "Utv")]
> >  	  UNSPEC_LD2))]
> >    "TARGET_SIMD"
> >    "ld2\\t{%S0.<Vtype> - %T0.<Vtype>}, %1"
> > @@ -6778,7 +6778,7 @@ (define_insn "aarch64_simd_ld2<vstruct_elt>"
> >  (define_insn "aarch64_simd_ld2r<vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_2QD 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_2QD [
> > -	  (match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
> > +	  (match_operand:BLK 1 "memory_operand" "Utv")]
> >            UNSPEC_LD2_DUP))]
> >    "TARGET_SIMD"
> >    "ld2r\\t{%S0.<Vtype> - %T0.<Vtype>}, %1"
> > @@ -6788,7 +6788,7 @@ (define_insn "aarch64_simd_ld2r<vstruct_elt>"
> >  (define_insn "aarch64_vec_load_lanes<mode>_lane<vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_2QD 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_2QD [
> > -		(match_operand:BLK 1 "aarch64_simd_struct_operand"
> "Utv")
> > +		(match_operand:BLK 1 "memory_operand" "Utv")
> >  		(match_operand:VSTRUCT_2QD 2 "register_operand" "0")
> >  		(match_operand:SI 3 "immediate_operand" "i")]
> >  		UNSPEC_LD2_LANE))]
> > @@ -6804,7 +6804,7 @@ (define_insn
> "aarch64_vec_load_lanes<mode>_lane<vstruct_elt>"
> >  (define_expand "vec_load_lanes<mode><vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_2Q 0 "register_operand")
> >  	(unspec:VSTRUCT_2Q [
> > -		(match_operand:VSTRUCT_2Q 1
> "aarch64_simd_struct_operand")]
> > +		(match_operand:VSTRUCT_2Q 1 "memory_operand")]
> >  		UNSPEC_LD2))]
> >    "TARGET_SIMD"
> >  {
> > @@ -6822,7 +6822,7 @@ (define_expand
> "vec_load_lanes<mode><vstruct_elt>"
> >  })
> >
> >  (define_insn "aarch64_simd_st2<vstruct_elt>"
> > -  [(set (match_operand:VSTRUCT_2Q 0 "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VSTRUCT_2Q 0 "memory_operand" "=Utv")
> >  	(unspec:VSTRUCT_2Q [
> >  		(match_operand:VSTRUCT_2Q 1 "register_operand" "w")]
> >                  UNSPEC_ST2))]
> > @@ -6833,7 +6833,7 @@ (define_insn "aarch64_simd_st2<vstruct_elt>"
> >
> >  ;; RTL uses GCC vector extension indices, so flip only for assembly.
> >  (define_insn "aarch64_vec_store_lanes<mode>_lane<vstruct_elt>"
> > -  [(set (match_operand:BLK 0 "aarch64_simd_struct_operand" "=Utv")
> > +  [(set (match_operand:BLK 0 "memory_operand" "=Utv")
> >  	(unspec:BLK [(match_operand:VSTRUCT_2QD 1 "register_operand"
> "w")
> >  		     (match_operand:SI 2 "immediate_operand" "i")]
> >  		     UNSPEC_ST2_LANE))]
> > @@ -6847,7 +6847,7 @@ (define_insn
> "aarch64_vec_store_lanes<mode>_lane<vstruct_elt>"
> >  )
> >
> >  (define_expand "vec_store_lanes<mode><vstruct_elt>"
> > -  [(set (match_operand:VSTRUCT_2Q 0 "aarch64_simd_struct_operand")
> > +  [(set (match_operand:VSTRUCT_2Q 0 "memory_operand")
> >  	(unspec:VSTRUCT_2Q [(match_operand:VSTRUCT_2Q 1
> "register_operand")]
> >                     UNSPEC_ST2))]
> >    "TARGET_SIMD"
> > @@ -6868,7 +6868,7 @@ (define_expand
> "vec_store_lanes<mode><vstruct_elt>"
> >  (define_insn "aarch64_simd_ld3<vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_3Q 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_3Q [
> > -	  (match_operand:VSTRUCT_3Q 1 "aarch64_simd_struct_operand"
> "Utv")]
> > +	  (match_operand:VSTRUCT_3Q 1 "memory_operand" "Utv")]
> >  	  UNSPEC_LD3))]
> >    "TARGET_SIMD"
> >    "ld3\\t{%S0.<Vtype> - %U0.<Vtype>}, %1"
> > @@ -6878,7 +6878,7 @@ (define_insn "aarch64_simd_ld3<vstruct_elt>"
> >  (define_insn "aarch64_simd_ld3r<vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_3QD 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_3QD [
> > -	  (match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
> > +	  (match_operand:BLK 1 "memory_operand" "Utv")]
> >            UNSPEC_LD3_DUP))]
> >    "TARGET_SIMD"
> >    "ld3r\\t{%S0.<Vtype> - %U0.<Vtype>}, %1"
> > @@ -6888,7 +6888,7 @@ (define_insn "aarch64_simd_ld3r<vstruct_elt>"
> >  (define_insn "aarch64_vec_load_lanes<mode>_lane<vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_3QD 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_3QD [
> > -		(match_operand:BLK 1 "aarch64_simd_struct_operand"
> "Utv")
> > +		(match_operand:BLK 1 "memory_operand" "Utv")
> >  		(match_operand:VSTRUCT_3QD 2 "register_operand" "0")
> >  		(match_operand:SI 3 "immediate_operand" "i")]
> >  		UNSPEC_LD3_LANE))]
> > @@ -6904,7 +6904,7 @@ (define_insn
> "aarch64_vec_load_lanes<mode>_lane<vstruct_elt>"
> >  (define_expand "vec_load_lanes<mode><vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_3Q 0 "register_operand")
> >  	(unspec:VSTRUCT_3Q [
> > -		(match_operand:VSTRUCT_3Q 1
> "aarch64_simd_struct_operand")]
> > +		(match_operand:VSTRUCT_3Q 1 "memory_operand")]
> >  		UNSPEC_LD3))]
> >    "TARGET_SIMD"
> >  {
> > @@ -6922,7 +6922,7 @@ (define_expand
> "vec_load_lanes<mode><vstruct_elt>"
> >  })
> >
> >  (define_insn "aarch64_simd_st3<vstruct_elt>"
> > -  [(set (match_operand:VSTRUCT_3Q 0 "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VSTRUCT_3Q 0 "memory_operand" "=Utv")
> >  	(unspec:VSTRUCT_3Q [(match_operand:VSTRUCT_3Q 1
> "register_operand" "w")]
> >                     UNSPEC_ST3))]
> >    "TARGET_SIMD"
> > @@ -6932,7 +6932,7 @@ (define_insn "aarch64_simd_st3<vstruct_elt>"
> >
> >  ;; RTL uses GCC vector extension indices, so flip only for assembly.
> >  (define_insn "aarch64_vec_store_lanes<mode>_lane<vstruct_elt>"
> > -  [(set (match_operand:BLK 0 "aarch64_simd_struct_operand" "=Utv")
> > +  [(set (match_operand:BLK 0 "memory_operand" "=Utv")
> >  	(unspec:BLK [(match_operand:VSTRUCT_3QD 1 "register_operand"
> "w")
> >  		     (match_operand:SI 2 "immediate_operand" "i")]
> >  		     UNSPEC_ST3_LANE))]
> > @@ -6946,7 +6946,7 @@ (define_insn
> "aarch64_vec_store_lanes<mode>_lane<vstruct_elt>"
> >  )
> >
> >  (define_expand "vec_store_lanes<mode><vstruct_elt>"
> > -  [(set (match_operand:VSTRUCT_3Q 0 "aarch64_simd_struct_operand")
> > +  [(set (match_operand:VSTRUCT_3Q 0 "memory_operand")
> >  	(unspec:VSTRUCT_3Q [
> >  		(match_operand:VSTRUCT_3Q 1 "register_operand")]
> >                  UNSPEC_ST3))]
> > @@ -6968,7 +6968,7 @@ (define_expand
> "vec_store_lanes<mode><vstruct_elt>"
> >  (define_insn "aarch64_simd_ld4<vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_4Q 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_4Q [
> > -	  (match_operand:VSTRUCT_4Q 1 "aarch64_simd_struct_operand"
> "Utv")]
> > +	  (match_operand:VSTRUCT_4Q 1 "memory_operand" "Utv")]
> >  	  UNSPEC_LD4))]
> >    "TARGET_SIMD"
> >    "ld4\\t{%S0.<Vtype> - %V0.<Vtype>}, %1"
> > @@ -6978,7 +6978,7 @@ (define_insn "aarch64_simd_ld4<vstruct_elt>"
> >  (define_insn "aarch64_simd_ld4r<vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_4QD 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_4QD [
> > -	  (match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")]
> > +	  (match_operand:BLK 1 "memory_operand" "Utv")]
> >            UNSPEC_LD4_DUP))]
> >    "TARGET_SIMD"
> >    "ld4r\\t{%S0.<Vtype> - %V0.<Vtype>}, %1"
> > @@ -6988,7 +6988,7 @@ (define_insn "aarch64_simd_ld4r<vstruct_elt>"
> >  (define_insn "aarch64_vec_load_lanes<mode>_lane<vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_4QD 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_4QD [
> > -		(match_operand:BLK 1 "aarch64_simd_struct_operand"
> "Utv")
> > +		(match_operand:BLK 1 "memory_operand" "Utv")
> >  		(match_operand:VSTRUCT_4QD 2 "register_operand" "0")
> >  		(match_operand:SI 3 "immediate_operand" "i")]
> >  		UNSPEC_LD4_LANE))]
> > @@ -7004,7 +7004,7 @@ (define_insn
> "aarch64_vec_load_lanes<mode>_lane<vstruct_elt>"
> >  (define_expand "vec_load_lanes<mode><vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_4Q 0 "register_operand")
> >  	(unspec:VSTRUCT_4Q [
> > -		(match_operand:VSTRUCT_4Q 1
> "aarch64_simd_struct_operand")]
> > +		(match_operand:VSTRUCT_4Q 1 "memory_operand")]
> >  		UNSPEC_LD4))]
> >    "TARGET_SIMD"
> >  {
> > @@ -7022,7 +7022,7 @@ (define_expand
> "vec_load_lanes<mode><vstruct_elt>"
> >  })
> >
> >  (define_insn "aarch64_simd_st4<vstruct_elt>"
> > -  [(set (match_operand:VSTRUCT_4Q 0 "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VSTRUCT_4Q 0 "memory_operand" "=Utv")
> >  	(unspec:VSTRUCT_4Q [
> >  		(match_operand:VSTRUCT_4Q 1 "register_operand" "w")]
> >                  UNSPEC_ST4))]
> > @@ -7033,7 +7033,7 @@ (define_insn "aarch64_simd_st4<vstruct_elt>"
> >
> >  ;; RTL uses GCC vector extension indices, so flip only for assembly.
> >  (define_insn "aarch64_vec_store_lanes<mode>_lane<vstruct_elt>"
> > -  [(set (match_operand:BLK 0 "aarch64_simd_struct_operand" "=Utv")
> > +  [(set (match_operand:BLK 0 "memory_operand" "=Utv")
> >  	(unspec:BLK [(match_operand:VSTRUCT_4QD 1 "register_operand"
> "w")
> >  		     (match_operand:SI 2 "immediate_operand" "i")]
> >  		     UNSPEC_ST4_LANE))]
> > @@ -7047,7 +7047,7 @@ (define_insn
> "aarch64_vec_store_lanes<mode>_lane<vstruct_elt>"
> >  )
> >
> >  (define_expand "vec_store_lanes<mode><vstruct_elt>"
> > -  [(set (match_operand:VSTRUCT_4Q 0 "aarch64_simd_struct_operand")
> > +  [(set (match_operand:VSTRUCT_4Q 0 "memory_operand")
> >  	(unspec:VSTRUCT_4Q [(match_operand:VSTRUCT_4Q 1
> "register_operand")]
> >                     UNSPEC_ST4))]
> >    "TARGET_SIMD"
> > @@ -7138,7 +7138,7 @@ (define_expand "aarch64_ld1x3<vstruct_elt>"
> >  (define_insn "aarch64_ld1_x3_<vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_3QD 0 "register_operand" "=w")
> >          (unspec:VSTRUCT_3QD
> > -	  [(match_operand:VSTRUCT_3QD 1 "aarch64_simd_struct_operand"
> "Utv")]
> > +	  [(match_operand:VSTRUCT_3QD 1 "memory_operand" "Utv")]
> >  	  UNSPEC_LD1))]
> >    "TARGET_SIMD"
> >    "ld1\\t{%S0.<Vtype> - %U0.<Vtype>}, %1"
> > @@ -7158,7 +7158,7 @@ (define_expand "aarch64_ld1x4<vstruct_elt>"
> >  (define_insn "aarch64_ld1_x4_<vstruct_elt>"
> >    [(set (match_operand:VSTRUCT_4QD 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_4QD
> > -	  [(match_operand:VSTRUCT_4QD 1 "aarch64_simd_struct_operand"
> "Utv")]
> > +	  [(match_operand:VSTRUCT_4QD 1 "memory_operand" "Utv")]
> >  	UNSPEC_LD1))]
> >    "TARGET_SIMD"
> >    "ld1\\t{%S0.<Vtype> - %V0.<Vtype>}, %1"
> > @@ -7176,7 +7176,7 @@ (define_expand "aarch64_st1x2<vstruct_elt>"
> >  })
> >
> >  (define_insn "aarch64_st1_x2_<vstruct_elt>"
> > -  [(set (match_operand:VSTRUCT_2QD 0 "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VSTRUCT_2QD 0 "memory_operand" "=Utv")
> >  	(unspec:VSTRUCT_2QD
> >  		[(match_operand:VSTRUCT_2QD 1 "register_operand" "w")]
> >  		UNSPEC_ST1))]
> > @@ -7196,7 +7196,7 @@ (define_expand "aarch64_st1x3<vstruct_elt>"
> >  })
> >
> >  (define_insn "aarch64_st1_x3_<vstruct_elt>"
> > -  [(set (match_operand:VSTRUCT_3QD 0 "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VSTRUCT_3QD 0 "memory_operand" "=Utv")
> >  	(unspec:VSTRUCT_3QD
> >  		[(match_operand:VSTRUCT_3QD 1 "register_operand" "w")]
> >  		UNSPEC_ST1))]
> > @@ -7216,7 +7216,7 @@ (define_expand "aarch64_st1x4<vstruct_elt>"
> >  })
> >
> >  (define_insn "aarch64_st1_x4_<vstruct_elt>"
> > -  [(set (match_operand:VSTRUCT_4QD 0 "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VSTRUCT_4QD 0 "memory_operand" "=Utv")
> >  	(unspec:VSTRUCT_4QD
> >  		[(match_operand:VSTRUCT_4QD 1 "register_operand" "w")]
> >  		UNSPEC_ST1))]
> > @@ -7268,7 +7268,7 @@ (define_insn "*aarch64_movv8di"
> >  (define_insn "aarch64_be_ld1<mode>"
> >    [(set (match_operand:VALLDI_F16 0	"register_operand" "=w")
> >  	(unspec:VALLDI_F16 [(match_operand:VALLDI_F16 1
> > -			     "aarch64_simd_struct_operand" "Utv")]
> > +			     "memory_operand" "Utv")]
> >  	UNSPEC_LD1))]
> >    "TARGET_SIMD"
> >    "ld1\\t{%0<Vmtype>}, %1"
> > @@ -7276,7 +7276,7 @@ (define_insn "aarch64_be_ld1<mode>"
> >  )
> >
> >  (define_insn "aarch64_be_st1<mode>"
> > -  [(set (match_operand:VALLDI_F16 0 "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VALLDI_F16 0 "memory_operand" "=Utv")
> >  	(unspec:VALLDI_F16 [(match_operand:VALLDI_F16 1
> "register_operand" "w")]
> >  	UNSPEC_ST1))]
> >    "TARGET_SIMD"
> > @@ -7551,7 +7551,7 @@ (define_expand
> "aarch64_ld<nregs>r<vstruct_elt>"
> >  (define_insn "aarch64_ld2<vstruct_elt>_dreg"
> >    [(set (match_operand:VSTRUCT_2DNX 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_2DNX [
> > -	  (match_operand:VSTRUCT_2DNX 1
> "aarch64_simd_struct_operand" "Utv")]
> > +	  (match_operand:VSTRUCT_2DNX 1 "memory_operand" "Utv")]
> >  	  UNSPEC_LD2_DREG))]
> >    "TARGET_SIMD"
> >    "ld2\\t{%S0.<Vtype> - %T0.<Vtype>}, %1"
> > @@ -7561,7 +7561,7 @@ (define_insn "aarch64_ld2<vstruct_elt>_dreg"
> >  (define_insn "aarch64_ld2<vstruct_elt>_dreg"
> >    [(set (match_operand:VSTRUCT_2DX 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_2DX [
> > -	  (match_operand:VSTRUCT_2DX 1 "aarch64_simd_struct_operand"
> "Utv")]
> > +	  (match_operand:VSTRUCT_2DX 1 "memory_operand" "Utv")]
> >  	  UNSPEC_LD2_DREG))]
> >    "TARGET_SIMD"
> >    "ld1\\t{%S0.1d - %T0.1d}, %1"
> > @@ -7571,7 +7571,7 @@ (define_insn "aarch64_ld2<vstruct_elt>_dreg"
> >  (define_insn "aarch64_ld3<vstruct_elt>_dreg"
> >    [(set (match_operand:VSTRUCT_3DNX 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_3DNX [
> > -	  (match_operand:VSTRUCT_3DNX 1
> "aarch64_simd_struct_operand" "Utv")]
> > +	  (match_operand:VSTRUCT_3DNX 1 "memory_operand" "Utv")]
> >  	  UNSPEC_LD3_DREG))]
> >    "TARGET_SIMD"
> >    "ld3\\t{%S0.<Vtype> - %U0.<Vtype>}, %1"
> > @@ -7581,7 +7581,7 @@ (define_insn "aarch64_ld3<vstruct_elt>_dreg"
> >  (define_insn "aarch64_ld3<vstruct_elt>_dreg"
> >    [(set (match_operand:VSTRUCT_3DX 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_3DX [
> > -	  (match_operand:VSTRUCT_3DX 1 "aarch64_simd_struct_operand"
> "Utv")]
> > +	  (match_operand:VSTRUCT_3DX 1 "memory_operand" "Utv")]
> >  	  UNSPEC_LD3_DREG))]
> >    "TARGET_SIMD"
> >    "ld1\\t{%S0.1d - %U0.1d}, %1"
> > @@ -7591,7 +7591,7 @@ (define_insn "aarch64_ld3<vstruct_elt>_dreg"
> >  (define_insn "aarch64_ld4<vstruct_elt>_dreg"
> >    [(set (match_operand:VSTRUCT_4DNX 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_4DNX [
> > -	  (match_operand:VSTRUCT_4DNX 1
> "aarch64_simd_struct_operand" "Utv")]
> > +	  (match_operand:VSTRUCT_4DNX 1 "memory_operand" "Utv")]
> >  	  UNSPEC_LD4_DREG))]
> >    "TARGET_SIMD"
> >    "ld4\\t{%S0.<Vtype> - %V0.<Vtype>}, %1"
> > @@ -7601,7 +7601,7 @@ (define_insn "aarch64_ld4<vstruct_elt>_dreg"
> >  (define_insn "aarch64_ld4<vstruct_elt>_dreg"
> >    [(set (match_operand:VSTRUCT_4DX 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_4DX [
> > -	  (match_operand:VSTRUCT_4DX 1 "aarch64_simd_struct_operand"
> "Utv")]
> > +	  (match_operand:VSTRUCT_4DX 1 "memory_operand" "Utv")]
> >  	  UNSPEC_LD4_DREG))]
> >    "TARGET_SIMD"
> >    "ld1\\t{%S0.1d - %V0.1d}, %1"
> > @@ -7841,7 +7841,7 @@ (define_insn
> "aarch64_rev<REVERSE:rev_op><mode>"
> >  )
> >
> >  (define_insn "aarch64_st2<vstruct_elt>_dreg"
> > -  [(set (match_operand:VSTRUCT_2DNX 0
> "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VSTRUCT_2DNX 0 "memory_operand" "=Utv")
> >  	(unspec:VSTRUCT_2DNX [
> >  		(match_operand:VSTRUCT_2DNX 1 "register_operand" "w")]
> >  		UNSPEC_ST2))]
> > @@ -7851,7 +7851,7 @@ (define_insn "aarch64_st2<vstruct_elt>_dreg"
> >  )
> >
> >  (define_insn "aarch64_st2<vstruct_elt>_dreg"
> > -  [(set (match_operand:VSTRUCT_2DX 0 "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VSTRUCT_2DX 0 "memory_operand" "=Utv")
> >  	(unspec:VSTRUCT_2DX [
> >  		(match_operand:VSTRUCT_2DX 1 "register_operand" "w")]
> >  		UNSPEC_ST2))]
> > @@ -7861,7 +7861,7 @@ (define_insn "aarch64_st2<vstruct_elt>_dreg"
> >  )
> >
> >  (define_insn "aarch64_st3<vstruct_elt>_dreg"
> > -  [(set (match_operand:VSTRUCT_3DNX 0
> "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VSTRUCT_3DNX 0 "memory_operand" "=Utv")
> >  	(unspec:VSTRUCT_3DNX [
> >  		(match_operand:VSTRUCT_3DNX 1 "register_operand" "w")]
> >  		UNSPEC_ST3))]
> > @@ -7871,7 +7871,7 @@ (define_insn "aarch64_st3<vstruct_elt>_dreg"
> >  )
> >
> >  (define_insn "aarch64_st3<vstruct_elt>_dreg"
> > -  [(set (match_operand:VSTRUCT_3DX 0 "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VSTRUCT_3DX 0 "memory_operand" "=Utv")
> >  	(unspec:VSTRUCT_3DX [
> >  		(match_operand:VSTRUCT_3DX 1 "register_operand" "w")]
> >  		UNSPEC_ST3))]
> > @@ -7881,7 +7881,7 @@ (define_insn "aarch64_st3<vstruct_elt>_dreg"
> >  )
> >
> >  (define_insn "aarch64_st4<vstruct_elt>_dreg"
> > -  [(set (match_operand:VSTRUCT_4DNX 0
> "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VSTRUCT_4DNX 0 "memory_operand" "=Utv")
> >  	(unspec:VSTRUCT_4DNX [
> >  		(match_operand:VSTRUCT_4DNX 1 "register_operand" "w")]
> >  		UNSPEC_ST4))]
> > @@ -7891,7 +7891,7 @@ (define_insn "aarch64_st4<vstruct_elt>_dreg"
> >  )
> >
> >  (define_insn "aarch64_st4<vstruct_elt>_dreg"
> > -  [(set (match_operand:VSTRUCT_4DX 0 "aarch64_simd_struct_operand"
> > "=Utv")
> > +  [(set (match_operand:VSTRUCT_4DX 0 "memory_operand" "=Utv")
> >  	(unspec:VSTRUCT_4DX [
> >  		(match_operand:VSTRUCT_4DX 1 "register_operand" "w")]
> >  		UNSPEC_ST4))]
> > @@ -7974,7 +7974,7 @@ (define_expand "vec_init<mode><Vhalf>"
> >  (define_insn "*aarch64_simd_ld1r<mode>"
> >    [(set (match_operand:VALL_F16 0 "register_operand" "=w")
> >  	(vec_duplicate:VALL_F16
> > -	  (match_operand:<VEL> 1 "aarch64_simd_struct_operand" "Utv")))]
> > +	  (match_operand:<VEL> 1 "memory_operand" "Utv")))]
> >    "TARGET_SIMD"
> >    "ld1r\\t{%0.<Vtype>}, %1"
> >    [(set_attr "type" "neon_load1_all_lanes")] @@ -7983,7 +7983,7 @@
> > (define_insn "*aarch64_simd_ld1r<mode>"
> >  (define_insn "aarch64_simd_ld1<vstruct_elt>_x2"
> >    [(set (match_operand:VSTRUCT_2QD 0 "register_operand" "=w")
> >  	(unspec:VSTRUCT_2QD [
> > -	    (match_operand:VSTRUCT_2QD 1 "aarch64_simd_struct_operand"
> "Utv")]
> > +	    (match_operand:VSTRUCT_2QD 1 "memory_operand" "Utv")]
> >  	    UNSPEC_LD1))]
> >    "TARGET_SIMD"
> >    "ld1\\t{%S0.<Vtype> - %T0.<Vtype>}, %1"
> > diff --git a/gcc/config/aarch64/predicates.md
> > b/gcc/config/aarch64/predicates.md
> > index
> >
> c308015ac2c13d24cd6bcec71247ec45df8cf5e6..6b70a364530c8108457091bfec
> 12
> > fe549f722149 100644
> > --- a/gcc/config/aarch64/predicates.md
> > +++ b/gcc/config/aarch64/predicates.md
> > @@ -494,10 +494,6 @@ (define_predicate
> "aarch64_simd_reg_or_minus_one"
> >    (ior (match_operand 0 "register_operand")
> >         (match_operand 0 "aarch64_simd_imm_minus_one")))
> >
> > -(define_predicate "aarch64_simd_struct_operand"
> > -  (and (match_code "mem")
> > -       (match_test "TARGET_SIMD && aarch64_simd_mem_operand_p
> (op)")))
> > -
> >  ;; Like general_operand but allow only valid SIMD addressing modes.
> >  (define_predicate "aarch64_simd_general_operand"
> >    (and (match_operand 0 "general_operand") diff --git
> > a/gcc/testsuite/gcc.target/aarch64/vld1r.c
> > b/gcc/testsuite/gcc.target/aarch64/vld1r.c
> > new file mode 100644
> > index
> >
> 0000000000000000000000000000000000000000..72c505c403e9e239771379b7ca
> dd
> > 8a9473f06113
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/vld1r.c
> > @@ -0,0 +1,26 @@
> > +/* { dg-do compile } */
> > +/* { dg-additional-options "-O" } */
> > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } }
> > +} */
> > +
> > +#include <arm_neon.h>
> > +
> > +/*
> > +** f1:
> > +** 	add	x0, x0, 1
> > +** 	ld1r	{v0.8b}, \[x0\]
> > +** 	ret
> > +*/
> > +uint8x8_t f1(const uint8_t *in) {
> > +    return vld1_dup_u8(&in[1]);
> > +}
> > +
> > +/*
> > +** f2:
> > +** 	ldr	s1, \[x0, 4\]
> > +** 	fmla	v0.4s, v0.4s, v1.s\[0\]
> > +** 	ret
> > +*/
> > +float32x4_t f2(const float32_t *in, float32x4_t a) {
> > +    float32x4_t dup = vld1q_dup_f32(&in[1]);
> > +    return vfmaq_laneq_f32 (a, a, dup, 1); }

next prev parent reply	other threads:[~2022-06-08 13:51 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-08  9:38 Tamar Christina
2022-06-08 10:31 ` Richard Sandiford
2022-06-08 13:51   ` Tamar Christina [this message]
2022-06-08 14:35     ` Richard Sandiford
2022-06-09  7:42       ` Tamar Christina
2022-06-09  8:22         ` Richard Sandiford
2022-06-09  8:43           ` Tamar Christina
2022-06-13  8:00       ` Richard Biener
2022-06-13  8:26         ` Richard Sandiford
2022-06-13  8:38           ` Richard Biener
2022-06-13  9:51             ` Tamar Christina
2022-06-13 11:50               ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=VI1PR08MB532510AAA956E28EDFAFBE3AFFA49@VI1PR08MB5325.eurprd08.prod.outlook.com \
    --to=tamar.christina@arm.com \
    --cc=Kyrylo.Tkachov@arm.com \
    --cc=Marcus.Shawcroft@arm.com \
    --cc=Richard.Earnshaw@arm.com \
    --cc=Richard.Sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).