public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Tamar Christina <Tamar.Christina@arm.com>
Cc: "gcc-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	 nd <nd@arm.com>,  Richard Earnshaw <Richard.Earnshaw@arm.com>,
	 Marcus Shawcroft <Marcus.Shawcroft@arm.com>,
	 "ktkachov\@gcc.gnu.org" <ktkachov@gcc.gnu.org>
Subject: Re: [PATCH 3/4]AArch64: add new alternative with early clobber to patterns
Date: Wed, 22 May 2024 12:24:12 +0100	[thread overview]
Message-ID: <mptr0dujdib.fsf@arm.com> (raw)
In-Reply-To: <VI1PR08MB532599F27C034913034459B6FFEB2@VI1PR08MB5325.eurprd08.prod.outlook.com> (Tamar Christina's message of "Wed, 22 May 2024 11:00:00 +0000")

Tamar Christina <Tamar.Christina@arm.com> writes:
>> -----Original Message-----
>> From: Richard Sandiford <richard.sandiford@arm.com>
>> Sent: Wednesday, May 22, 2024 10:48 AM
>> To: Tamar Christina <Tamar.Christina@arm.com>
>> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>; Richard Earnshaw
>> <Richard.Earnshaw@arm.com>; Marcus Shawcroft
>> <Marcus.Shawcroft@arm.com>; ktkachov@gcc.gnu.org
>> Subject: Re: [PATCH 3/4]AArch64: add new alternative with early clobber to
>> patterns
>> 
>> Tamar Christina <tamar.christina@arm.com> writes:
>> > Hi All,
>> >
>> > This patch adds new alternatives to the patterns which are affected.  The new
>> > alternatives with the conditional early clobbers are added before the normal
>> > ones in order for LRA to prefer them in the event that we have enough free
>> > registers to accommodate them.
>> >
>> > In case register pressure is too high the normal alternatives will be preferred
>> > before a reload is considered as we rather have the tie than a spill.
>> >
>> > Tests are in the next patch.
>> >
>> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>> >
>> > Ok for master?
>> >
>> > Thanks,
>> > Tamar
>> >
>> > gcc/ChangeLog:
>> >
>> > 	* config/aarch64/aarch64-sve.md (and<mode>3,
>> > 	@aarch64_pred_<optab><mode>_z, *<optab><mode>3_cc,
>> > 	*<optab><mode>3_ptest, aarch64_pred_<nlogical><mode>_z,
>> > 	*<nlogical><mode>3_cc, *<nlogical><mode>3_ptest,
>> > 	aarch64_pred_<logical_nn><mode>_z, *<logical_nn><mode>3_cc,
>> > 	*<logical_nn><mode>3_ptest, @aarch64_pred_cmp<cmp_op><mode>,
>> > 	*cmp<cmp_op><mode>_cc, *cmp<cmp_op><mode>_ptest,
>> > 	@aarch64_pred_cmp<cmp_op><mode>_wide,
>> > 	*aarch64_pred_cmp<cmp_op><mode>_wide_cc,
>> > 	*aarch64_pred_cmp<cmp_op><mode>_wide_ptest,
>> @aarch64_brk<brk_op>,
>> > 	*aarch64_brk<brk_op>_cc, *aarch64_brk<brk_op>_ptest,
>> > 	@aarch64_brk<brk_op>, *aarch64_brkn_cc, *aarch64_brkn_ptest,
>> > 	*aarch64_brk<brk_op>_cc, *aarch64_brk<brk_op>_ptest,
>> > 	aarch64_rdffr_z, *aarch64_rdffr_z_ptest, *aarch64_rdffr_ptest,
>> > 	*aarch64_rdffr_z_cc, *aarch64_rdffr_cc): Add new early clobber
>> > 	alternative.
>> > 	* config/aarch64/aarch64-sve2.md
>> > 	(@aarch64_pred_<sve_int_op><mode>): Likewise.
>> >
>> > ---
>> > diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-
>> sve.md
>> > index
>> e3085c0c636f1317409bbf3b5fbaf5342a2df1f6..8fdc1bc3cd43acfcd675a18350c
>> 297428c85fe46 100644
>> > --- a/gcc/config/aarch64/aarch64-sve.md
>> > +++ b/gcc/config/aarch64/aarch64-sve.md
>> > @@ -1161,8 +1161,10 @@ (define_insn "aarch64_rdffr_z"
>> >  	  (reg:VNx16BI FFRT_REGNUM)
>> >  	  (match_operand:VNx16BI 1 "register_operand")))]
>> >    "TARGET_SVE && TARGET_NON_STREAMING"
>> > -  {@ [ cons: =0, 1   ]
>> > -     [ Upa     , Upa ] rdffr\t%0.b, %1/z
>> > +  {@ [ cons: =0, 1  ; attrs: pred_clobber ]
>> > +     [ &Upa    , Upa; yes                 ] rdffr\t%0.b, %1/z
>> > +     [ ?Upa    , Upa; yes                 ] ^
>> > +     [ Upa     , Upa; *                   ] ^
>> >    }
>> >  )
>> 
>> Sorry for not explaining it very well, but in the previous review I suggested:
>> 
>> > The gather-like approach would be something like:
>> >
>> >      [ &Upa     , Upl , w , <sve_imm_con>; yes                 ]
>> cmp<cmp_op>\t%0.<Vetype>, %1/z, %3.<Vetype>, #%4
>> >      [ ?Upl     , 0   , w , <sve_imm_con>; yes                 ] ^
>> >      [ Upa      , Upl , w , <sve_imm_con>; no                  ] ^
>> >      [ &Upa     , Upl , w , w            ; yes                 ] cmp<cmp_op>\t%0.<Vetype>, %1/z,
>> %3.<Vetype>, %4.<Vetype>
>> >      [ ?Upl     , 0   , w , w            ; yes                 ] ^
>> >      [ Upa      , Upl , w , w            ; no                  ] ^
>> >
>> > with:
>> >
>> >   (define_attr "pred_clobber" "any,no,yes" (const_string "any"))
>> 
>> (with emphasis on the last line).  What I didn't say explicitly is
>> that "no" should require !TARGET_SVE_PRED_CLOBBER.
>> 
>> The premise of that review was that we shouldn't enable things like:
>> 
>>      [ Upa      , Upl , w , w            ; no                  ] ^
>> 
>> for TARGET_SVE_PRED_CLOBBER since it contradicts the earlyclobber
>> alternative.  So we should enable either the pred_clobber=yes
>> alternatives or the pred_clobber=no alternatives, but not both.
>> 
>> The default "any" is then for other non-predicate instructions that
>> don't care about TARGET_SVE_PRED_CLOBBER either way.
>> 
>> In contrast, this patch makes pred_clobber=yes enable the alternatives
>> that correctly describe the restriction (good!) but then also enables
>> the normal alternatives too, which IMO makes the semantics unclear.
>
> Sure, the reason I still had that is because this ICEs under high register
> pressure:
>
>   {@ [ cons: =0 , 1   , 3 , 4            ; attrs: pred_clobber ]
>      [ &Upa     , Upl , w , <sve_imm_con>; yes                 ] cmp<cmp_op>\t%0.<Vetype>, %1/z, %3.<Vetype>, #%4
>      [ ?Upa     , 0   , w , <sve_imm_con>; yes                 ] ^
>      [ Upa      , Upl , w , <sve_imm_con>; no                  ] ^
>      [ &Upa     , Upl , w , w            ; yes                 ] cmp<cmp_op>\t%0.<Vetype>, %1/z, %3.<Vetype>, %4.<Vetype>
>      [ ?Upa     , 0   , w , w            ; yes                 ] ^
>      [ Upa      , Upl , w , w            ; no                  ] ^
>   }
>
> So now in the `yes` case reload does:
>
>          Considering alt=0 of insn 10:   (0) =&Upa  (1) Upl  (3) w  (4) vsd
>             0 Small class reload: reject+=3
>             0 Non input pseudo reload: reject++
>             0 Early clobber: reject++
>             Bad operand -- refuse
>          Considering alt=1 of insn 10:   (0) ?Upa  (1) 0  (3) w  (4) vsd
>             Staticly defined alt reject+=6
>             0 Small class reload: reject+=3
>             0 Non input pseudo reload: reject++
>             1 Dying matched operand reload: reject++
>             1 Small class reload: reject+=3
>             Bad operand -- refuse
>          Considering alt=3 of insn 10:   (0) &Upa  (1) Upl  (3) w  (4) w
>             0 Small class reload: reject+=3
>             0 Non input pseudo reload: reject++
>             0 Early clobber: reject++
>           overall=11,losers=1,rld_nregs=1
>          Considering alt=4 of insn 10:   (0) ?Upa  (1) 0  (3) w  (4) w
>             Staticly defined alt reject+=6
>             0 Small class reload: reject+=3
>             0 Non input pseudo reload: reject++
>             overall=16,losers=1 -- refuse
>       Choosing alt 3 in insn 10:  (0) &Upa  (1) Upl  (3) w  (4) w {aarch64_pred_cmplovnx8hi}
>
> And the penalty of alt=4 makes it pick alt=3 even though it doesn't have the free registers
> for it. alt=4 would have worked.

By "high register pressure", do you mean if predicate registers are
disabled using -ffixed?  If so, that's ok in itself.  That kind of
ICE shouldn't occur in real use.

> I believe this now follows exactly what was suggested:
>
> 1. provide an early clobber alternative
> 2. provide an explicit tie alternative with an increase in cost for using it
> 3. provide a general/normal alternative that is only enabled when the first two aren't.
>
> Having read the email a number of times.. did I somehow miss something?

But how is (3) arranged?  It looks like the normal alternative is enabled
unconditionally, in the sense that the "enabled" attribute is always "yes".

Thanks,
Richard

  reply	other threads:[~2024-05-22 11:24 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-22  9:29 Tamar Christina
2024-05-22  9:47 ` Richard Sandiford
2024-05-22 11:00   ` Tamar Christina
2024-05-22 11:24     ` Richard Sandiford [this message]
2024-05-28  9:38       ` Tamar Christina
2024-05-30 20:12         ` Richard Sandiford
  -- strict thread matches above, loose matches on Subject: below --
2024-05-15 10:28 [PATCH 0/4]AArch64: support conditional early clobbers on certain operations Tamar Christina
2024-05-15 10:29 ` [PATCH 3/4]AArch64: add new alternative with early clobber to patterns Tamar Christina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptr0dujdib.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=Marcus.Shawcroft@arm.com \
    --cc=Richard.Earnshaw@arm.com \
    --cc=Tamar.Christina@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=ktkachov@gcc.gnu.org \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).