From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 150AF3858C50; Thu, 30 May 2024 14:59:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 150AF3858C50 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 150AF3858C50 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717081149; cv=none; b=mO95lSeN9pgxOS4p7aecn8t0UkdTxnETy0t6t1etf4Lqo9wLU7gSrFfVS9Pw6BgiDd4G7uwfLc0FgGzcgOxQLZUaZ+vQT1kxWzPMPnYiwNXPny+WxpYHeXv65FGufaUlq34qww10hZjr4jP4CC+3WN4r01jrGCGCSGu8taBLtTc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717081149; c=relaxed/simple; bh=JlncPXpiQwmvquLrdpyoHsIsyVIYK1CkMRjqSahjKRc=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=WqGih1/WkTXdFwAF0g9TZGX2xE3CPx1lscB+/bd3qnLZR8VqoBkolvqebkHRQnXYclKLf8u8d8/nmhxZVQ49rt2vbSwk6c2LHNo9aR3H4QVKQKkhYrPQOdIfBZPcdHDCS9nRCkJU4pRQ7Ymkr3pabZlIv2hxn/rllIYqw4n0Vj0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EF1302F4; Thu, 30 May 2024 07:59:30 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CE1A03F641; Thu, 30 May 2024 07:59:05 -0700 (PDT) From: Richard Sandiford To: Tamar Christina Mail-Followup-To: Tamar Christina ,"gcc-patches\@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , "ktkachov\@gcc.gnu.org" , richard.sandiford@arm.com Cc: "gcc-patches\@gcc.gnu.org" , nd , Richard Earnshaw , Marcus Shawcroft , "ktkachov\@gcc.gnu.org" Subject: Re: [PATCH 2/4]AArch64: add new tuning param and attribute for enabling conditional early clobber References: Date: Thu, 30 May 2024 15:59:04 +0100 In-Reply-To: (Tamar Christina's message of "Tue, 28 May 2024 10:37:33 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-20.2 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,KAM_LOTSOFHASH,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Tamar Christina writes: >> -----Original Message----- >> From: Tamar Christina >> Sent: Wednesday, May 22, 2024 10:29 AM >> To: Richard Sandiford >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; ktkachov@gcc.gnu.org >> Subject: RE: [PATCH 2/4]AArch64: add new tuning param and attribute for >> enabling conditional early clobber >> >> > >> > Sorry for the bike-shedding, but how about something like "avoid_pred_rmw"? >> > (I'm open to other suggestions.) Just looking for something that describes >> > either the architecture or the end result that we want to achieve. >> > And preferable something fairly short :) >> > >> > avoid_* would be consistent with the existing "avoid_cross_loop_fma". >> > >> > > + >> > > #undef AARCH64_EXTRA_TUNING_OPTION >> > > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h >> > > index >> > >> bbf11faaf4b4340956094a983f8b0dc2649b2d27..76a18dd511f40ebb58ed12d5 >> > 6b46c74084ba7c3c 100644 >> > > --- a/gcc/config/aarch64/aarch64.h >> > > +++ b/gcc/config/aarch64/aarch64.h >> > > @@ -495,6 +495,11 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = >> > AARCH64_FL_SM_OFF; >> > > enabled through +gcs. */ >> > > #define TARGET_GCS (AARCH64_ISA_GCS) >> > > >> > > +/* Prefer different predicate registers for the output of a predicated operation >> > over >> > > + re-using an existing input predicate. */ >> > > +#define TARGET_SVE_PRED_CLOBBER (TARGET_SVE \ >> > > + && (aarch64_tune_params.extra_tuning_flags \ >> > > + & >> > AARCH64_EXTRA_TUNE_EARLY_CLOBBER_SVE_PRED_DEST)) >> > > >> > > /* Standard register usage. */ >> > > >> > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md >> > > index >> > >> dbde066f7478bec51a8703b017ea553aa98be309..1ecd1a2812969504bd5114a >> > 53473b478c5ddba82 100644 >> > > --- a/gcc/config/aarch64/aarch64.md >> > > +++ b/gcc/config/aarch64/aarch64.md >> > > @@ -445,6 +445,10 @@ (define_enum_attr "arch" "arches" (const_string >> > "any")) >> > > ;; target-independent code. >> > > (define_attr "is_call" "no,yes" (const_string "no")) >> > > >> > > +;; Indicates whether we want to enable the pattern with an optional early >> > > +;; clobber for SVE predicates. >> > > +(define_attr "pred_clobber" "no,yes" (const_string "no")) >> > > + >> > > ;; [For compatibility with Arm in pipeline models] >> > > ;; Attribute that specifies whether or not the instruction touches fp >> > > ;; registers. >> > > @@ -461,7 +465,8 @@ (define_attr "fp" "no,yes" >> > > (define_attr "arch_enabled" "no,yes" >> > > (if_then_else >> > > (ior >> > > - (eq_attr "arch" "any") >> > > + (and (eq_attr "arch" "any") >> > > + (eq_attr "pred_clobber" "no")) >> > > >> > > (and (eq_attr "arch" "rcpc8_4") >> > > (match_test "AARCH64_ISA_RCPC8_4")) >> > > @@ -488,7 +493,10 @@ (define_attr "arch_enabled" "no,yes" >> > > (match_test "TARGET_SVE")) >> > > >> > > (and (eq_attr "arch" "sme") >> > > - (match_test "TARGET_SME"))) >> > > + (match_test "TARGET_SME")) >> > > + >> > > + (and (eq_attr "pred_clobber" "yes") >> > > + (match_test "TARGET_SVE_PRED_CLOBBER"))) >> > >> > IMO it'd be bettero handle pred_clobber separately from arch, as a new >> > top-level AND: >> > >> > (and >> > (ior >> > (eq_attr "pred_clobber" "no") >> > (match_test "!TARGET_...")) >> > (ior >> > ...existing arch tests...)) >> > >> > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64-tuning-flags.def > (AVOID_PRED_RMW): New. > * config/aarch64/aarch64.h (TARGET_SVE_PRED_CLOBBER): New. > * config/aarch64/aarch64.md (pred_clobber): New. > (arch_enabled): Use it. > > -- inline copy of patch -- > > diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def b/gcc/config/aarch64/aarch64-tuning-flags.def > index d5bcaebce770f0b217aac783063d39135f754c77..a9f48f5d3d4ea32fbf53086ba21eab4bc65b6dcb 100644 > --- a/gcc/config/aarch64/aarch64-tuning-flags.def > +++ b/gcc/config/aarch64/aarch64-tuning-flags.def > @@ -48,4 +48,8 @@ AARCH64_EXTRA_TUNING_OPTION ("avoid_cross_loop_fma", AVOID_CROSS_LOOP_FMA) > > AARCH64_EXTRA_TUNING_OPTION ("fully_pipelined_fma", FULLY_PIPELINED_FMA) > > +/* Enable is the target prefers to use a fresh register for predicate outputs > + rather than re-use an input predicate register. */ > +AARCH64_EXTRA_TUNING_OPTION ("avoid_pred_rmw", AVOID_PRED_RMW) > + > #undef AARCH64_EXTRA_TUNING_OPTION > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h > index bbf11faaf4b4340956094a983f8b0dc2649b2d27..e7669e65d7dae5df2ba42c265079b1856a5c382b 100644 > --- a/gcc/config/aarch64/aarch64.h > +++ b/gcc/config/aarch64/aarch64.h > @@ -495,6 +495,11 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE = AARCH64_FL_SM_OFF; > enabled through +gcs. */ > #define TARGET_GCS (AARCH64_ISA_GCS) > > +/* Prefer different predicate registers for the output of a predicated operation over > + re-using an existing input predicate. */ Formatting nit (sorry for not noticing last time): /* Prefer different predicate registers for the output of a predicated operation over re-using an existing input predicate. */ (avoiding an extra space after "/*" and wrapping at 80 columns). OK with that change, thanks. Richard > +#define TARGET_SVE_PRED_CLOBBER (TARGET_SVE \ > + && (aarch64_tune_params.extra_tuning_flags \ > + & AARCH64_EXTRA_TUNE_AVOID_PRED_RMW)) > > /* Standard register usage. */ > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index dbde066f7478bec51a8703b017ea553aa98be309..a7da3c01617eb8411029c7d2e32f13fa2cc1c833 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -445,6 +445,10 @@ (define_enum_attr "arch" "arches" (const_string "any")) > ;; target-independent code. > (define_attr "is_call" "no,yes" (const_string "no")) > > +;; Indicates whether we want to enable the pattern with an optional early > +;; clobber for SVE predicates. > +(define_attr "pred_clobber" "any,no,yes" (const_string "any")) > + > ;; [For compatibility with Arm in pipeline models] > ;; Attribute that specifies whether or not the instruction touches fp > ;; registers. > @@ -460,7 +464,17 @@ (define_attr "fp" "no,yes" > > (define_attr "arch_enabled" "no,yes" > (if_then_else > - (ior > + (and > + (ior > + (and > + (eq_attr "pred_clobber" "no") > + (match_test "!TARGET_SVE_PRED_CLOBBER")) > + (and > + (eq_attr "pred_clobber" "yes") > + (match_test "TARGET_SVE_PRED_CLOBBER")) > + (eq_attr "pred_clobber" "any")) > + > + (ior > (eq_attr "arch" "any") > > (and (eq_attr "arch" "rcpc8_4") > @@ -488,7 +502,7 @@ (define_attr "arch_enabled" "no,yes" > (match_test "TARGET_SVE")) > > (and (eq_attr "arch" "sme") > - (match_test "TARGET_SME"))) > + (match_test "TARGET_SME")))) > (const_string "yes") > (const_string "no")))