public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/8] aarch64: Add new flags for existing features
@ 2024-10-04 17:50 Andrew Carlotti
  2024-10-04 17:51 ` [PATCH 1/8] aarch64: Use PAUTH instead of V8_3A in some places Andrew Carlotti
                   ` (11 more replies)
  0 siblings, 12 replies; 25+ messages in thread
From: Andrew Carlotti @ 2024-10-04 17:50 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford

This patch series adds 7 new flags for features that were previously available
in GCC only as part of an architecture version.  It also fixes one other
instance where an architecture version was used in a check instead of a feature
flag.

Bootstrapped and regression tested as a whole on aarch64.  I additionally ran
the cpunative tests after each patch in the series.  Ok for master?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/8] aarch64: Use PAUTH instead of V8_3A in some places
  2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
@ 2024-10-04 17:51 ` Andrew Carlotti
  2024-10-08 15:46   ` Richard Sandiford
  2024-10-04 17:52 ` [PATCH 2/8] aarch64: Add new +fcma flag Andrew Carlotti
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Andrew Carlotti @ 2024-10-04 17:51 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford

gcc/ChangeLog:

	* config/aarch64/aarch64.cc
	(aarch64_expand_epilogue): Use TARGET_PAUTH.
	* config/aarch64/aarch64.md: Update comment.


diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index e7bb3278a27eca44c46afd26069d608218198a54..cf1107127fd5d9e12ad42441528666bf6b733f73 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -10042,12 +10042,12 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall)
 	1) Sibcalls don't return in a normal way, so if we're about to call one
 	   we must authenticate.
 
-	2) The RETAA instruction is not available before ARMv8.3-A, so if we are
-	   generating code for !TARGET_ARMV8_3 we can't use it and must
+	2) The RETAA instruction is not available without FEAT_PAuth, so if we
+	   are generating code for !TARGET_PAUTH we can't use it and must
 	   explicitly authenticate.
     */
   if (aarch64_return_address_signing_enabled ()
-      && (sibcall || !TARGET_ARMV8_3))
+      && (sibcall || !TARGET_PAUTH))
     {
       switch (aarch64_ra_sign_key)
 	{
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index c54b29cd64b9e0dc6c6d12735049386ccedc5408..0940a84f9295ee2bc07282b150095fdb5af11a4d 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -7672,10 +7672,10 @@
 )
 
 ;; Pointer authentication patterns are always provided.  In architecture
-;; revisions prior to ARMv8.3-A these HINT instructions operate as NOPs.
+;; revisions prior to FEAT_PAuth these HINT instructions operate as NOPs.
 ;; This lets the user write portable software which authenticates pointers
-;; when run on something which implements ARMv8.3-A, and which runs
-;; correctly, but does not authenticate pointers, where ARMv8.3-A is not
+;; when run on something which implements FEAT_PAuth, and which runs
+;; correctly, but does not authenticate pointers, where FEAT_PAuth is not
 ;; implemented.
 
 ;; Signing/Authenticating R30 using SP as the salt.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 2/8] aarch64: Add new +fcma flag
  2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
  2024-10-04 17:51 ` [PATCH 1/8] aarch64: Use PAUTH instead of V8_3A in some places Andrew Carlotti
@ 2024-10-04 17:52 ` Andrew Carlotti
  2024-10-08 16:18   ` Richard Sandiford
  2024-10-04 17:53 ` [PATCH 3/8] aarch64: Add new +jscvt flag Andrew Carlotti
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 25+ messages in thread
From: Andrew Carlotti @ 2024-10-04 17:52 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford

This includes +fcma as a dependency of +sve, and means that we can
finally support fcma intrinsics on a64fx.

Also add fcma to the Features list in several cpunative testcases that
incorrectly included sve without fcma.

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_3A): Add FCMA.
	* config/aarch64/aarch64-option-extensions.def (FCMA): New flag.
	(SVE): Add FCMA dependency.
	* config/aarch64/aarch64.h (TARGET_COMPLEX): Use new flag.
	* config/aarch64/arm_neon.h: Use new flag for fcma intrinsics.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cpunative/info_15: Add fcma to Features.
	* gcc.target/aarch64/cpunative/info_16: Ditto.
	* gcc.target/aarch64/cpunative/info_17: Ditto.
	* gcc.target/aarch64/cpunative/info_8: Ditto.
	* gcc.target/aarch64/cpunative/info_9: Ditto.


diff --git a/gcc/config/aarch64/aarch64-arches.def b/gcc/config/aarch64/aarch64-arches.def
index 4634b272e28006b5c6c2d6705a2f1010cbd9ab9b..fadf9c36b03865a3af9b25888a50f5bf3abe37b7 100644
--- a/gcc/config/aarch64/aarch64-arches.def
+++ b/gcc/config/aarch64/aarch64-arches.def
@@ -33,7 +33,7 @@
 AARCH64_ARCH("armv8-a",       generic_armv8_a,   V8A,       8,  (SIMD))
 AARCH64_ARCH("armv8.1-a",     generic_armv8_a,   V8_1A,     8,  (V8A, LSE, CRC, RDMA))
 AARCH64_ARCH("armv8.2-a",     generic_armv8_a,   V8_2A,     8,  (V8_1A))
-AARCH64_ARCH("armv8.3-a",     generic_armv8_a,   V8_3A,     8,  (V8_2A, PAUTH, RCPC))
+AARCH64_ARCH("armv8.3-a",     generic_armv8_a,   V8_3A,     8,  (V8_2A, PAUTH, RCPC, FCMA))
 AARCH64_ARCH("armv8.4-a",     generic_armv8_a,   V8_4A,     8,  (V8_3A, F16FML, DOTPROD, FLAGM))
 AARCH64_ARCH("armv8.5-a",     generic_armv8_a,   V8_5A,     8,  (V8_4A, SB, SSBS, PREDRES))
 AARCH64_ARCH("armv8.6-a",     generic_armv8_a,   V8_6A,     8,  (V8_5A, I8MM, BF16))
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index 8279f5a76eae7d787b8126044c5b4b4b78e97324..12640ed970d0475b9e28f1c4f1c6295e88e1ab97 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -151,6 +151,8 @@ AARCH64_OPT_EXTENSION("fp16fml", F16FML, (), (F16), (), "asimdfhm")
 
 AARCH64_FMV_FEATURE("fp16fml", FP16FML, (F16FML))
 
+AARCH64_OPT_FMV_EXTENSION("fcma", FCMA, (SIMD), (), (), "fcma")
+
 AARCH64_OPT_FMV_EXTENSION("rcpc", RCPC, (), (), (), "lrcpc")
 
 AARCH64_OPT_FMV_EXTENSION("rcpc3", RCPC3, (RCPC), (), (), "lrcpc3")
@@ -163,7 +165,7 @@ AARCH64_OPT_FMV_EXTENSION("bf16", BF16, (FP), (SIMD), (), "bf16")
 
 AARCH64_FMV_FEATURE("rpres", RPRES, ())
 
-AARCH64_OPT_FMV_EXTENSION("sve", SVE, (SIMD, F16), (), (), "sve")
+AARCH64_OPT_FMV_EXTENSION("sve", SVE, (SIMD, F16, FCMA), (), (), "sve")
 
 AARCH64_OPT_EXTENSION("f32mm", F32MM, (SVE), (), (), "f32mm")
 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 030cffb17606c1062af62398dd631bae50b448af..0c3d7baf7c85e54f7dd63fedb2da80d654c9ea50 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -364,7 +364,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED
 #define TARGET_JSCVT	(TARGET_FLOAT && TARGET_ARMV8_3)
 
 /* Armv8.3-a Complex number extension to AdvSIMD extensions.  */
-#define TARGET_COMPLEX (TARGET_SIMD && TARGET_ARMV8_3)
+#define TARGET_COMPLEX AARCH64_HAVE_ISA (FCMA)
 
 /* Floating-point rounding instructions from Armv8.5-a.  */
 #define TARGET_FRINT (AARCH64_HAVE_ISA (V8_5A) && TARGET_FLOAT)
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index e376685489da055029def6b661132b5154886b57..0ab511a884126821ecae7d2fc7c1a3427bdfe5ac 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -27015,7 +27015,7 @@ vbcaxq_s64 (int64x2_t __a, int64x2_t __b, int64x2_t __c)
 /* AdvSIMD Complex numbers intrinsics.  */
 
 #pragma GCC push_options
-#pragma GCC target ("arch=armv8.3-a")
+#pragma GCC target ("+nothing+fcma")
 
 #pragma GCC push_options
 #pragma GCC target ("+fp16")
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_15 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_15
index 6b425ea201351247c7273718d9e1e52cae62b342..1a31a75d6b4842846ad6d9476df23aae5ef72f83 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/info_15
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_15
@@ -1,6 +1,6 @@
 processor	: 0
 BogoMIPS	: 100.00
-Features	: Lorem ipsum dolor sit ametd rebum expetendis per at Dolor lucilius referrentur ei mei virtute eruditi eum ne Iisque verter svesm4 asimd fp sve sve2 fphp asimdhp sm3 sm4
+Features	: Lorem ipsum dolor sit ametd rebum expetendis per at Dolor lucilius referrentur ei mei virtute eruditi eum ne Iisque verter svesm4 asimd fp sve sve2 fphp asimdhp sm3 sm4 fcma
 CPU implementer	: 0x41
 CPU architecture: 8
 CPU variant	: 0x0
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_16 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_16
index 26f01c4962489ab116450dd55717e4db345fdaee..cdff314be73842b434fe39ecaf5bddbb778320ce 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/info_16
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_16
@@ -1,6 +1,6 @@
 processor	: 0
 BogoMIPS	: 100.00
-Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp
+Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp fcma
 CPU implementer	: 0xfe
 CPU architecture: 8
 CPU variant	: 0x0
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_17 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_17
index 26f01c4962489ab116450dd55717e4db345fdaee..cdff314be73842b434fe39ecaf5bddbb778320ce 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/info_17
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_17
@@ -1,6 +1,6 @@
 processor	: 0
 BogoMIPS	: 100.00
-Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp
+Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp fcma
 CPU implementer	: 0xfe
 CPU architecture: 8
 CPU variant	: 0x0
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_8 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_8
index 76da16c57b545c0cf72bf96e8a56f502ecc55073..37a488946b16c5fd05434a36d58b0af4d7221c04 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/info_8
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_8
@@ -1,6 +1,6 @@
 processor	: 0
 BogoMIPS	: 100.00
-Features	: asimd sve fp fphp asimdhp
+Features	: asimd sve fp fphp asimdhp fcma
 CPU implementer	: 0x41
 CPU architecture: 8
 CPU variant	: 0x0
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_9 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_9
index 14703dd1d0bf0c6543484d34950dc91778483b67..171ba498feabbb5ea2d392bc8ad0b11f156895ed 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/info_9
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_9
@@ -1,6 +1,6 @@
 processor	: 0
 BogoMIPS	: 100.00
-Features	: asimd fp svesm4 sve sve2 fphp asimdhp sm3 sm4
+Features	: asimd fp svesm4 sve sve2 fphp asimdhp sm3 sm4 fcma
 CPU implementer	: 0x41
 CPU architecture: 8
 CPU variant	: 0x0

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 3/8] aarch64: Add new +jscvt flag
  2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
  2024-10-04 17:51 ` [PATCH 1/8] aarch64: Use PAUTH instead of V8_3A in some places Andrew Carlotti
  2024-10-04 17:52 ` [PATCH 2/8] aarch64: Add new +fcma flag Andrew Carlotti
@ 2024-10-04 17:53 ` Andrew Carlotti
  2024-10-04 17:53 ` [PATCH 4/8] aarch64: Add new +frintts flag Andrew Carlotti
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Andrew Carlotti @ 2024-10-04 17:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_3A): Add JSCVT.
	* config/aarch64/aarch64-option-extensions.def (JSCVT): New flag.
	* config/aarch64/aarch64.h (TARGET_JSCVT): Use new flag.
	* config/aarch64/arm_acle.h: Use new flag for jscvt intrinsics.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add jscvt to
	expected feature string.
	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.


diff --git a/gcc/config/aarch64/aarch64-arches.def b/gcc/config/aarch64/aarch64-arches.def
index fadf9c36b03865a3af9b25888a50f5bf3abe37b7..c93c5c39d69ee497f1da3dd398b0353a3f99be8c 100644
--- a/gcc/config/aarch64/aarch64-arches.def
+++ b/gcc/config/aarch64/aarch64-arches.def
@@ -33,7 +33,7 @@
 AARCH64_ARCH("armv8-a",       generic_armv8_a,   V8A,       8,  (SIMD))
 AARCH64_ARCH("armv8.1-a",     generic_armv8_a,   V8_1A,     8,  (V8A, LSE, CRC, RDMA))
 AARCH64_ARCH("armv8.2-a",     generic_armv8_a,   V8_2A,     8,  (V8_1A))
-AARCH64_ARCH("armv8.3-a",     generic_armv8_a,   V8_3A,     8,  (V8_2A, PAUTH, RCPC, FCMA))
+AARCH64_ARCH("armv8.3-a",     generic_armv8_a,   V8_3A,     8,  (V8_2A, PAUTH, RCPC, FCMA, JSCVT))
 AARCH64_ARCH("armv8.4-a",     generic_armv8_a,   V8_4A,     8,  (V8_3A, F16FML, DOTPROD, FLAGM))
 AARCH64_ARCH("armv8.5-a",     generic_armv8_a,   V8_5A,     8,  (V8_4A, SB, SSBS, PREDRES))
 AARCH64_ARCH("armv8.6-a",     generic_armv8_a,   V8_6A,     8,  (V8_5A, I8MM, BF16))
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index 12640ed970d0475b9e28f1c4f1c6295e88e1ab97..c3663998c55b9ce4113dcce57bdea5980073d73c 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -151,6 +151,8 @@ AARCH64_OPT_EXTENSION("fp16fml", F16FML, (), (F16), (), "asimdfhm")
 
 AARCH64_FMV_FEATURE("fp16fml", FP16FML, (F16FML))
 
+AARCH64_OPT_FMV_EXTENSION("jscvt", JSCVT, (FP), (), (), "jscvt")
+
 AARCH64_OPT_FMV_EXTENSION("fcma", FCMA, (SIMD), (), (), "fcma")
 
 AARCH64_OPT_FMV_EXTENSION("rcpc", RCPC, (), (), (), "lrcpc")
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 0c3d7baf7c85e54f7dd63fedb2da80d654c9ea50..864f2d438479a74c9ada80577b37b2aa86085d02 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -361,7 +361,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED
 #define TARGET_ARMV8_3	AARCH64_HAVE_ISA (V8_3A)
 
 /* Javascript conversion instruction from Armv8.3-a.  */
-#define TARGET_JSCVT	(TARGET_FLOAT && TARGET_ARMV8_3)
+#define TARGET_JSCVT	AARCH64_HAVE_ISA (JSCVT)
 
 /* Armv8.3-a Complex number extension to AdvSIMD extensions.  */
 #define TARGET_COMPLEX AARCH64_HAVE_ISA (FCMA)
diff --git a/gcc/config/aarch64/arm_acle.h b/gcc/config/aarch64/arm_acle.h
index ab4e7e60e046a9e9c81237de2ca5463c3d4f96ca..0f06bde6c50261208d03985b6614d1983b535efb 100644
--- a/gcc/config/aarch64/arm_acle.h
+++ b/gcc/config/aarch64/arm_acle.h
@@ -119,7 +119,7 @@ __revl (unsigned long __value)
 }
 
 #pragma GCC push_options
-#pragma GCC target ("arch=armv8.3-a")
+#pragma GCC target ("+nothing+jscvt")
 __extension__ extern __inline int32_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __jcvt (double __a)
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
index 1d90e9ec9d971ae0f085fd832099058488c817b8..603ee48d584b8085755b577e09a6e7d6abbb5623 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
@@ -7,7 +7,7 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-assembler {\.arch armv8-a\+flagm\+lse\+dotprod\+rdma\+crc\+fp16fml\+rcpc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
+/* { dg-final { scan-assembler {\.arch armv8-a\+flagm\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
 
 /* Check that an Armv8-A core doesn't fall apart on extensions without midr
    values.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
index 17050a0b72c98ecfd87ec5f7f522cce4db9efc16..e0ba97fb6e9a2969b8122ca0315ef73f16983045 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
@@ -7,7 +7,7 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-assembler {\.arch armv8-a\+flagm\+lse\+dotprod\+rdma\+crc\+fp16fml\+rcpc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
+/* { dg-final { scan-assembler {\.arch armv8-a\+flagm\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
 
 /* Check that an Armv8-A core doesn't fall apart on extensions without midr
    values and that it enables optional features.  */

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 4/8] aarch64: Add new +frintts flag
  2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
                   ` (2 preceding siblings ...)
  2024-10-04 17:53 ` [PATCH 3/8] aarch64: Add new +jscvt flag Andrew Carlotti
@ 2024-10-04 17:53 ` Andrew Carlotti
  2024-10-04 17:53 ` [PATCH 5/8] aarch64: Add new +flagm2 flag Andrew Carlotti
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Andrew Carlotti @ 2024-10-04 17:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_5A): Add FRINTTS
	* config/aarch64/aarch64-option-extensions.def (FRINTTS): New flag.
	* config/aarch64/aarch64.h (TARGET_FRINT): Use new flag.
	* config/aarch64/arm_acle.h: Use new flag for frintts intrinsics.
	* config/aarch64/arm_neon.h: Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add frintts to
	expected feature string.
	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.


diff --git a/gcc/config/aarch64/aarch64-arches.def b/gcc/config/aarch64/aarch64-arches.def
index c93c5c39d69ee497f1da3dd398b0353a3f99be8c..668e7833bd81a7d8795df022f205ca7ca0d0ddef 100644
--- a/gcc/config/aarch64/aarch64-arches.def
+++ b/gcc/config/aarch64/aarch64-arches.def
@@ -35,7 +35,7 @@ AARCH64_ARCH("armv8.1-a",     generic_armv8_a,   V8_1A,     8,  (V8A, LSE, CRC,
 AARCH64_ARCH("armv8.2-a",     generic_armv8_a,   V8_2A,     8,  (V8_1A))
 AARCH64_ARCH("armv8.3-a",     generic_armv8_a,   V8_3A,     8,  (V8_2A, PAUTH, RCPC, FCMA, JSCVT))
 AARCH64_ARCH("armv8.4-a",     generic_armv8_a,   V8_4A,     8,  (V8_3A, F16FML, DOTPROD, FLAGM))
-AARCH64_ARCH("armv8.5-a",     generic_armv8_a,   V8_5A,     8,  (V8_4A, SB, SSBS, PREDRES))
+AARCH64_ARCH("armv8.5-a",     generic_armv8_a,   V8_5A,     8,  (V8_4A, SB, SSBS, PREDRES, FRINTTS))
 AARCH64_ARCH("armv8.6-a",     generic_armv8_a,   V8_6A,     8,  (V8_5A, I8MM, BF16))
 AARCH64_ARCH("armv8.7-a",     generic_armv8_a,   V8_7A,     8,  (V8_6A))
 AARCH64_ARCH("armv8.8-a",     generic_armv8_a,   V8_8A,     8,  (V8_7A, MOPS))
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index c3663998c55b9ce4113dcce57bdea5980073d73c..505f1fb721c64e4b55b52baf465024a57c68ab98 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -159,6 +159,8 @@ AARCH64_OPT_FMV_EXTENSION("rcpc", RCPC, (), (), (), "lrcpc")
 
 AARCH64_OPT_FMV_EXTENSION("rcpc3", RCPC3, (RCPC), (), (), "lrcpc3")
 
+AARCH64_OPT_FMV_EXTENSION("frintts", FRINTTS, (FP), (), (), "frint")
+
 AARCH64_OPT_FMV_EXTENSION("i8mm", I8MM, (SIMD), (), (), "i8mm")
 
 /* An explicit +bf16 implies +simd, but +bf16+nosimd still enables scalar BF16
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 864f2d438479a74c9ada80577b37b2aa86085d02..41430466b50bf223bf008c753d24f57570c1f2e5 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -367,7 +367,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED
 #define TARGET_COMPLEX AARCH64_HAVE_ISA (FCMA)
 
 /* Floating-point rounding instructions from Armv8.5-a.  */
-#define TARGET_FRINT (AARCH64_HAVE_ISA (V8_5A) && TARGET_FLOAT)
+#define TARGET_FRINT AARCH64_HAVE_ISA (FRINTTS)
 
 /* TME instructions are enabled.  */
 #define TARGET_TME AARCH64_HAVE_ISA (TME)
diff --git a/gcc/config/aarch64/arm_acle.h b/gcc/config/aarch64/arm_acle.h
index 0f06bde6c50261208d03985b6614d1983b535efb..617f261e1ba24acc77527b42eacb5233410689b8 100644
--- a/gcc/config/aarch64/arm_acle.h
+++ b/gcc/config/aarch64/arm_acle.h
@@ -130,7 +130,7 @@ __jcvt (double __a)
 #pragma GCC pop_options
 
 #pragma GCC push_options
-#pragma GCC target ("arch=armv8.5-a")
+#pragma GCC target ("+nothing+frintts")
 __extension__ extern __inline float
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __rint32zf (float __a)
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 0ab511a884126821ecae7d2fc7c1a3427bdfe5ac..2ffbffbac855ac2c47ad6416e7d2683c4ac2ab53 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -27678,7 +27678,7 @@ vfmlslq_laneq_high_f16 (float32x4_t __r, float16x8_t __a, float16x8_t __b,
 #pragma GCC pop_options
 
 #pragma GCC push_options
-#pragma GCC target ("arch=armv8.5-a")
+#pragma GCC target ("+nothing+simd+frintts")
 
 __extension__ extern __inline float32x2_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
index 603ee48d584b8085755b577e09a6e7d6abbb5623..aa70d1d22b8299befcd81a696f051eb72997d548 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
@@ -7,7 +7,7 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-assembler {\.arch armv8-a\+flagm\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
+/* { dg-final { scan-assembler {\.arch armv8-a\+flagm\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
 
 /* Check that an Armv8-A core doesn't fall apart on extensions without midr
    values.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
index e0ba97fb6e9a2969b8122ca0315ef73f16983045..ccd5d0d9bb7d7bf722bcffcc14c46d88d3223cf3 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
@@ -7,7 +7,7 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-assembler {\.arch armv8-a\+flagm\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
+/* { dg-final { scan-assembler {\.arch armv8-a\+flagm\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
 
 /* Check that an Armv8-A core doesn't fall apart on extensions without midr
    values and that it enables optional features.  */

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 5/8] aarch64: Add new +flagm2 flag
  2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
                   ` (3 preceding siblings ...)
  2024-10-04 17:53 ` [PATCH 4/8] aarch64: Add new +frintts flag Andrew Carlotti
@ 2024-10-04 17:53 ` Andrew Carlotti
  2024-10-04 17:54 ` [PATCH 6/8] aarch64: Add new +rcpc2 flag Andrew Carlotti
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Andrew Carlotti @ 2024-10-04 17:53 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford

GCC does not currently emit the axflag or xaflag instructions, so this
primarily affects the flags passed through to the assembler.

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_5A): Add FLAGM2.
	* config/aarch64/aarch64-option-extensions.def (FLAGM2): New flag.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add flagm2 to
	expected feature string instead of flagm.
	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.


diff --git a/gcc/config/aarch64/aarch64-arches.def b/gcc/config/aarch64/aarch64-arches.def
index 668e7833bd81a7d8795df022f205ca7ca0d0ddef..84782d55089650b5854c60497bc68f9564d6f90b 100644
--- a/gcc/config/aarch64/aarch64-arches.def
+++ b/gcc/config/aarch64/aarch64-arches.def
@@ -35,7 +35,7 @@ AARCH64_ARCH("armv8.1-a",     generic_armv8_a,   V8_1A,     8,  (V8A, LSE, CRC,
 AARCH64_ARCH("armv8.2-a",     generic_armv8_a,   V8_2A,     8,  (V8_1A))
 AARCH64_ARCH("armv8.3-a",     generic_armv8_a,   V8_3A,     8,  (V8_2A, PAUTH, RCPC, FCMA, JSCVT))
 AARCH64_ARCH("armv8.4-a",     generic_armv8_a,   V8_4A,     8,  (V8_3A, F16FML, DOTPROD, FLAGM))
-AARCH64_ARCH("armv8.5-a",     generic_armv8_a,   V8_5A,     8,  (V8_4A, SB, SSBS, PREDRES, FRINTTS))
+AARCH64_ARCH("armv8.5-a",     generic_armv8_a,   V8_5A,     8,  (V8_4A, SB, SSBS, PREDRES, FRINTTS, FLAGM2))
 AARCH64_ARCH("armv8.6-a",     generic_armv8_a,   V8_6A,     8,  (V8_5A, I8MM, BF16))
 AARCH64_ARCH("armv8.7-a",     generic_armv8_a,   V8_7A,     8,  (V8_6A))
 AARCH64_ARCH("armv8.8-a",     generic_armv8_a,   V8_8A,     8,  (V8_7A, MOPS))
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index 505f1fb721c64e4b55b52baf465024a57c68ab98..b73324abbeb6145b5a2c26fdb22f41de9b6045d9 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -103,6 +103,8 @@ AARCH64_OPT_FMV_EXTENSION("rng", RNG, (), (), (), "rng")
 
 AARCH64_OPT_FMV_EXTENSION("flagm", FLAGM, (), (), (), "flagm")
 
+AARCH64_OPT_FMV_EXTENSION("flagm2", FLAGM2, (FLAGM), (), (), "flagm2")
+
 AARCH64_OPT_FMV_EXTENSION("lse", LSE, (), (), (), "atomics")
 
 AARCH64_OPT_FMV_EXTENSION("fp", FP, (), (), (), "fp")
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
index aa70d1d22b8299befcd81a696f051eb72997d548..c1d5896e1eb0b3b48ac0c1eeb95a74c4b6ec9e85 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
@@ -7,7 +7,7 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-assembler {\.arch armv8-a\+flagm\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
+/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
 
 /* Check that an Armv8-A core doesn't fall apart on extensions without midr
    values.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
index ccd5d0d9bb7d7bf722bcffcc14c46d88d3223cf3..4533a2bf5912dc609327b63164ba4577e98f9eec 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
@@ -7,7 +7,7 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-assembler {\.arch armv8-a\+flagm\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
+/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
 
 /* Check that an Armv8-A core doesn't fall apart on extensions without midr
    values and that it enables optional features.  */

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 6/8] aarch64: Add new +rcpc2 flag
  2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
                   ` (4 preceding siblings ...)
  2024-10-04 17:53 ` [PATCH 5/8] aarch64: Add new +flagm2 flag Andrew Carlotti
@ 2024-10-04 17:54 ` Andrew Carlotti
  2024-10-04 17:54 ` [PATCH 7/8] aarch64: Add new +wfxt flag Andrew Carlotti
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Andrew Carlotti @ 2024-10-04 17:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_4A): Add RCPC2.
	* config/aarch64/aarch64-option-extensions.def
	(RCPC2): New flag.
	(RCPC3): Add RCPC2 dependency.
	* config/aarch64/aarch64.h (TARGET_RCPC2): Use new flag.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cpunative/native_cpu_21.c: Add rcpc2 to
	expected feature string instead of rcpc.
	* gcc.target/aarch64/cpunative/native_cpu_22.c: Ditto.


diff --git a/gcc/config/aarch64/aarch64-arches.def b/gcc/config/aarch64/aarch64-arches.def
index 84782d55089650b5854c60497bc68f9564d6f90b..f182d3dc6c77bf63ab272ab1b5824c1523390e09 100644
--- a/gcc/config/aarch64/aarch64-arches.def
+++ b/gcc/config/aarch64/aarch64-arches.def
@@ -34,7 +34,7 @@ AARCH64_ARCH("armv8-a",       generic_armv8_a,   V8A,       8,  (SIMD))
 AARCH64_ARCH("armv8.1-a",     generic_armv8_a,   V8_1A,     8,  (V8A, LSE, CRC, RDMA))
 AARCH64_ARCH("armv8.2-a",     generic_armv8_a,   V8_2A,     8,  (V8_1A))
 AARCH64_ARCH("armv8.3-a",     generic_armv8_a,   V8_3A,     8,  (V8_2A, PAUTH, RCPC, FCMA, JSCVT))
-AARCH64_ARCH("armv8.4-a",     generic_armv8_a,   V8_4A,     8,  (V8_3A, F16FML, DOTPROD, FLAGM))
+AARCH64_ARCH("armv8.4-a",     generic_armv8_a,   V8_4A,     8,  (V8_3A, F16FML, DOTPROD, FLAGM, RCPC2))
 AARCH64_ARCH("armv8.5-a",     generic_armv8_a,   V8_5A,     8,  (V8_4A, SB, SSBS, PREDRES, FRINTTS, FLAGM2))
 AARCH64_ARCH("armv8.6-a",     generic_armv8_a,   V8_6A,     8,  (V8_5A, I8MM, BF16))
 AARCH64_ARCH("armv8.7-a",     generic_armv8_a,   V8_7A,     8,  (V8_6A))
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index b73324abbeb6145b5a2c26fdb22f41de9b6045d9..b929773eba176a391d6e9242067e4f63e4434637 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -159,7 +159,9 @@ AARCH64_OPT_FMV_EXTENSION("fcma", FCMA, (SIMD), (), (), "fcma")
 
 AARCH64_OPT_FMV_EXTENSION("rcpc", RCPC, (), (), (), "lrcpc")
 
-AARCH64_OPT_FMV_EXTENSION("rcpc3", RCPC3, (RCPC), (), (), "lrcpc3")
+AARCH64_OPT_FMV_EXTENSION("rcpc2", RCPC2, (RCPC), (), (), "ilrcpc")
+
+AARCH64_OPT_FMV_EXTENSION("rcpc3", RCPC3, (RCPC2), (), (), "lrcpc3")
 
 AARCH64_OPT_FMV_EXTENSION("frintts", FRINTTS, (FP), (), (), "frint")
 
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 41430466b50bf223bf008c753d24f57570c1f2e5..3ed1930d3e4ac9f250219a43aa91cb8ed123f53c 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -427,7 +427,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED
 
 /* The RCPC2 extensions from Armv8.4-a that allow immediate offsets to LDAPR
    and sign-extending versions.*/
-#define TARGET_RCPC2 ((AARCH64_HAVE_ISA (V8_4A) && TARGET_RCPC) || TARGET_RCPC3)
+#define TARGET_RCPC2 AARCH64_HAVE_ISA (RCPC2)
 
 /* RCPC3 (Release Consistency) extensions, optional from Armv8.2-a.  */
 #define TARGET_RCPC3 AARCH64_HAVE_ISA (RCPC3)
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
index c1d5896e1eb0b3b48ac0c1eeb95a74c4b6ec9e85..904cdf452263961442f3ecc31cd1b6563130f9c7 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
@@ -7,7 +7,7 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
+/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc2\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
 
 /* Check that an Armv8-A core doesn't fall apart on extensions without midr
    values.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
index 4533a2bf5912dc609327b63164ba4577e98f9eec..feb959b11b0e383a5e1f3214d55f80f56d2605d4 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
@@ -7,7 +7,7 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
+/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc2\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
 
 /* Check that an Armv8-A core doesn't fall apart on extensions without midr
    values and that it enables optional features.  */

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 7/8] aarch64: Add new +wfxt flag
  2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
                   ` (5 preceding siblings ...)
  2024-10-04 17:54 ` [PATCH 6/8] aarch64: Add new +rcpc2 flag Andrew Carlotti
@ 2024-10-04 17:54 ` Andrew Carlotti
  2024-10-04 17:54 ` [PATCH 8/8] aarch64: Add new +xs flag Andrew Carlotti
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Andrew Carlotti @ 2024-10-04 17:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford

GCC does not currently emit the wfet or wfit instructions, so this
primarily affects the flags passed through to the assembler.

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_7A): Add WFXT.
	* config/aarch64/aarch64-option-extensions.def (WFXT): New flag.


diff --git a/gcc/config/aarch64/aarch64-arches.def b/gcc/config/aarch64/aarch64-arches.def
index f182d3dc6c77bf63ab272ab1b5824c1523390e09..fa06377dda089c8a89628bc4cc66d54510346053 100644
--- a/gcc/config/aarch64/aarch64-arches.def
+++ b/gcc/config/aarch64/aarch64-arches.def
@@ -37,7 +37,7 @@ AARCH64_ARCH("armv8.3-a",     generic_armv8_a,   V8_3A,     8,  (V8_2A, PAUTH, R
 AARCH64_ARCH("armv8.4-a",     generic_armv8_a,   V8_4A,     8,  (V8_3A, F16FML, DOTPROD, FLAGM, RCPC2))
 AARCH64_ARCH("armv8.5-a",     generic_armv8_a,   V8_5A,     8,  (V8_4A, SB, SSBS, PREDRES, FRINTTS, FLAGM2))
 AARCH64_ARCH("armv8.6-a",     generic_armv8_a,   V8_6A,     8,  (V8_5A, I8MM, BF16))
-AARCH64_ARCH("armv8.7-a",     generic_armv8_a,   V8_7A,     8,  (V8_6A))
+AARCH64_ARCH("armv8.7-a",     generic_armv8_a,   V8_7A,     8,  (V8_6A, WFXT))
 AARCH64_ARCH("armv8.8-a",     generic_armv8_a,   V8_8A,     8,  (V8_7A, MOPS))
 AARCH64_ARCH("armv8.9-a",     generic_armv8_a,   V8_9A,     8,  (V8_8A, CSSC))
 AARCH64_ARCH("armv8-r",       generic_armv8_a,   V8R  ,     8,  (V8_4A))
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index b929773eba176a391d6e9242067e4f63e4434637..9781d48f63778d186b66427bae7deb2c01e14107 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -220,6 +220,8 @@ AARCH64_OPT_EXTENSION("pauth", PAUTH, (), (), (), "paca pacg")
 
 AARCH64_OPT_EXTENSION("ls64", LS64, (), (), (), "")
 
+AARCH64_OPT_FMV_EXTENSION("wfxt", WFXT, (), (), (), "wfxt")
+
 AARCH64_OPT_EXTENSION("sme-f64f64", SME_F64F64, (SME), (), (), "")
 
 AARCH64_FMV_FEATURE("sme-f64f64", SME_F64, (SME_F64F64))

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 8/8] aarch64: Add new +xs flag
  2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
                   ` (6 preceding siblings ...)
  2024-10-04 17:54 ` [PATCH 7/8] aarch64: Add new +wfxt flag Andrew Carlotti
@ 2024-10-04 17:54 ` Andrew Carlotti
  2024-10-04 20:19 ` [PATCH 0/8] aarch64: Add new flags for existing features Andrew Pinski
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Andrew Carlotti @ 2024-10-04 17:54 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford

GCC does not emit tlbi instructions, so this only affects the flags
passed through to the assembler.

gcc/ChangeLog:

	* config/aarch64/aarch64-arches.def (V8_7A): Add XS.
	* config/aarch64/aarch64-option-extensions.def (XS): New flag.


diff --git a/gcc/config/aarch64/aarch64-arches.def b/gcc/config/aarch64/aarch64-arches.def
index fa06377dda089c8a89628bc4cc66d54510346053..66fe5cef0896847715d3b0a404ebabedfc82f34d 100644
--- a/gcc/config/aarch64/aarch64-arches.def
+++ b/gcc/config/aarch64/aarch64-arches.def
@@ -37,7 +37,7 @@ AARCH64_ARCH("armv8.3-a",     generic_armv8_a,   V8_3A,     8,  (V8_2A, PAUTH, R
 AARCH64_ARCH("armv8.4-a",     generic_armv8_a,   V8_4A,     8,  (V8_3A, F16FML, DOTPROD, FLAGM, RCPC2))
 AARCH64_ARCH("armv8.5-a",     generic_armv8_a,   V8_5A,     8,  (V8_4A, SB, SSBS, PREDRES, FRINTTS, FLAGM2))
 AARCH64_ARCH("armv8.6-a",     generic_armv8_a,   V8_6A,     8,  (V8_5A, I8MM, BF16))
-AARCH64_ARCH("armv8.7-a",     generic_armv8_a,   V8_7A,     8,  (V8_6A, WFXT))
+AARCH64_ARCH("armv8.7-a",     generic_armv8_a,   V8_7A,     8,  (V8_6A, WFXT, XS))
 AARCH64_ARCH("armv8.8-a",     generic_armv8_a,   V8_8A,     8,  (V8_7A, MOPS))
 AARCH64_ARCH("armv8.9-a",     generic_armv8_a,   V8_9A,     8,  (V8_8A, CSSC))
 AARCH64_ARCH("armv8-r",       generic_armv8_a,   V8R  ,     8,  (V8_4A))
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index 9781d48f63778d186b66427bae7deb2c01e14107..93adb556276c2379f50805d40d891229c87e1783 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -222,6 +222,8 @@ AARCH64_OPT_EXTENSION("ls64", LS64, (), (), (), "")
 
 AARCH64_OPT_FMV_EXTENSION("wfxt", WFXT, (), (), (), "wfxt")
 
+AARCH64_OPT_EXTENSION("xs", XS, (), (), (), "")
+
 AARCH64_OPT_EXTENSION("sme-f64f64", SME_F64F64, (SME), (), (), "")
 
 AARCH64_FMV_FEATURE("sme-f64f64", SME_F64, (SME_F64F64))

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/8] aarch64: Add new flags for existing features
  2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
                   ` (7 preceding siblings ...)
  2024-10-04 17:54 ` [PATCH 8/8] aarch64: Add new +xs flag Andrew Carlotti
@ 2024-10-04 20:19 ` Andrew Pinski
  2024-11-12 16:56 ` [PATCH 9/10] docs: Add new AArch64 flags Andrew Carlotti
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 25+ messages in thread
From: Andrew Pinski @ 2024-10-04 20:19 UTC (permalink / raw)
  To: Andrew Carlotti; +Cc: gcc-patches, Richard Sandiford

On Fri, Oct 4, 2024 at 10:51 AM Andrew Carlotti <andrew.carlotti@arm.com> wrote:
>
> This patch series adds 7 new flags for features that were previously available
> in GCC only as part of an architecture version.  It also fixes one other
> instance where an architecture version was used in a check instead of a feature
> flag.
>
> Bootstrapped and regression tested as a whole on aarch64.  I additionally ran
> the cpunative tests after each patch in the series.  Ok for master?

I think this is good except there is no modification of the documentation.
Yes the feature flags are documented; see
https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#g_t-march-and--mcpu-Feature-Modifiers
.

Thanks,
Andrew

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] aarch64: Use PAUTH instead of V8_3A in some places
  2024-10-04 17:51 ` [PATCH 1/8] aarch64: Use PAUTH instead of V8_3A in some places Andrew Carlotti
@ 2024-10-08 15:46   ` Richard Sandiford
  2025-03-20 14:05     ` Alfie Richards
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Sandiford @ 2024-10-08 15:46 UTC (permalink / raw)
  To: Andrew Carlotti; +Cc: gcc-patches

Andrew Carlotti <andrew.carlotti@arm.com> writes:
> gcc/ChangeLog:
>
> 	* config/aarch64/aarch64.cc
> 	(aarch64_expand_epilogue): Use TARGET_PAUTH.
> 	* config/aarch64/aarch64.md: Update comment.
>
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index e7bb3278a27eca44c46afd26069d608218198a54..cf1107127fd5d9e12ad42441528666bf6b733f73 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -10042,12 +10042,12 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall)
>  	1) Sibcalls don't return in a normal way, so if we're about to call one
>  	   we must authenticate.
>  
> -	2) The RETAA instruction is not available before ARMv8.3-A, so if we are
> -	   generating code for !TARGET_ARMV8_3 we can't use it and must
> +	2) The RETAA instruction is not available without FEAT_PAuth, so if we
> +	   are generating code for !TARGET_PAUTH we can't use it and must
>  	   explicitly authenticate.
>      */
>    if (aarch64_return_address_signing_enabled ()
> -      && (sibcall || !TARGET_ARMV8_3))
> +      && (sibcall || !TARGET_PAUTH))
>      {
>        switch (aarch64_ra_sign_key)
>  	{
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index c54b29cd64b9e0dc6c6d12735049386ccedc5408..0940a84f9295ee2bc07282b150095fdb5af11a4d 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -7672,10 +7672,10 @@
>  )
>  
>  ;; Pointer authentication patterns are always provided.  In architecture
> -;; revisions prior to ARMv8.3-A these HINT instructions operate as NOPs.
> +;; revisions prior to FEAT_PAuth these HINT instructions operate as NOPs.

I suppose this should be something like "On targets that don't implement
FEAT_PAuth".  OK with that change, thanks.

Richard

>  ;; This lets the user write portable software which authenticates pointers
> -;; when run on something which implements ARMv8.3-A, and which runs
> -;; correctly, but does not authenticate pointers, where ARMv8.3-A is not
> +;; when run on something which implements FEAT_PAuth, and which runs
> +;; correctly, but does not authenticate pointers, where FEAT_PAuth is not
>  ;; implemented.
>  
>  ;; Signing/Authenticating R30 using SP as the salt.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/8] aarch64: Add new +fcma flag
  2024-10-04 17:52 ` [PATCH 2/8] aarch64: Add new +fcma flag Andrew Carlotti
@ 2024-10-08 16:18   ` Richard Sandiford
  2024-10-25 14:31     ` Andre Vieira (lists)
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Sandiford @ 2024-10-08 16:18 UTC (permalink / raw)
  To: Andrew Carlotti; +Cc: gcc-patches

Andrew Carlotti <andrew.carlotti@arm.com> writes:
> This includes +fcma as a dependency of +sve, and means that we can
> finally support fcma intrinsics on a64fx.
>
> Also add fcma to the Features list in several cpunative testcases that
> incorrectly included sve without fcma.
>
> gcc/ChangeLog:
>
> 	* config/aarch64/aarch64-arches.def (V8_3A): Add FCMA.
> 	* config/aarch64/aarch64-option-extensions.def (FCMA): New flag.
> 	(SVE): Add FCMA dependency.
> 	* config/aarch64/aarch64.h (TARGET_COMPLEX): Use new flag.
> 	* config/aarch64/arm_neon.h: Use new flag for fcma intrinsics.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/aarch64/cpunative/info_15: Add fcma to Features.
> 	* gcc.target/aarch64/cpunative/info_16: Ditto.
> 	* gcc.target/aarch64/cpunative/info_17: Ditto.
> 	* gcc.target/aarch64/cpunative/info_8: Ditto.
> 	* gcc.target/aarch64/cpunative/info_9: Ditto.

In addition to Andrew P's comment about documentation, doesn't this
mean that -mcpu=native will now emit +fcma .arch strings for
unrecognised CPUs (i.e. those for which we can't establish a
baseline beyond Armv8-A?).  E.g., I think:

processor	: 0
BogoMIPS	: 100.00
Features        : fp asimd atomics crc32 asimdrdm paca pacg lrcpc fcma
CPU implementer	: 0xaa
CPU architecture: 8
CPU variant	: 0xaa
CPU part	: 0xaa
CPU revision	: 0

should enable all the Armv8.3-A features that GCC is aware of after
this patch, but we emit:

        .arch armv8-a+lse+rdma+crc+fcma+rcpc+pauth

rather than:

        .arch armv8.3-a

And that could be a problem because binutils support for the +fcma
name is relatively recent (your patch from January this year).
Assembling with older versions of gas is likely to fail, regardless
of whether the code uses FCMA.

I think we might need to adjust the driver code so that it tries
to consolidate features into an architecture level where possible.

That does theoretically run the risk of making gas enable features that
GCC isn't aware of and that aren't available on the CPU (for cases where
Armv8.X-A+features looks like Armv8.Y-A to GCC, but isn't because of a
missing feature that GCC doesn't know about).  The current code gets
that wrong in the opposite direction, though, as the above example
shows: we wouldn't enable FCMA support even if the target has it.

I think it's ok for odd combinations like -march=armv8.3-a+nofcma
to reference fcma in the .arch string.  But for compatibility,
I think we need to avoid adding it to .arch strings for command lines
that previously worked with older binutils.

Thanks,
Richard

> diff --git a/gcc/config/aarch64/aarch64-arches.def b/gcc/config/aarch64/aarch64-arches.def
> index 4634b272e28006b5c6c2d6705a2f1010cbd9ab9b..fadf9c36b03865a3af9b25888a50f5bf3abe37b7 100644
> --- a/gcc/config/aarch64/aarch64-arches.def
> +++ b/gcc/config/aarch64/aarch64-arches.def
> @@ -33,7 +33,7 @@
>  AARCH64_ARCH("armv8-a",       generic_armv8_a,   V8A,       8,  (SIMD))
>  AARCH64_ARCH("armv8.1-a",     generic_armv8_a,   V8_1A,     8,  (V8A, LSE, CRC, RDMA))
>  AARCH64_ARCH("armv8.2-a",     generic_armv8_a,   V8_2A,     8,  (V8_1A))
> -AARCH64_ARCH("armv8.3-a",     generic_armv8_a,   V8_3A,     8,  (V8_2A, PAUTH, RCPC))
> +AARCH64_ARCH("armv8.3-a",     generic_armv8_a,   V8_3A,     8,  (V8_2A, PAUTH, RCPC, FCMA))
>  AARCH64_ARCH("armv8.4-a",     generic_armv8_a,   V8_4A,     8,  (V8_3A, F16FML, DOTPROD, FLAGM))
>  AARCH64_ARCH("armv8.5-a",     generic_armv8_a,   V8_5A,     8,  (V8_4A, SB, SSBS, PREDRES))
>  AARCH64_ARCH("armv8.6-a",     generic_armv8_a,   V8_6A,     8,  (V8_5A, I8MM, BF16))
> diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
> index 8279f5a76eae7d787b8126044c5b4b4b78e97324..12640ed970d0475b9e28f1c4f1c6295e88e1ab97 100644
> --- a/gcc/config/aarch64/aarch64-option-extensions.def
> +++ b/gcc/config/aarch64/aarch64-option-extensions.def
> @@ -151,6 +151,8 @@ AARCH64_OPT_EXTENSION("fp16fml", F16FML, (), (F16), (), "asimdfhm")
>  
>  AARCH64_FMV_FEATURE("fp16fml", FP16FML, (F16FML))
>  
> +AARCH64_OPT_FMV_EXTENSION("fcma", FCMA, (SIMD), (), (), "fcma")
> +
>  AARCH64_OPT_FMV_EXTENSION("rcpc", RCPC, (), (), (), "lrcpc")
>  
>  AARCH64_OPT_FMV_EXTENSION("rcpc3", RCPC3, (RCPC), (), (), "lrcpc3")
> @@ -163,7 +165,7 @@ AARCH64_OPT_FMV_EXTENSION("bf16", BF16, (FP), (SIMD), (), "bf16")
>  
>  AARCH64_FMV_FEATURE("rpres", RPRES, ())
>  
> -AARCH64_OPT_FMV_EXTENSION("sve", SVE, (SIMD, F16), (), (), "sve")
> +AARCH64_OPT_FMV_EXTENSION("sve", SVE, (SIMD, F16, FCMA), (), (), "sve")
>  
>  AARCH64_OPT_EXTENSION("f32mm", F32MM, (SVE), (), (), "f32mm")
>  
> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index 030cffb17606c1062af62398dd631bae50b448af..0c3d7baf7c85e54f7dd63fedb2da80d654c9ea50 100644
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -364,7 +364,7 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED
>  #define TARGET_JSCVT	(TARGET_FLOAT && TARGET_ARMV8_3)
>  
>  /* Armv8.3-a Complex number extension to AdvSIMD extensions.  */
> -#define TARGET_COMPLEX (TARGET_SIMD && TARGET_ARMV8_3)
> +#define TARGET_COMPLEX AARCH64_HAVE_ISA (FCMA)
>  
>  /* Floating-point rounding instructions from Armv8.5-a.  */
>  #define TARGET_FRINT (AARCH64_HAVE_ISA (V8_5A) && TARGET_FLOAT)
> diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
> index e376685489da055029def6b661132b5154886b57..0ab511a884126821ecae7d2fc7c1a3427bdfe5ac 100644
> --- a/gcc/config/aarch64/arm_neon.h
> +++ b/gcc/config/aarch64/arm_neon.h
> @@ -27015,7 +27015,7 @@ vbcaxq_s64 (int64x2_t __a, int64x2_t __b, int64x2_t __c)
>  /* AdvSIMD Complex numbers intrinsics.  */
>  
>  #pragma GCC push_options
> -#pragma GCC target ("arch=armv8.3-a")
> +#pragma GCC target ("+nothing+fcma")
>  
>  #pragma GCC push_options
>  #pragma GCC target ("+fp16")
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_15 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_15
> index 6b425ea201351247c7273718d9e1e52cae62b342..1a31a75d6b4842846ad6d9476df23aae5ef72f83 100644
> --- a/gcc/testsuite/gcc.target/aarch64/cpunative/info_15
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_15
> @@ -1,6 +1,6 @@
>  processor	: 0
>  BogoMIPS	: 100.00
> -Features	: Lorem ipsum dolor sit ametd rebum expetendis per at Dolor lucilius referrentur ei mei virtute eruditi eum ne Iisque verter svesm4 asimd fp sve sve2 fphp asimdhp sm3 sm4
> +Features	: Lorem ipsum dolor sit ametd rebum expetendis per at Dolor lucilius referrentur ei mei virtute eruditi eum ne Iisque verter svesm4 asimd fp sve sve2 fphp asimdhp sm3 sm4 fcma
>  CPU implementer	: 0x41
>  CPU architecture: 8
>  CPU variant	: 0x0
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_16 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_16
> index 26f01c4962489ab116450dd55717e4db345fdaee..cdff314be73842b434fe39ecaf5bddbb778320ce 100644
> --- a/gcc/testsuite/gcc.target/aarch64/cpunative/info_16
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_16
> @@ -1,6 +1,6 @@
>  processor	: 0
>  BogoMIPS	: 100.00
> -Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp
> +Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp fcma
>  CPU implementer	: 0xfe
>  CPU architecture: 8
>  CPU variant	: 0x0
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_17 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_17
> index 26f01c4962489ab116450dd55717e4db345fdaee..cdff314be73842b434fe39ecaf5bddbb778320ce 100644
> --- a/gcc/testsuite/gcc.target/aarch64/cpunative/info_17
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_17
> @@ -1,6 +1,6 @@
>  processor	: 0
>  BogoMIPS	: 100.00
> -Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp
> +Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 asimddp sve sve2 fphp asimdhp fcma
>  CPU implementer	: 0xfe
>  CPU architecture: 8
>  CPU variant	: 0x0
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_8 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_8
> index 76da16c57b545c0cf72bf96e8a56f502ecc55073..37a488946b16c5fd05434a36d58b0af4d7221c04 100644
> --- a/gcc/testsuite/gcc.target/aarch64/cpunative/info_8
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_8
> @@ -1,6 +1,6 @@
>  processor	: 0
>  BogoMIPS	: 100.00
> -Features	: asimd sve fp fphp asimdhp
> +Features	: asimd sve fp fphp asimdhp fcma
>  CPU implementer	: 0x41
>  CPU architecture: 8
>  CPU variant	: 0x0
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_9 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_9
> index 14703dd1d0bf0c6543484d34950dc91778483b67..171ba498feabbb5ea2d392bc8ad0b11f156895ed 100644
> --- a/gcc/testsuite/gcc.target/aarch64/cpunative/info_9
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_9
> @@ -1,6 +1,6 @@
>  processor	: 0
>  BogoMIPS	: 100.00
> -Features	: asimd fp svesm4 sve sve2 fphp asimdhp sm3 sm4
> +Features	: asimd fp svesm4 sve sve2 fphp asimdhp sm3 sm4 fcma
>  CPU implementer	: 0x41
>  CPU architecture: 8
>  CPU variant	: 0x0

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/8] aarch64: Add new +fcma flag
  2024-10-08 16:18   ` Richard Sandiford
@ 2024-10-25 14:31     ` Andre Vieira (lists)
  0 siblings, 0 replies; 25+ messages in thread
From: Andre Vieira (lists) @ 2024-10-25 14:31 UTC (permalink / raw)
  To: Andrew Carlotti, gcc-patches, richard.sandiford



On 08/10/2024 17:18, Richard Sandiford wrote:
> Andrew Carlotti <andrew.carlotti@arm.com> writes:
>> This includes +fcma as a dependency of +sve, and means that we can
>> finally support fcma intrinsics on a64fx.
>>
>> Also add fcma to the Features list in several cpunative testcases that
>> incorrectly included sve without fcma.
>>
>> gcc/ChangeLog:
>>
>> 	* config/aarch64/aarch64-arches.def (V8_3A): Add FCMA.
>> 	* config/aarch64/aarch64-option-extensions.def (FCMA): New flag.
>> 	(SVE): Add FCMA dependency.
>> 	* config/aarch64/aarch64.h (TARGET_COMPLEX): Use new flag.
>> 	* config/aarch64/arm_neon.h: Use new flag for fcma intrinsics.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 	* gcc.target/aarch64/cpunative/info_15: Add fcma to Features.
>> 	* gcc.target/aarch64/cpunative/info_16: Ditto.
>> 	* gcc.target/aarch64/cpunative/info_17: Ditto.
>> 	* gcc.target/aarch64/cpunative/info_8: Ditto.
>> 	* gcc.target/aarch64/cpunative/info_9: Ditto.
> 
> In addition to Andrew P's comment about documentation, doesn't this
> mean that -mcpu=native will now emit +fcma .arch strings for
> unrecognised CPUs (i.e. those for which we can't establish a
> baseline beyond Armv8-A?).  E.g., I think:
> 
> processor	: 0
> BogoMIPS	: 100.00
> Features        : fp asimd atomics crc32 asimdrdm paca pacg lrcpc fcma
> CPU implementer	: 0xaa
> CPU architecture: 8
> CPU variant	: 0xaa
> CPU part	: 0xaa
> CPU revision	: 0
> 
> should enable all the Armv8.3-A features that GCC is aware of after
> this patch, but we emit:
> 
>          .arch armv8-a+lse+rdma+crc+fcma+rcpc+pauth
> 
> rather than:
> 
>          .arch armv8.3-a
> 
> And that could be a problem because binutils support for the +fcma
> name is relatively recent (your patch from January this year).
> Assembling with older versions of gas is likely to fail, regardless
> of whether the code uses FCMA.
> 
> I think we might need to adjust the driver code so that it tries
> to consolidate features into an architecture level where possible.
> 
> That does theoretically run the risk of making gas enable features that
> GCC isn't aware of and that aren't available on the CPU (for cases where
> Armv8.X-A+features looks like Armv8.Y-A to GCC, but isn't because of a
> missing feature that GCC doesn't know about).  The current code gets
> that wrong in the opposite direction, though, as the above example
> shows: we wouldn't enable FCMA support even if the target has it.
> 

Just because I think being pedantic here helps, gas doesn't 'enable' but 
rather 'accepts' more than GCC needs it to. The distinction matters here 
because gas will not suddenly generate a binary that has instructions 
that can not run on the target GCC was instructed to compile for (things 
like FMV/using target attributes/simd variants etc aside).

So yeah I agree with Richard that we should still emit armv8.3 in this case.

Kind Regards,
Andre

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 9/10] docs: Add new AArch64 flags
  2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
                   ` (8 preceding siblings ...)
  2024-10-04 20:19 ` [PATCH 0/8] aarch64: Add new flags for existing features Andrew Pinski
@ 2024-11-12 16:56 ` Andrew Carlotti
  2024-11-26 13:29   ` Richard Sandiford
  2024-11-12 16:57 ` [PATCH 10/10] aarch64: Try to avoid passing new flags to assembler Andrew Carlotti
  2024-11-12 17:02 ` [PATCH 0/10] aarch64: Add new flags for existing features Andrew Carlotti
  11 siblings, 1 reply; 25+ messages in thread
From: Andrew Carlotti @ 2024-11-12 16:56 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford, Andrew Pinski, Andre Vieira

gcc/ChangeLog:

	* doc/invoke.texi: Add new AArch64 flags.


diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 7146163d66d068522f5aa19f59badc1b05d05114..56186e98ca6a4d28d1c315746ade89cdc835219e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -21439,11 +21439,11 @@ and the features that they enable by default:
 @item @samp{armv8-a} @tab Armv8-A @tab @samp{+fp}, @samp{+simd}
 @item @samp{armv8.1-a} @tab Armv8.1-A @tab @samp{armv8-a}, @samp{+crc}, @samp{+lse}, @samp{+rdma}
 @item @samp{armv8.2-a} @tab Armv8.2-A @tab @samp{armv8.1-a}
-@item @samp{armv8.3-a} @tab Armv8.3-A @tab @samp{armv8.2-a}, @samp{+pauth}
-@item @samp{armv8.4-a} @tab Armv8.4-A @tab @samp{armv8.3-a}, @samp{+flagm}, @samp{+fp16fml}, @samp{+dotprod}
-@item @samp{armv8.5-a} @tab Armv8.5-A @tab @samp{armv8.4-a}, @samp{+sb}, @samp{+ssbs}, @samp{+predres}
+@item @samp{armv8.3-a} @tab Armv8.3-A @tab @samp{armv8.2-a}, @samp{+pauth}, @samp{+fcma}, @samp{+jscvt}
+@item @samp{armv8.4-a} @tab Armv8.4-A @tab @samp{armv8.3-a}, @samp{+flagm}, @samp{+fp16fml}, @samp{+dotprod}, @samp{+rcpc2}
+@item @samp{armv8.5-a} @tab Armv8.5-A @tab @samp{armv8.4-a}, @samp{+sb}, @samp{+ssbs}, @samp{+predres}, @samp{+frintts}, @samp{+flagm2}
 @item @samp{armv8.6-a} @tab Armv8.6-A @tab @samp{armv8.5-a}, @samp{+bf16}, @samp{+i8mm}
-@item @samp{armv8.7-a} @tab Armv8.7-A @tab @samp{armv8.6-a}
+@item @samp{armv8.7-a} @tab Armv8.7-A @tab @samp{armv8.6-a}, @samp{+wfxt}, @samp{+xs}
 @item @samp{armv8.8-a} @tab Armv8.8-a @tab @samp{armv8.7-a}, @samp{+mops}
 @item @samp{armv8.9-a} @tab Armv8.9-a @tab @samp{armv8.8-a}
 @item @samp{armv9-a} @tab Armv9-A @tab @samp{armv8.5-a}, @samp{+sve}, @samp{+sve2}
@@ -21779,6 +21779,8 @@ Enable the instructions to accelerate memory operations like @code{memcpy},
 @option{-march=armv8.8-a}
 @item flagm
 Enable the Flag Manipulation instructions Extension.
+@item flagm2
+Enable the FlagM2 flag conversion instructions.
 @item pauth
 Enable the Pointer Authentication Extension.
 @item cssc
@@ -21791,6 +21793,16 @@ Enable the FEAT_SME_I16I64 extension to SME.
 Enable the FEAT_SME_F64F64 extension to SME.
 @item sme2
 Enable the Scalable Matrix Extension 2.  This also enables SME instructions.
+@item fcma
+Enable the complex number SIMD extensions.
+@item jscvt
+Enable the @code{fjcvtzs} JavaScript conversion instruction.
+@item frintts
+Enable floating-point round to integral value instructions.
+@item wfxt
+Enable @code{wfet} and @code{wfit} instructions.
+@item xs
+Enable the XS memory attribute extension.
 @item lse128
 Enable the LSE128 128-bit atomic instructions extension.  This also
 enables LSE instructions.
@@ -21801,6 +21813,8 @@ This also enables the LSE128 extension.
 Enable support for Armv9.4-a Guarded Control Stack extension.
 @item the
 Enable support for Armv8.9-a/9.4-a translation hardening extension.
+@item rcpc2
+Enable the RCpc2 extension.
 @item rcpc3
 Enable the RCpc3 (Release Consistency) extension.
 @item fp8

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 10/10] aarch64: Try to avoid passing new flags to assembler
  2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
                   ` (9 preceding siblings ...)
  2024-11-12 16:56 ` [PATCH 9/10] docs: Add new AArch64 flags Andrew Carlotti
@ 2024-11-12 16:57 ` Andrew Carlotti
  2024-11-25 23:26   ` Richard Sandiford
  2024-11-12 17:02 ` [PATCH 0/10] aarch64: Add new flags for existing features Andrew Carlotti
  11 siblings, 1 reply; 25+ messages in thread
From: Andrew Carlotti @ 2024-11-12 16:57 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford, Andrew Pinski, Andre Vieira

These new flags (+fcma, +jscvt, +rcpc2, +jscvt, +frintts, +wfxt and +xs)
were only recently added to the assembler.  To improve compatibility
with older assemblers, we try to avoid passing these new flags to the
assembler if we can express the targetted architecture without them. We
do so by using an almost-equivalent architecture string with a higher
architecture version.

This should never reduce the set of instructions accepted by the
assembler.  It will make it more lenient in two cases:

1. Many system registers are currently gated behind architecture
versions instead of specific feature flags.  Increasing the base
architecture version may cause more system register accesses to be
accepted.

2. FEAT_XS doesn't have an HWCAP bit or cpuinfo entry.  We still want to
avoid passing +wfxt or +noxs to the assembler if possible, so we'll
instruct the assembler to accept FEAT_XS instructions as well whenever
the rest of the new features are enabled.

gcc/ChangeLog:

	* common/config/aarch64/aarch64-common.cc
	(aarch64_get_arch_string_for_assembler): New.
	(aarch64_rewrite_march): New.
	(aarch64_rewrite_selected_cpu): Call new function.
	* config/aarch64/aarch64-elf.h (ASM_SPEC): Remove identity mapping.
	* config/aarch64/aarch64-protos.h
	(aarch64_get_arch_string_for_assembler): New.
	* config/aarch64/aarch64.cc
	(aarch64_declare_function_name): Call new function.
	(aarch64_start_file): Ditto.
	* config/aarch64/aarch64.h
	* config/aarch64/aarch64.h
	(EXTRA_SPEC_FUNCTIONS): Use new macro name.
	(MCPU_TO_MARCH_SPEC): Rename to...
	(MARCH_REWRITE_SPEC): ...this, and add new spec rule.
	(aarch64_rewrite_march): New declaration.
	(MCPU_TO_MARCH_SPEC_FUNCTIONS): Rename to...
	(MARCH_REWRITE_SPEC_FUNCTIONS): ...this, and add new function.
	(ASM_CPU_SPEC): Use new macro name.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/cpunative/native_cpu_21.c: Update check.
	* gcc.target/aarch64/cpunative/native_cpu_22.c: Update check.
	* gcc.target/aarch64/cpunative/info_27: New test.
	* gcc.target/aarch64/cpunative/info_28: New test.
	* gcc.target/aarch64/cpunative/info_29: New test.
	* gcc.target/aarch64/cpunative/native_cpu_27.c: New test.
	* gcc.target/aarch64/cpunative/native_cpu_28.c: New test.
	* gcc.target/aarch64/cpunative/native_cpu_29.c: New test.


diff --git a/gcc/common/config/aarch64/aarch64-common.cc b/gcc/common/config/aarch64/aarch64-common.cc
index 2bfc597e333b6018970a9ee6e370a66b6d0960ef..717b3238be16f39a6fd1b4143662eb540ccf292d 100644
--- a/gcc/common/config/aarch64/aarch64-common.cc
+++ b/gcc/common/config/aarch64/aarch64-common.cc
@@ -371,6 +371,119 @@ aarch64_get_extension_string_for_isa_flags
   return outstr;
 }
 
+/* Generate an arch string to be passed to the assembler.
+
+   Several flags were added retrospectively for features that were previously
+   enabled only by specifying an architecture version.  We want to avoid
+   passing these flags to the assembler if possible, to improve compatibility
+   with older assemblers.  */
+
+std::string
+aarch64_get_arch_string_for_assembler (aarch64_arch arch,
+				       aarch64_feature_flags flags)
+{
+  if (!(flags & AARCH64_FL_FCMA) || !(flags & AARCH64_FL_JSCVT))
+    goto done;
+
+  if (arch == AARCH64_ARCH_V8A
+      || arch == AARCH64_ARCH_V8_1A
+      || arch == AARCH64_ARCH_V8_2A)
+    arch = AARCH64_ARCH_V8_3A;
+
+  if (!(flags & AARCH64_FL_RCPC2))
+    goto done;
+
+  if (arch == AARCH64_ARCH_V8_3A)
+    arch = AARCH64_ARCH_V8_4A;
+
+  if (!(flags & AARCH64_FL_FRINTTS) || !(flags & AARCH64_FL_FLAGM2))
+    goto done;
+
+  if (arch == AARCH64_ARCH_V8_4A)
+    arch = AARCH64_ARCH_V8_5A;
+
+  if (!(flags & AARCH64_FL_WFXT))
+    goto done;
+
+  if (arch == AARCH64_ARCH_V8_5A || arch == AARCH64_ARCH_V8_6A)
+    {
+      arch = AARCH64_ARCH_V8_7A;
+      /* We don't support native detection for FEAT_XS, so we'll assume it's
+	 present if the rest of these features are also present.  If we don't
+	 do this, then we would end up passing +noxs to the assembler.  */
+      flags |= AARCH64_FL_XS;
+    }
+done:
+
+  const struct arch_to_arch_name* a_to_an;
+  for (a_to_an = all_architectures;
+       a_to_an->arch != aarch64_no_arch;
+       a_to_an++)
+    {
+      if (a_to_an->arch == arch)
+	break;
+    }
+
+  std::string outstr = a_to_an->arch_name
+	+ aarch64_get_extension_string_for_isa_flags (flags, a_to_an->flags);
+
+  return outstr;
+}
+
+/* Called by the driver to rewrite a name passed to the -march
+   argument in preparation to be passed to the assembler.  The
+   names passed from the commend line will be in ARGV, we want
+   to use the right-most argument, which should be in
+   ARGV[ARGC - 1].  ARGC should always be greater than 0.  */
+
+const char *
+aarch64_rewrite_march (int argc, const char **argv)
+{
+  gcc_assert (argc);
+  const char *name = argv[argc - 1];
+  std::string original_string (name);
+  std::string extension_str;
+  std::string base_name;
+  size_t extension_pos = original_string.find_first_of ('+');
+
+  /* Strip and save the extension string.  */
+  if (extension_pos != std::string::npos)
+    {
+      base_name = original_string.substr (0, extension_pos);
+      extension_str = original_string.substr (extension_pos,
+					      std::string::npos);
+    }
+  else
+    {
+      /* No extensions.  */
+      base_name = original_string;
+    }
+
+  const struct arch_to_arch_name* a_to_an;
+  for (a_to_an = all_architectures;
+       a_to_an->arch != aarch64_no_arch;
+       a_to_an++)
+    {
+      if (a_to_an->arch_name == base_name)
+	break;
+    }
+
+  /* We couldn't find that architecture name.  */
+  if (a_to_an->arch == aarch64_no_arch)
+    fatal_error (input_location, "unknown value %qs for %<-march%>", name);
+
+  aarch64_feature_flags flags = a_to_an->flags;
+  aarch64_parse_extension (extension_str.c_str (), &flags, NULL);
+
+  std::string outstr = aarch64_get_arch_string_for_assembler (a_to_an->arch,
+							      flags);
+
+  /* We are going to memory leak here, nobody elsewhere
+     in the callchain is going to clean up after us.  The alternative is
+     to allocate a static buffer, and assert that it is big enough for our
+     modified string, which seems much worse!  */
+  return xstrdup (outstr.c_str ());
+}
 /* Attempt to rewrite NAME, which has been passed on the command line
    as a -mcpu option to an equivalent -march value.  If we can do so,
    return the new string, otherwise return an error.  */
@@ -414,7 +527,7 @@ aarch64_rewrite_selected_cpu (const char *name)
 	break;
     }
 
-  /* We couldn't find that proceesor name, or the processor name we
+  /* We couldn't find that processor name, or the processor name we
      found does not map to an architecture we understand.  */
   if (p_to_a->arch == aarch64_no_arch
       || a_to_an->arch == aarch64_no_arch)
@@ -423,9 +536,8 @@ aarch64_rewrite_selected_cpu (const char *name)
   aarch64_feature_flags extensions = p_to_a->flags;
   aarch64_parse_extension (extension_str.c_str (), &extensions, NULL);
 
-  std::string outstr = a_to_an->arch_name
-	+ aarch64_get_extension_string_for_isa_flags (extensions,
-						      a_to_an->flags);
+  std::string outstr = aarch64_get_arch_string_for_assembler (a_to_an->arch,
+							      extensions);
 
   /* We are going to memory leak here, nobody elsewhere
      in the callchain is going to clean up after us.  The alternative is
diff --git a/gcc/config/aarch64/aarch64-elf.h b/gcc/config/aarch64/aarch64-elf.h
index b6fb7936789fed7fc07d61c6e10301f0c451ac5c..1e210ce3b8beacb0ded7b482694df10368b9a50b 100644
--- a/gcc/config/aarch64/aarch64-elf.h
+++ b/gcc/config/aarch64/aarch64-elf.h
@@ -136,7 +136,6 @@
 #define ASM_SPEC "\
 %{mbig-endian:-EB} \
 %{mlittle-endian:-EL} \
-%{march=*:-march=%*} \
 %(asm_cpu_spec)" \
 ASM_MABI_SPEC
 #endif
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 6ab41a21c75dbdb6bba7875408bc1aa6959c9033..6cf09b41e88cb4d029e4d38f722a4247b9f84328 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -1165,6 +1165,8 @@ enum aarch_parse_opt_result aarch64_parse_extension (const char *,
 void aarch64_get_all_extension_candidates (auto_vec<const char *> *candidates);
 std::string aarch64_get_extension_string_for_isa_flags (aarch64_feature_flags,
 							aarch64_feature_flags);
+std::string aarch64_get_arch_string_for_assembler (aarch64_arch,
+						   aarch64_feature_flags);
 
 rtl_opt_pass *make_pass_aarch64_early_ra (gcc::context *);
 rtl_opt_pass *make_pass_fma_steering (gcc::context *);
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index e50bd43f3908916903fe724ec39ae137bc68dfad..9287329a76392034725080ae79060dcd16cfd753 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -1417,7 +1417,7 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
 #define HAVE_LOCAL_CPU_DETECT
 # define EXTRA_SPEC_FUNCTIONS                                           \
   { "local_cpu_detect", host_detect_local_cpu },                        \
-  MCPU_TO_MARCH_SPEC_FUNCTIONS
+  MARCH_REWRITE_SPEC_FUNCTIONS
 
 /* Rewrite -m{arch,cpu,tune}=native based on the host system information.
    When rewriting -march=native convert it into an -mcpu option if no other
@@ -1434,7 +1434,7 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
  { "tune", "%{!mcpu=*:%{!mtune=*:%{!march=native:-mtune=%(VALUE)}}}" },
 #else
 # define MCPU_MTUNE_NATIVE_SPECS ""
-# define EXTRA_SPEC_FUNCTIONS MCPU_TO_MARCH_SPEC_FUNCTIONS
+# define EXTRA_SPEC_FUNCTIONS MARCH_REWRITE_SPEC_FUNCTIONS
 # define CONFIG_TUNE_SPEC                                                \
   {"tune", "%{!mcpu=*:%{!mtune=*:-mtune=%(VALUE)}}"},
 #endif
@@ -1449,15 +1449,18 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
   {"cpu",  "%{!march=*:%{!mcpu=*:-mcpu=%(VALUE)}}" },   \
   CONFIG_TUNE_SPEC
 
-#define MCPU_TO_MARCH_SPEC \
-   " %{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}"
+#define MARCH_REWRITE_SPEC \
+   " %{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}" \
+   " %{march=*:-march=%:rewrite_march(%{march=*:%*})}"
 
 extern const char *aarch64_rewrite_mcpu (int argc, const char **argv);
-#define MCPU_TO_MARCH_SPEC_FUNCTIONS \
-  { "rewrite_mcpu", aarch64_rewrite_mcpu },
+extern const char *aarch64_rewrite_march (int argc, const char **argv);
+#define MARCH_REWRITE_SPEC_FUNCTIONS \
+  { "rewrite_mcpu", aarch64_rewrite_mcpu }, \
+  { "rewrite_march", aarch64_rewrite_march },
 
 #define ASM_CPU_SPEC \
-   MCPU_TO_MARCH_SPEC
+   MARCH_REWRITE_SPEC
 
 #define EXTRA_SPECS						\
   { "asm_cpu_spec",		ASM_CPU_SPEC }
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 81885d442ed3e7008aacc937c1a9305b7824bc7c..f32b90c188228fe980e247b4448ee3c1f3e7ddfc 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -24846,16 +24846,12 @@ aarch64_declare_function_name (FILE *stream, const char* name,
     targ_options = TREE_TARGET_OPTION (target_option_current_node);
   gcc_assert (targ_options);
 
-  const struct processor *this_arch
-    = aarch64_get_arch (targ_options->x_selected_arch);
-
   auto isa_flags = aarch64_get_asm_isa_flags (targ_options);
-  std::string extension
-    = aarch64_get_extension_string_for_isa_flags (isa_flags,
-						  this_arch->flags);
+  aarch64_arch arch = targ_options->x_selected_arch;
+  std::string to_print
+    = aarch64_get_arch_string_for_assembler (arch, isa_flags);
   /* Only update the assembler .arch string if it is distinct from the last
      such string we printed.  */
-  std::string to_print = this_arch->name + extension;
   if (to_print != aarch64_last_printed_arch_string)
     {
       asm_fprintf (asm_out_file, "\t.arch %s\n", to_print.c_str ());
@@ -24977,19 +24973,16 @@ aarch64_start_file (void)
   struct cl_target_option *default_options
     = TREE_TARGET_OPTION (target_option_default_node);
 
-  const struct processor *default_arch
-    = aarch64_get_arch (default_options->x_selected_arch);
+  aarch64_arch default_arch = default_options->x_selected_arch;
   auto default_isa_flags = aarch64_get_asm_isa_flags (default_options);
-  std::string extension
-    = aarch64_get_extension_string_for_isa_flags (default_isa_flags,
-						  default_arch->flags);
-
-   aarch64_last_printed_arch_string = default_arch->name + extension;
-   aarch64_last_printed_tune_string = "";
-   asm_fprintf (asm_out_file, "\t.arch %s\n",
-		aarch64_last_printed_arch_string.c_str ());
-
-   default_file_start ();
+  std::string arch_string
+    = aarch64_get_arch_string_for_assembler (default_arch, default_isa_flags);
+  aarch64_last_printed_arch_string = arch_string;
+  aarch64_last_printed_tune_string = "";
+  asm_fprintf (asm_out_file, "\t.arch %s\n",
+	       arch_string.c_str ());
+
+  default_file_start ();
 }
 
 /* Emit load exclusive.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_27 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_27
new file mode 100644
index 0000000000000000000000000000000000000000..1ca9354579f0b7fdd77e31857d744476529cd301
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_27
@@ -0,0 +1,8 @@
+processor	: 0
+BogoMIPS	: 100.00
+Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti paca pacg
+CPU implementer	: 0x41
+CPU architecture: 8
+CPU variant	: 0x0
+CPU part	: 0xd08
+CPU revision	: 2
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_28 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_28
new file mode 100644
index 0000000000000000000000000000000000000000..0c216abbb9e4d5c0273eaeb1824dc16e66b09c6c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_28
@@ -0,0 +1,8 @@
+processor	: 0
+BogoMIPS	: 100.00
+Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat flagm ssbs sb dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti paca pacg
+CPU implementer	: 0x41
+CPU architecture: 8
+CPU variant	: 0x0
+CPU part	: 0xd08
+CPU revision	: 2
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_29 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_29
new file mode 100644
index 0000000000000000000000000000000000000000..308c06710902507fcf274aa61e2244937d4e227b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_29
@@ -0,0 +1,8 @@
+processor	: 0
+BogoMIPS	: 100.00
+Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti paca pacg wfxt
+CPU implementer	: 0x41
+CPU architecture: 8
+CPU variant	: 0x0
+CPU part	: 0xd08
+CPU revision	: 2
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
index 904cdf452263961442f3ecc31cd1b6563130f9c7..e56b9164024c7535d6b10f451b7bc0796e7bd161 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
@@ -7,7 +7,7 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc2\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
+/* { dg-final { scan-assembler {\.arch armv8\.5-a\+crc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+nopredres\+nopauth\n} } } */
 
 /* Check that an Armv8-A core doesn't fall apart on extensions without midr
    values.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
index feb959b11b0e383a5e1f3214d55f80f56d2605d4..db3df27a22ea9275ca303e911061f2c35d3ba722 100644
--- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
@@ -7,7 +7,7 @@ int main()
   return 0;
 }
 
-/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc2\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
+/* { dg-final { scan-assembler {\.arch armv8\.5-a\+crc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+nopredres\n} } } */
 
 /* Check that an Armv8-A core doesn't fall apart on extensions without midr
    values and that it enables optional features.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_27.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_27.c
new file mode 100644
index 0000000000000000000000000000000000000000..43df6a50706df8855d2e960e508778542d81e643
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_27.c
@@ -0,0 +1,10 @@
+/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */
+/* { dg-set-compiler-env-var GCC_CPUINFO "$srcdir/gcc.target/aarch64/cpunative/info_27" } */
+/* { dg-additional-options "-mcpu=native" } */
+
+int main()
+{
+  return 0;
+}
+
+/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+rcpc2\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_28.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_28.c
new file mode 100644
index 0000000000000000000000000000000000000000..0e0e56f539433ea02c5c71c8c0bae5ddb256e962
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_28.c
@@ -0,0 +1,10 @@
+/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */
+/* { dg-set-compiler-env-var GCC_CPUINFO "$srcdir/gcc.target/aarch64/cpunative/info_28" } */
+/* { dg-additional-options "-mcpu=native" } */
+
+int main()
+{
+  return 0;
+}
+
+/* { dg-final { scan-assembler {\.arch armv8\.3-a\+flagm2\+dotprod\+crc\+fp16fml\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_29.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_29.c
new file mode 100644
index 0000000000000000000000000000000000000000..9b07161b77d75cfec19aea01fcf2eb5ece91853a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_29.c
@@ -0,0 +1,10 @@
+/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */
+/* { dg-set-compiler-env-var GCC_CPUINFO "$srcdir/gcc.target/aarch64/cpunative/info_29" } */
+/* { dg-additional-options "-mcpu=native" } */
+
+int main()
+{
+  return 0;
+}
+
+/* { dg-final { scan-assembler {\.arch armv8\.7-a\+crc\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+nopredres\n} } } */

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/10] aarch64: Add new flags for existing features
  2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
                   ` (10 preceding siblings ...)
  2024-11-12 16:57 ` [PATCH 10/10] aarch64: Try to avoid passing new flags to assembler Andrew Carlotti
@ 2024-11-12 17:02 ` Andrew Carlotti
  11 siblings, 0 replies; 25+ messages in thread
From: Andrew Carlotti @ 2024-11-12 17:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford, Andrew Pinski, Andre Vieira

On Fri, Oct 04, 2024 at 06:50:36PM +0100, Andrew Carlotti wrote:
> This patch series adds 7 new flags for features that were previously available
> in GCC only as part of an architecture version.  It also fixes one other
> instance where an architecture version was used in a check instead of a feature
> flag.
> 
> Bootstrapped and regression tested as a whole on aarch64.  I additionally ran
> the cpunative tests after each patch in the series.  Ok for master?


I've added two more patches to address reviews (one for docs, one for -march
rewriting), and updated (but not reposted) the comment in patch 1.  I've
bootstrapped and regression tested this new series as a whole on a native
aarch64 build.  Is the full series of 10 patches ok for master?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] aarch64: Try to avoid passing new flags to assembler
  2024-11-12 16:57 ` [PATCH 10/10] aarch64: Try to avoid passing new flags to assembler Andrew Carlotti
@ 2024-11-25 23:26   ` Richard Sandiford
  2025-01-07 12:11     ` Andrew Carlotti
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Sandiford @ 2024-11-25 23:26 UTC (permalink / raw)
  To: Andrew Carlotti; +Cc: gcc-patches, Andrew Pinski, Andre Vieira

Sorry for the slow review.

Andrew Carlotti <andrew.carlotti@arm.com> writes:
> These new flags (+fcma, +jscvt, +rcpc2, +jscvt, +frintts, +wfxt and +xs)
> were only recently added to the assembler.  To improve compatibility
> with older assemblers, we try to avoid passing these new flags to the
> assembler if we can express the targetted architecture without them. We
> do so by using an almost-equivalent architecture string with a higher
> architecture version.
>
> This should never reduce the set of instructions accepted by the
> assembler.  It will make it more lenient in two cases:
>
> 1. Many system registers are currently gated behind architecture
> versions instead of specific feature flags.  Increasing the base
> architecture version may cause more system register accesses to be
> accepted.
>
> 2. FEAT_XS doesn't have an HWCAP bit or cpuinfo entry.  We still want to
> avoid passing +wfxt or +noxs to the assembler if possible, so we'll
> instruct the assembler to accept FEAT_XS instructions as well whenever
> the rest of the new features are enabled.
>
> gcc/ChangeLog:
>
> 	* common/config/aarch64/aarch64-common.cc
> 	(aarch64_get_arch_string_for_assembler): New.
> 	(aarch64_rewrite_march): New.
> 	(aarch64_rewrite_selected_cpu): Call new function.
> 	* config/aarch64/aarch64-elf.h (ASM_SPEC): Remove identity mapping.
> 	* config/aarch64/aarch64-protos.h
> 	(aarch64_get_arch_string_for_assembler): New.
> 	* config/aarch64/aarch64.cc
> 	(aarch64_declare_function_name): Call new function.
> 	(aarch64_start_file): Ditto.
> 	* config/aarch64/aarch64.h
> 	* config/aarch64/aarch64.h
> 	(EXTRA_SPEC_FUNCTIONS): Use new macro name.
> 	(MCPU_TO_MARCH_SPEC): Rename to...
> 	(MARCH_REWRITE_SPEC): ...this, and add new spec rule.
> 	(aarch64_rewrite_march): New declaration.
> 	(MCPU_TO_MARCH_SPEC_FUNCTIONS): Rename to...
> 	(MARCH_REWRITE_SPEC_FUNCTIONS): ...this, and add new function.
> 	(ASM_CPU_SPEC): Use new macro name.
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/aarch64/cpunative/native_cpu_21.c: Update check.
> 	* gcc.target/aarch64/cpunative/native_cpu_22.c: Update check.
> 	* gcc.target/aarch64/cpunative/info_27: New test.
> 	* gcc.target/aarch64/cpunative/info_28: New test.
> 	* gcc.target/aarch64/cpunative/info_29: New test.
> 	* gcc.target/aarch64/cpunative/native_cpu_27.c: New test.
> 	* gcc.target/aarch64/cpunative/native_cpu_28.c: New test.
> 	* gcc.target/aarch64/cpunative/native_cpu_29.c: New test.
>
>
> diff --git a/gcc/common/config/aarch64/aarch64-common.cc b/gcc/common/config/aarch64/aarch64-common.cc
> index 2bfc597e333b6018970a9ee6e370a66b6d0960ef..717b3238be16f39a6fd1b4143662eb540ccf292d 100644
> --- a/gcc/common/config/aarch64/aarch64-common.cc
> +++ b/gcc/common/config/aarch64/aarch64-common.cc
> @@ -371,6 +371,119 @@ aarch64_get_extension_string_for_isa_flags
>    return outstr;
>  }
>  
> +/* Generate an arch string to be passed to the assembler.
> +
> +   Several flags were added retrospectively for features that were previously
> +   enabled only by specifying an architecture version.  We want to avoid
> +   passing these flags to the assembler if possible, to improve compatibility
> +   with older assemblers.  */
> +
> +std::string
> +aarch64_get_arch_string_for_assembler (aarch64_arch arch,
> +				       aarch64_feature_flags flags)
> +{
> +  if (!(flags & AARCH64_FL_FCMA) || !(flags & AARCH64_FL_JSCVT))
> +    goto done;
> +
> +  if (arch == AARCH64_ARCH_V8A
> +      || arch == AARCH64_ARCH_V8_1A
> +      || arch == AARCH64_ARCH_V8_2A)
> +    arch = AARCH64_ARCH_V8_3A;
> +
> +  if (!(flags & AARCH64_FL_RCPC2))
> +    goto done;
> +
> +  if (arch == AARCH64_ARCH_V8_3A)
> +    arch = AARCH64_ARCH_V8_4A;
> +
> +  if (!(flags & AARCH64_FL_FRINTTS) || !(flags & AARCH64_FL_FLAGM2))
> +    goto done;
> +
> +  if (arch == AARCH64_ARCH_V8_4A)
> +    arch = AARCH64_ARCH_V8_5A;
> +
> +  if (!(flags & AARCH64_FL_WFXT))
> +    goto done;
> +
> +  if (arch == AARCH64_ARCH_V8_5A || arch == AARCH64_ARCH_V8_6A)
> +    {
> +      arch = AARCH64_ARCH_V8_7A;
> +      /* We don't support native detection for FEAT_XS, so we'll assume it's
> +	 present if the rest of these features are also present.  If we don't
> +	 do this, then we would end up passing +noxs to the assembler.  */
> +      flags |= AARCH64_FL_XS;
> +    }
> +done:
> +
> +  const struct arch_to_arch_name* a_to_an;
> +  for (a_to_an = all_architectures;
> +       a_to_an->arch != aarch64_no_arch;
> +       a_to_an++)
> +    {
> +      if (a_to_an->arch == arch)
> +	break;
> +    }
> +
> +  std::string outstr = a_to_an->arch_name
> +	+ aarch64_get_extension_string_for_isa_flags (flags, a_to_an->flags);
> +
> +  return outstr;
> +}

I was hoping we could do this in a table-driven way.  Experimenting
a bit locally (but only lightly tested), the following seems to work:

aarch64.h:

/* The set of all architecture flags.  */
constexpr auto AARCH64_FL_ARCHES ATTRIBUTE_UNUSED = aarch64_feature_flags (0)
#define AARCH64_ARCH(A, B, ARCH_IDENT, D, E) \
  | feature_deps::ARCH_IDENT ().flag
#include "config/aarch64/aarch64-arches.def"
;

aarch64-common.cc:

...
  const struct arch_to_arch_name *best = nullptr;
  for (auto *a_to_an = all_architectures;
       a_to_an->arch != aarch64_no_arch;
       a_to_an++)
    {
      /* Require the architecture to have all architecture flags in FLAGS.  */
      if ((~a_to_an->flags & flags & AARCH64_FL_ARCHES) != 0)
	continue;

      /* Skip architectures that add no new mandatory features.  */
      if (best && (a_to_an->flags & ~best->flags & ~AARCH64_FL_ARCHES) == 0)
	continue;

      /* Require FLAGS to include all mandatory extensions.  */
      if ((a_to_an->flags & ~flags & ~AARCH64_FL_ARCHES) != 0)
        continue;

      best = a_to_an;
    }

> +
> +/* Called by the driver to rewrite a name passed to the -march
> +   argument in preparation to be passed to the assembler.  The
> +   names passed from the commend line will be in ARGV, we want
> +   to use the right-most argument, which should be in
> +   ARGV[ARGC - 1].  ARGC should always be greater than 0.  */
> +
> +const char *
> +aarch64_rewrite_march (int argc, const char **argv)
> +{
> +  gcc_assert (argc);
> +  const char *name = argv[argc - 1];
> +  std::string original_string (name);
> +  std::string extension_str;
> +  std::string base_name;
> +  size_t extension_pos = original_string.find_first_of ('+');
> +
> +  /* Strip and save the extension string.  */
> +  if (extension_pos != std::string::npos)
> +    {
> +      base_name = original_string.substr (0, extension_pos);
> +      extension_str = original_string.substr (extension_pos,
> +					      std::string::npos);
> +    }
> +  else
> +    {
> +      /* No extensions.  */
> +      base_name = original_string;
> +    }
> +
> +  const struct arch_to_arch_name* a_to_an;
> +  for (a_to_an = all_architectures;
> +       a_to_an->arch != aarch64_no_arch;
> +       a_to_an++)
> +    {
> +      if (a_to_an->arch_name == base_name)
> +	break;
> +    }
> +
> +  /* We couldn't find that architecture name.  */
> +  if (a_to_an->arch == aarch64_no_arch)
> +    fatal_error (input_location, "unknown value %qs for %<-march%>", name);
> +
> +  aarch64_feature_flags flags = a_to_an->flags;
> +  aarch64_parse_extension (extension_str.c_str (), &flags, NULL);
> +
> +  std::string outstr = aarch64_get_arch_string_for_assembler (a_to_an->arch,
> +							      flags);
> +
> +  /* We are going to memory leak here, nobody elsewhere
> +     in the callchain is going to clean up after us.  The alternative is
> +     to allocate a static buffer, and assert that it is big enough for our
> +     modified string, which seems much worse!  */
> +  return xstrdup (outstr.c_str ());
> +}

This is going to seem like feature creep, sorry, but: rather than
duplicate the architecture parsing, could we instead move the march
and mcpu processing from aarch64.cc to here?  Specifically:

- aarch64_parse_arch
- aarch64_parse_cpu
- aarch64_validate_mcpu
- aarch64_validate_march
- aarch64_print_hint_for_*

This would mean making "struct processor" public, and so giving it
an aarch64_ name (or putting it in a namespace).  We'd also need to
remove the tuning information and use a separate table for that.
Still, I think it would be more robust than having two pieces of code
doing the same parsing.  It should also give a better UI, since the
driver parsing would give the same hints as the compiler proper.

(Only tested to the point of moving the code and linking the driver.)

Thanks,
Richard

>  /* Attempt to rewrite NAME, which has been passed on the command line
>     as a -mcpu option to an equivalent -march value.  If we can do so,
>     return the new string, otherwise return an error.  */
> @@ -414,7 +527,7 @@ aarch64_rewrite_selected_cpu (const char *name)
>  	break;
>      }
>  
> -  /* We couldn't find that proceesor name, or the processor name we
> +  /* We couldn't find that processor name, or the processor name we
>       found does not map to an architecture we understand.  */
>    if (p_to_a->arch == aarch64_no_arch
>        || a_to_an->arch == aarch64_no_arch)
> @@ -423,9 +536,8 @@ aarch64_rewrite_selected_cpu (const char *name)
>    aarch64_feature_flags extensions = p_to_a->flags;
>    aarch64_parse_extension (extension_str.c_str (), &extensions, NULL);
>  
> -  std::string outstr = a_to_an->arch_name
> -	+ aarch64_get_extension_string_for_isa_flags (extensions,
> -						      a_to_an->flags);
> +  std::string outstr = aarch64_get_arch_string_for_assembler (a_to_an->arch,
> +							      extensions);
>  
>    /* We are going to memory leak here, nobody elsewhere
>       in the callchain is going to clean up after us.  The alternative is
> diff --git a/gcc/config/aarch64/aarch64-elf.h b/gcc/config/aarch64/aarch64-elf.h
> index b6fb7936789fed7fc07d61c6e10301f0c451ac5c..1e210ce3b8beacb0ded7b482694df10368b9a50b 100644
> --- a/gcc/config/aarch64/aarch64-elf.h
> +++ b/gcc/config/aarch64/aarch64-elf.h
> @@ -136,7 +136,6 @@
>  #define ASM_SPEC "\
>  %{mbig-endian:-EB} \
>  %{mlittle-endian:-EL} \
> -%{march=*:-march=%*} \
>  %(asm_cpu_spec)" \
>  ASM_MABI_SPEC
>  #endif
> diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
> index 6ab41a21c75dbdb6bba7875408bc1aa6959c9033..6cf09b41e88cb4d029e4d38f722a4247b9f84328 100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -1165,6 +1165,8 @@ enum aarch_parse_opt_result aarch64_parse_extension (const char *,
>  void aarch64_get_all_extension_candidates (auto_vec<const char *> *candidates);
>  std::string aarch64_get_extension_string_for_isa_flags (aarch64_feature_flags,
>  							aarch64_feature_flags);
> +std::string aarch64_get_arch_string_for_assembler (aarch64_arch,
> +						   aarch64_feature_flags);
>  
>  rtl_opt_pass *make_pass_aarch64_early_ra (gcc::context *);
>  rtl_opt_pass *make_pass_fma_steering (gcc::context *);
> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index e50bd43f3908916903fe724ec39ae137bc68dfad..9287329a76392034725080ae79060dcd16cfd753 100644
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -1417,7 +1417,7 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
>  #define HAVE_LOCAL_CPU_DETECT
>  # define EXTRA_SPEC_FUNCTIONS                                           \
>    { "local_cpu_detect", host_detect_local_cpu },                        \
> -  MCPU_TO_MARCH_SPEC_FUNCTIONS
> +  MARCH_REWRITE_SPEC_FUNCTIONS
>  
>  /* Rewrite -m{arch,cpu,tune}=native based on the host system information.
>     When rewriting -march=native convert it into an -mcpu option if no other
> @@ -1434,7 +1434,7 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
>   { "tune", "%{!mcpu=*:%{!mtune=*:%{!march=native:-mtune=%(VALUE)}}}" },
>  #else
>  # define MCPU_MTUNE_NATIVE_SPECS ""
> -# define EXTRA_SPEC_FUNCTIONS MCPU_TO_MARCH_SPEC_FUNCTIONS
> +# define EXTRA_SPEC_FUNCTIONS MARCH_REWRITE_SPEC_FUNCTIONS
>  # define CONFIG_TUNE_SPEC                                                \
>    {"tune", "%{!mcpu=*:%{!mtune=*:-mtune=%(VALUE)}}"},
>  #endif
> @@ -1449,15 +1449,18 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
>    {"cpu",  "%{!march=*:%{!mcpu=*:-mcpu=%(VALUE)}}" },   \
>    CONFIG_TUNE_SPEC
>  
> -#define MCPU_TO_MARCH_SPEC \
> -   " %{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}"
> +#define MARCH_REWRITE_SPEC \
> +   " %{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}" \
> +   " %{march=*:-march=%:rewrite_march(%{march=*:%*})}"
>  
>  extern const char *aarch64_rewrite_mcpu (int argc, const char **argv);
> -#define MCPU_TO_MARCH_SPEC_FUNCTIONS \
> -  { "rewrite_mcpu", aarch64_rewrite_mcpu },
> +extern const char *aarch64_rewrite_march (int argc, const char **argv);
> +#define MARCH_REWRITE_SPEC_FUNCTIONS \
> +  { "rewrite_mcpu", aarch64_rewrite_mcpu }, \
> +  { "rewrite_march", aarch64_rewrite_march },
>  
>  #define ASM_CPU_SPEC \
> -   MCPU_TO_MARCH_SPEC
> +   MARCH_REWRITE_SPEC
>  
>  #define EXTRA_SPECS						\
>    { "asm_cpu_spec",		ASM_CPU_SPEC }
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 81885d442ed3e7008aacc937c1a9305b7824bc7c..f32b90c188228fe980e247b4448ee3c1f3e7ddfc 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -24846,16 +24846,12 @@ aarch64_declare_function_name (FILE *stream, const char* name,
>      targ_options = TREE_TARGET_OPTION (target_option_current_node);
>    gcc_assert (targ_options);
>  
> -  const struct processor *this_arch
> -    = aarch64_get_arch (targ_options->x_selected_arch);
> -
>    auto isa_flags = aarch64_get_asm_isa_flags (targ_options);
> -  std::string extension
> -    = aarch64_get_extension_string_for_isa_flags (isa_flags,
> -						  this_arch->flags);
> +  aarch64_arch arch = targ_options->x_selected_arch;
> +  std::string to_print
> +    = aarch64_get_arch_string_for_assembler (arch, isa_flags);
>    /* Only update the assembler .arch string if it is distinct from the last
>       such string we printed.  */
> -  std::string to_print = this_arch->name + extension;
>    if (to_print != aarch64_last_printed_arch_string)
>      {
>        asm_fprintf (asm_out_file, "\t.arch %s\n", to_print.c_str ());
> @@ -24977,19 +24973,16 @@ aarch64_start_file (void)
>    struct cl_target_option *default_options
>      = TREE_TARGET_OPTION (target_option_default_node);
>  
> -  const struct processor *default_arch
> -    = aarch64_get_arch (default_options->x_selected_arch);
> +  aarch64_arch default_arch = default_options->x_selected_arch;
>    auto default_isa_flags = aarch64_get_asm_isa_flags (default_options);
> -  std::string extension
> -    = aarch64_get_extension_string_for_isa_flags (default_isa_flags,
> -						  default_arch->flags);
> -
> -   aarch64_last_printed_arch_string = default_arch->name + extension;
> -   aarch64_last_printed_tune_string = "";
> -   asm_fprintf (asm_out_file, "\t.arch %s\n",
> -		aarch64_last_printed_arch_string.c_str ());
> -
> -   default_file_start ();
> +  std::string arch_string
> +    = aarch64_get_arch_string_for_assembler (default_arch, default_isa_flags);
> +  aarch64_last_printed_arch_string = arch_string;
> +  aarch64_last_printed_tune_string = "";
> +  asm_fprintf (asm_out_file, "\t.arch %s\n",
> +	       arch_string.c_str ());
> +
> +  default_file_start ();
>  }
>  
>  /* Emit load exclusive.  */
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_27 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_27
> new file mode 100644
> index 0000000000000000000000000000000000000000..1ca9354579f0b7fdd77e31857d744476529cd301
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_27
> @@ -0,0 +1,8 @@
> +processor	: 0
> +BogoMIPS	: 100.00
> +Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti paca pacg
> +CPU implementer	: 0x41
> +CPU architecture: 8
> +CPU variant	: 0x0
> +CPU part	: 0xd08
> +CPU revision	: 2
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_28 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_28
> new file mode 100644
> index 0000000000000000000000000000000000000000..0c216abbb9e4d5c0273eaeb1824dc16e66b09c6c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_28
> @@ -0,0 +1,8 @@
> +processor	: 0
> +BogoMIPS	: 100.00
> +Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat flagm ssbs sb dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti paca pacg
> +CPU implementer	: 0x41
> +CPU architecture: 8
> +CPU variant	: 0x0
> +CPU part	: 0xd08
> +CPU revision	: 2
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_29 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_29
> new file mode 100644
> index 0000000000000000000000000000000000000000..308c06710902507fcf274aa61e2244937d4e227b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_29
> @@ -0,0 +1,8 @@
> +processor	: 0
> +BogoMIPS	: 100.00
> +Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti paca pacg wfxt
> +CPU implementer	: 0x41
> +CPU architecture: 8
> +CPU variant	: 0x0
> +CPU part	: 0xd08
> +CPU revision	: 2
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
> index 904cdf452263961442f3ecc31cd1b6563130f9c7..e56b9164024c7535d6b10f451b7bc0796e7bd161 100644
> --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
> @@ -7,7 +7,7 @@ int main()
>    return 0;
>  }
>  
> -/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc2\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
> +/* { dg-final { scan-assembler {\.arch armv8\.5-a\+crc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+nopredres\+nopauth\n} } } */
>  
>  /* Check that an Armv8-A core doesn't fall apart on extensions without midr
>     values.  */
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
> index feb959b11b0e383a5e1f3214d55f80f56d2605d4..db3df27a22ea9275ca303e911061f2c35d3ba722 100644
> --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
> @@ -7,7 +7,7 @@ int main()
>    return 0;
>  }
>  
> -/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc2\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
> +/* { dg-final { scan-assembler {\.arch armv8\.5-a\+crc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+nopredres\n} } } */
>  
>  /* Check that an Armv8-A core doesn't fall apart on extensions without midr
>     values and that it enables optional features.  */
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_27.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_27.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..43df6a50706df8855d2e960e508778542d81e643
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_27.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */
> +/* { dg-set-compiler-env-var GCC_CPUINFO "$srcdir/gcc.target/aarch64/cpunative/info_27" } */
> +/* { dg-additional-options "-mcpu=native" } */
> +
> +int main()
> +{
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+rcpc2\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_28.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_28.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..0e0e56f539433ea02c5c71c8c0bae5ddb256e962
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_28.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */
> +/* { dg-set-compiler-env-var GCC_CPUINFO "$srcdir/gcc.target/aarch64/cpunative/info_28" } */
> +/* { dg-additional-options "-mcpu=native" } */
> +
> +int main()
> +{
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler {\.arch armv8\.3-a\+flagm2\+dotprod\+crc\+fp16fml\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
> diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_29.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_29.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..9b07161b77d75cfec19aea01fcf2eb5ece91853a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_29.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */
> +/* { dg-set-compiler-env-var GCC_CPUINFO "$srcdir/gcc.target/aarch64/cpunative/info_29" } */
> +/* { dg-additional-options "-mcpu=native" } */
> +
> +int main()
> +{
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler {\.arch armv8\.7-a\+crc\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+nopredres\n} } } */

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 9/10] docs: Add new AArch64 flags
  2024-11-12 16:56 ` [PATCH 9/10] docs: Add new AArch64 flags Andrew Carlotti
@ 2024-11-26 13:29   ` Richard Sandiford
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Sandiford @ 2024-11-26 13:29 UTC (permalink / raw)
  To: Andrew Carlotti; +Cc: gcc-patches, Andrew Pinski, Andre Vieira

Andrew Carlotti <andrew.carlotti@arm.com> writes:
> gcc/ChangeLog:
>
> 	* doc/invoke.texi: Add new AArch64 flags.
>

OK, thanks.

Richard

> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 7146163d66d068522f5aa19f59badc1b05d05114..56186e98ca6a4d28d1c315746ade89cdc835219e 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -21439,11 +21439,11 @@ and the features that they enable by default:
>  @item @samp{armv8-a} @tab Armv8-A @tab @samp{+fp}, @samp{+simd}
>  @item @samp{armv8.1-a} @tab Armv8.1-A @tab @samp{armv8-a}, @samp{+crc}, @samp{+lse}, @samp{+rdma}
>  @item @samp{armv8.2-a} @tab Armv8.2-A @tab @samp{armv8.1-a}
> -@item @samp{armv8.3-a} @tab Armv8.3-A @tab @samp{armv8.2-a}, @samp{+pauth}
> -@item @samp{armv8.4-a} @tab Armv8.4-A @tab @samp{armv8.3-a}, @samp{+flagm}, @samp{+fp16fml}, @samp{+dotprod}
> -@item @samp{armv8.5-a} @tab Armv8.5-A @tab @samp{armv8.4-a}, @samp{+sb}, @samp{+ssbs}, @samp{+predres}
> +@item @samp{armv8.3-a} @tab Armv8.3-A @tab @samp{armv8.2-a}, @samp{+pauth}, @samp{+fcma}, @samp{+jscvt}
> +@item @samp{armv8.4-a} @tab Armv8.4-A @tab @samp{armv8.3-a}, @samp{+flagm}, @samp{+fp16fml}, @samp{+dotprod}, @samp{+rcpc2}
> +@item @samp{armv8.5-a} @tab Armv8.5-A @tab @samp{armv8.4-a}, @samp{+sb}, @samp{+ssbs}, @samp{+predres}, @samp{+frintts}, @samp{+flagm2}
>  @item @samp{armv8.6-a} @tab Armv8.6-A @tab @samp{armv8.5-a}, @samp{+bf16}, @samp{+i8mm}
> -@item @samp{armv8.7-a} @tab Armv8.7-A @tab @samp{armv8.6-a}
> +@item @samp{armv8.7-a} @tab Armv8.7-A @tab @samp{armv8.6-a}, @samp{+wfxt}, @samp{+xs}
>  @item @samp{armv8.8-a} @tab Armv8.8-a @tab @samp{armv8.7-a}, @samp{+mops}
>  @item @samp{armv8.9-a} @tab Armv8.9-a @tab @samp{armv8.8-a}
>  @item @samp{armv9-a} @tab Armv9-A @tab @samp{armv8.5-a}, @samp{+sve}, @samp{+sve2}
> @@ -21779,6 +21779,8 @@ Enable the instructions to accelerate memory operations like @code{memcpy},
>  @option{-march=armv8.8-a}
>  @item flagm
>  Enable the Flag Manipulation instructions Extension.
> +@item flagm2
> +Enable the FlagM2 flag conversion instructions.
>  @item pauth
>  Enable the Pointer Authentication Extension.
>  @item cssc
> @@ -21791,6 +21793,16 @@ Enable the FEAT_SME_I16I64 extension to SME.
>  Enable the FEAT_SME_F64F64 extension to SME.
>  @item sme2
>  Enable the Scalable Matrix Extension 2.  This also enables SME instructions.
> +@item fcma
> +Enable the complex number SIMD extensions.
> +@item jscvt
> +Enable the @code{fjcvtzs} JavaScript conversion instruction.
> +@item frintts
> +Enable floating-point round to integral value instructions.
> +@item wfxt
> +Enable @code{wfet} and @code{wfit} instructions.
> +@item xs
> +Enable the XS memory attribute extension.
>  @item lse128
>  Enable the LSE128 128-bit atomic instructions extension.  This also
>  enables LSE instructions.
> @@ -21801,6 +21813,8 @@ This also enables the LSE128 extension.
>  Enable support for Armv9.4-a Guarded Control Stack extension.
>  @item the
>  Enable support for Armv8.9-a/9.4-a translation hardening extension.
> +@item rcpc2
> +Enable the RCpc2 extension.
>  @item rcpc3
>  Enable the RCpc3 (Release Consistency) extension.
>  @item fp8

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] aarch64: Try to avoid passing new flags to assembler
  2024-11-25 23:26   ` Richard Sandiford
@ 2025-01-07 12:11     ` Andrew Carlotti
  2025-01-07 17:50       ` Richard Sandiford
  0 siblings, 1 reply; 25+ messages in thread
From: Andrew Carlotti @ 2025-01-07 12:11 UTC (permalink / raw)
  To: gcc-patches, Andrew Pinski, Andre Vieira, richard.sandiford

On Mon, Nov 25, 2024 at 11:26:39PM +0000, Richard Sandiford wrote:
> Sorry for the slow review.
> 
> Andrew Carlotti <andrew.carlotti@arm.com> writes:
> > These new flags (+fcma, +jscvt, +rcpc2, +jscvt, +frintts, +wfxt and +xs)
> > were only recently added to the assembler.  To improve compatibility
> > with older assemblers, we try to avoid passing these new flags to the
> > assembler if we can express the targetted architecture without them. We
> > do so by using an almost-equivalent architecture string with a higher
> > architecture version.
> >
> > This should never reduce the set of instructions accepted by the
> > assembler.  It will make it more lenient in two cases:
> >
> > 1. Many system registers are currently gated behind architecture
> > versions instead of specific feature flags.  Increasing the base
> > architecture version may cause more system register accesses to be
> > accepted.
> >
> > 2. FEAT_XS doesn't have an HWCAP bit or cpuinfo entry.  We still want to
> > avoid passing +wfxt or +noxs to the assembler if possible, so we'll
> > instruct the assembler to accept FEAT_XS instructions as well whenever
> > the rest of the new features are enabled.
> >
> > gcc/ChangeLog:
> >
> > 	* common/config/aarch64/aarch64-common.cc
> > 	(aarch64_get_arch_string_for_assembler): New.
> > 	(aarch64_rewrite_march): New.
> > 	(aarch64_rewrite_selected_cpu): Call new function.
> > 	* config/aarch64/aarch64-elf.h (ASM_SPEC): Remove identity mapping.
> > 	* config/aarch64/aarch64-protos.h
> > 	(aarch64_get_arch_string_for_assembler): New.
> > 	* config/aarch64/aarch64.cc
> > 	(aarch64_declare_function_name): Call new function.
> > 	(aarch64_start_file): Ditto.
> > 	* config/aarch64/aarch64.h
> > 	* config/aarch64/aarch64.h
> > 	(EXTRA_SPEC_FUNCTIONS): Use new macro name.
> > 	(MCPU_TO_MARCH_SPEC): Rename to...
> > 	(MARCH_REWRITE_SPEC): ...this, and add new spec rule.
> > 	(aarch64_rewrite_march): New declaration.
> > 	(MCPU_TO_MARCH_SPEC_FUNCTIONS): Rename to...
> > 	(MARCH_REWRITE_SPEC_FUNCTIONS): ...this, and add new function.
> > 	(ASM_CPU_SPEC): Use new macro name.
> >
> > gcc/testsuite/ChangeLog:
> >
> > 	* gcc.target/aarch64/cpunative/native_cpu_21.c: Update check.
> > 	* gcc.target/aarch64/cpunative/native_cpu_22.c: Update check.
> > 	* gcc.target/aarch64/cpunative/info_27: New test.
> > 	* gcc.target/aarch64/cpunative/info_28: New test.
> > 	* gcc.target/aarch64/cpunative/info_29: New test.
> > 	* gcc.target/aarch64/cpunative/native_cpu_27.c: New test.
> > 	* gcc.target/aarch64/cpunative/native_cpu_28.c: New test.
> > 	* gcc.target/aarch64/cpunative/native_cpu_29.c: New test.
> >
> >
> > diff --git a/gcc/common/config/aarch64/aarch64-common.cc b/gcc/common/config/aarch64/aarch64-common.cc
> > index 2bfc597e333b6018970a9ee6e370a66b6d0960ef..717b3238be16f39a6fd1b4143662eb540ccf292d 100644
> > --- a/gcc/common/config/aarch64/aarch64-common.cc
> > +++ b/gcc/common/config/aarch64/aarch64-common.cc
> > @@ -371,6 +371,119 @@ aarch64_get_extension_string_for_isa_flags
> >    return outstr;
> >  }
> >  
> > +/* Generate an arch string to be passed to the assembler.
> > +
> > +   Several flags were added retrospectively for features that were previously
> > +   enabled only by specifying an architecture version.  We want to avoid
> > +   passing these flags to the assembler if possible, to improve compatibility
> > +   with older assemblers.  */
> > +
> > +std::string
> > +aarch64_get_arch_string_for_assembler (aarch64_arch arch,
> > +				       aarch64_feature_flags flags)
> > +{
> > +  if (!(flags & AARCH64_FL_FCMA) || !(flags & AARCH64_FL_JSCVT))
> > +    goto done;
> > +
> > +  if (arch == AARCH64_ARCH_V8A
> > +      || arch == AARCH64_ARCH_V8_1A
> > +      || arch == AARCH64_ARCH_V8_2A)
> > +    arch = AARCH64_ARCH_V8_3A;
> > +
> > +  if (!(flags & AARCH64_FL_RCPC2))
> > +    goto done;
> > +
> > +  if (arch == AARCH64_ARCH_V8_3A)
> > +    arch = AARCH64_ARCH_V8_4A;
> > +
> > +  if (!(flags & AARCH64_FL_FRINTTS) || !(flags & AARCH64_FL_FLAGM2))
> > +    goto done;
> > +
> > +  if (arch == AARCH64_ARCH_V8_4A)
> > +    arch = AARCH64_ARCH_V8_5A;
> > +
> > +  if (!(flags & AARCH64_FL_WFXT))
> > +    goto done;
> > +
> > +  if (arch == AARCH64_ARCH_V8_5A || arch == AARCH64_ARCH_V8_6A)
> > +    {
> > +      arch = AARCH64_ARCH_V8_7A;
> > +      /* We don't support native detection for FEAT_XS, so we'll assume it's
> > +	 present if the rest of these features are also present.  If we don't
> > +	 do this, then we would end up passing +noxs to the assembler.  */
> > +      flags |= AARCH64_FL_XS;
> > +    }
> > +done:
> > +
> > +  const struct arch_to_arch_name* a_to_an;
> > +  for (a_to_an = all_architectures;
> > +       a_to_an->arch != aarch64_no_arch;
> > +       a_to_an++)
> > +    {
> > +      if (a_to_an->arch == arch)
> > +	break;
> > +    }
> > +
> > +  std::string outstr = a_to_an->arch_name
> > +	+ aarch64_get_extension_string_for_isa_flags (flags, a_to_an->flags);
> > +
> > +  return outstr;
> > +}
> 
> I was hoping we could do this in a table-driven way.  Experimenting
> a bit locally (but only lightly tested), the following seems to work:
> 
> aarch64.h:
> 
> /* The set of all architecture flags.  */
> constexpr auto AARCH64_FL_ARCHES ATTRIBUTE_UNUSED = aarch64_feature_flags (0)
> #define AARCH64_ARCH(A, B, ARCH_IDENT, D, E) \
>   | feature_deps::ARCH_IDENT ().flag
> #include "config/aarch64/aarch64-arches.def"
> ;
> 
> aarch64-common.cc:
> 
> ...
>   const struct arch_to_arch_name *best = nullptr;
>   for (auto *a_to_an = all_architectures;
>        a_to_an->arch != aarch64_no_arch;
>        a_to_an++)
>     {
>       /* Require the architecture to have all architecture flags in FLAGS.  */
>       if ((~a_to_an->flags & flags & AARCH64_FL_ARCHES) != 0)
> 	continue;
> 
>       /* Skip architectures that add no new mandatory features.  */
>       if (best && (a_to_an->flags & ~best->flags & ~AARCH64_FL_ARCHES) == 0)
> 	continue;
> 
>       /* Require FLAGS to include all mandatory extensions.  */
>       if ((a_to_an->flags & ~flags & ~AARCH64_FL_ARCHES) != 0)
>         continue;
> 
>       best = a_to_an;
>     }

There are some hypothetical cases in which your suggested approach wouldn't be
able to avoid the new feature flag, whereas my more targetted approach here
would.  I'm struggling to think of a realistic example though, as I can only
think of one core in which any of these features have been backported to an
earlier architecture version.  The closest bad examples I can find are:

- Using -mcpu=a64fx+nosve: This would canonicalise to armv8.2-a+f16+fcma with
  either approach.  If jscvt didn't exist, however, then my approach would
  instead give armv8.3-a+nopauth+norcpc.
- Running on an unrecognised future cpu with an unfortunate combination of
  features could also be an example, but that's probably unrealistic.

I also feel that this is a work-around that we should be applying in moderation
only when we know it would actually help with compatibility issues.

However, if you'd still prefer to use a bigger table-driven hammer, then I can
change the patch to do that.
 
> > +
> > +/* Called by the driver to rewrite a name passed to the -march
> > +   argument in preparation to be passed to the assembler.  The
> > +   names passed from the commend line will be in ARGV, we want
> > +   to use the right-most argument, which should be in
> > +   ARGV[ARGC - 1].  ARGC should always be greater than 0.  */
> > +
> > +const char *
> > +aarch64_rewrite_march (int argc, const char **argv)
> > +{
> > +  gcc_assert (argc);
> > +  const char *name = argv[argc - 1];
> > +  std::string original_string (name);
> > +  std::string extension_str;
> > +  std::string base_name;
> > +  size_t extension_pos = original_string.find_first_of ('+');
> > +
> > +  /* Strip and save the extension string.  */
> > +  if (extension_pos != std::string::npos)
> > +    {
> > +      base_name = original_string.substr (0, extension_pos);
> > +      extension_str = original_string.substr (extension_pos,
> > +					      std::string::npos);
> > +    }
> > +  else
> > +    {
> > +      /* No extensions.  */
> > +      base_name = original_string;
> > +    }
> > +
> > +  const struct arch_to_arch_name* a_to_an;
> > +  for (a_to_an = all_architectures;
> > +       a_to_an->arch != aarch64_no_arch;
> > +       a_to_an++)
> > +    {
> > +      if (a_to_an->arch_name == base_name)
> > +	break;
> > +    }
> > +
> > +  /* We couldn't find that architecture name.  */
> > +  if (a_to_an->arch == aarch64_no_arch)
> > +    fatal_error (input_location, "unknown value %qs for %<-march%>", name);
> > +
> > +  aarch64_feature_flags flags = a_to_an->flags;
> > +  aarch64_parse_extension (extension_str.c_str (), &flags, NULL);
> > +
> > +  std::string outstr = aarch64_get_arch_string_for_assembler (a_to_an->arch,
> > +							      flags);
> > +
> > +  /* We are going to memory leak here, nobody elsewhere
> > +     in the callchain is going to clean up after us.  The alternative is
> > +     to allocate a static buffer, and assert that it is big enough for our
> > +     modified string, which seems much worse!  */
> > +  return xstrdup (outstr.c_str ());
> > +}
> 
> This is going to seem like feature creep, sorry, but: rather than
> duplicate the architecture parsing, could we instead move the march
> and mcpu processing from aarch64.cc to here?  Specifically:
> 
> - aarch64_parse_arch
> - aarch64_parse_cpu
> - aarch64_validate_mcpu
> - aarch64_validate_march
> - aarch64_print_hint_for_*
> 
> This would mean making "struct processor" public, and so giving it
> an aarch64_ name (or putting it in a namespace).  We'd also need to
> remove the tuning information and use a separate table for that.
> Still, I think it would be more robust than having two pieces of code
> doing the same parsing.  It should also give a better UI, since the
> driver parsing would give the same hints as the compiler proper.
> 
> (Only tested to the point of moving the code and linking the driver.)

This seems like a reasonable improvement, though I do see the "seems like
feature creep" remark.
 
> Thanks,
> Richard
> 
> >  /* Attempt to rewrite NAME, which has been passed on the command line
> >     as a -mcpu option to an equivalent -march value.  If we can do so,
> >     return the new string, otherwise return an error.  */
> > @@ -414,7 +527,7 @@ aarch64_rewrite_selected_cpu (const char *name)
> >  	break;
> >      }
> >  
> > -  /* We couldn't find that proceesor name, or the processor name we
> > +  /* We couldn't find that processor name, or the processor name we
> >       found does not map to an architecture we understand.  */
> >    if (p_to_a->arch == aarch64_no_arch
> >        || a_to_an->arch == aarch64_no_arch)
> > @@ -423,9 +536,8 @@ aarch64_rewrite_selected_cpu (const char *name)
> >    aarch64_feature_flags extensions = p_to_a->flags;
> >    aarch64_parse_extension (extension_str.c_str (), &extensions, NULL);
> >  
> > -  std::string outstr = a_to_an->arch_name
> > -	+ aarch64_get_extension_string_for_isa_flags (extensions,
> > -						      a_to_an->flags);
> > +  std::string outstr = aarch64_get_arch_string_for_assembler (a_to_an->arch,
> > +							      extensions);
> >  
> >    /* We are going to memory leak here, nobody elsewhere
> >       in the callchain is going to clean up after us.  The alternative is
> > diff --git a/gcc/config/aarch64/aarch64-elf.h b/gcc/config/aarch64/aarch64-elf.h
> > index b6fb7936789fed7fc07d61c6e10301f0c451ac5c..1e210ce3b8beacb0ded7b482694df10368b9a50b 100644
> > --- a/gcc/config/aarch64/aarch64-elf.h
> > +++ b/gcc/config/aarch64/aarch64-elf.h
> > @@ -136,7 +136,6 @@
> >  #define ASM_SPEC "\
> >  %{mbig-endian:-EB} \
> >  %{mlittle-endian:-EL} \
> > -%{march=*:-march=%*} \
> >  %(asm_cpu_spec)" \
> >  ASM_MABI_SPEC
> >  #endif
> > diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
> > index 6ab41a21c75dbdb6bba7875408bc1aa6959c9033..6cf09b41e88cb4d029e4d38f722a4247b9f84328 100644
> > --- a/gcc/config/aarch64/aarch64-protos.h
> > +++ b/gcc/config/aarch64/aarch64-protos.h
> > @@ -1165,6 +1165,8 @@ enum aarch_parse_opt_result aarch64_parse_extension (const char *,
> >  void aarch64_get_all_extension_candidates (auto_vec<const char *> *candidates);
> >  std::string aarch64_get_extension_string_for_isa_flags (aarch64_feature_flags,
> >  							aarch64_feature_flags);
> > +std::string aarch64_get_arch_string_for_assembler (aarch64_arch,
> > +						   aarch64_feature_flags);
> >  
> >  rtl_opt_pass *make_pass_aarch64_early_ra (gcc::context *);
> >  rtl_opt_pass *make_pass_fma_steering (gcc::context *);
> > diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> > index e50bd43f3908916903fe724ec39ae137bc68dfad..9287329a76392034725080ae79060dcd16cfd753 100644
> > --- a/gcc/config/aarch64/aarch64.h
> > +++ b/gcc/config/aarch64/aarch64.h
> > @@ -1417,7 +1417,7 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
> >  #define HAVE_LOCAL_CPU_DETECT
> >  # define EXTRA_SPEC_FUNCTIONS                                           \
> >    { "local_cpu_detect", host_detect_local_cpu },                        \
> > -  MCPU_TO_MARCH_SPEC_FUNCTIONS
> > +  MARCH_REWRITE_SPEC_FUNCTIONS
> >  
> >  /* Rewrite -m{arch,cpu,tune}=native based on the host system information.
> >     When rewriting -march=native convert it into an -mcpu option if no other
> > @@ -1434,7 +1434,7 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
> >   { "tune", "%{!mcpu=*:%{!mtune=*:%{!march=native:-mtune=%(VALUE)}}}" },
> >  #else
> >  # define MCPU_MTUNE_NATIVE_SPECS ""
> > -# define EXTRA_SPEC_FUNCTIONS MCPU_TO_MARCH_SPEC_FUNCTIONS
> > +# define EXTRA_SPEC_FUNCTIONS MARCH_REWRITE_SPEC_FUNCTIONS
> >  # define CONFIG_TUNE_SPEC                                                \
> >    {"tune", "%{!mcpu=*:%{!mtune=*:-mtune=%(VALUE)}}"},
> >  #endif
> > @@ -1449,15 +1449,18 @@ extern const char *host_detect_local_cpu (int argc, const char **argv);
> >    {"cpu",  "%{!march=*:%{!mcpu=*:-mcpu=%(VALUE)}}" },   \
> >    CONFIG_TUNE_SPEC
> >  
> > -#define MCPU_TO_MARCH_SPEC \
> > -   " %{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}"
> > +#define MARCH_REWRITE_SPEC \
> > +   " %{mcpu=*:-march=%:rewrite_mcpu(%{mcpu=*:%*})}" \
> > +   " %{march=*:-march=%:rewrite_march(%{march=*:%*})}"
> >  
> >  extern const char *aarch64_rewrite_mcpu (int argc, const char **argv);
> > -#define MCPU_TO_MARCH_SPEC_FUNCTIONS \
> > -  { "rewrite_mcpu", aarch64_rewrite_mcpu },
> > +extern const char *aarch64_rewrite_march (int argc, const char **argv);
> > +#define MARCH_REWRITE_SPEC_FUNCTIONS \
> > +  { "rewrite_mcpu", aarch64_rewrite_mcpu }, \
> > +  { "rewrite_march", aarch64_rewrite_march },
> >  
> >  #define ASM_CPU_SPEC \
> > -   MCPU_TO_MARCH_SPEC
> > +   MARCH_REWRITE_SPEC
> >  
> >  #define EXTRA_SPECS						\
> >    { "asm_cpu_spec",		ASM_CPU_SPEC }
> > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> > index 81885d442ed3e7008aacc937c1a9305b7824bc7c..f32b90c188228fe980e247b4448ee3c1f3e7ddfc 100644
> > --- a/gcc/config/aarch64/aarch64.cc
> > +++ b/gcc/config/aarch64/aarch64.cc
> > @@ -24846,16 +24846,12 @@ aarch64_declare_function_name (FILE *stream, const char* name,
> >      targ_options = TREE_TARGET_OPTION (target_option_current_node);
> >    gcc_assert (targ_options);
> >  
> > -  const struct processor *this_arch
> > -    = aarch64_get_arch (targ_options->x_selected_arch);
> > -
> >    auto isa_flags = aarch64_get_asm_isa_flags (targ_options);
> > -  std::string extension
> > -    = aarch64_get_extension_string_for_isa_flags (isa_flags,
> > -						  this_arch->flags);
> > +  aarch64_arch arch = targ_options->x_selected_arch;
> > +  std::string to_print
> > +    = aarch64_get_arch_string_for_assembler (arch, isa_flags);
> >    /* Only update the assembler .arch string if it is distinct from the last
> >       such string we printed.  */
> > -  std::string to_print = this_arch->name + extension;
> >    if (to_print != aarch64_last_printed_arch_string)
> >      {
> >        asm_fprintf (asm_out_file, "\t.arch %s\n", to_print.c_str ());
> > @@ -24977,19 +24973,16 @@ aarch64_start_file (void)
> >    struct cl_target_option *default_options
> >      = TREE_TARGET_OPTION (target_option_default_node);
> >  
> > -  const struct processor *default_arch
> > -    = aarch64_get_arch (default_options->x_selected_arch);
> > +  aarch64_arch default_arch = default_options->x_selected_arch;
> >    auto default_isa_flags = aarch64_get_asm_isa_flags (default_options);
> > -  std::string extension
> > -    = aarch64_get_extension_string_for_isa_flags (default_isa_flags,
> > -						  default_arch->flags);
> > -
> > -   aarch64_last_printed_arch_string = default_arch->name + extension;
> > -   aarch64_last_printed_tune_string = "";
> > -   asm_fprintf (asm_out_file, "\t.arch %s\n",
> > -		aarch64_last_printed_arch_string.c_str ());
> > -
> > -   default_file_start ();
> > +  std::string arch_string
> > +    = aarch64_get_arch_string_for_assembler (default_arch, default_isa_flags);
> > +  aarch64_last_printed_arch_string = arch_string;
> > +  aarch64_last_printed_tune_string = "";
> > +  asm_fprintf (asm_out_file, "\t.arch %s\n",
> > +	       arch_string.c_str ());
> > +
> > +  default_file_start ();
> >  }
> >  
> >  /* Emit load exclusive.  */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_27 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_27
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..1ca9354579f0b7fdd77e31857d744476529cd301
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_27
> > @@ -0,0 +1,8 @@
> > +processor	: 0
> > +BogoMIPS	: 100.00
> > +Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti paca pacg
> > +CPU implementer	: 0x41
> > +CPU architecture: 8
> > +CPU variant	: 0x0
> > +CPU part	: 0xd08
> > +CPU revision	: 2
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_28 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_28
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..0c216abbb9e4d5c0273eaeb1824dc16e66b09c6c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_28
> > @@ -0,0 +1,8 @@
> > +processor	: 0
> > +BogoMIPS	: 100.00
> > +Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat flagm ssbs sb dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti paca pacg
> > +CPU implementer	: 0x41
> > +CPU architecture: 8
> > +CPU variant	: 0x0
> > +CPU part	: 0xd08
> > +CPU revision	: 2
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/info_29 b/gcc/testsuite/gcc.target/aarch64/cpunative/info_29
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..308c06710902507fcf274aa61e2244937d4e227b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/info_29
> > @@ -0,0 +1,8 @@
> > +processor	: 0
> > +BogoMIPS	: 100.00
> > +Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh bti paca pacg wfxt
> > +CPU implementer	: 0x41
> > +CPU architecture: 8
> > +CPU variant	: 0x0
> > +CPU part	: 0xd08
> > +CPU revision	: 2
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
> > index 904cdf452263961442f3ecc31cd1b6563130f9c7..e56b9164024c7535d6b10f451b7bc0796e7bd161 100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_21.c
> > @@ -7,7 +7,7 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc2\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
> > +/* { dg-final { scan-assembler {\.arch armv8\.5-a\+crc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+nopredres\+nopauth\n} } } */
> >  
> >  /* Check that an Armv8-A core doesn't fall apart on extensions without midr
> >     values.  */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
> > index feb959b11b0e383a5e1f3214d55f80f56d2605d4..db3df27a22ea9275ca303e911061f2c35d3ba722 100644
> > --- a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_22.c
> > @@ -7,7 +7,7 @@ int main()
> >    return 0;
> >  }
> >  
> > -/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+jscvt\+rcpc2\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
> > +/* { dg-final { scan-assembler {\.arch armv8\.5-a\+crc\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+nopredres\n} } } */
> >  
> >  /* Check that an Armv8-A core doesn't fall apart on extensions without midr
> >     values and that it enables optional features.  */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_27.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_27.c
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..43df6a50706df8855d2e960e508778542d81e643
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_27.c
> > @@ -0,0 +1,10 @@
> > +/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */
> > +/* { dg-set-compiler-env-var GCC_CPUINFO "$srcdir/gcc.target/aarch64/cpunative/info_27" } */
> > +/* { dg-additional-options "-mcpu=native" } */
> > +
> > +int main()
> > +{
> > +  return 0;
> > +}
> > +
> > +/* { dg-final { scan-assembler {\.arch armv8-a\+flagm2\+lse\+dotprod\+rdma\+crc\+fp16fml\+rcpc2\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\+pauth\n} } } */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_28.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_28.c
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..0e0e56f539433ea02c5c71c8c0bae5ddb256e962
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_28.c
> > @@ -0,0 +1,10 @@
> > +/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */
> > +/* { dg-set-compiler-env-var GCC_CPUINFO "$srcdir/gcc.target/aarch64/cpunative/info_28" } */
> > +/* { dg-additional-options "-mcpu=native" } */
> > +
> > +int main()
> > +{
> > +  return 0;
> > +}
> > +
> > +/* { dg-final { scan-assembler {\.arch armv8\.3-a\+flagm2\+dotprod\+crc\+fp16fml\+frintts\+i8mm\+bf16\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+sb\+ssbs\n} } } */
> > diff --git a/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_29.c b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_29.c
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..9b07161b77d75cfec19aea01fcf2eb5ece91853a
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/cpunative/native_cpu_29.c
> > @@ -0,0 +1,10 @@
> > +/* { dg-do compile { target { { aarch64*-*-linux*} && native } } } */
> > +/* { dg-set-compiler-env-var GCC_CPUINFO "$srcdir/gcc.target/aarch64/cpunative/info_29" } */
> > +/* { dg-additional-options "-mcpu=native" } */
> > +
> > +int main()
> > +{
> > +  return 0;
> > +}
> > +
> > +/* { dg-final { scan-assembler {\.arch armv8\.7-a\+crc\+sve2-aes\+sve2-bitperm\+sve2-sha3\+sve2-sm4\+nopredres\n} } } */

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] aarch64: Try to avoid passing new flags to assembler
  2025-01-07 12:11     ` Andrew Carlotti
@ 2025-01-07 17:50       ` Richard Sandiford
  2025-01-09 18:00         ` Richard Sandiford
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Sandiford @ 2025-01-07 17:50 UTC (permalink / raw)
  To: Andrew Carlotti; +Cc: gcc-patches, Andrew Pinski, Andre Vieira

Andrew Carlotti <andrew.carlotti@arm.com> writes:
> On Mon, Nov 25, 2024 at 11:26:39PM +0000, Richard Sandiford wrote:
>> Sorry for the slow review.
>> 
>> Andrew Carlotti <andrew.carlotti@arm.com> writes:
>> > These new flags (+fcma, +jscvt, +rcpc2, +jscvt, +frintts, +wfxt and +xs)
>> > were only recently added to the assembler.  To improve compatibility
>> > with older assemblers, we try to avoid passing these new flags to the
>> > assembler if we can express the targetted architecture without them. We
>> > do so by using an almost-equivalent architecture string with a higher
>> > architecture version.
>> >
>> > This should never reduce the set of instructions accepted by the
>> > assembler.  It will make it more lenient in two cases:
>> >
>> > 1. Many system registers are currently gated behind architecture
>> > versions instead of specific feature flags.  Increasing the base
>> > architecture version may cause more system register accesses to be
>> > accepted.
>> >
>> > 2. FEAT_XS doesn't have an HWCAP bit or cpuinfo entry.  We still want to
>> > avoid passing +wfxt or +noxs to the assembler if possible, so we'll
>> > instruct the assembler to accept FEAT_XS instructions as well whenever
>> > the rest of the new features are enabled.
>> >
>> > gcc/ChangeLog:
>> >
>> > 	* common/config/aarch64/aarch64-common.cc
>> > 	(aarch64_get_arch_string_for_assembler): New.
>> > 	(aarch64_rewrite_march): New.
>> > 	(aarch64_rewrite_selected_cpu): Call new function.
>> > 	* config/aarch64/aarch64-elf.h (ASM_SPEC): Remove identity mapping.
>> > 	* config/aarch64/aarch64-protos.h
>> > 	(aarch64_get_arch_string_for_assembler): New.
>> > 	* config/aarch64/aarch64.cc
>> > 	(aarch64_declare_function_name): Call new function.
>> > 	(aarch64_start_file): Ditto.
>> > 	* config/aarch64/aarch64.h
>> > 	* config/aarch64/aarch64.h
>> > 	(EXTRA_SPEC_FUNCTIONS): Use new macro name.
>> > 	(MCPU_TO_MARCH_SPEC): Rename to...
>> > 	(MARCH_REWRITE_SPEC): ...this, and add new spec rule.
>> > 	(aarch64_rewrite_march): New declaration.
>> > 	(MCPU_TO_MARCH_SPEC_FUNCTIONS): Rename to...
>> > 	(MARCH_REWRITE_SPEC_FUNCTIONS): ...this, and add new function.
>> > 	(ASM_CPU_SPEC): Use new macro name.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> > 	* gcc.target/aarch64/cpunative/native_cpu_21.c: Update check.
>> > 	* gcc.target/aarch64/cpunative/native_cpu_22.c: Update check.
>> > 	* gcc.target/aarch64/cpunative/info_27: New test.
>> > 	* gcc.target/aarch64/cpunative/info_28: New test.
>> > 	* gcc.target/aarch64/cpunative/info_29: New test.
>> > 	* gcc.target/aarch64/cpunative/native_cpu_27.c: New test.
>> > 	* gcc.target/aarch64/cpunative/native_cpu_28.c: New test.
>> > 	* gcc.target/aarch64/cpunative/native_cpu_29.c: New test.
>> >
>> >
>> > diff --git a/gcc/common/config/aarch64/aarch64-common.cc b/gcc/common/config/aarch64/aarch64-common.cc
>> > index 2bfc597e333b6018970a9ee6e370a66b6d0960ef..717b3238be16f39a6fd1b4143662eb540ccf292d 100644
>> > --- a/gcc/common/config/aarch64/aarch64-common.cc
>> > +++ b/gcc/common/config/aarch64/aarch64-common.cc
>> > @@ -371,6 +371,119 @@ aarch64_get_extension_string_for_isa_flags
>> >    return outstr;
>> >  }
>> >  
>> > +/* Generate an arch string to be passed to the assembler.
>> > +
>> > +   Several flags were added retrospectively for features that were previously
>> > +   enabled only by specifying an architecture version.  We want to avoid
>> > +   passing these flags to the assembler if possible, to improve compatibility
>> > +   with older assemblers.  */
>> > +
>> > +std::string
>> > +aarch64_get_arch_string_for_assembler (aarch64_arch arch,
>> > +				       aarch64_feature_flags flags)
>> > +{
>> > +  if (!(flags & AARCH64_FL_FCMA) || !(flags & AARCH64_FL_JSCVT))
>> > +    goto done;
>> > +
>> > +  if (arch == AARCH64_ARCH_V8A
>> > +      || arch == AARCH64_ARCH_V8_1A
>> > +      || arch == AARCH64_ARCH_V8_2A)
>> > +    arch = AARCH64_ARCH_V8_3A;
>> > +
>> > +  if (!(flags & AARCH64_FL_RCPC2))
>> > +    goto done;
>> > +
>> > +  if (arch == AARCH64_ARCH_V8_3A)
>> > +    arch = AARCH64_ARCH_V8_4A;
>> > +
>> > +  if (!(flags & AARCH64_FL_FRINTTS) || !(flags & AARCH64_FL_FLAGM2))
>> > +    goto done;
>> > +
>> > +  if (arch == AARCH64_ARCH_V8_4A)
>> > +    arch = AARCH64_ARCH_V8_5A;
>> > +
>> > +  if (!(flags & AARCH64_FL_WFXT))
>> > +    goto done;
>> > +
>> > +  if (arch == AARCH64_ARCH_V8_5A || arch == AARCH64_ARCH_V8_6A)
>> > +    {
>> > +      arch = AARCH64_ARCH_V8_7A;
>> > +      /* We don't support native detection for FEAT_XS, so we'll assume it's
>> > +	 present if the rest of these features are also present.  If we don't
>> > +	 do this, then we would end up passing +noxs to the assembler.  */
>> > +      flags |= AARCH64_FL_XS;
>> > +    }
>> > +done:
>> > +
>> > +  const struct arch_to_arch_name* a_to_an;
>> > +  for (a_to_an = all_architectures;
>> > +       a_to_an->arch != aarch64_no_arch;
>> > +       a_to_an++)
>> > +    {
>> > +      if (a_to_an->arch == arch)
>> > +	break;
>> > +    }
>> > +
>> > +  std::string outstr = a_to_an->arch_name
>> > +	+ aarch64_get_extension_string_for_isa_flags (flags, a_to_an->flags);
>> > +
>> > +  return outstr;
>> > +}
>> 
>> I was hoping we could do this in a table-driven way.  Experimenting
>> a bit locally (but only lightly tested), the following seems to work:
>> 
>> aarch64.h:
>> 
>> /* The set of all architecture flags.  */
>> constexpr auto AARCH64_FL_ARCHES ATTRIBUTE_UNUSED = aarch64_feature_flags (0)
>> #define AARCH64_ARCH(A, B, ARCH_IDENT, D, E) \
>>   | feature_deps::ARCH_IDENT ().flag
>> #include "config/aarch64/aarch64-arches.def"
>> ;
>> 
>> aarch64-common.cc:
>> 
>> ...
>>   const struct arch_to_arch_name *best = nullptr;
>>   for (auto *a_to_an = all_architectures;
>>        a_to_an->arch != aarch64_no_arch;
>>        a_to_an++)
>>     {
>>       /* Require the architecture to have all architecture flags in FLAGS.  */
>>       if ((~a_to_an->flags & flags & AARCH64_FL_ARCHES) != 0)
>> 	continue;
>> 
>>       /* Skip architectures that add no new mandatory features.  */
>>       if (best && (a_to_an->flags & ~best->flags & ~AARCH64_FL_ARCHES) == 0)
>> 	continue;
>> 
>>       /* Require FLAGS to include all mandatory extensions.  */
>>       if ((a_to_an->flags & ~flags & ~AARCH64_FL_ARCHES) != 0)
>>         continue;
>> 
>>       best = a_to_an;
>>     }
>
> There are some hypothetical cases in which your suggested approach wouldn't be
> able to avoid the new feature flag, whereas my more targetted approach here
> would.  I'm struggling to think of a realistic example though, as I can only
> think of one core in which any of these features have been backported to an
> earlier architecture version.  The closest bad examples I can find are:
>
> - Using -mcpu=a64fx+nosve: This would canonicalise to armv8.2-a+f16+fcma with
>   either approach.  If jscvt didn't exist, however, then my approach would
>   instead give armv8.3-a+nopauth+norcpc.
> - Running on an unrecognised future cpu with an unfortunate combination of
>   features could also be an example, but that's probably unrealistic.

Thanks for considering these cases.  But like you say, the first one
is counterfactual, in that we do have jscvt.  And personally, I think
it's less surprising to add +fcma in that case anyway, rather than have
+nosve bump the architecture version from armv8.2-a to armv8.3-a.
It seems odd for +no to increase the architecture level.

So IMO, the fact that we use the "correct" base architecture is a plus
rather than a minus.  This is more about choosing between multiple ways
of telling a direct truth, rather than trying to be indirect.

> I also feel that this is a work-around that we should be applying in moderation
> only when we know it would actually help with compatibility issues.

I can see that.  But I think it's easier to reason about, and less
surprising, if the behaviour stems from easily-understood rules,
rather than being a list of special cases.

Also, I think replacing -march=armv8-a+all+the+features+for+v8.8-a
with -march=armv8.8-a is an independent good.  So...

> However, if you'd still prefer to use a bigger table-driven hammer, then I can
> change the patch to do that.

...yeah, I would still prefer that, sorry. :)

>> > +
>> > +/* Called by the driver to rewrite a name passed to the -march
>> > +   argument in preparation to be passed to the assembler.  The
>> > +   names passed from the commend line will be in ARGV, we want
>> > +   to use the right-most argument, which should be in
>> > +   ARGV[ARGC - 1].  ARGC should always be greater than 0.  */
>> > +
>> > +const char *
>> > +aarch64_rewrite_march (int argc, const char **argv)
>> > +{
>> > +  gcc_assert (argc);
>> > +  const char *name = argv[argc - 1];
>> > +  std::string original_string (name);
>> > +  std::string extension_str;
>> > +  std::string base_name;
>> > +  size_t extension_pos = original_string.find_first_of ('+');
>> > +
>> > +  /* Strip and save the extension string.  */
>> > +  if (extension_pos != std::string::npos)
>> > +    {
>> > +      base_name = original_string.substr (0, extension_pos);
>> > +      extension_str = original_string.substr (extension_pos,
>> > +					      std::string::npos);
>> > +    }
>> > +  else
>> > +    {
>> > +      /* No extensions.  */
>> > +      base_name = original_string;
>> > +    }
>> > +
>> > +  const struct arch_to_arch_name* a_to_an;
>> > +  for (a_to_an = all_architectures;
>> > +       a_to_an->arch != aarch64_no_arch;
>> > +       a_to_an++)
>> > +    {
>> > +      if (a_to_an->arch_name == base_name)
>> > +	break;
>> > +    }
>> > +
>> > +  /* We couldn't find that architecture name.  */
>> > +  if (a_to_an->arch == aarch64_no_arch)
>> > +    fatal_error (input_location, "unknown value %qs for %<-march%>", name);
>> > +
>> > +  aarch64_feature_flags flags = a_to_an->flags;
>> > +  aarch64_parse_extension (extension_str.c_str (), &flags, NULL);
>> > +
>> > +  std::string outstr = aarch64_get_arch_string_for_assembler (a_to_an->arch,
>> > +							      flags);
>> > +
>> > +  /* We are going to memory leak here, nobody elsewhere
>> > +     in the callchain is going to clean up after us.  The alternative is
>> > +     to allocate a static buffer, and assert that it is big enough for our
>> > +     modified string, which seems much worse!  */
>> > +  return xstrdup (outstr.c_str ());
>> > +}
>> 
>> This is going to seem like feature creep, sorry, but: rather than
>> duplicate the architecture parsing, could we instead move the march
>> and mcpu processing from aarch64.cc to here?  Specifically:
>> 
>> - aarch64_parse_arch
>> - aarch64_parse_cpu
>> - aarch64_validate_mcpu
>> - aarch64_validate_march
>> - aarch64_print_hint_for_*
>> 
>> This would mean making "struct processor" public, and so giving it
>> an aarch64_ name (or putting it in a namespace).  We'd also need to
>> remove the tuning information and use a separate table for that.
>> Still, I think it would be more robust than having two pieces of code
>> doing the same parsing.  It should also give a better UI, since the
>> driver parsing would give the same hints as the compiler proper.
>> 
>> (Only tested to the point of moving the code and linking the driver.)
>
> This seems like a reasonable improvement, though I do see the "seems like
> feature creep" remark.

I'm happy to do this refactor if that's easier, since I know you have
other urgent things to take care of.

Richard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] aarch64: Try to avoid passing new flags to assembler
  2025-01-07 17:50       ` Richard Sandiford
@ 2025-01-09 18:00         ` Richard Sandiford
  2025-01-10 14:38           ` Andrew Carlotti
  0 siblings, 1 reply; 25+ messages in thread
From: Richard Sandiford @ 2025-01-09 18:00 UTC (permalink / raw)
  To: Andrew Carlotti
  Cc: gcc-patches, Andrew Pinski, Andre Vieira, richard.earnshaw, ktkachov

Richard Sandiford <richard.sandiford@arm.com> writes:
> Andrew Carlotti <andrew.carlotti@arm.com> writes:
>> On Mon, Nov 25, 2024 at 11:26:39PM +0000, Richard Sandiford wrote:
>>> Sorry for the slow review.
>>> 
>>> Andrew Carlotti <andrew.carlotti@arm.com> writes:
>>> > These new flags (+fcma, +jscvt, +rcpc2, +jscvt, +frintts, +wfxt and +xs)
>>> > were only recently added to the assembler.  To improve compatibility
>>> > with older assemblers, we try to avoid passing these new flags to the
>>> > assembler if we can express the targetted architecture without them. We
>>> > do so by using an almost-equivalent architecture string with a higher
>>> > architecture version.
>>> >
>>> > This should never reduce the set of instructions accepted by the
>>> > assembler.  It will make it more lenient in two cases:
>>> >
>>> > 1. Many system registers are currently gated behind architecture
>>> > versions instead of specific feature flags.  Increasing the base
>>> > architecture version may cause more system register accesses to be
>>> > accepted.
>>> >
>>> > 2. FEAT_XS doesn't have an HWCAP bit or cpuinfo entry.  We still want to
>>> > avoid passing +wfxt or +noxs to the assembler if possible, so we'll
>>> > instruct the assembler to accept FEAT_XS instructions as well whenever
>>> > the rest of the new features are enabled.
>>> >
>>> > gcc/ChangeLog:
>>> >
>>> > 	* common/config/aarch64/aarch64-common.cc
>>> > 	(aarch64_get_arch_string_for_assembler): New.
>>> > 	(aarch64_rewrite_march): New.
>>> > 	(aarch64_rewrite_selected_cpu): Call new function.
>>> > 	* config/aarch64/aarch64-elf.h (ASM_SPEC): Remove identity mapping.
>>> > 	* config/aarch64/aarch64-protos.h
>>> > 	(aarch64_get_arch_string_for_assembler): New.
>>> > 	* config/aarch64/aarch64.cc
>>> > 	(aarch64_declare_function_name): Call new function.
>>> > 	(aarch64_start_file): Ditto.
>>> > 	* config/aarch64/aarch64.h
>>> > 	* config/aarch64/aarch64.h
>>> > 	(EXTRA_SPEC_FUNCTIONS): Use new macro name.
>>> > 	(MCPU_TO_MARCH_SPEC): Rename to...
>>> > 	(MARCH_REWRITE_SPEC): ...this, and add new spec rule.
>>> > 	(aarch64_rewrite_march): New declaration.
>>> > 	(MCPU_TO_MARCH_SPEC_FUNCTIONS): Rename to...
>>> > 	(MARCH_REWRITE_SPEC_FUNCTIONS): ...this, and add new function.
>>> > 	(ASM_CPU_SPEC): Use new macro name.
>>> >
>>> > gcc/testsuite/ChangeLog:
>>> >
>>> > 	* gcc.target/aarch64/cpunative/native_cpu_21.c: Update check.
>>> > 	* gcc.target/aarch64/cpunative/native_cpu_22.c: Update check.
>>> > 	* gcc.target/aarch64/cpunative/info_27: New test.
>>> > 	* gcc.target/aarch64/cpunative/info_28: New test.
>>> > 	* gcc.target/aarch64/cpunative/info_29: New test.
>>> > 	* gcc.target/aarch64/cpunative/native_cpu_27.c: New test.
>>> > 	* gcc.target/aarch64/cpunative/native_cpu_28.c: New test.
>>> > 	* gcc.target/aarch64/cpunative/native_cpu_29.c: New test.
>>> >
>>> >
>>> > diff --git a/gcc/common/config/aarch64/aarch64-common.cc b/gcc/common/config/aarch64/aarch64-common.cc
>>> > index 2bfc597e333b6018970a9ee6e370a66b6d0960ef..717b3238be16f39a6fd1b4143662eb540ccf292d 100644
>>> > --- a/gcc/common/config/aarch64/aarch64-common.cc
>>> > +++ b/gcc/common/config/aarch64/aarch64-common.cc
>>> > @@ -371,6 +371,119 @@ aarch64_get_extension_string_for_isa_flags
>>> >    return outstr;
>>> >  }
>>> >  
>>> > +/* Generate an arch string to be passed to the assembler.
>>> > +
>>> > +   Several flags were added retrospectively for features that were previously
>>> > +   enabled only by specifying an architecture version.  We want to avoid
>>> > +   passing these flags to the assembler if possible, to improve compatibility
>>> > +   with older assemblers.  */
>>> > +
>>> > +std::string
>>> > +aarch64_get_arch_string_for_assembler (aarch64_arch arch,
>>> > +				       aarch64_feature_flags flags)
>>> > +{
>>> > +  if (!(flags & AARCH64_FL_FCMA) || !(flags & AARCH64_FL_JSCVT))
>>> > +    goto done;
>>> > +
>>> > +  if (arch == AARCH64_ARCH_V8A
>>> > +      || arch == AARCH64_ARCH_V8_1A
>>> > +      || arch == AARCH64_ARCH_V8_2A)
>>> > +    arch = AARCH64_ARCH_V8_3A;
>>> > +
>>> > +  if (!(flags & AARCH64_FL_RCPC2))
>>> > +    goto done;
>>> > +
>>> > +  if (arch == AARCH64_ARCH_V8_3A)
>>> > +    arch = AARCH64_ARCH_V8_4A;
>>> > +
>>> > +  if (!(flags & AARCH64_FL_FRINTTS) || !(flags & AARCH64_FL_FLAGM2))
>>> > +    goto done;
>>> > +
>>> > +  if (arch == AARCH64_ARCH_V8_4A)
>>> > +    arch = AARCH64_ARCH_V8_5A;
>>> > +
>>> > +  if (!(flags & AARCH64_FL_WFXT))
>>> > +    goto done;
>>> > +
>>> > +  if (arch == AARCH64_ARCH_V8_5A || arch == AARCH64_ARCH_V8_6A)
>>> > +    {
>>> > +      arch = AARCH64_ARCH_V8_7A;
>>> > +      /* We don't support native detection for FEAT_XS, so we'll assume it's
>>> > +	 present if the rest of these features are also present.  If we don't
>>> > +	 do this, then we would end up passing +noxs to the assembler.  */
>>> > +      flags |= AARCH64_FL_XS;
>>> > +    }
>>> > +done:
>>> > +
>>> > +  const struct arch_to_arch_name* a_to_an;
>>> > +  for (a_to_an = all_architectures;
>>> > +       a_to_an->arch != aarch64_no_arch;
>>> > +       a_to_an++)
>>> > +    {
>>> > +      if (a_to_an->arch == arch)
>>> > +	break;
>>> > +    }
>>> > +
>>> > +  std::string outstr = a_to_an->arch_name
>>> > +	+ aarch64_get_extension_string_for_isa_flags (flags, a_to_an->flags);
>>> > +
>>> > +  return outstr;
>>> > +}
>>> 
>>> I was hoping we could do this in a table-driven way.  Experimenting
>>> a bit locally (but only lightly tested), the following seems to work:
>>> 
>>> aarch64.h:
>>> 
>>> /* The set of all architecture flags.  */
>>> constexpr auto AARCH64_FL_ARCHES ATTRIBUTE_UNUSED = aarch64_feature_flags (0)
>>> #define AARCH64_ARCH(A, B, ARCH_IDENT, D, E) \
>>>   | feature_deps::ARCH_IDENT ().flag
>>> #include "config/aarch64/aarch64-arches.def"
>>> ;
>>> 
>>> aarch64-common.cc:
>>> 
>>> ...
>>>   const struct arch_to_arch_name *best = nullptr;
>>>   for (auto *a_to_an = all_architectures;
>>>        a_to_an->arch != aarch64_no_arch;
>>>        a_to_an++)
>>>     {
>>>       /* Require the architecture to have all architecture flags in FLAGS.  */
>>>       if ((~a_to_an->flags & flags & AARCH64_FL_ARCHES) != 0)
>>> 	continue;
>>> 
>>>       /* Skip architectures that add no new mandatory features.  */
>>>       if (best && (a_to_an->flags & ~best->flags & ~AARCH64_FL_ARCHES) == 0)
>>> 	continue;
>>> 
>>>       /* Require FLAGS to include all mandatory extensions.  */
>>>       if ((a_to_an->flags & ~flags & ~AARCH64_FL_ARCHES) != 0)
>>>         continue;
>>> 
>>>       best = a_to_an;
>>>     }
>>
>> There are some hypothetical cases in which your suggested approach wouldn't be
>> able to avoid the new feature flag, whereas my more targetted approach here
>> would.  I'm struggling to think of a realistic example though, as I can only
>> think of one core in which any of these features have been backported to an
>> earlier architecture version.  The closest bad examples I can find are:
>>
>> - Using -mcpu=a64fx+nosve: This would canonicalise to armv8.2-a+f16+fcma with
>>   either approach.  If jscvt didn't exist, however, then my approach would
>>   instead give armv8.3-a+nopauth+norcpc.
>> - Running on an unrecognised future cpu with an unfortunate combination of
>>   features could also be an example, but that's probably unrealistic.
>
> Thanks for considering these cases.  But like you say, the first one
> is counterfactual, in that we do have jscvt.  And personally, I think
> it's less surprising to add +fcma in that case anyway, rather than have
> +nosve bump the architecture version from armv8.2-a to armv8.3-a.
> It seems odd for +no to increase the architecture level.
>
> So IMO, the fact that we use the "correct" base architecture is a plus
> rather than a minus.  This is more about choosing between multiple ways
> of telling a direct truth, rather than trying to be indirect.
>
>> I also feel that this is a work-around that we should be applying in moderation
>> only when we know it would actually help with compatibility issues.
>
> I can see that.  But I think it's easier to reason about, and less
> surprising, if the behaviour stems from easily-understood rules,
> rather than being a list of special cases.
>
> Also, I think replacing -march=armv8-a+all+the+features+for+v8.8-a
> with -march=armv8.8-a is an independent good.  So...
>
>> However, if you'd still prefer to use a bigger table-driven hammer, then I can
>> change the patch to do that.
>
> ...yeah, I would still prefer that, sorry. :)

To recap, the original problem that we're trying to solve is:

  In addition to Andrew P's comment about documentation, doesn't this
  mean that -mcpu=native will now emit +fcma .arch strings for
  unrecognised CPUs (i.e. those for which we can't establish a
  baseline beyond Armv8-A?).  E.g., I think:

  processor	: 0
  BogoMIPS	: 100.00
  Features        : fp asimd atomics crc32 asimdrdm paca pacg lrcpc fcma
  CPU implementer	: 0xaa
  CPU architecture: 8
  CPU variant	: 0xaa
  CPU part	: 0xaa
  CPU revision	: 0

  should enable all the Armv8.3-A features that GCC is aware of after
  this patch, but we emit:

          .arch armv8-a+lse+rdma+crc+fcma+rcpc+pauth

  rather than:

          .arch armv8.3-a

  And that could be a problem because binutils support for the +fcma
  name is relatively recent (your patch from January this year).
  Assembling with older versions of gas is likely to fail, regardless
  of whether the code uses FCMA.

  I think we might need to adjust the driver code so that it tries
  to consolidate features into an architecture level where possible.

However, Andrew C pointed out off-list that this wouldn't solve
the problem for Armv8.5-A and above, because predres is a mandatory
feature for Armv8.5-A, but it does not have a /proc/cpuinfo identifier.
We'd therefore never go beyond armv8.5-a and would add an explicit
+flagm2 (etc.) for Armv8.6-A and above, or for Armv9-A and above.

And Armv8.5-A+ and Armv9-A+ are likely to be the only cases that
matter, in that it's relatively unlikely that new CPUs would be
below those baselines.

One option would have been to limit the procedure above to flags
that have /proc/cpuinfo identifiers, but that makes the dangerous
assumption that features without /proc/cpuinfo identifiers would
never be used by the compiler (only directly via intrinsics or asm).

Another alternative would be to go back to the original patch,
that sometimes goes to a higher architecture level and then
adds +no options.  But that feels dangerous too.  Going back to
the example above, it seems wrong in principle to emit
-march=armv8.3-a+stuff for A64FX.

So after all that, I suppose we're just going to have to accept that
anyone using -mcpu=native with GCC15+ on a target that GCC doesn't
recognise will need to use binutils 2.42+.  It would be worth a mention
in the release notes, under the "Caveats" section.

Does anyone disagree?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 10/10] aarch64: Try to avoid passing new flags to assembler
  2025-01-09 18:00         ` Richard Sandiford
@ 2025-01-10 14:38           ` Andrew Carlotti
  0 siblings, 0 replies; 25+ messages in thread
From: Andrew Carlotti @ 2025-01-10 14:38 UTC (permalink / raw)
  To: gcc-patches, Andrew Pinski, Andre Vieira, richard.earnshaw,
	ktkachov, richard.sandiford

On Thu, Jan 09, 2025 at 06:00:34PM +0000, Richard Sandiford wrote:
> Richard Sandiford <richard.sandiford@arm.com> writes:
> > Andrew Carlotti <andrew.carlotti@arm.com> writes:
> >> On Mon, Nov 25, 2024 at 11:26:39PM +0000, Richard Sandiford wrote:
> >>> Sorry for the slow review.
> >>> 
> >>> Andrew Carlotti <andrew.carlotti@arm.com> writes:
> >>> > These new flags (+fcma, +jscvt, +rcpc2, +jscvt, +frintts, +wfxt and +xs)
> >>> > were only recently added to the assembler.  To improve compatibility
> >>> > with older assemblers, we try to avoid passing these new flags to the
> >>> > assembler if we can express the targetted architecture without them. We
> >>> > do so by using an almost-equivalent architecture string with a higher
> >>> > architecture version.
> >>> >
> >>> > This should never reduce the set of instructions accepted by the
> >>> > assembler.  It will make it more lenient in two cases:
> >>> >
> >>> > 1. Many system registers are currently gated behind architecture
> >>> > versions instead of specific feature flags.  Increasing the base
> >>> > architecture version may cause more system register accesses to be
> >>> > accepted.
> >>> >
> >>> > 2. FEAT_XS doesn't have an HWCAP bit or cpuinfo entry.  We still want to
> >>> > avoid passing +wfxt or +noxs to the assembler if possible, so we'll
> >>> > instruct the assembler to accept FEAT_XS instructions as well whenever
> >>> > the rest of the new features are enabled.
> >>> >
> >>> > gcc/ChangeLog:
> >>> >
> >>> > 	* common/config/aarch64/aarch64-common.cc
> >>> > 	(aarch64_get_arch_string_for_assembler): New.
> >>> > 	(aarch64_rewrite_march): New.
> >>> > 	(aarch64_rewrite_selected_cpu): Call new function.
> >>> > 	* config/aarch64/aarch64-elf.h (ASM_SPEC): Remove identity mapping.
> >>> > 	* config/aarch64/aarch64-protos.h
> >>> > 	(aarch64_get_arch_string_for_assembler): New.
> >>> > 	* config/aarch64/aarch64.cc
> >>> > 	(aarch64_declare_function_name): Call new function.
> >>> > 	(aarch64_start_file): Ditto.
> >>> > 	* config/aarch64/aarch64.h
> >>> > 	* config/aarch64/aarch64.h
> >>> > 	(EXTRA_SPEC_FUNCTIONS): Use new macro name.
> >>> > 	(MCPU_TO_MARCH_SPEC): Rename to...
> >>> > 	(MARCH_REWRITE_SPEC): ...this, and add new spec rule.
> >>> > 	(aarch64_rewrite_march): New declaration.
> >>> > 	(MCPU_TO_MARCH_SPEC_FUNCTIONS): Rename to...
> >>> > 	(MARCH_REWRITE_SPEC_FUNCTIONS): ...this, and add new function.
> >>> > 	(ASM_CPU_SPEC): Use new macro name.
> >>> >
> >>> > gcc/testsuite/ChangeLog:
> >>> >
> >>> > 	* gcc.target/aarch64/cpunative/native_cpu_21.c: Update check.
> >>> > 	* gcc.target/aarch64/cpunative/native_cpu_22.c: Update check.
> >>> > 	* gcc.target/aarch64/cpunative/info_27: New test.
> >>> > 	* gcc.target/aarch64/cpunative/info_28: New test.
> >>> > 	* gcc.target/aarch64/cpunative/info_29: New test.
> >>> > 	* gcc.target/aarch64/cpunative/native_cpu_27.c: New test.
> >>> > 	* gcc.target/aarch64/cpunative/native_cpu_28.c: New test.
> >>> > 	* gcc.target/aarch64/cpunative/native_cpu_29.c: New test.
> >>> >
> >>> >
> >>> > diff --git a/gcc/common/config/aarch64/aarch64-common.cc b/gcc/common/config/aarch64/aarch64-common.cc
> >>> > index 2bfc597e333b6018970a9ee6e370a66b6d0960ef..717b3238be16f39a6fd1b4143662eb540ccf292d 100644
> >>> > --- a/gcc/common/config/aarch64/aarch64-common.cc
> >>> > +++ b/gcc/common/config/aarch64/aarch64-common.cc
> >>> > @@ -371,6 +371,119 @@ aarch64_get_extension_string_for_isa_flags
> >>> >    return outstr;
> >>> >  }
> >>> >  
> >>> > +/* Generate an arch string to be passed to the assembler.
> >>> > +
> >>> > +   Several flags were added retrospectively for features that were previously
> >>> > +   enabled only by specifying an architecture version.  We want to avoid
> >>> > +   passing these flags to the assembler if possible, to improve compatibility
> >>> > +   with older assemblers.  */
> >>> > +
> >>> > +std::string
> >>> > +aarch64_get_arch_string_for_assembler (aarch64_arch arch,
> >>> > +				       aarch64_feature_flags flags)
> >>> > +{
> >>> > +  if (!(flags & AARCH64_FL_FCMA) || !(flags & AARCH64_FL_JSCVT))
> >>> > +    goto done;
> >>> > +
> >>> > +  if (arch == AARCH64_ARCH_V8A
> >>> > +      || arch == AARCH64_ARCH_V8_1A
> >>> > +      || arch == AARCH64_ARCH_V8_2A)
> >>> > +    arch = AARCH64_ARCH_V8_3A;
> >>> > +
> >>> > +  if (!(flags & AARCH64_FL_RCPC2))
> >>> > +    goto done;
> >>> > +
> >>> > +  if (arch == AARCH64_ARCH_V8_3A)
> >>> > +    arch = AARCH64_ARCH_V8_4A;
> >>> > +
> >>> > +  if (!(flags & AARCH64_FL_FRINTTS) || !(flags & AARCH64_FL_FLAGM2))
> >>> > +    goto done;
> >>> > +
> >>> > +  if (arch == AARCH64_ARCH_V8_4A)
> >>> > +    arch = AARCH64_ARCH_V8_5A;
> >>> > +
> >>> > +  if (!(flags & AARCH64_FL_WFXT))
> >>> > +    goto done;
> >>> > +
> >>> > +  if (arch == AARCH64_ARCH_V8_5A || arch == AARCH64_ARCH_V8_6A)
> >>> > +    {
> >>> > +      arch = AARCH64_ARCH_V8_7A;
> >>> > +      /* We don't support native detection for FEAT_XS, so we'll assume it's
> >>> > +	 present if the rest of these features are also present.  If we don't
> >>> > +	 do this, then we would end up passing +noxs to the assembler.  */
> >>> > +      flags |= AARCH64_FL_XS;
> >>> > +    }
> >>> > +done:
> >>> > +
> >>> > +  const struct arch_to_arch_name* a_to_an;
> >>> > +  for (a_to_an = all_architectures;
> >>> > +       a_to_an->arch != aarch64_no_arch;
> >>> > +       a_to_an++)
> >>> > +    {
> >>> > +      if (a_to_an->arch == arch)
> >>> > +	break;
> >>> > +    }
> >>> > +
> >>> > +  std::string outstr = a_to_an->arch_name
> >>> > +	+ aarch64_get_extension_string_for_isa_flags (flags, a_to_an->flags);
> >>> > +
> >>> > +  return outstr;
> >>> > +}
> >>> 
> >>> I was hoping we could do this in a table-driven way.  Experimenting
> >>> a bit locally (but only lightly tested), the following seems to work:
> >>> 
> >>> aarch64.h:
> >>> 
> >>> /* The set of all architecture flags.  */
> >>> constexpr auto AARCH64_FL_ARCHES ATTRIBUTE_UNUSED = aarch64_feature_flags (0)
> >>> #define AARCH64_ARCH(A, B, ARCH_IDENT, D, E) \
> >>>   | feature_deps::ARCH_IDENT ().flag
> >>> #include "config/aarch64/aarch64-arches.def"
> >>> ;
> >>> 
> >>> aarch64-common.cc:
> >>> 
> >>> ...
> >>>   const struct arch_to_arch_name *best = nullptr;
> >>>   for (auto *a_to_an = all_architectures;
> >>>        a_to_an->arch != aarch64_no_arch;
> >>>        a_to_an++)
> >>>     {
> >>>       /* Require the architecture to have all architecture flags in FLAGS.  */
> >>>       if ((~a_to_an->flags & flags & AARCH64_FL_ARCHES) != 0)
> >>> 	continue;
> >>> 
> >>>       /* Skip architectures that add no new mandatory features.  */
> >>>       if (best && (a_to_an->flags & ~best->flags & ~AARCH64_FL_ARCHES) == 0)
> >>> 	continue;
> >>> 
> >>>       /* Require FLAGS to include all mandatory extensions.  */
> >>>       if ((a_to_an->flags & ~flags & ~AARCH64_FL_ARCHES) != 0)
> >>>         continue;
> >>> 
> >>>       best = a_to_an;
> >>>     }
> >>
> >> There are some hypothetical cases in which your suggested approach wouldn't be
> >> able to avoid the new feature flag, whereas my more targetted approach here
> >> would.  I'm struggling to think of a realistic example though, as I can only
> >> think of one core in which any of these features have been backported to an
> >> earlier architecture version.  The closest bad examples I can find are:
> >>
> >> - Using -mcpu=a64fx+nosve: This would canonicalise to armv8.2-a+f16+fcma with
> >>   either approach.  If jscvt didn't exist, however, then my approach would
> >>   instead give armv8.3-a+nopauth+norcpc.
> >> - Running on an unrecognised future cpu with an unfortunate combination of
> >>   features could also be an example, but that's probably unrealistic.
> >
> > Thanks for considering these cases.  But like you say, the first one
> > is counterfactual, in that we do have jscvt.  And personally, I think
> > it's less surprising to add +fcma in that case anyway, rather than have
> > +nosve bump the architecture version from armv8.2-a to armv8.3-a.
> > It seems odd for +no to increase the architecture level.
> >
> > So IMO, the fact that we use the "correct" base architecture is a plus
> > rather than a minus.  This is more about choosing between multiple ways
> > of telling a direct truth, rather than trying to be indirect.
> >
> >> I also feel that this is a work-around that we should be applying in moderation
> >> only when we know it would actually help with compatibility issues.
> >
> > I can see that.  But I think it's easier to reason about, and less
> > surprising, if the behaviour stems from easily-understood rules,
> > rather than being a list of special cases.
> >
> > Also, I think replacing -march=armv8-a+all+the+features+for+v8.8-a
> > with -march=armv8.8-a is an independent good.  So...
> >
> >> However, if you'd still prefer to use a bigger table-driven hammer, then I can
> >> change the patch to do that.
> >
> > ...yeah, I would still prefer that, sorry. :)
> 
> To recap, the original problem that we're trying to solve is:
> 
>   In addition to Andrew P's comment about documentation, doesn't this
>   mean that -mcpu=native will now emit +fcma .arch strings for
>   unrecognised CPUs (i.e. those for which we can't establish a
>   baseline beyond Armv8-A?).  E.g., I think:
> 
>   processor	: 0
>   BogoMIPS	: 100.00
>   Features        : fp asimd atomics crc32 asimdrdm paca pacg lrcpc fcma
>   CPU implementer	: 0xaa
>   CPU architecture: 8
>   CPU variant	: 0xaa
>   CPU part	: 0xaa
>   CPU revision	: 0
> 
>   should enable all the Armv8.3-A features that GCC is aware of after
>   this patch, but we emit:
> 
>           .arch armv8-a+lse+rdma+crc+fcma+rcpc+pauth
> 
>   rather than:
> 
>           .arch armv8.3-a
> 
>   And that could be a problem because binutils support for the +fcma
>   name is relatively recent (your patch from January this year).
>   Assembling with older versions of gas is likely to fail, regardless
>   of whether the code uses FCMA.
> 
>   I think we might need to adjust the driver code so that it tries
>   to consolidate features into an architecture level where possible.
> 
> However, Andrew C pointed out off-list that this wouldn't solve
> the problem for Armv8.5-A and above, because predres is a mandatory
> feature for Armv8.5-A, but it does not have a /proc/cpuinfo identifier.
> We'd therefore never go beyond armv8.5-a and would add an explicit
> +flagm2 (etc.) for Armv8.6-A and above, or for Armv9-A and above.
> 
> And Armv8.5-A+ and Armv9-A+ are likely to be the only cases that
> matter, in that it's relatively unlikely that new CPUs would be
> below those baselines.
> 
> One option would have been to limit the procedure above to flags
> that have /proc/cpuinfo identifiers, but that makes the dangerous
> assumption that features without /proc/cpuinfo identifiers would
> never be used by the compiler (only directly via intrinsics or asm).
> 
> Another alternative would be to go back to the original patch,
> that sometimes goes to a higher architecture level and then
> adds +no options.  But that feels dangerous too.  Going back to
> the example above, it seems wrong in principle to emit
> -march=armv8.3-a+stuff for A64FX.
> 
> So after all that, I suppose we're just going to have to accept that
> anyone using -mcpu=native with GCC15+ on a target that GCC doesn't
> recognise will need to use binutils 2.42+.  It would be worth a mention
> in the release notes, under the "Caveats" section.
> 
> Does anyone disagree?
> 
> Thanks,
> Richard

I've pushed the first 9 patches in the series to master.  I'll also repost a
refactored version of this patch 10/10 with the extra canonicalisation
removed - that will be a good cleanup and will be useful if we want to try
tweaking the assembler target strings in any other way.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] aarch64: Use PAUTH instead of V8_3A in some places
  2024-10-08 15:46   ` Richard Sandiford
@ 2025-03-20 14:05     ` Alfie Richards
  2025-04-01  6:45       ` Alfie Richards
  0 siblings, 1 reply; 25+ messages in thread
From: Alfie Richards @ 2025-03-20 14:05 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford, Richard.Earnshaw, andrew.carlotti

Hi all,

This commit applies cleanly to GCC 14 and fixes PR119372.

Bootstrapped and regtested on aarch64-linux-gnu.

Okay for gcc 14 backport?

Alfie Richards

On 08/10/2024 16:46, Richard Sandiford wrote:
> Andrew Carlotti <andrew.carlotti@arm.com> writes:
>> gcc/ChangeLog:
>>
>> 	* config/aarch64/aarch64.cc
>> 	(aarch64_expand_epilogue): Use TARGET_PAUTH.
>> 	* config/aarch64/aarch64.md: Update comment.
>>
>>
>> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
>> index e7bb3278a27eca44c46afd26069d608218198a54..cf1107127fd5d9e12ad42441528666bf6b733f73 100644
>> --- a/gcc/config/aarch64/aarch64.cc
>> +++ b/gcc/config/aarch64/aarch64.cc
>> @@ -10042,12 +10042,12 @@ aarch64_expand_epilogue (rtx_call_insn *sibcall)
>>   	1) Sibcalls don't return in a normal way, so if we're about to call one
>>   	   we must authenticate.
>>   
>> -	2) The RETAA instruction is not available before ARMv8.3-A, so if we are
>> -	   generating code for !TARGET_ARMV8_3 we can't use it and must
>> +	2) The RETAA instruction is not available without FEAT_PAuth, so if we
>> +	   are generating code for !TARGET_PAUTH we can't use it and must
>>   	   explicitly authenticate.
>>       */
>>     if (aarch64_return_address_signing_enabled ()
>> -      && (sibcall || !TARGET_ARMV8_3))
>> +      && (sibcall || !TARGET_PAUTH))
>>       {
>>         switch (aarch64_ra_sign_key)
>>   	{
>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
>> index c54b29cd64b9e0dc6c6d12735049386ccedc5408..0940a84f9295ee2bc07282b150095fdb5af11a4d 100644
>> --- a/gcc/config/aarch64/aarch64.md
>> +++ b/gcc/config/aarch64/aarch64.md
>> @@ -7672,10 +7672,10 @@
>>   )
>>   
>>   ;; Pointer authentication patterns are always provided.  In architecture
>> -;; revisions prior to ARMv8.3-A these HINT instructions operate as NOPs.
>> +;; revisions prior to FEAT_PAuth these HINT instructions operate as NOPs.
> 
> I suppose this should be something like "On targets that don't implement
> FEAT_PAuth".  OK with that change, thanks.
> 
> Richard
> 
>>   ;; This lets the user write portable software which authenticates pointers
>> -;; when run on something which implements ARMv8.3-A, and which runs
>> -;; correctly, but does not authenticate pointers, where ARMv8.3-A is not
>> +;; when run on something which implements FEAT_PAuth, and which runs
>> +;; correctly, but does not authenticate pointers, where FEAT_PAuth is not
>>   ;; implemented.
>>   
>>   ;; Signing/Authenticating R30 using SP as the salt.
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] aarch64: Use PAUTH instead of V8_3A in some places
  2025-03-20 14:05     ` Alfie Richards
@ 2025-04-01  6:45       ` Alfie Richards
  2025-04-02  9:02         ` Richard Sandiford
  0 siblings, 1 reply; 25+ messages in thread
From: Alfie Richards @ 2025-04-01  6:45 UTC (permalink / raw)
  To: Richard Sandiford, gcc-patches; +Cc: andrew.carlotti

Hi Richard,

Is this backport okay for GCC 14 as well?
(It applies cleanly for 14 but patch for 12 and 13 required a minor edit)

Alfie

On 20/03/2025 14:05, Alfie Richards wrote:
> Hi all,
> 
> This commit applies cleanly to GCC 14 and fixes PR119372.
> 
> Bootstrapped and regtested on aarch64-linux-gnu.
> 
> Okay for gcc 14 backport?
> 
> Alfie Richards
> 
> On 08/10/2024 16:46, Richard Sandiford wrote:
>> Andrew Carlotti <andrew.carlotti@arm.com> writes:
>>> gcc/ChangeLog:
>>>
>>>     * config/aarch64/aarch64.cc
>>>     (aarch64_expand_epilogue): Use TARGET_PAUTH.
>>>     * config/aarch64/aarch64.md: Update comment.
>>>
>>>
>>> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/ 
>>> aarch64.cc
>>> index 
>>> e7bb3278a27eca44c46afd26069d608218198a54..cf1107127fd5d9e12ad42441528666bf6b733f73 100644
>>> --- a/gcc/config/aarch64/aarch64.cc
>>> +++ b/gcc/config/aarch64/aarch64.cc
>>> @@ -10042,12 +10042,12 @@ aarch64_expand_epilogue (rtx_call_insn 
>>> *sibcall)
>>>       1) Sibcalls don't return in a normal way, so if we're about to 
>>> call one
>>>          we must authenticate.
>>> -    2) The RETAA instruction is not available before ARMv8.3-A, so 
>>> if we are
>>> -       generating code for !TARGET_ARMV8_3 we can't use it and must
>>> +    2) The RETAA instruction is not available without FEAT_PAuth, so 
>>> if we
>>> +       are generating code for !TARGET_PAUTH we can't use it and must
>>>          explicitly authenticate.
>>>       */
>>>     if (aarch64_return_address_signing_enabled ()
>>> -      && (sibcall || !TARGET_ARMV8_3))
>>> +      && (sibcall || !TARGET_PAUTH))
>>>       {
>>>         switch (aarch64_ra_sign_key)
>>>       {
>>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/ 
>>> aarch64.md
>>> index 
>>> c54b29cd64b9e0dc6c6d12735049386ccedc5408..0940a84f9295ee2bc07282b150095fdb5af11a4d 100644
>>> --- a/gcc/config/aarch64/aarch64.md
>>> +++ b/gcc/config/aarch64/aarch64.md
>>> @@ -7672,10 +7672,10 @@
>>>   )
>>>   ;; Pointer authentication patterns are always provided.  In 
>>> architecture
>>> -;; revisions prior to ARMv8.3-A these HINT instructions operate as 
>>> NOPs.
>>> +;; revisions prior to FEAT_PAuth these HINT instructions operate as 
>>> NOPs.
>>
>> I suppose this should be something like "On targets that don't implement
>> FEAT_PAuth".  OK with that change, thanks.
>>
>> Richard
>>
>>>   ;; This lets the user write portable software which authenticates 
>>> pointers
>>> -;; when run on something which implements ARMv8.3-A, and which runs
>>> -;; correctly, but does not authenticate pointers, where ARMv8.3-A is 
>>> not
>>> +;; when run on something which implements FEAT_PAuth, and which runs
>>> +;; correctly, but does not authenticate pointers, where FEAT_PAuth 
>>> is not
>>>   ;; implemented.
>>>   ;; Signing/Authenticating R30 using SP as the salt.
>>
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] aarch64: Use PAUTH instead of V8_3A in some places
  2025-04-01  6:45       ` Alfie Richards
@ 2025-04-02  9:02         ` Richard Sandiford
  0 siblings, 0 replies; 25+ messages in thread
From: Richard Sandiford @ 2025-04-02  9:02 UTC (permalink / raw)
  To: Alfie Richards; +Cc: gcc-patches, andrew.carlotti

Alfie Richards <alfie.richards@arm.com> writes:
> Hi Richard,
>
> Is this backport okay for GCC 14 as well?
> (It applies cleanly for 14 but patch for 12 and 13 required a minor edit)

Yeah, OK for GCC 14, thanks.

Richard

>
> Alfie
>
> On 20/03/2025 14:05, Alfie Richards wrote:
>> Hi all,
>> 
>> This commit applies cleanly to GCC 14 and fixes PR119372.
>> 
>> Bootstrapped and regtested on aarch64-linux-gnu.
>> 
>> Okay for gcc 14 backport?
>> 
>> Alfie Richards
>> 
>> On 08/10/2024 16:46, Richard Sandiford wrote:
>>> Andrew Carlotti <andrew.carlotti@arm.com> writes:
>>>> gcc/ChangeLog:
>>>>
>>>>     * config/aarch64/aarch64.cc
>>>>     (aarch64_expand_epilogue): Use TARGET_PAUTH.
>>>>     * config/aarch64/aarch64.md: Update comment.
>>>>
>>>>
>>>> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/ 
>>>> aarch64.cc
>>>> index 
>>>> e7bb3278a27eca44c46afd26069d608218198a54..cf1107127fd5d9e12ad42441528666bf6b733f73 100644
>>>> --- a/gcc/config/aarch64/aarch64.cc
>>>> +++ b/gcc/config/aarch64/aarch64.cc
>>>> @@ -10042,12 +10042,12 @@ aarch64_expand_epilogue (rtx_call_insn 
>>>> *sibcall)
>>>>       1) Sibcalls don't return in a normal way, so if we're about to 
>>>> call one
>>>>          we must authenticate.
>>>> -    2) The RETAA instruction is not available before ARMv8.3-A, so 
>>>> if we are
>>>> -       generating code for !TARGET_ARMV8_3 we can't use it and must
>>>> +    2) The RETAA instruction is not available without FEAT_PAuth, so 
>>>> if we
>>>> +       are generating code for !TARGET_PAUTH we can't use it and must
>>>>          explicitly authenticate.
>>>>       */
>>>>     if (aarch64_return_address_signing_enabled ()
>>>> -      && (sibcall || !TARGET_ARMV8_3))
>>>> +      && (sibcall || !TARGET_PAUTH))
>>>>       {
>>>>         switch (aarch64_ra_sign_key)
>>>>       {
>>>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/ 
>>>> aarch64.md
>>>> index 
>>>> c54b29cd64b9e0dc6c6d12735049386ccedc5408..0940a84f9295ee2bc07282b150095fdb5af11a4d 100644
>>>> --- a/gcc/config/aarch64/aarch64.md
>>>> +++ b/gcc/config/aarch64/aarch64.md
>>>> @@ -7672,10 +7672,10 @@
>>>>   )
>>>>   ;; Pointer authentication patterns are always provided.  In 
>>>> architecture
>>>> -;; revisions prior to ARMv8.3-A these HINT instructions operate as 
>>>> NOPs.
>>>> +;; revisions prior to FEAT_PAuth these HINT instructions operate as 
>>>> NOPs.
>>>
>>> I suppose this should be something like "On targets that don't implement
>>> FEAT_PAuth".  OK with that change, thanks.
>>>
>>> Richard
>>>
>>>>   ;; This lets the user write portable software which authenticates 
>>>> pointers
>>>> -;; when run on something which implements ARMv8.3-A, and which runs
>>>> -;; correctly, but does not authenticate pointers, where ARMv8.3-A is 
>>>> not
>>>> +;; when run on something which implements FEAT_PAuth, and which runs
>>>> +;; correctly, but does not authenticate pointers, where FEAT_PAuth 
>>>> is not
>>>>   ;; implemented.
>>>>   ;; Signing/Authenticating R30 using SP as the salt.
>>>
>> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2025-04-02  9:02 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-04 17:50 [PATCH 0/8] aarch64: Add new flags for existing features Andrew Carlotti
2024-10-04 17:51 ` [PATCH 1/8] aarch64: Use PAUTH instead of V8_3A in some places Andrew Carlotti
2024-10-08 15:46   ` Richard Sandiford
2025-03-20 14:05     ` Alfie Richards
2025-04-01  6:45       ` Alfie Richards
2025-04-02  9:02         ` Richard Sandiford
2024-10-04 17:52 ` [PATCH 2/8] aarch64: Add new +fcma flag Andrew Carlotti
2024-10-08 16:18   ` Richard Sandiford
2024-10-25 14:31     ` Andre Vieira (lists)
2024-10-04 17:53 ` [PATCH 3/8] aarch64: Add new +jscvt flag Andrew Carlotti
2024-10-04 17:53 ` [PATCH 4/8] aarch64: Add new +frintts flag Andrew Carlotti
2024-10-04 17:53 ` [PATCH 5/8] aarch64: Add new +flagm2 flag Andrew Carlotti
2024-10-04 17:54 ` [PATCH 6/8] aarch64: Add new +rcpc2 flag Andrew Carlotti
2024-10-04 17:54 ` [PATCH 7/8] aarch64: Add new +wfxt flag Andrew Carlotti
2024-10-04 17:54 ` [PATCH 8/8] aarch64: Add new +xs flag Andrew Carlotti
2024-10-04 20:19 ` [PATCH 0/8] aarch64: Add new flags for existing features Andrew Pinski
2024-11-12 16:56 ` [PATCH 9/10] docs: Add new AArch64 flags Andrew Carlotti
2024-11-26 13:29   ` Richard Sandiford
2024-11-12 16:57 ` [PATCH 10/10] aarch64: Try to avoid passing new flags to assembler Andrew Carlotti
2024-11-25 23:26   ` Richard Sandiford
2025-01-07 12:11     ` Andrew Carlotti
2025-01-07 17:50       ` Richard Sandiford
2025-01-09 18:00         ` Richard Sandiford
2025-01-10 14:38           ` Andrew Carlotti
2024-11-12 17:02 ` [PATCH 0/10] aarch64: Add new flags for existing features Andrew Carlotti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).