public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc r9-9690] arm: reorder assembler architecture directives [PR101723]
@ 2021-08-23 14:41 Richard Earnshaw
  0 siblings, 0 replies; only message in thread
From: Richard Earnshaw @ 2021-08-23 14:41 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:04c568961e793a1d7ad86248b4ca929fc84acf8d

commit r9-9690-g04c568961e793a1d7ad86248b4ca929fc84acf8d
Author: Richard Earnshaw <rearnsha@arm.com>
Date:   Thu Jul 29 11:00:31 2021 +0100

    arm: reorder assembler architecture directives [PR101723]
    
    A change to the way gas interprets the .fpu directive in binutils-2.34
    means that issuing .fpu will clear any features set by .arch_extension
    that apply to the floating point or simd units.  This unfortunately
    causes problems for more recent versions of the architecture because
    we currently emit .arch, .arch_extension and .fpu directives at
    different times and try to suppress redundant changes.
    
    This change addresses this by firstly unifying all the places where we
    emit these directives to a single block of code and secondly
    (re)emitting all the directives if any changes have been made to the
    target options.  Whilst this is slightly more than the strict minimum
    it should be enough to catch all cases where a change could have
    happened.  The new code also emits the directives in the order: .arch,
    .fpu, .arch_extension.  This ensures that the additional architectural
    extensions are not removed by a later .fpu directive.
    
    Whilst writing this patch I also noticed that in the corner case where
    the last function to be compiled had a non-standard set of
    architecture flags, the assembler would add an incorrect set of
    derived attributes for the file as a whole.  Instead of reflecting the
    command-line options it would reflect the flags from the last file in
    the function.  To address this I've also added a call to re-emit the
    flags from the asm_file_end callback so the assembler will be in the
    correct state when it finishes processing the intput.
    
    There's some slight churn to the testsuite as a consequence of this,
    because previously we had a hack to suppress emitting a .fpu directive
    for one specific case, but with the new order this is no-longer
    necessary.
    
    gcc/ChangeLog:
    
            PR target/101723
            * config/arm/arm-cpus.in (quirk_no_asmcpu): New feature bit.
            (ALL_QUIRKS): Add it.
            (generic-armv7-a): Add quirk to suppress writing .cpu directive in
            asm output.
            * config/arm/arm.c (arm_identify_fpu_from_isa): New variable.
            (arm_last_printed_arch_string): Delete.
            (arm_last-printed_fpu_string): Delete.
            (arm_configure_build_target): If use of floating-point/SIMD is
            disabled, remove all fp/simd related features from the target ISA.
            (last_arm_targ_options): New variable.
            (arm_print_asm_arch_directives): Add new parameters.  Change order
            of emitted directives and handle all cases here.
            (arm_file_start): Always call arm_print_asm_arch_directives, move
            all generation of .arch/.arch_extension here.
            (arm_file_end): Call arm_print_asm_arch.
            (arm_declare_function_name): Call arm_print_asm_arch_directives
            instead of printing .arch/.fpu directives directly.
    
    gcc/testsuite/ChangeLog:
    
            PR target/101723
            * gcc.target/arm/attr-neon.c: Tighten scan-assembler tests.
            * gcc.target/arm/attr-neon2.c: Likewise.
            * gcc.target/arm/attr-neon3.c: Likewise.
            * gcc.target/arm/pr69245.c: Tighten scan-assembler match, but allow
            multiple instances.
            * gcc.target/arm/pragma_fpu_attribute.c: Likewise.
            * gcc.target/arm/pragma_fpu_attribute_2.c: Likewise.
    
    (cherry picked from commit c1cdabe3aab817d95a8db00a8b5e9f6bcdea936f)

Diff:
---
 gcc/config/arm/arm-cpus.in                         |   6 +-
 gcc/config/arm/arm.c                               | 173 ++++++++++-----------
 gcc/testsuite/gcc.target/arm/attr-neon.c           |   9 +-
 gcc/testsuite/gcc.target/arm/attr-neon2.c          |   4 +-
 gcc/testsuite/gcc.target/arm/attr-neon3.c          |   6 +-
 gcc/testsuite/gcc.target/arm/pr69245.c             |   6 +-
 .../gcc.target/arm/pragma_fpu_attribute.c          |   7 +-
 .../gcc.target/arm/pragma_fpu_attribute_2.c        |   7 +-
 8 files changed, 111 insertions(+), 107 deletions(-)

diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 37fcf2a9a8f..de2caec5497 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -180,6 +180,9 @@ define feature quirk_armv6kz
 # Cortex-M3 LDRD quirk.
 define feature quirk_cm3_ldrd
 
+# Don't use .cpu assembly directive
+define feature quirk_no_asmcpu
+
 # (Very) slow multiply operations.  Should probably be a tuning bit.
 define feature smallmul
 
@@ -277,7 +280,7 @@ define fgroup DOTPROD	NEON dotprod
 # architectures.
 # xscale isn't really a 'quirk', but it isn't an architecture either and we
 # need to ignore it for matching purposes.
-define fgroup ALL_QUIRKS   quirk_no_volatile_ce quirk_armv6kz quirk_cm3_ldrd xscale
+define fgroup ALL_QUIRKS   quirk_no_volatile_ce quirk_armv6kz quirk_cm3_ldrd xscale quirk_no_asmcpu
 
 # Architecture entries
 # format:
@@ -979,6 +982,7 @@ begin cpu generic-armv7-a
  cname genericv7a
  tune flags LDSCHED
  architecture armv7-a+fp
+ isa quirk_no_asmcpu
  option mp add mp
  option sec add sec
  option vfpv3-d16 add VFPv3 FP_DBL
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index f4ed21ebb50..58afc4a2647 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -78,10 +78,6 @@
 typedef struct minipool_node    Mnode;
 typedef struct minipool_fixup   Mfix;
 
-/* The last .arch and .fpu assembly strings that we printed.  */
-static std::string arm_last_printed_arch_string;
-static std::string arm_last_printed_fpu_string;
-
 void (*arm_lang_output_object_attributes_hook)(void);
 
 struct four_ints
@@ -325,6 +321,7 @@ static unsigned int arm_hard_regno_nregs (unsigned int, machine_mode);
 static bool arm_hard_regno_mode_ok (unsigned int, machine_mode);
 static bool arm_modes_tieable_p (machine_mode, machine_mode);
 static HOST_WIDE_INT arm_constant_alignment (const_tree, HOST_WIDE_INT);
+static const char *arm_identify_fpu_from_isa (sbitmap);
 \f
 /* Table of machine attributes.  */
 static const struct attribute_spec arm_attribute_table[] =
@@ -3350,6 +3347,11 @@ arm_configure_build_target (struct arm_build_target *target,
       bitmap_ior (target->isa, target->isa, fpu_bits);
     }
 
+  /* If we have the soft-float ABI, clear any feature bits relating to use of
+     floating-point operations.  They'll just confuse things later on.  */
+  if (arm_float_abi == ARM_FLOAT_ABI_SOFT)
+    bitmap_and_compl (target->isa, target->isa, isa_all_fpbits);
+
   if (!arm_selected_tune)
     arm_selected_tune = arm_selected_cpu;
   else /* Validate the features passed to -mtune.  */
@@ -26663,20 +26665,65 @@ arm_print_tune_info (void)
 	       (int) current_tune->sched_autopref);
 }
 
+/* The last set of target options used to emit .arch directives, etc.  This
+   could be a function-local static if it were not required to expose it as a
+   root to the garbage collector.  */
+static GTY(()) cl_target_option *last_asm_targ_options = NULL;
+
 /* Print .arch and .arch_extension directives corresponding to the
    current architecture configuration.  */
 static void
-arm_print_asm_arch_directives ()
+arm_print_asm_arch_directives (FILE *stream, cl_target_option *targ_options)
 {
+  arm_build_target build_target;
+  /* If the target options haven't changed since the last time we were called
+     there is nothing to do.  This should be sufficient to suppress the
+     majority of redundant work.  */
+  if (last_asm_targ_options == targ_options)
+    return;
+
+  last_asm_targ_options = targ_options;
+
+  build_target.isa = sbitmap_alloc (isa_num_bits);
+  arm_configure_build_target (&build_target, targ_options, false);
+
+  if (build_target.core_name
+      && !bitmap_bit_p (build_target.isa, isa_bit_quirk_no_asmcpu))
+    {
+      const char* truncated_name
+	= arm_rewrite_selected_cpu (build_target.core_name);
+      asm_fprintf (stream, "\t.cpu %s\n", truncated_name);
+    }
+
   const arch_option *arch
     = arm_parse_arch_option_name (all_architectures, "-march",
-				  arm_active_target.arch_name);
+				  build_target.arch_name);
   auto_sbitmap opt_bits (isa_num_bits);
 
   gcc_assert (arch);
 
-  asm_fprintf (asm_out_file, "\t.arch %s\n", arm_active_target.arch_name);
-  arm_last_printed_arch_string = arm_active_target.arch_name;
+  if (strcmp (build_target.arch_name, "armv7ve") == 0)
+    {
+      /* Keep backward compatability for assemblers which don't support
+	 armv7ve.  Fortunately, none of the following extensions are reset
+	 by a .fpu directive.  */
+      asm_fprintf (stream, "\t.arch armv7-a\n");
+      asm_fprintf (stream, "\t.arch_extension virt\n");
+      asm_fprintf (stream, "\t.arch_extension idiv\n");
+      asm_fprintf (stream, "\t.arch_extension sec\n");
+      asm_fprintf (stream, "\t.arch_extension mp\n");
+    }
+  else
+    asm_fprintf (stream, "\t.arch %s\n", build_target.arch_name);
+
+  /* The .fpu directive will reset any architecture extensions from the
+     assembler that relate to the fp/vector extensions.  So put this out before
+     any .arch_extension directives.  */
+  const char *fpu_name = (TARGET_SOFT_FLOAT
+			  ? "softvfp"
+			  : arm_identify_fpu_from_isa (build_target.isa));
+  asm_fprintf (stream, "\t.fpu %s\n", fpu_name);
+
   if (!arch->common.extensions)
     return;
 
@@ -26688,13 +26735,12 @@ arm_print_asm_arch_directives ()
 	{
 	  arm_initialize_isa (opt_bits, opt->isa_bits);
 
-	  /* If every feature bit of this option is set in the target
-	     ISA specification, print out the option name.  However,
-	     don't print anything if all the bits are part of the
-	     FPU specification.  */
-	  if (bitmap_subset_p (opt_bits, arm_active_target.isa)
+	  /* If every feature bit of this option is set in the target ISA
+	     specification, print out the option name.  However, don't print
+	     anything if all the bits are part of the FPU specification.  */
+	  if (bitmap_subset_p (opt_bits, build_target.isa)
 	      && !bitmap_subset_p (opt_bits, isa_all_fpubits_internal))
-	    asm_fprintf (asm_out_file, "\t.arch_extension %s\n", opt->name);
+	    asm_fprintf (stream, "\t.arch_extension %s\n", opt->name);
 	}
     }
 }
@@ -26704,42 +26750,23 @@ arm_file_start (void)
 {
   int val;
 
+  arm_print_asm_arch_directives
+    (asm_out_file, TREE_TARGET_OPTION (target_option_default_node));
+
   if (TARGET_BPABI)
     {
-      /* We don't have a specified CPU.  Use the architecture to
-	 generate the tags.
-
-	 Note: it might be better to do this unconditionally, then the
-	 assembler would not need to know about all new CPU names as
-	 they are added.  */
-      if (!arm_active_target.core_name)
-	{
-	  /* armv7ve doesn't support any extensions.  */
-	  if (strcmp (arm_active_target.arch_name, "armv7ve") == 0)
-	    {
-	      /* Keep backward compatability for assemblers
-		 which don't support armv7ve.  */
-	      asm_fprintf (asm_out_file, "\t.arch armv7-a\n");
-	      asm_fprintf (asm_out_file, "\t.arch_extension virt\n");
-	      asm_fprintf (asm_out_file, "\t.arch_extension idiv\n");
-	      asm_fprintf (asm_out_file, "\t.arch_extension sec\n");
-	      asm_fprintf (asm_out_file, "\t.arch_extension mp\n");
-	      arm_last_printed_arch_string = "armv7ve";
-	    }
-	  else
-	    arm_print_asm_arch_directives ();
-	}
-      else if (strncmp (arm_active_target.core_name, "generic", 7) == 0)
-	{
-	  asm_fprintf (asm_out_file, "\t.arch %s\n",
-		       arm_active_target.core_name + 8);
-	  arm_last_printed_arch_string = arm_active_target.core_name + 8;
-	}
-      else
+      /* If we have a named cpu, but we the assembler does not support that
+	 name via .cpu, put out a cpu name attribute; but don't do this if the
+	 name starts with the fictitious prefix, 'generic'.  */
+      if (arm_active_target.core_name
+	  && bitmap_bit_p (arm_active_target.isa, isa_bit_quirk_no_asmcpu)
+	  && strncmp (arm_active_target.core_name, "generic", 7) != 0)
 	{
 	  const char* truncated_name
 	    = arm_rewrite_selected_cpu (arm_active_target.core_name);
-	  asm_fprintf (asm_out_file, "\t.cpu %s\n", truncated_name);
+	  if (bitmap_bit_p (arm_active_target.isa, isa_bit_quirk_no_asmcpu))
+	    asm_fprintf (asm_out_file, "\t.eabi_attribute 5, \"%s\"\n",
+			 truncated_name);
 	}
 
       if (print_tune_info)
@@ -26807,6 +26834,13 @@ arm_file_end (void)
 {
   int regno;
 
+  /* Just in case the last function output in the assembler had non-default
+     architecture directives, we force the assembler state back to the default
+     set, so that any 'calculated' build attributes are based on the default
+     options rather than the special options for that function.  */
+  arm_print_asm_arch_directives
+    (asm_out_file, TREE_TARGET_OPTION (target_option_default_node));
+
   if (NEED_INDICATE_EXEC_STACK)
     /* Add .note.GNU-stack.  */
     file_end_indicate_exec_stack ();
@@ -31226,44 +31260,7 @@ arm_declare_function_name (FILE *stream, const char *name, tree decl)
     targ_options = TREE_TARGET_OPTION (target_option_current_node);
   gcc_assert (targ_options);
 
-  /* Only update the assembler .arch string if it is distinct from the last
-     such string we printed. arch_to_print is set conditionally in case
-     targ_options->x_arm_arch_string is NULL which can be the case
-     when cc1 is invoked directly without passing -march option.  */
-  std::string arch_to_print;
-  if (targ_options->x_arm_arch_string)
-    arch_to_print = targ_options->x_arm_arch_string;
-
-  if (arch_to_print != arm_last_printed_arch_string)
-    {
-      std::string arch_name
-	= arch_to_print.substr (0, arch_to_print.find ("+"));
-      asm_fprintf (asm_out_file, "\t.arch %s\n", arch_name.c_str ());
-      const arch_option *arch
-	= arm_parse_arch_option_name (all_architectures, "-march",
-				      targ_options->x_arm_arch_string);
-      auto_sbitmap opt_bits (isa_num_bits);
-
-      gcc_assert (arch);
-      if (arch->common.extensions)
-	{
-	  for (const struct cpu_arch_extension *opt = arch->common.extensions;
-	       opt->name != NULL;
-	       opt++)
-	    {
-	      if (!opt->remove)
-		{
-		  arm_initialize_isa (opt_bits, opt->isa_bits);
-		  if (bitmap_subset_p (opt_bits, arm_active_target.isa)
-		      && !bitmap_subset_p (opt_bits, isa_all_fpubits_internal))
-		    asm_fprintf (asm_out_file, "\t.arch_extension %s\n",
-				 opt->name);
-		}
-	     }
-	}
-
-      arm_last_printed_arch_string = arch_to_print;
-    }
+  arm_print_asm_arch_directives (stream, targ_options);
 
   fprintf (stream, "\t.syntax unified\n");
 
@@ -31281,16 +31278,6 @@ arm_declare_function_name (FILE *stream, const char *name, tree decl)
   else
     fprintf (stream, "\t.arm\n");
 
-  std::string fpu_to_print
-    = TARGET_SOFT_FLOAT
-	? "softvfp" : arm_identify_fpu_from_isa (arm_active_target.isa);
-
-  if (fpu_to_print != arm_last_printed_arch_string)
-    {
-      asm_fprintf (asm_out_file, "\t.fpu %s\n", fpu_to_print.c_str ());
-      arm_last_printed_fpu_string = fpu_to_print;
-    }
-
   if (TARGET_POKE_FUNCTION_NAME)
     arm_poke_function_name (stream, (const char *) name);
 }
diff --git a/gcc/testsuite/gcc.target/arm/attr-neon.c b/gcc/testsuite/gcc.target/arm/attr-neon.c
index 225fb8dc3db..e8e3086247d 100644
--- a/gcc/testsuite/gcc.target/arm/attr-neon.c
+++ b/gcc/testsuite/gcc.target/arm/attr-neon.c
@@ -1,7 +1,10 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_neon_ok } */
 /* { dg-options "-O2 -ftree-vectorize" } */
-/* { dg-add-options arm_neon arm_v8_vfp } */ /* The arm_v8_vfp adds -mfpu=fp-armv8 to the command line, overriding any -mfpu= option set by arm_neon, thus ensuring that the attributes below really are checked for correct fpu selection.  */
+/* { dg-add-options arm_neon arm_v8_vfp } */
+/* The arm_v8_vfp adds -mfpu=fp-armv8 to the command line, overriding any
+   -mfpu= option set by arm_neon, thus ensuring that the attributes below
+   really are checked for correct fpu selection.  */
 
 /* Verify that neon instructions are emitted once.  */
 void __attribute__ ((target("fpu=neon")))
@@ -18,6 +21,6 @@ f3(int n, int x[], int y[]) {
     y[i] = x[i] << 3;
 }
 
-/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */
-/* { dg-final { scan-assembler-times "\.fpu neon" 1 } } */
+/* { dg-final { scan-assembler-times "\.fpu\\s+vfp\n" 1 } } */
+/* { dg-final { scan-assembler-times "\.fpu\\s+neon\n" 1 } } */
 /* { dg-final { scan-assembler-times "vshl" 1 } } */
diff --git a/gcc/testsuite/gcc.target/arm/attr-neon2.c b/gcc/testsuite/gcc.target/arm/attr-neon2.c
index 29668256cf5..91cf4dde8db 100644
--- a/gcc/testsuite/gcc.target/arm/attr-neon2.c
+++ b/gcc/testsuite/gcc.target/arm/attr-neon2.c
@@ -27,8 +27,8 @@ my1 (int8x8_t __a, int8x8_t __b)
   return __a + __b;
 }
 
-/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */
-/* { dg-final { scan-assembler-times "\.fpu neon" 1 } } */
+/* { dg-final { scan-assembler {\.fpu\s+vfp\n} } } */
+/* { dg-final { scan-assembler {\.fpu\s+neon\n} } } */
 /* { dg-final { scan-assembler "vadd" } } */
 
 
diff --git a/gcc/testsuite/gcc.target/arm/attr-neon3.c b/gcc/testsuite/gcc.target/arm/attr-neon3.c
index 17e429ad739..0acb7f98dc6 100644
--- a/gcc/testsuite/gcc.target/arm/attr-neon3.c
+++ b/gcc/testsuite/gcc.target/arm/attr-neon3.c
@@ -31,8 +31,8 @@ my1 (int8x8_t __a, int8x8_t __b)
   return __a + __b;
 }
 
-/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */
-/* { dg-final { scan-assembler-times "\.fpu neon" 1 } } */
-/* { dg-final { scan-assembler-times "\.fpu crypto-neon-fp-armv8" 1 } } */
+/* { dg-final { scan-assembler {\.fpu\s+vfp\n} } } */
+/* { dg-final { scan-assembler {\.fpu\s+neon\n} } } */
+/* { dg-final { scan-assembler {\.fpu\s+crypto-neon-fp-armv8\n} } } */
 /* { dg-final { scan-assembler-times "vld1" 1 } } */
 /* { dg-final { scan-assembler-times "vadd" 1} } */
diff --git a/gcc/testsuite/gcc.target/arm/pr69245.c b/gcc/testsuite/gcc.target/arm/pr69245.c
index bd505187728..34b97a22e15 100644
--- a/gcc/testsuite/gcc.target/arm/pr69245.c
+++ b/gcc/testsuite/gcc.target/arm/pr69245.c
@@ -23,4 +23,8 @@ void fn2 ()
   d = b * c + a;
 }
 
-/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */
+/* Because we don't know the exact command-line options used to invoke the test
+   we cannot expect these tests to match exactly once.  But they must appear at
+   least once.  */
+/* { dg-final { scan-assembler "\.fpu\s+vfp\n" } } */
+/* { dg-final { scan-assembler "\.fpu\s+neon-vfpv4\n" } } */
diff --git a/gcc/testsuite/gcc.target/arm/pragma_fpu_attribute.c b/gcc/testsuite/gcc.target/arm/pragma_fpu_attribute.c
index 174be85f3f7..7e63cf53013 100644
--- a/gcc/testsuite/gcc.target/arm/pragma_fpu_attribute.c
+++ b/gcc/testsuite/gcc.target/arm/pragma_fpu_attribute.c
@@ -22,5 +22,8 @@ uint32_t restored ()
   return bar();
 }
 
-/* { dg-final { scan-assembler-times {\.fpu\s+vfpv4} 1 } } */
-/* { dg-final { scan-assembler-times {\.fpu\s+vfpv3-d16} 1 } } */
+/* We can't tell exactly how many times the following tests will match because
+   command-line options may cause additional instances to be generated, but
+   each must be present at least once.  */
+/* { dg-final { scan-assembler {\.fpu\s+vfpv4\n} } } */
+/* { dg-final { scan-assembler {\.fpu\s+vfpv3-d16\n} } } */
diff --git a/gcc/testsuite/gcc.target/arm/pragma_fpu_attribute_2.c b/gcc/testsuite/gcc.target/arm/pragma_fpu_attribute_2.c
index add40ddc6b8..398d8fff35c 100644
--- a/gcc/testsuite/gcc.target/arm/pragma_fpu_attribute_2.c
+++ b/gcc/testsuite/gcc.target/arm/pragma_fpu_attribute_2.c
@@ -25,5 +25,8 @@ uint32_t restored ()
   return bar();
 }
 
-/* { dg-final { scan-assembler-times {\.fpu\s+vfpv4} 1 } } */
-/* { dg-final { scan-assembler-times {\.fpu\s+vfpv3-d16} 1 } } */
+/* We can't tell exactly how many times the following tests will match because
+   command-line options may cause additional instances to be generated, but
+   each must be present at least once.  */
+/* { dg-final { scan-assembler {\.fpu\s+vfpv4\n} } } */
+/* { dg-final { scan-assembler {\.fpu\s+vfpv3-d16\n} } } */


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-08-23 14:41 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-23 14:41 [gcc r9-9690] arm: reorder assembler architecture directives [PR101723] Richard Earnshaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).