[PATCH, ARM] Improve robustness of -mslow-flash-data

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH, ARM] Improve robustness of -mslow-flash-data
@ 2018-11-19 17:56 Thomas Preudhomme
  2018-11-20 10:23 ` Christophe Lyon
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Thomas Preudhomme @ 2018-11-19 17:56 UTC (permalink / raw)
  To: Kyrill Tkachov, Ramana Radhakrishnan, Richard Earnshaw, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 4671 bytes --]

Hi,

Current code to handle -mslow-flash-data in machine description files
suffers from a number of issues which this patch fixes:

1) The insn_and_split in vfp.md to load a generic floating-point
constant via GPR first and move it to VFP register are guarded by
!reload_completed which is forbidden explicitely in the GCC internals
documentation section 17.2 point 3;

2) A number of testcase in the testsuite ICEs under -mslow-flash-data
when targeting the hardfloat ABI [1];

3) Instructions performing load from literal pool are not disabled.

These problems are addressed by 2 separate actions:

1) Making the splitters take a clobber and changing the expanders
accordingly to generate a mov with clobber in cases where a literal
pool would be used. The splitter can thus be enabled after reload since
it does not call gen_reg_rtx anymore;

2) Adding new predicates and constraints to disable literal pool loads
in existing instructions when -mslow-flash-data is in effect.

The patch also rework the splitter for DFmode slightly to generate an
intermediate DI load instead of 2 intermediate SI loads, thus relying on
the existing DI splitters instead of redoing their job. At last, the
patch adds some missing arm_fp_ok effective target to some of the
slow-flash-data testcases.

[1]
c-c++-common/Wunused-var-3.c
gcc.c-torture/compile/pr72771.c
gcc.c-torture/compile/vector-5.c
gcc.c-torture/compile/vector-6.c
gcc.c-torture/execute/20030914-1.c
gcc.c-torture/execute/20050316-1.c
gcc.c-torture/execute/pr59643.c
gcc.dg/builtin-tgmath-1.c
gcc.dg/debug/pr55730.c
gcc.dg/graphite/interchange-7.c
gcc.dg/pr56890-2.c
gcc.dg/pr68474.c
gcc.dg/pr80286.c
gcc.dg/torture/pr35227.c
gcc.dg/torture/pr65077.c
gcc.dg/torture/pr86363.c
g++.dg/torture/pr81112.C
g++.dg/torture/pr82985.C
g++.dg/warn/Wunused-var-7.C
and a lot more in libstdc++ in special_functions/*_comp_ellint_* and
special_functions/*_ellint_* directories.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-11-14  Thomas Preud'homme  <thomas.preudhomme@arm.com>

	* config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and
	source is a constant that would be loaded by literal pool.
	(movsf expander): Generate a no_literal_pool_sf_immediate insn if
	-mslow-flash-data is present, targeting hardfloat ABI and source is a
	float constant that cannot be loaded via vmov.
	(movdf expander): Likewise but generate a no_literal_pool_df_immediate
	insn.
	(arm_movsf_soft_insn): Split if -mslow-flash-data and source is a
	float constant that would be loaded by literal pool.
	(softfloat constant movsf splitter): Splitter for the above case.
	(movdf_soft_insn): Split if -mslow-flash-data and source is a float
	constant that would be loaded by literal pool.
	(softfloat constant movdf splitter): Splitter for the above case.
	* config/arm/constraints.md (Pz): Document existing constraint.
	(Ha): Define constraint.
	(Tu): Likewise.
	* config/arm/predicates.md (hard_sf_operand): New predicate.
	(hard_df_operand): Likewise.
	* config/arm/thumb2.md (thumb2_movsi_insn): Split if
	-mslow-flash-data and constant would be loaded by literal pool.
	* constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable constant
	load in VFP register.
	(movdi_vfp): Likewise.
	(thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to
	prevent match for a constant load if -mslow-flash-data and constant
	cannot be loaded via vmov.  Adapt constraint accordingly by
	using Ha instead of E for generic floating-point constant load.
	(thumb2_movdf_vfp): Likewise using hard_df_operand predicate instead.
	(no_literal_pool_df_immediate): Add a clobber to use as the
	intermediate general purpose register and also enable it after reload
	but disable it constant is a valid FP constant.  Add constraints and
	generate a DI intermediate load rather than 2 SI loads.
	(no_literal_pool_sf_immediate): Add a clobber to use as the
	intermediate general purpose register and also enable it after
	reload.

*** gcc/testsuite/ChangeLog ***

2018-11-14  Thomas Preud'homme  <thomas.preudhomme@arm.com>

	* gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok
	effective target.
	* gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
	* gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
	* gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.

Testing: Built arm-none-eabi cross compilers for Armv7E-M defaulting to
softfloat and hardfloat ABI which showed no regression and some
FAIL->PASS for hardfloat ABI. Bootstraped on Arm and Thumb-2 without any
regression. Compiled SPEC2k6 without -mslow-flash-data and checked that
code generation didn't change.

Is this ok for stage3?

Best regards,

Thomas

[-- Attachment #2: improve_robustness_mslow-flash-data.patch --]
[-- Type: text/x-patch, Size: 17630 bytes --]

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index a773518cefaf8451e77fead9e072ee8ef39f1eb8..a08298bbb9f93fc132aa64a206fad64dcda9ed65 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5831,6 +5831,11 @@
     case 1:
     case 2:
       return \"#\";
+    case 3:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      /* Fall through.  */
     default:
       return output_move_double (operands, true, NULL);
     }
@@ -6939,6 +6944,20 @@
 	     operands[1] = force_reg (SFmode, operands[1]);
         }
     }
+
+  /* Cannot load it directly, generate a load with clobber so that it can be
+     loaded via GPR with MOV / MOVT.  */
+  if (arm_disable_literal_pool
+      && (REG_P (operands[0]) || SUBREG_P (operands[0]))
+      && CONST_DOUBLE_P (operands[1])
+      && TARGET_HARD_FLOAT
+      && !vfp3_const_double_rtx (operands[1]))
+    {
+      rtx clobreg = gen_reg_rtx (SFmode);
+      emit_insn (gen_no_literal_pool_sf_immediate (operands[0], operands[1],
+						   clobreg));
+      DONE;
+    }
   "
 )
 
@@ -6966,10 +6985,19 @@
    && TARGET_SOFT_FLOAT
    && (!MEM_P (operands[0])
        || register_operand (operands[1], SFmode))"
-  "@
-   mov%?\\t%0, %1
-   ldr%?\\t%0, %1\\t%@ float
-   str%?\\t%1, %0\\t%@ float"
+{
+  switch (which_alternative)
+    {
+    case 0: return \"mov%?\\t%0, %1\";
+    case 1:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      return \"ldr%?\\t%0, %1\\t%@ float\";
+    case 2: return \"str%?\\t%1, %0\\t%@ float\";
+    default: gcc_unreachable ();
+    }
+}
   [(set_attr "predicable" "yes")
    (set_attr "type" "mov_reg,load_4,store_4")
    (set_attr "arm_pool_range" "*,4096,*")
@@ -6978,6 +7006,21 @@
    (set_attr "thumb2_neg_pool_range" "*,0,*")]
 )
 
+;; Splitter for the above.
+(define_split
+  [(set (match_operand:SF 0 "s_register_operand")
+	(match_operand:SF 1 "const_double_operand"))]
+  "arm_disable_literal_pool && TARGET_SOFT_FLOAT"
+  [(const_int 0)]
+{
+  long buf;
+  real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
+  rtx cst = gen_int_mode (buf, SImode);
+  emit_move_insn (simplify_gen_subreg (SImode, operands[0], SFmode, 0), cst);
+  DONE;
+}
+)
+
 (define_expand "movdf"
   [(set (match_operand:DF 0 "general_operand" "")
 	(match_operand:DF 1 "general_operand" ""))]
@@ -6996,6 +7039,21 @@
 	    operands[1] = force_reg (DFmode, operands[1]);
         }
     }
+
+  /* Cannot load it directly, generate a load with clobber so that it can be
+     loaded via GPR with MOV / MOVT.  */
+  if (arm_disable_literal_pool
+      && (REG_P (operands[0]) || SUBREG_P (operands[0]))
+      && CONSTANT_P (operands[1])
+      && TARGET_HARD_FLOAT
+      && !arm_const_double_rtx (operands[1])
+      && !(TARGET_VFP_DOUBLE && vfp3_const_double_rtx (operands[1])))
+    {
+      rtx clobreg = gen_reg_rtx (DFmode);
+      emit_insn (gen_no_literal_pool_df_immediate (operands[0], operands[1],
+						   clobreg));
+      DONE;
+    }
   "
 )
 
@@ -7055,6 +7113,11 @@
     case 1:
     case 2:
       return \"#\";
+    case 3:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      /* Fall through.  */
     default:
       return output_move_double (operands, true, NULL);
     }
@@ -7066,6 +7129,24 @@
    (set_attr "arm_neg_pool_range" "*,*,*,1004,*")
    (set_attr "thumb2_neg_pool_range" "*,*,*,0,*")]
 )
+
+;; Splitter for the above.
+(define_split
+  [(set (match_operand:DF 0 "s_register_operand")
+	(match_operand:DF 1 "const_double_operand"))]
+  "arm_disable_literal_pool && TARGET_SOFT_FLOAT"
+  [(const_int 0)]
+{
+  long buf[2];
+  int order = BYTES_BIG_ENDIAN ? 1 : 0;
+  real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
+  unsigned HOST_WIDE_INT ival = zext_hwi (buf[order], 32);
+  ival |= (zext_hwi (buf[1 - order], 32) << 32);
+  rtx cst = gen_int_mode (ival, DImode);
+  emit_move_insn (simplify_gen_subreg (DImode, operands[0], DFmode, 0), cst);
+  DONE;
+}
+)
 \f
 
 ;; load- and store-multiple insns
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 7576c6fc401fc5ce25245fa2b740db99169ce7ce..657e540816bdd82cddd23059dea2be19df7eb1bb 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -31,9 +31,10 @@
 ;; 'H' was previously used for FPA.
 
 ;; The following multi-letter normal constraints have been used:
-;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
+;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp,
+;;			 Dz, Tu
 ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
-;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
+;; in Thumb-2 state: Ha, Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py, Pz
 ;; in all states: Pf
 
 ;; The following memory constraints have been used:
@@ -234,6 +235,12 @@
  (and (match_code "const_double")
       (match_test "TARGET_32BIT && arm_const_double_rtx (op)")))
 
+(define_constraint "Ha"
+  "@internal In ARM / Thumb-2 a float constant iff literal pools are allowed."
+  (and (match_code "const_double")
+       (match_test "satisfies_constraint_E (op)")
+       (match_test "!arm_disable_literal_pool")))
+
 (define_constraint "Dz"
  "@internal
   In ARM/Thumb-2 state a vector of constant zeros."
@@ -351,6 +358,12 @@
        (match_test "TARGET_32BIT
 		    && vfp3_const_double_for_bits (op) > 0")))
 
+(define_constraint "Tu"
+  "@internal In ARM / Thumb-2 an integer constant iff literal pools are
+   allowed."
+  (and (match_test "CONSTANT_P (op)")
+       (match_test "!arm_disable_literal_pool")))
+
 (define_register_constraint "Ts" "(arm_restrict_it) ? LO_REGS : GENERAL_REGS"
  "For arm_restrict_it the core registers @code{r0}-@code{r7}.  GENERAL_REGS otherwise.")
 
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index 7e198f9bce441c55913615e4c601a760d7e62c20..f73264cc2a07cacec5e7c4e31ce12299a1fadd0b 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -456,6 +456,24 @@
        (and (match_code "reg,subreg,mem")
 	    (match_operand 0 "nonimmediate_soft_df_operand"))))
 
+;; Predicate for thumb2_movsf_vfp.  Compared to general_operand, this
+;; forbids constant loaded via literal pool iff literal pools are disabled.
+(define_predicate "hard_sf_operand"
+  (and (match_operand 0 "general_operand")
+       (ior (not (match_code "const_double"))
+	    (not (match_test "arm_disable_literal_pool"))
+	    (match_test "satisfies_constraint_Dv (op)"))))
+
+;; Predicate for thumb2_movdf_vfp.  Compared to soft_df_operand used in
+;; movdf_soft_insn, this forbids constant loaded via literal pool iff
+;; literal pools are disabled.
+(define_predicate "hard_df_operand"
+  (and (match_operand 0 "soft_df_operand")
+       (ior (not (match_code "const_double"))
+	    (not (match_test "arm_disable_literal_pool"))
+	    (match_test "satisfies_constraint_Dy (op)")
+	    (match_test "satisfies_constraint_G (op)"))))
+
 (define_special_predicate "load_multiple_operation"
   (match_code "parallel")
 {
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index c42670f8643c3286bc5abf537d4fd0483cba68ac..727ceb9b37957efbc7ab8809f57e8825deb6b1df 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -252,16 +252,26 @@
   "TARGET_THUMB2 && !TARGET_IWMMXT && !TARGET_HARD_FLOAT
    && (   register_operand (operands[0], SImode)
        || register_operand (operands[1], SImode))"
-  "@
-   mov%?\\t%0, %1
-   mov%?\\t%0, %1
-   mov%?\\t%0, %1
-   mvn%?\\t%0, #%B1
-   movw%?\\t%0, %1
-   ldr%?\\t%0, %1
-   ldr%?\\t%0, %1
-   str%?\\t%1, %0
-   str%?\\t%1, %0"
+{
+  switch (which_alternative)
+    {
+    case 0:
+    case 1:
+    case 2:
+      return \"mov%?\\t%0, %1\";
+    case 3: return \"mvn%?\\t%0, #%B1\";
+    case 4: return \"movw%?\\t%0, %1\";
+    case 5:
+    case 6:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      return \"ldr%?\\t%0, %1\";
+    case 7:
+    case 8: return \"str%?\\t%1, %0\";
+    default: gcc_unreachable ();
+    }
+}
   [(set_attr "type" "mov_reg,mov_imm,mov_imm,mvn_imm,mov_imm,load_4,load_4,store_4,store_4")
    (set_attr "length" "2,4,2,4,4,4,4,4,4")
    (set_attr "predicable" "yes")
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index 611ebe2d83698e3129df6c55a03e4f5f33c891e7..f3d4f30cb53d82e2ffd2c4fcaad2cc873d97c24b 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -259,7 +259,7 @@
 ;; arm_restrict_it.
 (define_insn "*thumb2_movsi_vfp"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r, l,*hk,m, *m,*t, r,*t,*t,  *Uv")
-	(match_operand:SI 1 "general_operand"	   "rk,I,Py,K,j,mi,*mi,l,*hk, r,*t,*t,*Uvi,*t"))]
+	(match_operand:SI 1 "general_operand"	   "rk,I,Py,K,j,mi,*mi,l,*hk, r,*t,*t,*UvTu,*t"))]
   "TARGET_THUMB2 && TARGET_HARD_FLOAT
    && (   s_register_operand (operands[0], SImode)
        || s_register_operand (operands[1], SImode))"
@@ -276,6 +276,9 @@
       return \"movw%?\\t%0, %1\";
     case 5:
     case 6:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
       return \"ldr%?\\t%0, %1\";
     case 7:
     case 8:
@@ -305,7 +308,7 @@
 
 (define_insn "*movdi_vfp"
   [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,q,q,m,w,!r,w,w, Uv")
-       (match_operand:DI 1 "di_operand"              "r,rDa,Db,Dc,mi,mi,q,r,w,w,Uvi,w"))]
+	(match_operand:DI 1 "di_operand"	      "r,rDa,Db,Dc,mi,mi,q,r,w,w,UvTu,w"))]
   "TARGET_32BIT && TARGET_HARD_FLOAT
    && (   register_operand (operands[0], DImode)
        || register_operand (operands[1], DImode))
@@ -321,6 +324,10 @@
       return \"#\";
     case 4:
     case 5:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      /* Fall through.  */
     case 6:
       return output_move_double (operands, true, NULL);
     case 7:
@@ -587,7 +594,7 @@
 
 (define_insn "*thumb2_movsf_vfp"
   [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t  ,Uv,r ,m,t,r")
-	(match_operand:SF 1 "general_operand"	   " ?r,t,Dv,UvE,t, mE,r,t,r"))]
+	(match_operand:SF 1 "hard_sf_operand"	   " ?r,t,Dv,UvHa,t, mHa,r,t,r"))]
   "TARGET_THUMB2 && TARGET_HARD_FLOAT
    && (   s_register_operand (operands[0], SFmode)
        || s_register_operand (operands[1], SFmode))"
@@ -676,7 +683,7 @@
 
 (define_insn "*thumb2_movdf_vfp"
   [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w  ,Uv,r ,m,w,r")
-	(match_operand:DF 1 "soft_df_operand"		   " ?r,w,Dy,G,UvF,w, mF,r, w,r"))]
+	(match_operand:DF 1 "hard_df_operand"		   " ?r,w,Dy,G,UvHa,w, mHa,r, w,r"))]
   "TARGET_THUMB2 && TARGET_HARD_FLOAT
    && (   register_operand (operands[0], DFmode)
        || register_operand (operands[1], DFmode))"
@@ -1983,39 +1990,50 @@
 ;; Support for xD (single precision only) variants.
 ;; fmrrs, fmsrr
 
-;; Split an immediate DF move to two immediate SI moves.
+;; Load a DF immediate via GPR (where combinations of MOV and MOVT can be used)
+;; and then move it into a VFP register.
 (define_insn_and_split "no_literal_pool_df_immediate"
-  [(set (match_operand:DF 0 "s_register_operand" "")
-	(match_operand:DF 1 "const_double_operand" ""))]
-  "TARGET_THUMB2 && arm_disable_literal_pool
-  && !(TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE
-       && vfp3_const_double_rtx (operands[1]))"
+  [(set (match_operand:DF 0 "s_register_operand" "=w")
+	(match_operand:DF 1 "const_double_operand" "F"))
+   (clobber (match_operand:DF 2 "s_register_operand" "=r"))]
+  "arm_disable_literal_pool
+   && TARGET_HARD_FLOAT
+   && !arm_const_double_rtx (operands[1])
+   && !(TARGET_VFP_DOUBLE && vfp3_const_double_rtx (operands[1]))"
   "#"
-  "&& !reload_completed"
-  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
-   (set (subreg:SI (match_dup 1) 4) (match_dup 3))
-   (set (match_dup 0) (match_dup 1))]
-  "
+  ""
+  [(const_int 0)]
+{
   long buf[2];
+  int order = BYTES_BIG_ENDIAN ? 1 : 0;
   real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
-  operands[2] = GEN_INT ((int) buf[0]);
-  operands[3] = GEN_INT ((int) buf[1]);
-  operands[1] = gen_reg_rtx (DFmode);
-  ")
+  unsigned HOST_WIDE_INT ival = zext_hwi (buf[order], 32);
+  ival |= (zext_hwi (buf[1 - order], 32) << 32);
+  rtx cst = gen_int_mode (ival, DImode);
+  emit_move_insn (simplify_gen_subreg (DImode, operands[2], DFmode, 0), cst);
+  emit_move_insn (operands[0], operands[2]);
+  DONE;
+}
+)
 
-;; Split an immediate SF move to one immediate SI move.
+;; Load a SF immediate via GPR (where combinations of MOV and MOVT can be used)
+;; and then move it into a VFP register.
 (define_insn_and_split "no_literal_pool_sf_immediate"
-  [(set (match_operand:SF 0 "s_register_operand" "")
-	(match_operand:SF 1 "const_double_operand" ""))]
-  "TARGET_THUMB2 && arm_disable_literal_pool
-  && !(TARGET_HARD_FLOAT && vfp3_const_double_rtx (operands[1]))"
+  [(set (match_operand:SF 0 "s_register_operand" "=t")
+	(match_operand:SF 1 "const_double_operand" "E"))
+   (clobber (match_operand:SF 2 "s_register_operand" "=r"))]
+  "arm_disable_literal_pool
+   && TARGET_HARD_FLOAT
+   && !vfp3_const_double_rtx (operands[1])"
   "#"
-  "&& !reload_completed"
-  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
-   (set (match_dup 0) (match_dup 1))]
-  "
+  ""
+  [(const_int 0)]
+{
   long buf;
   real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
-  operands[2] = GEN_INT ((int) buf);
-  operands[1] = gen_reg_rtx (SFmode);
-  ")
+  rtx cst = gen_int_mode (buf, SImode);
+  emit_move_insn (simplify_gen_subreg (SImode, operands[2], SFmode, 0), cst);
+  emit_move_insn (operands[0], operands[2]);
+  DONE;
+}
+)
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
index 90bd44e27e5c53d34f2816f4d6320acbc1dc709b..231243759cfe486c390ca27f10bd06177f60bd43 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_cortex_m } */
 /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
 /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
 /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
 /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
index 5d9cd9c4df28837b81b2de48c25d38cdf2c15999..27e72ec20863866acdc5e7fea632bc6880678dfd 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_cortex_m } */
 /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
 /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
 /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
 /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
index 0eeddd5e6ec1f42a96fc6220277f9ecb7cad44f5..8dbe87a1e68d5eb2edfd8259948988fbe0658ced 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_cortex_m } */
 /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
 /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
 /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
 /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
index 7d52f3801b6d4b62b27833871ac830d6d077894d..b98eb7624e42b5a7f4a11c604c7d2826339bcfd5 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_cortex_m } */
 /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
 /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
 /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
 /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH, ARM] Improve robustness of -mslow-flash-data
  2018-11-19 17:56 [PATCH, ARM] Improve robustness of -mslow-flash-data Thomas Preudhomme
@ 2018-11-20 10:23 ` Christophe Lyon
  2018-11-26 10:01 ` [PATCH, ARM, ping] " Thomas Preudhomme
  2018-11-30 14:11 ` [PATCH, ARM] " Kyrill Tkachov
  2 siblings, 0 replies; 6+ messages in thread
From: Christophe Lyon @ 2018-11-20 10:23 UTC (permalink / raw)
  To: Thomas Preud'homme
  Cc: Kyrylo Tkachov, Ramana Radhakrishnan, Richard Earnshaw, gcc Patches

On Mon, 19 Nov 2018 at 18:56, Thomas Preudhomme
<thomas.preudhomme@foss.arm.com> wrote:
>
> Hi,
>
> Current code to handle -mslow-flash-data in machine description files
> suffers from a number of issues which this patch fixes:
>
> 1) The insn_and_split in vfp.md to load a generic floating-point
> constant via GPR first and move it to VFP register are guarded by
> !reload_completed which is forbidden explicitely in the GCC internals
> documentation section 17.2 point 3;
>
> 2) A number of testcase in the testsuite ICEs under -mslow-flash-data
> when targeting the hardfloat ABI [1];
>
> 3) Instructions performing load from literal pool are not disabled.
>
> These problems are addressed by 2 separate actions:
>
> 1) Making the splitters take a clobber and changing the expanders
> accordingly to generate a mov with clobber in cases where a literal
> pool would be used. The splitter can thus be enabled after reload since
> it does not call gen_reg_rtx anymore;
>
> 2) Adding new predicates and constraints to disable literal pool loads
> in existing instructions when -mslow-flash-data is in effect.
>
> The patch also rework the splitter for DFmode slightly to generate an
> intermediate DI load instead of 2 intermediate SI loads, thus relying on
> the existing DI splitters instead of redoing their job. At last, the
> patch adds some missing arm_fp_ok effective target to some of the
> slow-flash-data testcases.
>
> [1]
> c-c++-common/Wunused-var-3.c
> gcc.c-torture/compile/pr72771.c
> gcc.c-torture/compile/vector-5.c
> gcc.c-torture/compile/vector-6.c
> gcc.c-torture/execute/20030914-1.c
> gcc.c-torture/execute/20050316-1.c
> gcc.c-torture/execute/pr59643.c
> gcc.dg/builtin-tgmath-1.c
> gcc.dg/debug/pr55730.c
> gcc.dg/graphite/interchange-7.c
> gcc.dg/pr56890-2.c
> gcc.dg/pr68474.c
> gcc.dg/pr80286.c
> gcc.dg/torture/pr35227.c
> gcc.dg/torture/pr65077.c
> gcc.dg/torture/pr86363.c
> g++.dg/torture/pr81112.C
> g++.dg/torture/pr82985.C
> g++.dg/warn/Wunused-var-7.C
> and a lot more in libstdc++ in special_functions/*_comp_ellint_* and
> special_functions/*_ellint_* directories.
>
> ChangeLog entries are as follows:
>
> *** gcc/ChangeLog ***
>
> 2018-11-14  Thomas Preud'homme  <thomas.preudhomme@arm.com>
>
>         * config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and
>         source is a constant that would be loaded by literal pool.
>         (movsf expander): Generate a no_literal_pool_sf_immediate insn if
>         -mslow-flash-data is present, targeting hardfloat ABI and source is a
>         float constant that cannot be loaded via vmov.
>         (movdf expander): Likewise but generate a no_literal_pool_df_immediate
>         insn.
>         (arm_movsf_soft_insn): Split if -mslow-flash-data and source is a
>         float constant that would be loaded by literal pool.
>         (softfloat constant movsf splitter): Splitter for the above case.
>         (movdf_soft_insn): Split if -mslow-flash-data and source is a float
>         constant that would be loaded by literal pool.
>         (softfloat constant movdf splitter): Splitter for the above case.
>         * config/arm/constraints.md (Pz): Document existing constraint.
>         (Ha): Define constraint.
>         (Tu): Likewise.
>         * config/arm/predicates.md (hard_sf_operand): New predicate.
>         (hard_df_operand): Likewise.
>         * config/arm/thumb2.md (thumb2_movsi_insn): Split if
>         -mslow-flash-data and constant would be loaded by literal pool.
>         * constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable constant
>         load in VFP register.
>         (movdi_vfp): Likewise.
>         (thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to
>         prevent match for a constant load if -mslow-flash-data and constant
>         cannot be loaded via vmov.  Adapt constraint accordingly by
>         using Ha instead of E for generic floating-point constant load.
>         (thumb2_movdf_vfp): Likewise using hard_df_operand predicate instead.
>         (no_literal_pool_df_immediate): Add a clobber to use as the
>         intermediate general purpose register and also enable it after reload
>         but disable it constant is a valid FP constant.  Add constraints and
>         generate a DI intermediate load rather than 2 SI loads.
>         (no_literal_pool_sf_immediate): Add a clobber to use as the
>         intermediate general purpose register and also enable it after
>         reload.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2018-11-14  Thomas Preud'homme  <thomas.preudhomme@arm.com>
>
>         * gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok
>         effective target.
>         * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
>         * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
>         * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
>
> Testing: Built arm-none-eabi cross compilers for Armv7E-M defaulting to
> softfloat and hardfloat ABI which showed no regression and some
> FAIL->PASS for hardfloat ABI. Bootstraped on Arm and Thumb-2 without any
> regression. Compiled SPEC2k6 without -mslow-flash-data and checked that
> code generation didn't change.
>

FWIW, it's OK for my validations: I do not see the improvements you mention,
because I do not have this very configuration. On arm-eabi --with-cpu=cortex-m3,
I only see the above tests PASS -> UNSUPPORTED, which is OK
(the reason being "-mfloat-abi=hard: selected procesor lacks an FPU"_

Thanks,

Christophe


> Is this ok for stage3?
>
> Best regards,
>
> Thomas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH, ARM, ping] Improve robustness of -mslow-flash-data
  2018-11-19 17:56 [PATCH, ARM] Improve robustness of -mslow-flash-data Thomas Preudhomme
  2018-11-20 10:23 ` Christophe Lyon
@ 2018-11-26 10:01 ` Thomas Preudhomme
  2018-11-30 14:11 ` [PATCH, ARM] " Kyrill Tkachov
  2 siblings, 0 replies; 6+ messages in thread
From: Thomas Preudhomme @ 2018-11-26 10:01 UTC (permalink / raw)
  To: Kyrill Tkachov, Ramana Radhakrishnan, Richard Earnshaw, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5639 bytes --]

Ping?

Best regards,

Thomas

On 19/11/2018 17:56, Thomas Preudhomme wrote:
> Hi,
> 
> Current code to handle -mslow-flash-data in machine description files
> suffers from a number of issues which this patch fixes:
> 
> 1) The insn_and_split in vfp.md to load a generic floating-point
> constant via GPR first and move it to VFP register are guarded by
> !reload_completed which is forbidden explicitely in the GCC internals
> documentation section 17.2 point 3;
> 
> 2) A number of testcase in the testsuite ICEs under -mslow-flash-data
> when targeting the hardfloat ABI [1];
> 
> 3) Instructions performing load from literal pool are not disabled.
> 
> These problems are addressed by 2 separate actions:
> 
> 1) Making the splitters take a clobber and changing the expanders
> accordingly to generate a mov with clobber in cases where a literal
> pool would be used. The splitter can thus be enabled after reload since
> it does not call gen_reg_rtx anymore;
> 
> 2) Adding new predicates and constraints to disable literal pool loads
> in existing instructions when -mslow-flash-data is in effect.
> 
> The patch also rework the splitter for DFmode slightly to generate an
> intermediate DI load instead of 2 intermediate SI loads, thus relying on
> the existing DI splitters instead of redoing their job. At last, the
> patch adds some missing arm_fp_ok effective target to some of the
> slow-flash-data testcases.
> 
> [1]
> c-c++-common/Wunused-var-3.c
> gcc.c-torture/compile/pr72771.c
> gcc.c-torture/compile/vector-5.c
> gcc.c-torture/compile/vector-6.c
> gcc.c-torture/execute/20030914-1.c
> gcc.c-torture/execute/20050316-1.c
> gcc.c-torture/execute/pr59643.c
> gcc.dg/builtin-tgmath-1.c
> gcc.dg/debug/pr55730.c
> gcc.dg/graphite/interchange-7.c
> gcc.dg/pr56890-2.c
> gcc.dg/pr68474.c
> gcc.dg/pr80286.c
> gcc.dg/torture/pr35227.c
> gcc.dg/torture/pr65077.c
> gcc.dg/torture/pr86363.c
> g++.dg/torture/pr81112.C
> g++.dg/torture/pr82985.C
> g++.dg/warn/Wunused-var-7.C
> and a lot more in libstdc++ in special_functions/*_comp_ellint_* and
> special_functions/*_ellint_* directories.
> 
> ChangeLog entries are as follows:
> 
> *** gcc/ChangeLog ***
> 
> 2018-11-14Â  Thomas Preud'hommeÂ  <thomas.preudhomme@arm.com>
> 
>  Â Â Â Â * config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and
>  Â Â Â Â source is a constant that would be loaded by literal pool.
>  Â Â Â Â (movsf expander): Generate a no_literal_pool_sf_immediate insn if
>  Â Â Â Â -mslow-flash-data is present, targeting hardfloat ABI and source is a
>  Â Â Â Â float constant that cannot be loaded via vmov.
>  Â Â Â Â (movdf expander): Likewise but generate a no_literal_pool_df_immediate
>  Â Â Â Â insn.
>  Â Â Â Â (arm_movsf_soft_insn): Split if -mslow-flash-data and source is a
>  Â Â Â Â float constant that would be loaded by literal pool.
>  Â Â Â Â (softfloat constant movsf splitter): Splitter for the above case.
>  Â Â Â Â (movdf_soft_insn): Split if -mslow-flash-data and source is a float
>  Â Â Â Â constant that would be loaded by literal pool.
>  Â Â Â Â (softfloat constant movdf splitter): Splitter for the above case.
>  Â Â Â Â * config/arm/constraints.md (Pz): Document existing constraint.
>  Â Â Â Â (Ha): Define constraint.
>  Â Â Â Â (Tu): Likewise.
>  Â Â Â Â * config/arm/predicates.md (hard_sf_operand): New predicate.
>  Â Â Â Â (hard_df_operand): Likewise.
>  Â Â Â Â * config/arm/thumb2.md (thumb2_movsi_insn): Split if
>  Â Â Â Â -mslow-flash-data and constant would be loaded by literal pool.
>  Â Â Â Â * constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable constant
>  Â Â Â Â load in VFP register.
>  Â Â Â Â (movdi_vfp): Likewise.
>  Â Â Â Â (thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to
>  Â Â Â Â prevent match for a constant load if -mslow-flash-data and constant
>  Â Â Â Â cannot be loaded via vmov.Â  Adapt constraint accordingly by
>  Â Â Â Â using Ha instead of E for generic floating-point constant load.
>  Â Â Â Â (thumb2_movdf_vfp): Likewise using hard_df_operand predicate instead.
>  Â Â Â Â (no_literal_pool_df_immediate): Add a clobber to use as the
>  Â Â Â Â intermediate general purpose register and also enable it after reload
>  Â Â Â Â but disable it constant is a valid FP constant.Â  Add constraints and
>  Â Â Â Â generate a DI intermediate load rather than 2 SI loads.
>  Â Â Â Â (no_literal_pool_sf_immediate): Add a clobber to use as the
>  Â Â Â Â intermediate general purpose register and also enable it after
>  Â Â Â Â reload.
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2018-11-14Â  Thomas Preud'hommeÂ  <thomas.preudhomme@arm.com>
> 
>  Â Â Â Â * gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok
>  Â Â Â Â effective target.
>  Â Â Â Â * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
>  Â Â Â Â * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
>  Â Â Â Â * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> 
> Testing: Built arm-none-eabi cross compilers for Armv7E-M defaulting to
> softfloat and hardfloat ABI which showed no regression and some
> FAIL->PASS for hardfloat ABI. Bootstraped on Arm and Thumb-2 without any
> regression. Compiled SPEC2k6 without -mslow-flash-data and checked that
> code generation didn't change.
> 
> Is this ok for stage3?
> 
> Best regards,
> 
> Thomas

[-- Attachment #2: improve_robustness_mslow-flash-data.patch --]
[-- Type: text/x-patch, Size: 17630 bytes --]

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index a773518cefaf8451e77fead9e072ee8ef39f1eb8..a08298bbb9f93fc132aa64a206fad64dcda9ed65 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5831,6 +5831,11 @@
     case 1:
     case 2:
       return \"#\";
+    case 3:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      /* Fall through.  */
     default:
       return output_move_double (operands, true, NULL);
     }
@@ -6939,6 +6944,20 @@
 	     operands[1] = force_reg (SFmode, operands[1]);
         }
     }
+
+  /* Cannot load it directly, generate a load with clobber so that it can be
+     loaded via GPR with MOV / MOVT.  */
+  if (arm_disable_literal_pool
+      && (REG_P (operands[0]) || SUBREG_P (operands[0]))
+      && CONST_DOUBLE_P (operands[1])
+      && TARGET_HARD_FLOAT
+      && !vfp3_const_double_rtx (operands[1]))
+    {
+      rtx clobreg = gen_reg_rtx (SFmode);
+      emit_insn (gen_no_literal_pool_sf_immediate (operands[0], operands[1],
+						   clobreg));
+      DONE;
+    }
   "
 )
 
@@ -6966,10 +6985,19 @@
    && TARGET_SOFT_FLOAT
    && (!MEM_P (operands[0])
        || register_operand (operands[1], SFmode))"
-  "@
-   mov%?\\t%0, %1
-   ldr%?\\t%0, %1\\t%@ float
-   str%?\\t%1, %0\\t%@ float"
+{
+  switch (which_alternative)
+    {
+    case 0: return \"mov%?\\t%0, %1\";
+    case 1:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      return \"ldr%?\\t%0, %1\\t%@ float\";
+    case 2: return \"str%?\\t%1, %0\\t%@ float\";
+    default: gcc_unreachable ();
+    }
+}
   [(set_attr "predicable" "yes")
    (set_attr "type" "mov_reg,load_4,store_4")
    (set_attr "arm_pool_range" "*,4096,*")
@@ -6978,6 +7006,21 @@
    (set_attr "thumb2_neg_pool_range" "*,0,*")]
 )
 
+;; Splitter for the above.
+(define_split
+  [(set (match_operand:SF 0 "s_register_operand")
+	(match_operand:SF 1 "const_double_operand"))]
+  "arm_disable_literal_pool && TARGET_SOFT_FLOAT"
+  [(const_int 0)]
+{
+  long buf;
+  real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
+  rtx cst = gen_int_mode (buf, SImode);
+  emit_move_insn (simplify_gen_subreg (SImode, operands[0], SFmode, 0), cst);
+  DONE;
+}
+)
+
 (define_expand "movdf"
   [(set (match_operand:DF 0 "general_operand" "")
 	(match_operand:DF 1 "general_operand" ""))]
@@ -6996,6 +7039,21 @@
 	    operands[1] = force_reg (DFmode, operands[1]);
         }
     }
+
+  /* Cannot load it directly, generate a load with clobber so that it can be
+     loaded via GPR with MOV / MOVT.  */
+  if (arm_disable_literal_pool
+      && (REG_P (operands[0]) || SUBREG_P (operands[0]))
+      && CONSTANT_P (operands[1])
+      && TARGET_HARD_FLOAT
+      && !arm_const_double_rtx (operands[1])
+      && !(TARGET_VFP_DOUBLE && vfp3_const_double_rtx (operands[1])))
+    {
+      rtx clobreg = gen_reg_rtx (DFmode);
+      emit_insn (gen_no_literal_pool_df_immediate (operands[0], operands[1],
+						   clobreg));
+      DONE;
+    }
   "
 )
 
@@ -7055,6 +7113,11 @@
     case 1:
     case 2:
       return \"#\";
+    case 3:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      /* Fall through.  */
     default:
       return output_move_double (operands, true, NULL);
     }
@@ -7066,6 +7129,24 @@
    (set_attr "arm_neg_pool_range" "*,*,*,1004,*")
    (set_attr "thumb2_neg_pool_range" "*,*,*,0,*")]
 )
+
+;; Splitter for the above.
+(define_split
+  [(set (match_operand:DF 0 "s_register_operand")
+	(match_operand:DF 1 "const_double_operand"))]
+  "arm_disable_literal_pool && TARGET_SOFT_FLOAT"
+  [(const_int 0)]
+{
+  long buf[2];
+  int order = BYTES_BIG_ENDIAN ? 1 : 0;
+  real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
+  unsigned HOST_WIDE_INT ival = zext_hwi (buf[order], 32);
+  ival |= (zext_hwi (buf[1 - order], 32) << 32);
+  rtx cst = gen_int_mode (ival, DImode);
+  emit_move_insn (simplify_gen_subreg (DImode, operands[0], DFmode, 0), cst);
+  DONE;
+}
+)
 \f
 
 ;; load- and store-multiple insns
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 7576c6fc401fc5ce25245fa2b740db99169ce7ce..657e540816bdd82cddd23059dea2be19df7eb1bb 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -31,9 +31,10 @@
 ;; 'H' was previously used for FPA.
 
 ;; The following multi-letter normal constraints have been used:
-;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
+;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp,
+;;			 Dz, Tu
 ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
-;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
+;; in Thumb-2 state: Ha, Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py, Pz
 ;; in all states: Pf
 
 ;; The following memory constraints have been used:
@@ -234,6 +235,12 @@
  (and (match_code "const_double")
       (match_test "TARGET_32BIT && arm_const_double_rtx (op)")))
 
+(define_constraint "Ha"
+  "@internal In ARM / Thumb-2 a float constant iff literal pools are allowed."
+  (and (match_code "const_double")
+       (match_test "satisfies_constraint_E (op)")
+       (match_test "!arm_disable_literal_pool")))
+
 (define_constraint "Dz"
  "@internal
   In ARM/Thumb-2 state a vector of constant zeros."
@@ -351,6 +358,12 @@
        (match_test "TARGET_32BIT
 		    && vfp3_const_double_for_bits (op) > 0")))
 
+(define_constraint "Tu"
+  "@internal In ARM / Thumb-2 an integer constant iff literal pools are
+   allowed."
+  (and (match_test "CONSTANT_P (op)")
+       (match_test "!arm_disable_literal_pool")))
+
 (define_register_constraint "Ts" "(arm_restrict_it) ? LO_REGS : GENERAL_REGS"
  "For arm_restrict_it the core registers @code{r0}-@code{r7}.  GENERAL_REGS otherwise.")
 
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index 7e198f9bce441c55913615e4c601a760d7e62c20..f73264cc2a07cacec5e7c4e31ce12299a1fadd0b 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -456,6 +456,24 @@
        (and (match_code "reg,subreg,mem")
 	    (match_operand 0 "nonimmediate_soft_df_operand"))))
 
+;; Predicate for thumb2_movsf_vfp.  Compared to general_operand, this
+;; forbids constant loaded via literal pool iff literal pools are disabled.
+(define_predicate "hard_sf_operand"
+  (and (match_operand 0 "general_operand")
+       (ior (not (match_code "const_double"))
+	    (not (match_test "arm_disable_literal_pool"))
+	    (match_test "satisfies_constraint_Dv (op)"))))
+
+;; Predicate for thumb2_movdf_vfp.  Compared to soft_df_operand used in
+;; movdf_soft_insn, this forbids constant loaded via literal pool iff
+;; literal pools are disabled.
+(define_predicate "hard_df_operand"
+  (and (match_operand 0 "soft_df_operand")
+       (ior (not (match_code "const_double"))
+	    (not (match_test "arm_disable_literal_pool"))
+	    (match_test "satisfies_constraint_Dy (op)")
+	    (match_test "satisfies_constraint_G (op)"))))
+
 (define_special_predicate "load_multiple_operation"
   (match_code "parallel")
 {
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index c42670f8643c3286bc5abf537d4fd0483cba68ac..727ceb9b37957efbc7ab8809f57e8825deb6b1df 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -252,16 +252,26 @@
   "TARGET_THUMB2 && !TARGET_IWMMXT && !TARGET_HARD_FLOAT
    && (   register_operand (operands[0], SImode)
        || register_operand (operands[1], SImode))"
-  "@
-   mov%?\\t%0, %1
-   mov%?\\t%0, %1
-   mov%?\\t%0, %1
-   mvn%?\\t%0, #%B1
-   movw%?\\t%0, %1
-   ldr%?\\t%0, %1
-   ldr%?\\t%0, %1
-   str%?\\t%1, %0
-   str%?\\t%1, %0"
+{
+  switch (which_alternative)
+    {
+    case 0:
+    case 1:
+    case 2:
+      return \"mov%?\\t%0, %1\";
+    case 3: return \"mvn%?\\t%0, #%B1\";
+    case 4: return \"movw%?\\t%0, %1\";
+    case 5:
+    case 6:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      return \"ldr%?\\t%0, %1\";
+    case 7:
+    case 8: return \"str%?\\t%1, %0\";
+    default: gcc_unreachable ();
+    }
+}
   [(set_attr "type" "mov_reg,mov_imm,mov_imm,mvn_imm,mov_imm,load_4,load_4,store_4,store_4")
    (set_attr "length" "2,4,2,4,4,4,4,4,4")
    (set_attr "predicable" "yes")
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index 611ebe2d83698e3129df6c55a03e4f5f33c891e7..f3d4f30cb53d82e2ffd2c4fcaad2cc873d97c24b 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -259,7 +259,7 @@
 ;; arm_restrict_it.
 (define_insn "*thumb2_movsi_vfp"
   [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r, l,*hk,m, *m,*t, r,*t,*t,  *Uv")
-	(match_operand:SI 1 "general_operand"	   "rk,I,Py,K,j,mi,*mi,l,*hk, r,*t,*t,*Uvi,*t"))]
+	(match_operand:SI 1 "general_operand"	   "rk,I,Py,K,j,mi,*mi,l,*hk, r,*t,*t,*UvTu,*t"))]
   "TARGET_THUMB2 && TARGET_HARD_FLOAT
    && (   s_register_operand (operands[0], SImode)
        || s_register_operand (operands[1], SImode))"
@@ -276,6 +276,9 @@
       return \"movw%?\\t%0, %1\";
     case 5:
     case 6:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
       return \"ldr%?\\t%0, %1\";
     case 7:
     case 8:
@@ -305,7 +308,7 @@
 
 (define_insn "*movdi_vfp"
   [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,q,q,m,w,!r,w,w, Uv")
-       (match_operand:DI 1 "di_operand"              "r,rDa,Db,Dc,mi,mi,q,r,w,w,Uvi,w"))]
+	(match_operand:DI 1 "di_operand"	      "r,rDa,Db,Dc,mi,mi,q,r,w,w,UvTu,w"))]
   "TARGET_32BIT && TARGET_HARD_FLOAT
    && (   register_operand (operands[0], DImode)
        || register_operand (operands[1], DImode))
@@ -321,6 +324,10 @@
       return \"#\";
     case 4:
     case 5:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      /* Fall through.  */
     case 6:
       return output_move_double (operands, true, NULL);
     case 7:
@@ -587,7 +594,7 @@
 
 (define_insn "*thumb2_movsf_vfp"
   [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t  ,Uv,r ,m,t,r")
-	(match_operand:SF 1 "general_operand"	   " ?r,t,Dv,UvE,t, mE,r,t,r"))]
+	(match_operand:SF 1 "hard_sf_operand"	   " ?r,t,Dv,UvHa,t, mHa,r,t,r"))]
   "TARGET_THUMB2 && TARGET_HARD_FLOAT
    && (   s_register_operand (operands[0], SFmode)
        || s_register_operand (operands[1], SFmode))"
@@ -676,7 +683,7 @@
 
 (define_insn "*thumb2_movdf_vfp"
   [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w  ,Uv,r ,m,w,r")
-	(match_operand:DF 1 "soft_df_operand"		   " ?r,w,Dy,G,UvF,w, mF,r, w,r"))]
+	(match_operand:DF 1 "hard_df_operand"		   " ?r,w,Dy,G,UvHa,w, mHa,r, w,r"))]
   "TARGET_THUMB2 && TARGET_HARD_FLOAT
    && (   register_operand (operands[0], DFmode)
        || register_operand (operands[1], DFmode))"
@@ -1983,39 +1990,50 @@
 ;; Support for xD (single precision only) variants.
 ;; fmrrs, fmsrr
 
-;; Split an immediate DF move to two immediate SI moves.
+;; Load a DF immediate via GPR (where combinations of MOV and MOVT can be used)
+;; and then move it into a VFP register.
 (define_insn_and_split "no_literal_pool_df_immediate"
-  [(set (match_operand:DF 0 "s_register_operand" "")
-	(match_operand:DF 1 "const_double_operand" ""))]
-  "TARGET_THUMB2 && arm_disable_literal_pool
-  && !(TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE
-       && vfp3_const_double_rtx (operands[1]))"
+  [(set (match_operand:DF 0 "s_register_operand" "=w")
+	(match_operand:DF 1 "const_double_operand" "F"))
+   (clobber (match_operand:DF 2 "s_register_operand" "=r"))]
+  "arm_disable_literal_pool
+   && TARGET_HARD_FLOAT
+   && !arm_const_double_rtx (operands[1])
+   && !(TARGET_VFP_DOUBLE && vfp3_const_double_rtx (operands[1]))"
   "#"
-  "&& !reload_completed"
-  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
-   (set (subreg:SI (match_dup 1) 4) (match_dup 3))
-   (set (match_dup 0) (match_dup 1))]
-  "
+  ""
+  [(const_int 0)]
+{
   long buf[2];
+  int order = BYTES_BIG_ENDIAN ? 1 : 0;
   real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
-  operands[2] = GEN_INT ((int) buf[0]);
-  operands[3] = GEN_INT ((int) buf[1]);
-  operands[1] = gen_reg_rtx (DFmode);
-  ")
+  unsigned HOST_WIDE_INT ival = zext_hwi (buf[order], 32);
+  ival |= (zext_hwi (buf[1 - order], 32) << 32);
+  rtx cst = gen_int_mode (ival, DImode);
+  emit_move_insn (simplify_gen_subreg (DImode, operands[2], DFmode, 0), cst);
+  emit_move_insn (operands[0], operands[2]);
+  DONE;
+}
+)
 
-;; Split an immediate SF move to one immediate SI move.
+;; Load a SF immediate via GPR (where combinations of MOV and MOVT can be used)
+;; and then move it into a VFP register.
 (define_insn_and_split "no_literal_pool_sf_immediate"
-  [(set (match_operand:SF 0 "s_register_operand" "")
-	(match_operand:SF 1 "const_double_operand" ""))]
-  "TARGET_THUMB2 && arm_disable_literal_pool
-  && !(TARGET_HARD_FLOAT && vfp3_const_double_rtx (operands[1]))"
+  [(set (match_operand:SF 0 "s_register_operand" "=t")
+	(match_operand:SF 1 "const_double_operand" "E"))
+   (clobber (match_operand:SF 2 "s_register_operand" "=r"))]
+  "arm_disable_literal_pool
+   && TARGET_HARD_FLOAT
+   && !vfp3_const_double_rtx (operands[1])"
   "#"
-  "&& !reload_completed"
-  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
-   (set (match_dup 0) (match_dup 1))]
-  "
+  ""
+  [(const_int 0)]
+{
   long buf;
   real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
-  operands[2] = GEN_INT ((int) buf);
-  operands[1] = gen_reg_rtx (SFmode);
-  ")
+  rtx cst = gen_int_mode (buf, SImode);
+  emit_move_insn (simplify_gen_subreg (SImode, operands[2], SFmode, 0), cst);
+  emit_move_insn (operands[0], operands[2]);
+  DONE;
+}
+)
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
index 90bd44e27e5c53d34f2816f4d6320acbc1dc709b..231243759cfe486c390ca27f10bd06177f60bd43 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_cortex_m } */
 /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
 /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
 /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
 /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
index 5d9cd9c4df28837b81b2de48c25d38cdf2c15999..27e72ec20863866acdc5e7fea632bc6880678dfd 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_cortex_m } */
 /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
 /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
 /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
 /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
index 0eeddd5e6ec1f42a96fc6220277f9ecb7cad44f5..8dbe87a1e68d5eb2edfd8259948988fbe0658ced 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_cortex_m } */
 /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
 /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
 /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
 /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
index 7d52f3801b6d4b62b27833871ac830d6d077894d..b98eb7624e42b5a7f4a11c604c7d2826339bcfd5 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_cortex_m } */
 /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
 /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
 /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
 /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH, ARM] Improve robustness of -mslow-flash-data
  2018-11-19 17:56 [PATCH, ARM] Improve robustness of -mslow-flash-data Thomas Preudhomme
  2018-11-20 10:23 ` Christophe Lyon
  2018-11-26 10:01 ` [PATCH, ARM, ping] " Thomas Preudhomme
@ 2018-11-30 14:11 ` Kyrill Tkachov
  2018-12-11 16:09   ` Thomas Preudhomme
  2 siblings, 1 reply; 6+ messages in thread
From: Kyrill Tkachov @ 2018-11-30 14:11 UTC (permalink / raw)
  To: Thomas Preudhomme, Ramana Radhakrishnan, Richard Earnshaw, gcc-patches

Hi Thomas,

On 19/11/18 17:56, Thomas Preudhomme wrote:
> Hi,
>
> Current code to handle -mslow-flash-data in machine description files
> suffers from a number of issues which this patch fixes:
>
> 1) The insn_and_split in vfp.md to load a generic floating-point
> constant via GPR first and move it to VFP register are guarded by
> !reload_completed which is forbidden explicitely in the GCC internals
> documentation section 17.2 point 3;
>
> 2) A number of testcase in the testsuite ICEs under -mslow-flash-data
> when targeting the hardfloat ABI [1];
>
> 3) Instructions performing load from literal pool are not disabled.
>
> These problems are addressed by 2 separate actions:
>
> 1) Making the splitters take a clobber and changing the expanders
> accordingly to generate a mov with clobber in cases where a literal
> pool would be used. The splitter can thus be enabled after reload since
> it does not call gen_reg_rtx anymore;
>
> 2) Adding new predicates and constraints to disable literal pool loads
> in existing instructions when -mslow-flash-data is in effect.
>

Please split these into two separate patches so we can more clearly see which changes address which problem

> The patch also rework the splitter for DFmode slightly to generate an
> intermediate DI load instead of 2 intermediate SI loads, thus relying on
> the existing DI splitters instead of redoing their job. At last, the
> patch adds some missing arm_fp_ok effective target to some of the
> slow-flash-data testcases.
>
> [1]
> c-c++-common/Wunused-var-3.c
> gcc.c-torture/compile/pr72771.c
> gcc.c-torture/compile/vector-5.c
> gcc.c-torture/compile/vector-6.c
> gcc.c-torture/execute/20030914-1.c
> gcc.c-torture/execute/20050316-1.c
> gcc.c-torture/execute/pr59643.c
> gcc.dg/builtin-tgmath-1.c
> gcc.dg/debug/pr55730.c
> gcc.dg/graphite/interchange-7.c
> gcc.dg/pr56890-2.c
> gcc.dg/pr68474.c
> gcc.dg/pr80286.c
> gcc.dg/torture/pr35227.c
> gcc.dg/torture/pr65077.c
> gcc.dg/torture/pr86363.c
> g++.dg/torture/pr81112.C
> g++.dg/torture/pr82985.C
> g++.dg/warn/Wunused-var-7.C
> and a lot more in libstdc++ in special_functions/*_comp_ellint_* and
> special_functions/*_ellint_* directories.
>
> ChangeLog entries are as follows:
>
> *** gcc/ChangeLog ***
>
> 2018-11-14  Thomas Preud'homme <thomas.preudhomme@arm.com>
>
>         * config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and
>         source is a constant that would be loaded by literal pool.
>         (movsf expander): Generate a no_literal_pool_sf_immediate insn if
>         -mslow-flash-data is present, targeting hardfloat ABI and source is a
>         float constant that cannot be loaded via vmov.
>         (movdf expander): Likewise but generate a no_literal_pool_df_immediate
>         insn.
>         (arm_movsf_soft_insn): Split if -mslow-flash-data and source is a
>         float constant that would be loaded by literal pool.
>         (softfloat constant movsf splitter): Splitter for the above case.
>         (movdf_soft_insn): Split if -mslow-flash-data and source is a float
>         constant that would be loaded by literal pool.
>         (softfloat constant movdf splitter): Splitter for the above case.
>         * config/arm/constraints.md (Pz): Document existing constraint.
>         (Ha): Define constraint.
>         (Tu): Likewise.
>         * config/arm/predicates.md (hard_sf_operand): New predicate.
>         (hard_df_operand): Likewise.
>         * config/arm/thumb2.md (thumb2_movsi_insn): Split if
>         -mslow-flash-data and constant would be loaded by literal pool.
>         * constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable constant
>         load in VFP register.
>         (movdi_vfp): Likewise.
>         (thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to
>         prevent match for a constant load if -mslow-flash-data and constant
>         cannot be loaded via vmov.  Adapt constraint accordingly by
>         using Ha instead of E for generic floating-point constant load.
>         (thumb2_movdf_vfp): Likewise using hard_df_operand predicate instead.
>         (no_literal_pool_df_immediate): Add a clobber to use as the
>         intermediate general purpose register and also enable it after reload
>         but disable it constant is a valid FP constant.  Add constraints and
>         generate a DI intermediate load rather than 2 SI loads.
>         (no_literal_pool_sf_immediate): Add a clobber to use as the
>         intermediate general purpose register and also enable it after
>         reload.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2018-11-14  Thomas Preud'homme <thomas.preudhomme@arm.com>
>
>         * gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok
>         effective target.
>         * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
>         * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
>         * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
>
> Testing: Built arm-none-eabi cross compilers for Armv7E-M defaulting to
> softfloat and hardfloat ABI which showed no regression and some
> FAIL->PASS for hardfloat ABI. Bootstraped on Arm and Thumb-2 without any
> regression. Compiled SPEC2k6 without -mslow-flash-data and checked that
> code generation didn't change.
>
> Is this ok for stage3?
>
> Best regards,
>
> Thomas

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index a773518cefaf8451e77fead9e072ee8ef39f1eb8..a08298bbb9f93fc132aa64a206fad64dcda9ed65 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -5831,6 +5831,11 @@
      case 1:
      case 2:
        return \"#\";
+    case 3:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      /* Fall through.  */
      default:
        return output_move_double (operands, true, NULL);
      }
@@ -6939,6 +6944,20 @@
  	     operands[1] = force_reg (SFmode, operands[1]);
          }
      }
+
+  /* Cannot load it directly, generate a load with clobber so that it can be
+     loaded via GPR with MOV / MOVT.  */
+  if (arm_disable_literal_pool
+      && (REG_P (operands[0]) || SUBREG_P (operands[0]))
+      && CONST_DOUBLE_P (operands[1])
+      && TARGET_HARD_FLOAT
+      && !vfp3_const_double_rtx (operands[1]))
+    {
+      rtx clobreg = gen_reg_rtx (SFmode);
+      emit_insn (gen_no_literal_pool_sf_immediate (operands[0], operands[1],
+						   clobreg));
+      DONE;
+    }
    "
  )
  
@@ -6966,10 +6985,19 @@
     && TARGET_SOFT_FLOAT
     && (!MEM_P (operands[0])
         || register_operand (operands[1], SFmode))"
-  "@
-   mov%?\\t%0, %1
-   ldr%?\\t%0, %1\\t%@ float
-   str%?\\t%1, %0\\t%@ float"
+{
+  switch (which_alternative)
+    {
+    case 0: return \"mov%?\\t%0, %1\";
+    case 1:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      return \"ldr%?\\t%0, %1\\t%@ float\";
+    case 2: return \"str%?\\t%1, %0\\t%@ float\";
+    default: gcc_unreachable ();
+    }
+}
    [(set_attr "predicable" "yes")
     (set_attr "type" "mov_reg,load_4,store_4")
     (set_attr "arm_pool_range" "*,4096,*")
@@ -6978,6 +7006,21 @@
     (set_attr "thumb2_neg_pool_range" "*,0,*")]
  )
  
+;; Splitter for the above.
+(define_split
+  [(set (match_operand:SF 0 "s_register_operand")
+	(match_operand:SF 1 "const_double_operand"))]
+  "arm_disable_literal_pool && TARGET_SOFT_FLOAT"
+  [(const_int 0)]
+{
+  long buf;
+  real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
+  rtx cst = gen_int_mode (buf, SImode);
+  emit_move_insn (simplify_gen_subreg (SImode, operands[0], SFmode, 0), cst);
+  DONE;
+}
+)
+
  (define_expand "movdf"
    [(set (match_operand:DF 0 "general_operand" "")
  	(match_operand:DF 1 "general_operand" ""))]
@@ -6996,6 +7039,21 @@
  	    operands[1] = force_reg (DFmode, operands[1]);
          }
      }
+
+  /* Cannot load it directly, generate a load with clobber so that it can be
+     loaded via GPR with MOV / MOVT.  */
+  if (arm_disable_literal_pool
+      && (REG_P (operands[0]) || SUBREG_P (operands[0]))
+      && CONSTANT_P (operands[1])
+      && TARGET_HARD_FLOAT
+      && !arm_const_double_rtx (operands[1])
+      && !(TARGET_VFP_DOUBLE && vfp3_const_double_rtx (operands[1])))
+    {
+      rtx clobreg = gen_reg_rtx (DFmode);
+      emit_insn (gen_no_literal_pool_df_immediate (operands[0], operands[1],
+						   clobreg));
+      DONE;
+    }
    "
  )
  
@@ -7055,6 +7113,11 @@
      case 1:
      case 2:
        return \"#\";
+    case 3:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      /* Fall through.  */
      default:
        return output_move_double (operands, true, NULL);
      }
@@ -7066,6 +7129,24 @@
     (set_attr "arm_neg_pool_range" "*,*,*,1004,*")
     (set_attr "thumb2_neg_pool_range" "*,*,*,0,*")]
  )
+
+;; Splitter for the above.
+(define_split
+  [(set (match_operand:DF 0 "s_register_operand")
+	(match_operand:DF 1 "const_double_operand"))]
+  "arm_disable_literal_pool && TARGET_SOFT_FLOAT"
+  [(const_int 0)]
+{
+  long buf[2];
+  int order = BYTES_BIG_ENDIAN ? 1 : 0;
+  real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
+  unsigned HOST_WIDE_INT ival = zext_hwi (buf[order], 32);
+  ival |= (zext_hwi (buf[1 - order], 32) << 32);
+  rtx cst = gen_int_mode (ival, DImode);
+  emit_move_insn (simplify_gen_subreg (DImode, operands[0], DFmode, 0), cst);

This is the part I'm most hesitant about, especially for big-endian.
Did you run any armeb tests tahat exercise this?
Would you not want to use gen_highpart_mode/gen_lowpart that handles all the endianness-subreg subtleties for you?


Thanks,
Kyrill


  +  DONE;
+}
+)
  
  
  ;; load- and store-multiple insns
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 7576c6fc401fc5ce25245fa2b740db99169ce7ce..657e540816bdd82cddd23059dea2be19df7eb1bb 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -31,9 +31,10 @@
  ;; 'H' was previously used for FPA.
  
  ;; The following multi-letter normal constraints have been used:
-;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
+;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp,
+;;			 Dz, Tu
  ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
-;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
+;; in Thumb-2 state: Ha, Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py, Pz
  ;; in all states: Pf
  
  ;; The following memory constraints have been used:
@@ -234,6 +235,12 @@
   (and (match_code "const_double")
        (match_test "TARGET_32BIT && arm_const_double_rtx (op)")))
  
+(define_constraint "Ha"
+  "@internal In ARM / Thumb-2 a float constant iff literal pools are allowed."
+  (and (match_code "const_double")
+       (match_test "satisfies_constraint_E (op)")
+       (match_test "!arm_disable_literal_pool")))
+
  (define_constraint "Dz"
   "@internal
    In ARM/Thumb-2 state a vector of constant zeros."
@@ -351,6 +358,12 @@
         (match_test "TARGET_32BIT
  		    && vfp3_const_double_for_bits (op) > 0")))
  
+(define_constraint "Tu"
+  "@internal In ARM / Thumb-2 an integer constant iff literal pools are
+   allowed."
+  (and (match_test "CONSTANT_P (op)")
+       (match_test "!arm_disable_literal_pool")))
+
  (define_register_constraint "Ts" "(arm_restrict_it) ? LO_REGS : GENERAL_REGS"
   "For arm_restrict_it the core registers @code{r0}-@code{r7}.  GENERAL_REGS otherwise.")
  
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index 7e198f9bce441c55913615e4c601a760d7e62c20..f73264cc2a07cacec5e7c4e31ce12299a1fadd0b 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -456,6 +456,24 @@
         (and (match_code "reg,subreg,mem")
  	    (match_operand 0 "nonimmediate_soft_df_operand"))))
  
+;; Predicate for thumb2_movsf_vfp.  Compared to general_operand, this
+;; forbids constant loaded via literal pool iff literal pools are disabled.
+(define_predicate "hard_sf_operand"
+  (and (match_operand 0 "general_operand")
+       (ior (not (match_code "const_double"))
+	    (not (match_test "arm_disable_literal_pool"))
+	    (match_test "satisfies_constraint_Dv (op)"))))
+
+;; Predicate for thumb2_movdf_vfp.  Compared to soft_df_operand used in
+;; movdf_soft_insn, this forbids constant loaded via literal pool iff
+;; literal pools are disabled.
+(define_predicate "hard_df_operand"
+  (and (match_operand 0 "soft_df_operand")
+       (ior (not (match_code "const_double"))
+	    (not (match_test "arm_disable_literal_pool"))
+	    (match_test "satisfies_constraint_Dy (op)")
+	    (match_test "satisfies_constraint_G (op)"))))
+
  (define_special_predicate "load_multiple_operation"
    (match_code "parallel")
  {
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index c42670f8643c3286bc5abf537d4fd0483cba68ac..727ceb9b37957efbc7ab8809f57e8825deb6b1df 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -252,16 +252,26 @@
    "TARGET_THUMB2 && !TARGET_IWMMXT && !TARGET_HARD_FLOAT
     && (   register_operand (operands[0], SImode)
         || register_operand (operands[1], SImode))"
-  "@
-   mov%?\\t%0, %1
-   mov%?\\t%0, %1
-   mov%?\\t%0, %1
-   mvn%?\\t%0, #%B1
-   movw%?\\t%0, %1
-   ldr%?\\t%0, %1
-   ldr%?\\t%0, %1
-   str%?\\t%1, %0
-   str%?\\t%1, %0"
+{
+  switch (which_alternative)
+    {
+    case 0:
+    case 1:
+    case 2:
+      return \"mov%?\\t%0, %1\";
+    case 3: return \"mvn%?\\t%0, #%B1\";
+    case 4: return \"movw%?\\t%0, %1\";
+    case 5:
+    case 6:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      return \"ldr%?\\t%0, %1\";
+    case 7:
+    case 8: return \"str%?\\t%1, %0\";
+    default: gcc_unreachable ();
+    }
+}
    [(set_attr "type" "mov_reg,mov_imm,mov_imm,mvn_imm,mov_imm,load_4,load_4,store_4,store_4")
     (set_attr "length" "2,4,2,4,4,4,4,4,4")
     (set_attr "predicable" "yes")
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index 611ebe2d83698e3129df6c55a03e4f5f33c891e7..f3d4f30cb53d82e2ffd2c4fcaad2cc873d97c24b 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -259,7 +259,7 @@
  ;; arm_restrict_it.
  (define_insn "*thumb2_movsi_vfp"
    [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r, l,*hk,m, *m,*t, r,*t,*t,  *Uv")
-	(match_operand:SI 1 "general_operand"	   "rk,I,Py,K,j,mi,*mi,l,*hk, r,*t,*t,*Uvi,*t"))]
+	(match_operand:SI 1 "general_operand"	   "rk,I,Py,K,j,mi,*mi,l,*hk, r,*t,*t,*UvTu,*t"))]
    "TARGET_THUMB2 && TARGET_HARD_FLOAT
     && (   s_register_operand (operands[0], SImode)
         || s_register_operand (operands[1], SImode))"
@@ -276,6 +276,9 @@
        return \"movw%?\\t%0, %1\";
      case 5:
      case 6:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
        return \"ldr%?\\t%0, %1\";
      case 7:
      case 8:
@@ -305,7 +308,7 @@
  
  (define_insn "*movdi_vfp"
    [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,q,q,m,w,!r,w,w, Uv")
-       (match_operand:DI 1 "di_operand"              "r,rDa,Db,Dc,mi,mi,q,r,w,w,Uvi,w"))]
+	(match_operand:DI 1 "di_operand"	      "r,rDa,Db,Dc,mi,mi,q,r,w,w,UvTu,w"))]
    "TARGET_32BIT && TARGET_HARD_FLOAT
     && (   register_operand (operands[0], DImode)
         || register_operand (operands[1], DImode))
@@ -321,6 +324,10 @@
        return \"#\";
      case 4:
      case 5:
+      /* Cannot load it directly, split to load it via MOV / MOVT.  */
+      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
+	return \"#\";
+      /* Fall through.  */
      case 6:
        return output_move_double (operands, true, NULL);
      case 7:
@@ -587,7 +594,7 @@
  
  (define_insn "*thumb2_movsf_vfp"
    [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t  ,Uv,r ,m,t,r")
-	(match_operand:SF 1 "general_operand"	   " ?r,t,Dv,UvE,t, mE,r,t,r"))]
+	(match_operand:SF 1 "hard_sf_operand"	   " ?r,t,Dv,UvHa,t, mHa,r,t,r"))]
    "TARGET_THUMB2 && TARGET_HARD_FLOAT
     && (   s_register_operand (operands[0], SFmode)
         || s_register_operand (operands[1], SFmode))"
@@ -676,7 +683,7 @@
  
  (define_insn "*thumb2_movdf_vfp"
    [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w  ,Uv,r ,m,w,r")
-	(match_operand:DF 1 "soft_df_operand"		   " ?r,w,Dy,G,UvF,w, mF,r, w,r"))]
+	(match_operand:DF 1 "hard_df_operand"		   " ?r,w,Dy,G,UvHa,w, mHa,r, w,r"))]
    "TARGET_THUMB2 && TARGET_HARD_FLOAT
     && (   register_operand (operands[0], DFmode)
         || register_operand (operands[1], DFmode))"
@@ -1983,39 +1990,50 @@
  ;; Support for xD (single precision only) variants.
  ;; fmrrs, fmsrr
  
-;; Split an immediate DF move to two immediate SI moves.
+;; Load a DF immediate via GPR (where combinations of MOV and MOVT can be used)
+;; and then move it into a VFP register.
  (define_insn_and_split "no_literal_pool_df_immediate"
-  [(set (match_operand:DF 0 "s_register_operand" "")
-	(match_operand:DF 1 "const_double_operand" ""))]
-  "TARGET_THUMB2 && arm_disable_literal_pool
-  && !(TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE
-       && vfp3_const_double_rtx (operands[1]))"
+  [(set (match_operand:DF 0 "s_register_operand" "=w")
+	(match_operand:DF 1 "const_double_operand" "F"))
+   (clobber (match_operand:DF 2 "s_register_operand" "=r"))]
+  "arm_disable_literal_pool
+   && TARGET_HARD_FLOAT
+   && !arm_const_double_rtx (operands[1])
+   && !(TARGET_VFP_DOUBLE && vfp3_const_double_rtx (operands[1]))"
    "#"
-  "&& !reload_completed"
-  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
-   (set (subreg:SI (match_dup 1) 4) (match_dup 3))
-   (set (match_dup 0) (match_dup 1))]
-  "
+  ""
+  [(const_int 0)]
+{
    long buf[2];
+  int order = BYTES_BIG_ENDIAN ? 1 : 0;
    real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
-  operands[2] = GEN_INT ((int) buf[0]);
-  operands[3] = GEN_INT ((int) buf[1]);
-  operands[1] = gen_reg_rtx (DFmode);
-  ")
+  unsigned HOST_WIDE_INT ival = zext_hwi (buf[order], 32);
+  ival |= (zext_hwi (buf[1 - order], 32) << 32);
+  rtx cst = gen_int_mode (ival, DImode);
+  emit_move_insn (simplify_gen_subreg (DImode, operands[2], DFmode, 0), cst);
+  emit_move_insn (operands[0], operands[2]);
+  DONE;
+}
+)
  
-;; Split an immediate SF move to one immediate SI move.
+;; Load a SF immediate via GPR (where combinations of MOV and MOVT can be used)
+;; and then move it into a VFP register.
  (define_insn_and_split "no_literal_pool_sf_immediate"
-  [(set (match_operand:SF 0 "s_register_operand" "")
-	(match_operand:SF 1 "const_double_operand" ""))]
-  "TARGET_THUMB2 && arm_disable_literal_pool
-  && !(TARGET_HARD_FLOAT && vfp3_const_double_rtx (operands[1]))"
+  [(set (match_operand:SF 0 "s_register_operand" "=t")
+	(match_operand:SF 1 "const_double_operand" "E"))
+   (clobber (match_operand:SF 2 "s_register_operand" "=r"))]
+  "arm_disable_literal_pool
+   && TARGET_HARD_FLOAT
+   && !vfp3_const_double_rtx (operands[1])"
    "#"
-  "&& !reload_completed"
-  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
-   (set (match_dup 0) (match_dup 1))]
-  "
+  ""
+  [(const_int 0)]
+{
    long buf;
    real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
-  operands[2] = GEN_INT ((int) buf);
-  operands[1] = gen_reg_rtx (SFmode);
-  ")
+  rtx cst = gen_int_mode (buf, SImode);
+  emit_move_insn (simplify_gen_subreg (SImode, operands[2], SFmode, 0), cst);
+  emit_move_insn (operands[0], operands[2]);
+  DONE;
+}
+)
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
index 90bd44e27e5c53d34f2816f4d6320acbc1dc709b..231243759cfe486c390ca27f10bd06177f60bd43 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
@@ -1,6 +1,7 @@
  /* { dg-do compile } */
  /* { dg-require-effective-target arm_cortex_m } */
  /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
  /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
  /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
  /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
index 5d9cd9c4df28837b81b2de48c25d38cdf2c15999..27e72ec20863866acdc5e7fea632bc6880678dfd 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
@@ -1,6 +1,7 @@
  /* { dg-do compile } */
  /* { dg-require-effective-target arm_cortex_m } */
  /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
  /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
  /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
  /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
index 0eeddd5e6ec1f42a96fc6220277f9ecb7cad44f5..8dbe87a1e68d5eb2edfd8259948988fbe0658ced 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
@@ -1,6 +1,7 @@
  /* { dg-do compile } */
  /* { dg-require-effective-target arm_cortex_m } */
  /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
  /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
  /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
  /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
index 7d52f3801b6d4b62b27833871ac830d6d077894d..b98eb7624e42b5a7f4a11c604c7d2826339bcfd5 100644
--- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
+++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
@@ -1,6 +1,7 @@
  /* { dg-do compile } */
  /* { dg-require-effective-target arm_cortex_m } */
  /* { dg-require-effective-target arm_thumb2_ok } */
+/* { dg-require-effective-target arm_fp_ok } */
  /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
  /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
  /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH, ARM] Improve robustness of -mslow-flash-data
  2018-11-30 14:11 ` [PATCH, ARM] " Kyrill Tkachov
@ 2018-12-11 16:09   ` Thomas Preudhomme
  2018-12-14 16:14     ` Kyrill Tkachov
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Preudhomme @ 2018-12-11 16:09 UTC (permalink / raw)
  To: kyrylo.tkachov
  Cc: thomas.preudhomme, Ramana Radhakrishnan, Richard Earnshaw, gcc-patches

Hi Kyrill,

I've tested on armeb-none-eabi with -mslow-flash-data for both
-mfloat-abi=hard and -mfloat-abi=soft. Both show no regression and the
former shows some new PASS.

Regarding the part you are hesitant about, the code was taken from
aarch64_reinterpret_float_as_int in config/aarch64/aarch64.c. I'm not
too keen on splitting the patch unless it's just for review (ie still
committed as one) since the changes really go together. The tighter
predicate and constraint are to prevent normal pattern to match when
-mslow-flash-data is in effect while the new splitter and expander is
to deal with load under those circumstances.

Best regards,

Thomas
On Fri, 30 Nov 2018 at 14:11, Kyrill Tkachov
<kyrylo.tkachov@foss.arm.com> wrote:
>
> Hi Thomas,
>
> On 19/11/18 17:56, Thomas Preudhomme wrote:
> > Hi,
> >
> > Current code to handle -mslow-flash-data in machine description files
> > suffers from a number of issues which this patch fixes:
> >
> > 1) The insn_and_split in vfp.md to load a generic floating-point
> > constant via GPR first and move it to VFP register are guarded by
> > !reload_completed which is forbidden explicitely in the GCC internals
> > documentation section 17.2 point 3;
> >
> > 2) A number of testcase in the testsuite ICEs under -mslow-flash-data
> > when targeting the hardfloat ABI [1];
> >
> > 3) Instructions performing load from literal pool are not disabled.
> >
> > These problems are addressed by 2 separate actions:
> >
> > 1) Making the splitters take a clobber and changing the expanders
> > accordingly to generate a mov with clobber in cases where a literal
> > pool would be used. The splitter can thus be enabled after reload since
> > it does not call gen_reg_rtx anymore;
> >
> > 2) Adding new predicates and constraints to disable literal pool loads
> > in existing instructions when -mslow-flash-data is in effect.
> >
>
> Please split these into two separate patches so we can more clearly see which changes address which problem
>
> > The patch also rework the splitter for DFmode slightly to generate an
> > intermediate DI load instead of 2 intermediate SI loads, thus relying on
> > the existing DI splitters instead of redoing their job. At last, the
> > patch adds some missing arm_fp_ok effective target to some of the
> > slow-flash-data testcases.
> >
> > [1]
> > c-c++-common/Wunused-var-3.c
> > gcc.c-torture/compile/pr72771.c
> > gcc.c-torture/compile/vector-5.c
> > gcc.c-torture/compile/vector-6.c
> > gcc.c-torture/execute/20030914-1.c
> > gcc.c-torture/execute/20050316-1.c
> > gcc.c-torture/execute/pr59643.c
> > gcc.dg/builtin-tgmath-1.c
> > gcc.dg/debug/pr55730.c
> > gcc.dg/graphite/interchange-7.c
> > gcc.dg/pr56890-2.c
> > gcc.dg/pr68474.c
> > gcc.dg/pr80286.c
> > gcc.dg/torture/pr35227.c
> > gcc.dg/torture/pr65077.c
> > gcc.dg/torture/pr86363.c
> > g++.dg/torture/pr81112.C
> > g++.dg/torture/pr82985.C
> > g++.dg/warn/Wunused-var-7.C
> > and a lot more in libstdc++ in special_functions/*_comp_ellint_* and
> > special_functions/*_ellint_* directories.
> >
> > ChangeLog entries are as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-11-14  Thomas Preud'homme <thomas.preudhomme@arm.com>
> >
> >         * config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and
> >         source is a constant that would be loaded by literal pool.
> >         (movsf expander): Generate a no_literal_pool_sf_immediate insn if
> >         -mslow-flash-data is present, targeting hardfloat ABI and source is a
> >         float constant that cannot be loaded via vmov.
> >         (movdf expander): Likewise but generate a no_literal_pool_df_immediate
> >         insn.
> >         (arm_movsf_soft_insn): Split if -mslow-flash-data and source is a
> >         float constant that would be loaded by literal pool.
> >         (softfloat constant movsf splitter): Splitter for the above case.
> >         (movdf_soft_insn): Split if -mslow-flash-data and source is a float
> >         constant that would be loaded by literal pool.
> >         (softfloat constant movdf splitter): Splitter for the above case.
> >         * config/arm/constraints.md (Pz): Document existing constraint.
> >         (Ha): Define constraint.
> >         (Tu): Likewise.
> >         * config/arm/predicates.md (hard_sf_operand): New predicate.
> >         (hard_df_operand): Likewise.
> >         * config/arm/thumb2.md (thumb2_movsi_insn): Split if
> >         -mslow-flash-data and constant would be loaded by literal pool.
> >         * constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable constant
> >         load in VFP register.
> >         (movdi_vfp): Likewise.
> >         (thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to
> >         prevent match for a constant load if -mslow-flash-data and constant
> >         cannot be loaded via vmov.  Adapt constraint accordingly by
> >         using Ha instead of E for generic floating-point constant load.
> >         (thumb2_movdf_vfp): Likewise using hard_df_operand predicate instead.
> >         (no_literal_pool_df_immediate): Add a clobber to use as the
> >         intermediate general purpose register and also enable it after reload
> >         but disable it constant is a valid FP constant.  Add constraints and
> >         generate a DI intermediate load rather than 2 SI loads.
> >         (no_literal_pool_sf_immediate): Add a clobber to use as the
> >         intermediate general purpose register and also enable it after
> >         reload.
> >
> > *** gcc/testsuite/ChangeLog ***
> >
> > 2018-11-14  Thomas Preud'homme <thomas.preudhomme@arm.com>
> >
> >         * gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok
> >         effective target.
> >         * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
> >         * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
> >         * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
> >
> > Testing: Built arm-none-eabi cross compilers for Armv7E-M defaulting to
> > softfloat and hardfloat ABI which showed no regression and some
> > FAIL->PASS for hardfloat ABI. Bootstraped on Arm and Thumb-2 without any
> > regression. Compiled SPEC2k6 without -mslow-flash-data and checked that
> > code generation didn't change.
> >
> > Is this ok for stage3?
> >
> > Best regards,
> >
> > Thomas
>
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index a773518cefaf8451e77fead9e072ee8ef39f1eb8..a08298bbb9f93fc132aa64a206fad64dcda9ed65 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -5831,6 +5831,11 @@
>       case 1:
>       case 2:
>         return \"#\";
> +    case 3:
> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> +       return \"#\";
> +      /* Fall through.  */
>       default:
>         return output_move_double (operands, true, NULL);
>       }
> @@ -6939,6 +6944,20 @@
>              operands[1] = force_reg (SFmode, operands[1]);
>           }
>       }
> +
> +  /* Cannot load it directly, generate a load with clobber so that it can be
> +     loaded via GPR with MOV / MOVT.  */
> +  if (arm_disable_literal_pool
> +      && (REG_P (operands[0]) || SUBREG_P (operands[0]))
> +      && CONST_DOUBLE_P (operands[1])
> +      && TARGET_HARD_FLOAT
> +      && !vfp3_const_double_rtx (operands[1]))
> +    {
> +      rtx clobreg = gen_reg_rtx (SFmode);
> +      emit_insn (gen_no_literal_pool_sf_immediate (operands[0], operands[1],
> +                                                  clobreg));
> +      DONE;
> +    }
>     "
>   )
>
> @@ -6966,10 +6985,19 @@
>      && TARGET_SOFT_FLOAT
>      && (!MEM_P (operands[0])
>          || register_operand (operands[1], SFmode))"
> -  "@
> -   mov%?\\t%0, %1
> -   ldr%?\\t%0, %1\\t%@ float
> -   str%?\\t%1, %0\\t%@ float"
> +{
> +  switch (which_alternative)
> +    {
> +    case 0: return \"mov%?\\t%0, %1\";
> +    case 1:
> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> +       return \"#\";
> +      return \"ldr%?\\t%0, %1\\t%@ float\";
> +    case 2: return \"str%?\\t%1, %0\\t%@ float\";
> +    default: gcc_unreachable ();
> +    }
> +}
>     [(set_attr "predicable" "yes")
>      (set_attr "type" "mov_reg,load_4,store_4")
>      (set_attr "arm_pool_range" "*,4096,*")
> @@ -6978,6 +7006,21 @@
>      (set_attr "thumb2_neg_pool_range" "*,0,*")]
>   )
>
> +;; Splitter for the above.
> +(define_split
> +  [(set (match_operand:SF 0 "s_register_operand")
> +       (match_operand:SF 1 "const_double_operand"))]
> +  "arm_disable_literal_pool && TARGET_SOFT_FLOAT"
> +  [(const_int 0)]
> +{
> +  long buf;
> +  real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
> +  rtx cst = gen_int_mode (buf, SImode);
> +  emit_move_insn (simplify_gen_subreg (SImode, operands[0], SFmode, 0), cst);
> +  DONE;
> +}
> +)
> +
>   (define_expand "movdf"
>     [(set (match_operand:DF 0 "general_operand" "")
>         (match_operand:DF 1 "general_operand" ""))]
> @@ -6996,6 +7039,21 @@
>             operands[1] = force_reg (DFmode, operands[1]);
>           }
>       }
> +
> +  /* Cannot load it directly, generate a load with clobber so that it can be
> +     loaded via GPR with MOV / MOVT.  */
> +  if (arm_disable_literal_pool
> +      && (REG_P (operands[0]) || SUBREG_P (operands[0]))
> +      && CONSTANT_P (operands[1])
> +      && TARGET_HARD_FLOAT
> +      && !arm_const_double_rtx (operands[1])
> +      && !(TARGET_VFP_DOUBLE && vfp3_const_double_rtx (operands[1])))
> +    {
> +      rtx clobreg = gen_reg_rtx (DFmode);
> +      emit_insn (gen_no_literal_pool_df_immediate (operands[0], operands[1],
> +                                                  clobreg));
> +      DONE;
> +    }
>     "
>   )
>
> @@ -7055,6 +7113,11 @@
>       case 1:
>       case 2:
>         return \"#\";
> +    case 3:
> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> +       return \"#\";
> +      /* Fall through.  */
>       default:
>         return output_move_double (operands, true, NULL);
>       }
> @@ -7066,6 +7129,24 @@
>      (set_attr "arm_neg_pool_range" "*,*,*,1004,*")
>      (set_attr "thumb2_neg_pool_range" "*,*,*,0,*")]
>   )
> +
> +;; Splitter for the above.
> +(define_split
> +  [(set (match_operand:DF 0 "s_register_operand")
> +       (match_operand:DF 1 "const_double_operand"))]
> +  "arm_disable_literal_pool && TARGET_SOFT_FLOAT"
> +  [(const_int 0)]
> +{
> +  long buf[2];
> +  int order = BYTES_BIG_ENDIAN ? 1 : 0;
> +  real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
> +  unsigned HOST_WIDE_INT ival = zext_hwi (buf[order], 32);
> +  ival |= (zext_hwi (buf[1 - order], 32) << 32);
> +  rtx cst = gen_int_mode (ival, DImode);
> +  emit_move_insn (simplify_gen_subreg (DImode, operands[0], DFmode, 0), cst);
>
> This is the part I'm most hesitant about, especially for big-endian.
> Did you run any armeb tests tahat exercise this?
> Would you not want to use gen_highpart_mode/gen_lowpart that handles all the endianness-subreg subtleties for you?
>
>
> Thanks,
> Kyrill
>
>
>   +  DONE;
> +}
> +)
>
>
>   ;; load- and store-multiple insns
> diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
> index 7576c6fc401fc5ce25245fa2b740db99169ce7ce..657e540816bdd82cddd23059dea2be19df7eb1bb 100644
> --- a/gcc/config/arm/constraints.md
> +++ b/gcc/config/arm/constraints.md
> @@ -31,9 +31,10 @@
>   ;; 'H' was previously used for FPA.
>
>   ;; The following multi-letter normal constraints have been used:
> -;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
> +;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp,
> +;;                      Dz, Tu
>   ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
> -;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
> +;; in Thumb-2 state: Ha, Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py, Pz
>   ;; in all states: Pf
>
>   ;; The following memory constraints have been used:
> @@ -234,6 +235,12 @@
>    (and (match_code "const_double")
>         (match_test "TARGET_32BIT && arm_const_double_rtx (op)")))
>
> +(define_constraint "Ha"
> +  "@internal In ARM / Thumb-2 a float constant iff literal pools are allowed."
> +  (and (match_code "const_double")
> +       (match_test "satisfies_constraint_E (op)")
> +       (match_test "!arm_disable_literal_pool")))
> +
>   (define_constraint "Dz"
>    "@internal
>     In ARM/Thumb-2 state a vector of constant zeros."
> @@ -351,6 +358,12 @@
>          (match_test "TARGET_32BIT
>                     && vfp3_const_double_for_bits (op) > 0")))
>
> +(define_constraint "Tu"
> +  "@internal In ARM / Thumb-2 an integer constant iff literal pools are
> +   allowed."
> +  (and (match_test "CONSTANT_P (op)")
> +       (match_test "!arm_disable_literal_pool")))
> +
>   (define_register_constraint "Ts" "(arm_restrict_it) ? LO_REGS : GENERAL_REGS"
>    "For arm_restrict_it the core registers @code{r0}-@code{r7}.  GENERAL_REGS otherwise.")
>
> diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
> index 7e198f9bce441c55913615e4c601a760d7e62c20..f73264cc2a07cacec5e7c4e31ce12299a1fadd0b 100644
> --- a/gcc/config/arm/predicates.md
> +++ b/gcc/config/arm/predicates.md
> @@ -456,6 +456,24 @@
>          (and (match_code "reg,subreg,mem")
>             (match_operand 0 "nonimmediate_soft_df_operand"))))
>
> +;; Predicate for thumb2_movsf_vfp.  Compared to general_operand, this
> +;; forbids constant loaded via literal pool iff literal pools are disabled.
> +(define_predicate "hard_sf_operand"
> +  (and (match_operand 0 "general_operand")
> +       (ior (not (match_code "const_double"))
> +           (not (match_test "arm_disable_literal_pool"))
> +           (match_test "satisfies_constraint_Dv (op)"))))
> +
> +;; Predicate for thumb2_movdf_vfp.  Compared to soft_df_operand used in
> +;; movdf_soft_insn, this forbids constant loaded via literal pool iff
> +;; literal pools are disabled.
> +(define_predicate "hard_df_operand"
> +  (and (match_operand 0 "soft_df_operand")
> +       (ior (not (match_code "const_double"))
> +           (not (match_test "arm_disable_literal_pool"))
> +           (match_test "satisfies_constraint_Dy (op)")
> +           (match_test "satisfies_constraint_G (op)"))))
> +
>   (define_special_predicate "load_multiple_operation"
>     (match_code "parallel")
>   {
> diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
> index c42670f8643c3286bc5abf537d4fd0483cba68ac..727ceb9b37957efbc7ab8809f57e8825deb6b1df 100644
> --- a/gcc/config/arm/thumb2.md
> +++ b/gcc/config/arm/thumb2.md
> @@ -252,16 +252,26 @@
>     "TARGET_THUMB2 && !TARGET_IWMMXT && !TARGET_HARD_FLOAT
>      && (   register_operand (operands[0], SImode)
>          || register_operand (operands[1], SImode))"
> -  "@
> -   mov%?\\t%0, %1
> -   mov%?\\t%0, %1
> -   mov%?\\t%0, %1
> -   mvn%?\\t%0, #%B1
> -   movw%?\\t%0, %1
> -   ldr%?\\t%0, %1
> -   ldr%?\\t%0, %1
> -   str%?\\t%1, %0
> -   str%?\\t%1, %0"
> +{
> +  switch (which_alternative)
> +    {
> +    case 0:
> +    case 1:
> +    case 2:
> +      return \"mov%?\\t%0, %1\";
> +    case 3: return \"mvn%?\\t%0, #%B1\";
> +    case 4: return \"movw%?\\t%0, %1\";
> +    case 5:
> +    case 6:
> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> +       return \"#\";
> +      return \"ldr%?\\t%0, %1\";
> +    case 7:
> +    case 8: return \"str%?\\t%1, %0\";
> +    default: gcc_unreachable ();
> +    }
> +}
>     [(set_attr "type" "mov_reg,mov_imm,mov_imm,mvn_imm,mov_imm,load_4,load_4,store_4,store_4")
>      (set_attr "length" "2,4,2,4,4,4,4,4,4")
>      (set_attr "predicable" "yes")
> diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
> index 611ebe2d83698e3129df6c55a03e4f5f33c891e7..f3d4f30cb53d82e2ffd2c4fcaad2cc873d97c24b 100644
> --- a/gcc/config/arm/vfp.md
> +++ b/gcc/config/arm/vfp.md
> @@ -259,7 +259,7 @@
>   ;; arm_restrict_it.
>   (define_insn "*thumb2_movsi_vfp"
>     [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r, l,*hk,m, *m,*t, r,*t,*t,  *Uv")
> -       (match_operand:SI 1 "general_operand"      "rk,I,Py,K,j,mi,*mi,l,*hk, r,*t,*t,*Uvi,*t"))]
> +       (match_operand:SI 1 "general_operand"      "rk,I,Py,K,j,mi,*mi,l,*hk, r,*t,*t,*UvTu,*t"))]
>     "TARGET_THUMB2 && TARGET_HARD_FLOAT
>      && (   s_register_operand (operands[0], SImode)
>          || s_register_operand (operands[1], SImode))"
> @@ -276,6 +276,9 @@
>         return \"movw%?\\t%0, %1\";
>       case 5:
>       case 6:
> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> +       return \"#\";
>         return \"ldr%?\\t%0, %1\";
>       case 7:
>       case 8:
> @@ -305,7 +308,7 @@
>
>   (define_insn "*movdi_vfp"
>     [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,q,q,m,w,!r,w,w, Uv")
> -       (match_operand:DI 1 "di_operand"              "r,rDa,Db,Dc,mi,mi,q,r,w,w,Uvi,w"))]
> +       (match_operand:DI 1 "di_operand"              "r,rDa,Db,Dc,mi,mi,q,r,w,w,UvTu,w"))]
>     "TARGET_32BIT && TARGET_HARD_FLOAT
>      && (   register_operand (operands[0], DImode)
>          || register_operand (operands[1], DImode))
> @@ -321,6 +324,10 @@
>         return \"#\";
>       case 4:
>       case 5:
> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> +       return \"#\";
> +      /* Fall through.  */
>       case 6:
>         return output_move_double (operands, true, NULL);
>       case 7:
> @@ -587,7 +594,7 @@
>
>   (define_insn "*thumb2_movsf_vfp"
>     [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t  ,Uv,r ,m,t,r")
> -       (match_operand:SF 1 "general_operand"      " ?r,t,Dv,UvE,t, mE,r,t,r"))]
> +       (match_operand:SF 1 "hard_sf_operand"      " ?r,t,Dv,UvHa,t, mHa,r,t,r"))]
>     "TARGET_THUMB2 && TARGET_HARD_FLOAT
>      && (   s_register_operand (operands[0], SFmode)
>          || s_register_operand (operands[1], SFmode))"
> @@ -676,7 +683,7 @@
>
>   (define_insn "*thumb2_movdf_vfp"
>     [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w  ,Uv,r ,m,w,r")
> -       (match_operand:DF 1 "soft_df_operand"              " ?r,w,Dy,G,UvF,w, mF,r, w,r"))]
> +       (match_operand:DF 1 "hard_df_operand"              " ?r,w,Dy,G,UvHa,w, mHa,r, w,r"))]
>     "TARGET_THUMB2 && TARGET_HARD_FLOAT
>      && (   register_operand (operands[0], DFmode)
>          || register_operand (operands[1], DFmode))"
> @@ -1983,39 +1990,50 @@
>   ;; Support for xD (single precision only) variants.
>   ;; fmrrs, fmsrr
>
> -;; Split an immediate DF move to two immediate SI moves.
> +;; Load a DF immediate via GPR (where combinations of MOV and MOVT can be used)
> +;; and then move it into a VFP register.
>   (define_insn_and_split "no_literal_pool_df_immediate"
> -  [(set (match_operand:DF 0 "s_register_operand" "")
> -       (match_operand:DF 1 "const_double_operand" ""))]
> -  "TARGET_THUMB2 && arm_disable_literal_pool
> -  && !(TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE
> -       && vfp3_const_double_rtx (operands[1]))"
> +  [(set (match_operand:DF 0 "s_register_operand" "=w")
> +       (match_operand:DF 1 "const_double_operand" "F"))
> +   (clobber (match_operand:DF 2 "s_register_operand" "=r"))]
> +  "arm_disable_literal_pool
> +   && TARGET_HARD_FLOAT
> +   && !arm_const_double_rtx (operands[1])
> +   && !(TARGET_VFP_DOUBLE && vfp3_const_double_rtx (operands[1]))"
>     "#"
> -  "&& !reload_completed"
> -  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
> -   (set (subreg:SI (match_dup 1) 4) (match_dup 3))
> -   (set (match_dup 0) (match_dup 1))]
> -  "
> +  ""
> +  [(const_int 0)]
> +{
>     long buf[2];
> +  int order = BYTES_BIG_ENDIAN ? 1 : 0;
>     real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
> -  operands[2] = GEN_INT ((int) buf[0]);
> -  operands[3] = GEN_INT ((int) buf[1]);
> -  operands[1] = gen_reg_rtx (DFmode);
> -  ")
> +  unsigned HOST_WIDE_INT ival = zext_hwi (buf[order], 32);
> +  ival |= (zext_hwi (buf[1 - order], 32) << 32);
> +  rtx cst = gen_int_mode (ival, DImode);
> +  emit_move_insn (simplify_gen_subreg (DImode, operands[2], DFmode, 0), cst);
> +  emit_move_insn (operands[0], operands[2]);
> +  DONE;
> +}
> +)
>
> -;; Split an immediate SF move to one immediate SI move.
> +;; Load a SF immediate via GPR (where combinations of MOV and MOVT can be used)
> +;; and then move it into a VFP register.
>   (define_insn_and_split "no_literal_pool_sf_immediate"
> -  [(set (match_operand:SF 0 "s_register_operand" "")
> -       (match_operand:SF 1 "const_double_operand" ""))]
> -  "TARGET_THUMB2 && arm_disable_literal_pool
> -  && !(TARGET_HARD_FLOAT && vfp3_const_double_rtx (operands[1]))"
> +  [(set (match_operand:SF 0 "s_register_operand" "=t")
> +       (match_operand:SF 1 "const_double_operand" "E"))
> +   (clobber (match_operand:SF 2 "s_register_operand" "=r"))]
> +  "arm_disable_literal_pool
> +   && TARGET_HARD_FLOAT
> +   && !vfp3_const_double_rtx (operands[1])"
>     "#"
> -  "&& !reload_completed"
> -  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
> -   (set (match_dup 0) (match_dup 1))]
> -  "
> +  ""
> +  [(const_int 0)]
> +{
>     long buf;
>     real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
> -  operands[2] = GEN_INT ((int) buf);
> -  operands[1] = gen_reg_rtx (SFmode);
> -  ")
> +  rtx cst = gen_int_mode (buf, SImode);
> +  emit_move_insn (simplify_gen_subreg (SImode, operands[2], SFmode, 0), cst);
> +  emit_move_insn (operands[0], operands[2]);
> +  DONE;
> +}
> +)
> diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
> index 90bd44e27e5c53d34f2816f4d6320acbc1dc709b..231243759cfe486c390ca27f10bd06177f60bd43 100644
> --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
> +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
> @@ -1,6 +1,7 @@
>   /* { dg-do compile } */
>   /* { dg-require-effective-target arm_cortex_m } */
>   /* { dg-require-effective-target arm_thumb2_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
>   /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
>   /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
>   /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
> index 5d9cd9c4df28837b81b2de48c25d38cdf2c15999..27e72ec20863866acdc5e7fea632bc6880678dfd 100644
> --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
> +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
> @@ -1,6 +1,7 @@
>   /* { dg-do compile } */
>   /* { dg-require-effective-target arm_cortex_m } */
>   /* { dg-require-effective-target arm_thumb2_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
>   /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
>   /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
>   /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
> index 0eeddd5e6ec1f42a96fc6220277f9ecb7cad44f5..8dbe87a1e68d5eb2edfd8259948988fbe0658ced 100644
> --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
> +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
> @@ -1,6 +1,7 @@
>   /* { dg-do compile } */
>   /* { dg-require-effective-target arm_cortex_m } */
>   /* { dg-require-effective-target arm_thumb2_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
>   /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
>   /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
>   /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
> index 7d52f3801b6d4b62b27833871ac830d6d077894d..b98eb7624e42b5a7f4a11c604c7d2826339bcfd5 100644
> --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
> +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
> @@ -1,6 +1,7 @@
>   /* { dg-do compile } */
>   /* { dg-require-effective-target arm_cortex_m } */
>   /* { dg-require-effective-target arm_thumb2_ok } */
> +/* { dg-require-effective-target arm_fp_ok } */
>   /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
>   /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
>   /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
>
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH, ARM] Improve robustness of -mslow-flash-data
  2018-12-11 16:09   ` Thomas Preudhomme
@ 2018-12-14 16:14     ` Kyrill Tkachov
  0 siblings, 0 replies; 6+ messages in thread
From: Kyrill Tkachov @ 2018-12-14 16:14 UTC (permalink / raw)
  To: Thomas Preudhomme
  Cc: thomas.preudhomme, Ramana Radhakrishnan, Richard Earnshaw, gcc-patches

Hi Thomas,

On 11/12/18 16:09, Thomas Preudhomme wrote:
> Hi Kyrill,
>
> I've tested on armeb-none-eabi with -mslow-flash-data for both
> -mfloat-abi=hard and -mfloat-abi=soft. Both show no regression and the
> former shows some new PASS.
>
> Regarding the part you are hesitant about, the code was taken from
> aarch64_reinterpret_float_as_int in config/aarch64/aarch64.c. I'm not
> too keen on splitting the patch unless it's just for review (ie still
> committed as one) since the changes really go together. The tighter
> predicate and constraint are to prevent normal pattern to match when
> -mslow-flash-data is in effect while the new splitter and expander is
> to deal with load under those circumstances.
>
> Best regards,
> Thomas

Ok. Thanks for the explanation.
Kyrill

> On Fri, 30 Nov 2018 at 14:11, Kyrill Tkachov
> <kyrylo.tkachov@foss.arm.com> wrote:
>> Hi Thomas,
>>
>> On 19/11/18 17:56, Thomas Preudhomme wrote:
>>> Hi,
>>>
>>> Current code to handle -mslow-flash-data in machine description files
>>> suffers from a number of issues which this patch fixes:
>>>
>>> 1) The insn_and_split in vfp.md to load a generic floating-point
>>> constant via GPR first and move it to VFP register are guarded by
>>> !reload_completed which is forbidden explicitely in the GCC internals
>>> documentation section 17.2 point 3;
>>>
>>> 2) A number of testcase in the testsuite ICEs under -mslow-flash-data
>>> when targeting the hardfloat ABI [1];
>>>
>>> 3) Instructions performing load from literal pool are not disabled.
>>>
>>> These problems are addressed by 2 separate actions:
>>>
>>> 1) Making the splitters take a clobber and changing the expanders
>>> accordingly to generate a mov with clobber in cases where a literal
>>> pool would be used. The splitter can thus be enabled after reload since
>>> it does not call gen_reg_rtx anymore;
>>>
>>> 2) Adding new predicates and constraints to disable literal pool loads
>>> in existing instructions when -mslow-flash-data is in effect.
>>>
>> Please split these into two separate patches so we can more clearly see which changes address which problem
>>
>>> The patch also rework the splitter for DFmode slightly to generate an
>>> intermediate DI load instead of 2 intermediate SI loads, thus relying on
>>> the existing DI splitters instead of redoing their job. At last, the
>>> patch adds some missing arm_fp_ok effective target to some of the
>>> slow-flash-data testcases.
>>>
>>> [1]
>>> c-c++-common/Wunused-var-3.c
>>> gcc.c-torture/compile/pr72771.c
>>> gcc.c-torture/compile/vector-5.c
>>> gcc.c-torture/compile/vector-6.c
>>> gcc.c-torture/execute/20030914-1.c
>>> gcc.c-torture/execute/20050316-1.c
>>> gcc.c-torture/execute/pr59643.c
>>> gcc.dg/builtin-tgmath-1.c
>>> gcc.dg/debug/pr55730.c
>>> gcc.dg/graphite/interchange-7.c
>>> gcc.dg/pr56890-2.c
>>> gcc.dg/pr68474.c
>>> gcc.dg/pr80286.c
>>> gcc.dg/torture/pr35227.c
>>> gcc.dg/torture/pr65077.c
>>> gcc.dg/torture/pr86363.c
>>> g++.dg/torture/pr81112.C
>>> g++.dg/torture/pr82985.C
>>> g++.dg/warn/Wunused-var-7.C
>>> and a lot more in libstdc++ in special_functions/*_comp_ellint_* and
>>> special_functions/*_ellint_* directories.
>>>
>>> ChangeLog entries are as follows:
>>>
>>> *** gcc/ChangeLog ***
>>>
>>> 2018-11-14  Thomas Preud'homme <thomas.preudhomme@arm.com>
>>>
>>>          * config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and
>>>          source is a constant that would be loaded by literal pool.
>>>          (movsf expander): Generate a no_literal_pool_sf_immediate insn if
>>>          -mslow-flash-data is present, targeting hardfloat ABI and source is a
>>>          float constant that cannot be loaded via vmov.
>>>          (movdf expander): Likewise but generate a no_literal_pool_df_immediate
>>>          insn.
>>>          (arm_movsf_soft_insn): Split if -mslow-flash-data and source is a
>>>          float constant that would be loaded by literal pool.
>>>          (softfloat constant movsf splitter): Splitter for the above case.
>>>          (movdf_soft_insn): Split if -mslow-flash-data and source is a float
>>>          constant that would be loaded by literal pool.
>>>          (softfloat constant movdf splitter): Splitter for the above case.
>>>          * config/arm/constraints.md (Pz): Document existing constraint.
>>>          (Ha): Define constraint.
>>>          (Tu): Likewise.
>>>          * config/arm/predicates.md (hard_sf_operand): New predicate.
>>>          (hard_df_operand): Likewise.
>>>          * config/arm/thumb2.md (thumb2_movsi_insn): Split if
>>>          -mslow-flash-data and constant would be loaded by literal pool.
>>>          * constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable constant
>>>          load in VFP register.
>>>          (movdi_vfp): Likewise.
>>>          (thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to
>>>          prevent match for a constant load if -mslow-flash-data and constant
>>>          cannot be loaded via vmov.  Adapt constraint accordingly by
>>>          using Ha instead of E for generic floating-point constant load.
>>>          (thumb2_movdf_vfp): Likewise using hard_df_operand predicate instead.
>>>          (no_literal_pool_df_immediate): Add a clobber to use as the
>>>          intermediate general purpose register and also enable it after reload
>>>          but disable it constant is a valid FP constant.  Add constraints and
>>>          generate a DI intermediate load rather than 2 SI loads.
>>>          (no_literal_pool_sf_immediate): Add a clobber to use as the
>>>          intermediate general purpose register and also enable it after
>>>          reload.
>>>
>>> *** gcc/testsuite/ChangeLog ***
>>>
>>> 2018-11-14  Thomas Preud'homme <thomas.preudhomme@arm.com>
>>>
>>>          * gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok
>>>          effective target.
>>>          * gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
>>>          * gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
>>>          * gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.
>>>
>>> Testing: Built arm-none-eabi cross compilers for Armv7E-M defaulting to
>>> softfloat and hardfloat ABI which showed no regression and some
>>> FAIL->PASS for hardfloat ABI. Bootstraped on Arm and Thumb-2 without any
>>> regression. Compiled SPEC2k6 without -mslow-flash-data and checked that
>>> code generation didn't change.
>>>
>>> Is this ok for stage3?
>>>
>>> Best regards,
>>>
>>> Thomas
>> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>> index a773518cefaf8451e77fead9e072ee8ef39f1eb8..a08298bbb9f93fc132aa64a206fad64dcda9ed65 100644
>> --- a/gcc/config/arm/arm.md
>> +++ b/gcc/config/arm/arm.md
>> @@ -5831,6 +5831,11 @@
>>        case 1:
>>        case 2:
>>          return \"#\";
>> +    case 3:
>> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
>> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
>> +       return \"#\";
>> +      /* Fall through.  */
>>        default:
>>          return output_move_double (operands, true, NULL);
>>        }
>> @@ -6939,6 +6944,20 @@
>>               operands[1] = force_reg (SFmode, operands[1]);
>>            }
>>        }
>> +
>> +  /* Cannot load it directly, generate a load with clobber so that it can be
>> +     loaded via GPR with MOV / MOVT.  */
>> +  if (arm_disable_literal_pool
>> +      && (REG_P (operands[0]) || SUBREG_P (operands[0]))
>> +      && CONST_DOUBLE_P (operands[1])
>> +      && TARGET_HARD_FLOAT
>> +      && !vfp3_const_double_rtx (operands[1]))
>> +    {
>> +      rtx clobreg = gen_reg_rtx (SFmode);
>> +      emit_insn (gen_no_literal_pool_sf_immediate (operands[0], operands[1],
>> +                                                  clobreg));
>> +      DONE;
>> +    }
>>      "
>>    )
>>
>> @@ -6966,10 +6985,19 @@
>>       && TARGET_SOFT_FLOAT
>>       && (!MEM_P (operands[0])
>>           || register_operand (operands[1], SFmode))"
>> -  "@
>> -   mov%?\\t%0, %1
>> -   ldr%?\\t%0, %1\\t%@ float
>> -   str%?\\t%1, %0\\t%@ float"
>> +{
>> +  switch (which_alternative)
>> +    {
>> +    case 0: return \"mov%?\\t%0, %1\";
>> +    case 1:
>> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
>> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
>> +       return \"#\";
>> +      return \"ldr%?\\t%0, %1\\t%@ float\";
>> +    case 2: return \"str%?\\t%1, %0\\t%@ float\";
>> +    default: gcc_unreachable ();
>> +    }
>> +}
>>      [(set_attr "predicable" "yes")
>>       (set_attr "type" "mov_reg,load_4,store_4")
>>       (set_attr "arm_pool_range" "*,4096,*")
>> @@ -6978,6 +7006,21 @@
>>       (set_attr "thumb2_neg_pool_range" "*,0,*")]
>>    )
>>
>> +;; Splitter for the above.
>> +(define_split
>> +  [(set (match_operand:SF 0 "s_register_operand")
>> +       (match_operand:SF 1 "const_double_operand"))]
>> +  "arm_disable_literal_pool && TARGET_SOFT_FLOAT"
>> +  [(const_int 0)]
>> +{
>> +  long buf;
>> +  real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
>> +  rtx cst = gen_int_mode (buf, SImode);
>> +  emit_move_insn (simplify_gen_subreg (SImode, operands[0], SFmode, 0), cst);
>> +  DONE;
>> +}
>> +)
>> +
>>    (define_expand "movdf"
>>      [(set (match_operand:DF 0 "general_operand" "")
>>          (match_operand:DF 1 "general_operand" ""))]
>> @@ -6996,6 +7039,21 @@
>>              operands[1] = force_reg (DFmode, operands[1]);
>>            }
>>        }
>> +
>> +  /* Cannot load it directly, generate a load with clobber so that it can be
>> +     loaded via GPR with MOV / MOVT.  */
>> +  if (arm_disable_literal_pool
>> +      && (REG_P (operands[0]) || SUBREG_P (operands[0]))
>> +      && CONSTANT_P (operands[1])
>> +      && TARGET_HARD_FLOAT
>> +      && !arm_const_double_rtx (operands[1])
>> +      && !(TARGET_VFP_DOUBLE && vfp3_const_double_rtx (operands[1])))
>> +    {
>> +      rtx clobreg = gen_reg_rtx (DFmode);
>> +      emit_insn (gen_no_literal_pool_df_immediate (operands[0], operands[1],
>> +                                                  clobreg));
>> +      DONE;
>> +    }
>>      "
>>    )
>>
>> @@ -7055,6 +7113,11 @@
>>        case 1:
>>        case 2:
>>          return \"#\";
>> +    case 3:
>> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
>> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
>> +       return \"#\";
>> +      /* Fall through.  */
>>        default:
>>          return output_move_double (operands, true, NULL);
>>        }
>> @@ -7066,6 +7129,24 @@
>>       (set_attr "arm_neg_pool_range" "*,*,*,1004,*")
>>       (set_attr "thumb2_neg_pool_range" "*,*,*,0,*")]
>>    )
>> +
>> +;; Splitter for the above.
>> +(define_split
>> +  [(set (match_operand:DF 0 "s_register_operand")
>> +       (match_operand:DF 1 "const_double_operand"))]
>> +  "arm_disable_literal_pool && TARGET_SOFT_FLOAT"
>> +  [(const_int 0)]
>> +{
>> +  long buf[2];
>> +  int order = BYTES_BIG_ENDIAN ? 1 : 0;
>> +  real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
>> +  unsigned HOST_WIDE_INT ival = zext_hwi (buf[order], 32);
>> +  ival |= (zext_hwi (buf[1 - order], 32) << 32);
>> +  rtx cst = gen_int_mode (ival, DImode);
>> +  emit_move_insn (simplify_gen_subreg (DImode, operands[0], DFmode, 0), cst);
>>
>> This is the part I'm most hesitant about, especially for big-endian.
>> Did you run any armeb tests tahat exercise this?
>> Would you not want to use gen_highpart_mode/gen_lowpart that handles all the endianness-subreg subtleties for you?
>>
>>
>> Thanks,
>> Kyrill
>>
>>
>>    +  DONE;
>> +}
>> +)
>>
>>
>>    ;; load- and store-multiple insns
>> diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
>> index 7576c6fc401fc5ce25245fa2b740db99169ce7ce..657e540816bdd82cddd23059dea2be19df7eb1bb 100644
>> --- a/gcc/config/arm/constraints.md
>> +++ b/gcc/config/arm/constraints.md
>> @@ -31,9 +31,10 @@
>>    ;; 'H' was previously used for FPA.
>>
>>    ;; The following multi-letter normal constraints have been used:
>> -;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
>> +;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp,
>> +;;                      Dz, Tu
>>    ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
>> -;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
>> +;; in Thumb-2 state: Ha, Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py, Pz
>>    ;; in all states: Pf
>>
>>    ;; The following memory constraints have been used:
>> @@ -234,6 +235,12 @@
>>     (and (match_code "const_double")
>>          (match_test "TARGET_32BIT && arm_const_double_rtx (op)")))
>>
>> +(define_constraint "Ha"
>> +  "@internal In ARM / Thumb-2 a float constant iff literal pools are allowed."
>> +  (and (match_code "const_double")
>> +       (match_test "satisfies_constraint_E (op)")
>> +       (match_test "!arm_disable_literal_pool")))
>> +
>>    (define_constraint "Dz"
>>     "@internal
>>      In ARM/Thumb-2 state a vector of constant zeros."
>> @@ -351,6 +358,12 @@
>>           (match_test "TARGET_32BIT
>>                      && vfp3_const_double_for_bits (op) > 0")))
>>
>> +(define_constraint "Tu"
>> +  "@internal In ARM / Thumb-2 an integer constant iff literal pools are
>> +   allowed."
>> +  (and (match_test "CONSTANT_P (op)")
>> +       (match_test "!arm_disable_literal_pool")))
>> +
>>    (define_register_constraint "Ts" "(arm_restrict_it) ? LO_REGS : GENERAL_REGS"
>>     "For arm_restrict_it the core registers @code{r0}-@code{r7}.  GENERAL_REGS otherwise.")
>>
>> diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
>> index 7e198f9bce441c55913615e4c601a760d7e62c20..f73264cc2a07cacec5e7c4e31ce12299a1fadd0b 100644
>> --- a/gcc/config/arm/predicates.md
>> +++ b/gcc/config/arm/predicates.md
>> @@ -456,6 +456,24 @@
>>           (and (match_code "reg,subreg,mem")
>>              (match_operand 0 "nonimmediate_soft_df_operand"))))
>>
>> +;; Predicate for thumb2_movsf_vfp.  Compared to general_operand, this
>> +;; forbids constant loaded via literal pool iff literal pools are disabled.
>> +(define_predicate "hard_sf_operand"
>> +  (and (match_operand 0 "general_operand")
>> +       (ior (not (match_code "const_double"))
>> +           (not (match_test "arm_disable_literal_pool"))
>> +           (match_test "satisfies_constraint_Dv (op)"))))
>> +
>> +;; Predicate for thumb2_movdf_vfp.  Compared to soft_df_operand used in
>> +;; movdf_soft_insn, this forbids constant loaded via literal pool iff
>> +;; literal pools are disabled.
>> +(define_predicate "hard_df_operand"
>> +  (and (match_operand 0 "soft_df_operand")
>> +       (ior (not (match_code "const_double"))
>> +           (not (match_test "arm_disable_literal_pool"))
>> +           (match_test "satisfies_constraint_Dy (op)")
>> +           (match_test "satisfies_constraint_G (op)"))))
>> +
>>    (define_special_predicate "load_multiple_operation"
>>      (match_code "parallel")
>>    {
>> diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
>> index c42670f8643c3286bc5abf537d4fd0483cba68ac..727ceb9b37957efbc7ab8809f57e8825deb6b1df 100644
>> --- a/gcc/config/arm/thumb2.md
>> +++ b/gcc/config/arm/thumb2.md
>> @@ -252,16 +252,26 @@
>>      "TARGET_THUMB2 && !TARGET_IWMMXT && !TARGET_HARD_FLOAT
>>       && (   register_operand (operands[0], SImode)
>>           || register_operand (operands[1], SImode))"
>> -  "@
>> -   mov%?\\t%0, %1
>> -   mov%?\\t%0, %1
>> -   mov%?\\t%0, %1
>> -   mvn%?\\t%0, #%B1
>> -   movw%?\\t%0, %1
>> -   ldr%?\\t%0, %1
>> -   ldr%?\\t%0, %1
>> -   str%?\\t%1, %0
>> -   str%?\\t%1, %0"
>> +{
>> +  switch (which_alternative)
>> +    {
>> +    case 0:
>> +    case 1:
>> +    case 2:
>> +      return \"mov%?\\t%0, %1\";
>> +    case 3: return \"mvn%?\\t%0, #%B1\";
>> +    case 4: return \"movw%?\\t%0, %1\";
>> +    case 5:
>> +    case 6:
>> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
>> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
>> +       return \"#\";
>> +      return \"ldr%?\\t%0, %1\";
>> +    case 7:
>> +    case 8: return \"str%?\\t%1, %0\";
>> +    default: gcc_unreachable ();
>> +    }
>> +}
>>      [(set_attr "type" "mov_reg,mov_imm,mov_imm,mvn_imm,mov_imm,load_4,load_4,store_4,store_4")
>>       (set_attr "length" "2,4,2,4,4,4,4,4,4")
>>       (set_attr "predicable" "yes")
>> diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
>> index 611ebe2d83698e3129df6c55a03e4f5f33c891e7..f3d4f30cb53d82e2ffd2c4fcaad2cc873d97c24b 100644
>> --- a/gcc/config/arm/vfp.md
>> +++ b/gcc/config/arm/vfp.md
>> @@ -259,7 +259,7 @@
>>    ;; arm_restrict_it.
>>    (define_insn "*thumb2_movsi_vfp"
>>      [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r, l,*hk,m, *m,*t, r,*t,*t,  *Uv")
>> -       (match_operand:SI 1 "general_operand"      "rk,I,Py,K,j,mi,*mi,l,*hk, r,*t,*t,*Uvi,*t"))]
>> +       (match_operand:SI 1 "general_operand"      "rk,I,Py,K,j,mi,*mi,l,*hk, r,*t,*t,*UvTu,*t"))]
>>      "TARGET_THUMB2 && TARGET_HARD_FLOAT
>>       && (   s_register_operand (operands[0], SImode)
>>           || s_register_operand (operands[1], SImode))"
>> @@ -276,6 +276,9 @@
>>          return \"movw%?\\t%0, %1\";
>>        case 5:
>>        case 6:
>> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
>> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
>> +       return \"#\";
>>          return \"ldr%?\\t%0, %1\";
>>        case 7:
>>        case 8:
>> @@ -305,7 +308,7 @@
>>
>>    (define_insn "*movdi_vfp"
>>      [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,q,q,m,w,!r,w,w, Uv")
>> -       (match_operand:DI 1 "di_operand"              "r,rDa,Db,Dc,mi,mi,q,r,w,w,Uvi,w"))]
>> +       (match_operand:DI 1 "di_operand"              "r,rDa,Db,Dc,mi,mi,q,r,w,w,UvTu,w"))]
>>      "TARGET_32BIT && TARGET_HARD_FLOAT
>>       && (   register_operand (operands[0], DImode)
>>           || register_operand (operands[1], DImode))
>> @@ -321,6 +324,10 @@
>>          return \"#\";
>>        case 4:
>>        case 5:
>> +      /* Cannot load it directly, split to load it via MOV / MOVT.  */
>> +      if (!MEM_P (operands[1]) && arm_disable_literal_pool)
>> +       return \"#\";
>> +      /* Fall through.  */
>>        case 6:
>>          return output_move_double (operands, true, NULL);
>>        case 7:
>> @@ -587,7 +594,7 @@
>>
>>    (define_insn "*thumb2_movsf_vfp"
>>      [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t  ,Uv,r ,m,t,r")
>> -       (match_operand:SF 1 "general_operand"      " ?r,t,Dv,UvE,t, mE,r,t,r"))]
>> +       (match_operand:SF 1 "hard_sf_operand"      " ?r,t,Dv,UvHa,t, mHa,r,t,r"))]
>>      "TARGET_THUMB2 && TARGET_HARD_FLOAT
>>       && (   s_register_operand (operands[0], SFmode)
>>           || s_register_operand (operands[1], SFmode))"
>> @@ -676,7 +683,7 @@
>>
>>    (define_insn "*thumb2_movdf_vfp"
>>      [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w  ,Uv,r ,m,w,r")
>> -       (match_operand:DF 1 "soft_df_operand"              " ?r,w,Dy,G,UvF,w, mF,r, w,r"))]
>> +       (match_operand:DF 1 "hard_df_operand"              " ?r,w,Dy,G,UvHa,w, mHa,r, w,r"))]
>>      "TARGET_THUMB2 && TARGET_HARD_FLOAT
>>       && (   register_operand (operands[0], DFmode)
>>           || register_operand (operands[1], DFmode))"
>> @@ -1983,39 +1990,50 @@
>>    ;; Support for xD (single precision only) variants.
>>    ;; fmrrs, fmsrr
>>
>> -;; Split an immediate DF move to two immediate SI moves.
>> +;; Load a DF immediate via GPR (where combinations of MOV and MOVT can be used)
>> +;; and then move it into a VFP register.
>>    (define_insn_and_split "no_literal_pool_df_immediate"
>> -  [(set (match_operand:DF 0 "s_register_operand" "")
>> -       (match_operand:DF 1 "const_double_operand" ""))]
>> -  "TARGET_THUMB2 && arm_disable_literal_pool
>> -  && !(TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE
>> -       && vfp3_const_double_rtx (operands[1]))"
>> +  [(set (match_operand:DF 0 "s_register_operand" "=w")
>> +       (match_operand:DF 1 "const_double_operand" "F"))
>> +   (clobber (match_operand:DF 2 "s_register_operand" "=r"))]
>> +  "arm_disable_literal_pool
>> +   && TARGET_HARD_FLOAT
>> +   && !arm_const_double_rtx (operands[1])
>> +   && !(TARGET_VFP_DOUBLE && vfp3_const_double_rtx (operands[1]))"
>>      "#"
>> -  "&& !reload_completed"
>> -  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
>> -   (set (subreg:SI (match_dup 1) 4) (match_dup 3))
>> -   (set (match_dup 0) (match_dup 1))]
>> -  "
>> +  ""
>> +  [(const_int 0)]
>> +{
>>      long buf[2];
>> +  int order = BYTES_BIG_ENDIAN ? 1 : 0;
>>      real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode);
>> -  operands[2] = GEN_INT ((int) buf[0]);
>> -  operands[3] = GEN_INT ((int) buf[1]);
>> -  operands[1] = gen_reg_rtx (DFmode);
>> -  ")
>> +  unsigned HOST_WIDE_INT ival = zext_hwi (buf[order], 32);
>> +  ival |= (zext_hwi (buf[1 - order], 32) << 32);
>> +  rtx cst = gen_int_mode (ival, DImode);
>> +  emit_move_insn (simplify_gen_subreg (DImode, operands[2], DFmode, 0), cst);
>> +  emit_move_insn (operands[0], operands[2]);
>> +  DONE;
>> +}
>> +)
>>
>> -;; Split an immediate SF move to one immediate SI move.
>> +;; Load a SF immediate via GPR (where combinations of MOV and MOVT can be used)
>> +;; and then move it into a VFP register.
>>    (define_insn_and_split "no_literal_pool_sf_immediate"
>> -  [(set (match_operand:SF 0 "s_register_operand" "")
>> -       (match_operand:SF 1 "const_double_operand" ""))]
>> -  "TARGET_THUMB2 && arm_disable_literal_pool
>> -  && !(TARGET_HARD_FLOAT && vfp3_const_double_rtx (operands[1]))"
>> +  [(set (match_operand:SF 0 "s_register_operand" "=t")
>> +       (match_operand:SF 1 "const_double_operand" "E"))
>> +   (clobber (match_operand:SF 2 "s_register_operand" "=r"))]
>> +  "arm_disable_literal_pool
>> +   && TARGET_HARD_FLOAT
>> +   && !vfp3_const_double_rtx (operands[1])"
>>      "#"
>> -  "&& !reload_completed"
>> -  [(set (subreg:SI (match_dup 1) 0) (match_dup 2))
>> -   (set (match_dup 0) (match_dup 1))]
>> -  "
>> +  ""
>> +  [(const_int 0)]
>> +{
>>      long buf;
>>      real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode);
>> -  operands[2] = GEN_INT ((int) buf);
>> -  operands[1] = gen_reg_rtx (SFmode);
>> -  ")
>> +  rtx cst = gen_int_mode (buf, SImode);
>> +  emit_move_insn (simplify_gen_subreg (SImode, operands[2], SFmode, 0), cst);
>> +  emit_move_insn (operands[0], operands[2]);
>> +  DONE;
>> +}
>> +)
>> diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
>> index 90bd44e27e5c53d34f2816f4d6320acbc1dc709b..231243759cfe486c390ca27f10bd06177f60bd43 100644
>> --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
>> +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c
>> @@ -1,6 +1,7 @@
>>    /* { dg-do compile } */
>>    /* { dg-require-effective-target arm_cortex_m } */
>>    /* { dg-require-effective-target arm_thumb2_ok } */
>> +/* { dg-require-effective-target arm_fp_ok } */
>>    /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
>>    /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
>>    /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
>> diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
>> index 5d9cd9c4df28837b81b2de48c25d38cdf2c15999..27e72ec20863866acdc5e7fea632bc6880678dfd 100644
>> --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
>> +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c
>> @@ -1,6 +1,7 @@
>>    /* { dg-do compile } */
>>    /* { dg-require-effective-target arm_cortex_m } */
>>    /* { dg-require-effective-target arm_thumb2_ok } */
>> +/* { dg-require-effective-target arm_fp_ok } */
>>    /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
>>    /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
>>    /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
>> diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
>> index 0eeddd5e6ec1f42a96fc6220277f9ecb7cad44f5..8dbe87a1e68d5eb2edfd8259948988fbe0658ced 100644
>> --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
>> +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c
>> @@ -1,6 +1,7 @@
>>    /* { dg-do compile } */
>>    /* { dg-require-effective-target arm_cortex_m } */
>>    /* { dg-require-effective-target arm_thumb2_ok } */
>> +/* { dg-require-effective-target arm_fp_ok } */
>>    /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
>>    /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
>>    /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
>> diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
>> index 7d52f3801b6d4b62b27833871ac830d6d077894d..b98eb7624e42b5a7f4a11c604c7d2826339bcfd5 100644
>> --- a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
>> +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c
>> @@ -1,6 +1,7 @@
>>    /* { dg-do compile } */
>>    /* { dg-require-effective-target arm_cortex_m } */
>>    /* { dg-require-effective-target arm_thumb2_ok } */
>> +/* { dg-require-effective-target arm_fp_ok } */
>>    /* { dg-skip-if "avoid conflicts with multilib options" { *-*-* } { "-mcpu=*" } { "-mcpu=cortex-m4" "-mcpu=cortex-m7" } } */
>>    /* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */
>>    /* { dg-skip-if "-mslow-flash-data and -mword-relocations incompatible" { *-*-* } { "-mword-relocations" } } */
>>
>>
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-12-14 16:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-19 17:56 [PATCH, ARM] Improve robustness of -mslow-flash-data Thomas Preudhomme
2018-11-20 10:23 ` Christophe Lyon
2018-11-26 10:01 ` [PATCH, ARM, ping] " Thomas Preudhomme
2018-11-30 14:11 ` [PATCH, ARM] " Kyrill Tkachov
2018-12-11 16:09   ` Thomas Preudhomme
2018-12-14 16:14     ` Kyrill Tkachov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).