public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling
@ 2023-10-19 14:02 Xi Ruoyao
  2023-10-19 14:02 ` [PATCH 1/5] LoongArch: Add enum-style -mexplicit-relocs= option Xi Ruoyao
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Xi Ruoyao @ 2023-10-19 14:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: chenglulu, i, xuchenghua, mengqinggang, Xi Ruoyao

For relaxation we are now generating assembler macros for symbolic
addresses everywhere, but this is limiting scheduling and there are
known situations where the relaxation cannot improve the code.

1. When we are performing LTO during a final link and the linker plugin
is used, la.global won't be relaxed because they reference to an
external or preemptable symbol.
2. The linker currently do not relax la.tls.*.
3. For la.local + ld/st pairs, if the address is only used once,
emitting pcalau12i + ld/st is always not worse than relying on linker
relaxation.

Add -mexplicit-relocs=auto to allow the compiler to use explicit relocs
for these cases, but assembler macros for other cases.  Use it as the
default if the assembler supports both explicit relocs and relaxation.

LTO-bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?

Xi Ruoyao (5):
  LoongArch: Add enum-style -mexplicit-relocs= option
  LoongArch: Use explicit relocs for GOT access when
    -mexplicit-relocs=auto and LTO during a final link with linker
    plugin
  LoongArch: Use explicit relocs for TLS access with
    -mexplicit-relocs=auto
  LoongArch: Use explicit relocs for addresses only used for one load or
    store with -mexplicit-relocs=auto and -mcmodel={normal,medium}
  LoongArch: Document -mexplicit-relocs={auto,none,always}

 .../loongarch/genopts/loongarch-strings       |   6 +
 gcc/config/loongarch/genopts/loongarch.opt.in |  21 ++-
 gcc/config/loongarch/loongarch-def.h          |   6 +
 gcc/config/loongarch/loongarch-protos.h       |   1 +
 gcc/config/loongarch/loongarch-str.h          |   5 +
 gcc/config/loongarch/loongarch.cc             |  75 ++++++++--
 gcc/config/loongarch/loongarch.h              |   3 +
 gcc/config/loongarch/loongarch.md             | 128 +++++++++++++++++-
 gcc/config/loongarch/loongarch.opt            |  21 ++-
 gcc/config/loongarch/predicates.md            |  15 +-
 gcc/doc/invoke.texi                           |  37 +++--
 .../loongarch/explicit-relocs-auto-lto.c      |  26 ++++
 ...-relocs-auto-single-load-store-no-anchor.c |   6 +
 .../explicit-relocs-auto-single-load-store.c  |  14 ++
 .../explicit-relocs-auto-tls-ld-gd.c          |   9 ++
 .../explicit-relocs-auto-tls-le-ie.c          |   6 +
 16 files changed, 343 insertions(+), 36 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-lto.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store-no-anchor.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-tls-ld-gd.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-tls-le-ie.c

-- 
2.42.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/5] LoongArch: Add enum-style -mexplicit-relocs= option
  2023-10-19 14:02 [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling Xi Ruoyao
@ 2023-10-19 14:02 ` Xi Ruoyao
  2023-10-19 14:02 ` [PATCH 2/5] LoongArch: Use explicit relocs for GOT access when -mexplicit-relocs=auto and LTO during a final link with linker plugin Xi Ruoyao
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Xi Ruoyao @ 2023-10-19 14:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: chenglulu, i, xuchenghua, mengqinggang, Xi Ruoyao

To take a better balance between scheduling and relaxation when -flto is
enabled, add three-way -mexplicit-relocs={auto,none,always} options.
The old -mexplicit-relocs and -mno-explicit-relocs options are still
supported, they are mapped to -mexplicit-relocs=always and
-mexplicit-relocs=none.

The default choice is determined by probing assembler capabilities at
build time.  If the assembler does not supports explicit relocs at all,
the default will be none; if it supports explicit relocs but not
relaxation, the default will be always; if both explicit relocs and
relaxation are supported, the default will be auto.

Currently auto is same as none.  We will make auto more clever in
following changes.

gcc/ChangeLog:

	* config/loongarch/genopts/loongarch-strings: Add strings for
	-mexplicit-relocs={auto,none,always}.
	* config/loongarch/genopts/loongarch.opt.in: Add options for
	-mexplicit-relocs={auto,none,always}.
	* config/loongarch/loongarch-str.h: Regenerate.
	* config/loongarch/loongarch.opt: Regenerate.
	* config/loongarch/loongarch-def.h
	(EXPLICIT_RELOCS_AUTO): Define.
	(EXPLICIT_RELOCS_NONE): Define.
	(EXPLICIT_RELOCS_ALWAYS): Define.
	(N_EXPLICIT_RELOCS_TYPES): Define.
	* config/loongarch/loongarch.cc
	(loongarch_option_override_internal): Error out if the old-style
	-m[no-]explicit-relocs option is used with
	-mexplicit-relocs={auto,none,always} together.  Map
	-mno-explicit-relocs to -mexplicit-relocs=none and
	-mexplicit-relocs to -mexplicit-relocs=always for backward
	compatibility.  Set a proper default for -mexplicit-relocs=
	based on configure-time probed linker capability.  Update a
	diagnostic message to mention -mexplicit-relocs=always instead
	of the old-style -mexplicit-relocs.
	(loongarch_handle_model_attribute): Update a diagnostic message
	to mention -mexplicit-relocs=always instead of the old-style
	-mexplicit-relocs.
	* config/loongarch/loongarch.h (TARGET_EXPLICIT_RELOCS): Define.
---
 .../loongarch/genopts/loongarch-strings       |  6 +++++
 gcc/config/loongarch/genopts/loongarch.opt.in | 21 ++++++++++++++--
 gcc/config/loongarch/loongarch-def.h          |  6 +++++
 gcc/config/loongarch/loongarch-str.h          |  5 ++++
 gcc/config/loongarch/loongarch.cc             | 24 +++++++++++++++++--
 gcc/config/loongarch/loongarch.h              |  3 +++
 gcc/config/loongarch/loongarch.opt            | 21 ++++++++++++++--
 7 files changed, 80 insertions(+), 6 deletions(-)

diff --git a/gcc/config/loongarch/genopts/loongarch-strings b/gcc/config/loongarch/genopts/loongarch-strings
index adecaec3eda..8e412f7536e 100644
--- a/gcc/config/loongarch/genopts/loongarch-strings
+++ b/gcc/config/loongarch/genopts/loongarch-strings
@@ -63,3 +63,9 @@ STR_CMODEL_TS	      tiny-static
 STR_CMODEL_MEDIUM     medium
 STR_CMODEL_LARGE      large
 STR_CMODEL_EXTREME    extreme
+
+# -mexplicit-relocs
+OPTSTR_EXPLICIT_RELOCS		explicit-relocs
+STR_EXPLICIT_RELOCS_AUTO	auto
+STR_EXPLICIT_RELOCS_NONE	none
+STR_EXPLICIT_RELOCS_ALWAYS	always
diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in b/gcc/config/loongarch/genopts/loongarch.opt.in
index 4a2d7438f1b..e1fe0c7086e 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -170,10 +170,27 @@ mmax-inline-memcpy-size=
 Target Joined RejectNegative UInteger Var(loongarch_max_inline_memcpy_size) Init(1024)
 -mmax-inline-memcpy-size=SIZE	Set the max size of memcpy to inline, default is 1024.
 
-mexplicit-relocs
-Target Var(TARGET_EXPLICIT_RELOCS) Init(HAVE_AS_EXPLICIT_RELOCS & !HAVE_AS_MRELAX_OPTION)
+Enum
+Name(explicit_relocs) Type(int)
+The code model option names for -mexplicit-relocs:
+
+EnumValue
+Enum(explicit_relocs) String(@@STR_EXPLICIT_RELOCS_AUTO@@) Value(EXPLICIT_RELOCS_AUTO)
+
+EnumValue
+Enum(explicit_relocs) String(@@STR_EXPLICIT_RELOCS_NONE@@) Value(EXPLICIT_RELOCS_NONE)
+
+EnumValue
+Enum(explicit_relocs) String(@@STR_EXPLICIT_RELOCS_ALWAYS@@) Value(EXPLICIT_RELOCS_ALWAYS)
+
+mexplicit-relocs=
+Target RejectNegative Joined Enum(explicit_relocs) Var(la_opt_explicit_relocs) Init(M_OPT_UNSET)
 Use %reloc() assembly operators.
 
+mexplicit-relocs
+Target Var(la_opt_explicit_relocs_backward) Init(M_OPT_UNSET)
+Use %reloc() assembly operators (for backward compatibility).
+
 ; The code model option names for -mcmodel.
 Enum
 Name(cmodel) Type(int)
diff --git a/gcc/config/loongarch/loongarch-def.h b/gcc/config/loongarch/loongarch-def.h
index 769efcb70fb..6e2a6987910 100644
--- a/gcc/config/loongarch/loongarch-def.h
+++ b/gcc/config/loongarch/loongarch-def.h
@@ -99,6 +99,12 @@ extern const char* loongarch_cmodel_strings[];
 #define CMODEL_EXTREME	      5
 #define N_CMODEL_TYPES	      6
 
+/* enum explicit_relocs */
+#define EXPLICIT_RELOCS_AUTO	0
+#define EXPLICIT_RELOCS_NONE	1
+#define EXPLICIT_RELOCS_ALWAYS	2
+#define N_EXPLICIT_RELOCS_TYPES	3
+
 /* The common default value for variables whose assignments
    are triggered by command-line options.  */
 
diff --git a/gcc/config/loongarch/loongarch-str.h b/gcc/config/loongarch/loongarch-str.h
index a3e0510493b..072558c28f1 100644
--- a/gcc/config/loongarch/loongarch-str.h
+++ b/gcc/config/loongarch/loongarch-str.h
@@ -62,4 +62,9 @@ along with GCC; see the file COPYING3.  If not see
 #define STR_CMODEL_LARGE "large"
 #define STR_CMODEL_EXTREME "extreme"
 
+#define OPTSTR_EXPLICIT_RELOCS "explicit-relocs"
+#define STR_EXPLICIT_RELOCS_AUTO "auto"
+#define STR_EXPLICIT_RELOCS_NONE "none"
+#define STR_EXPLICIT_RELOCS_ALWAYS "always"
+
 #endif /* LOONGARCH_STR_H */
diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 73f0c160e5f..5df8b12ed92 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -7387,6 +7387,25 @@ loongarch_option_override_internal (struct gcc_options *opts,
   loongarch_update_gcc_opt_status (&la_target, opts, opts_set);
   loongarch_cpu_option_override (&la_target, opts, opts_set);
 
+  if (la_opt_explicit_relocs != M_OPT_UNSET
+      && la_opt_explicit_relocs_backward != M_OPT_UNSET)
+    error ("do not use %qs (with %qs) and %qs (without %qs) together",
+	   "-mexplicit-relocs=", "=",
+	   la_opt_explicit_relocs_backward ? "-mexplicit-relocs"
+					   : "-mno-explicit-relocs", "=");
+
+  if (la_opt_explicit_relocs_backward != M_OPT_UNSET)
+    la_opt_explicit_relocs = (la_opt_explicit_relocs_backward
+			      ? EXPLICIT_RELOCS_ALWAYS
+			      : EXPLICIT_RELOCS_NONE);
+
+  if (la_opt_explicit_relocs == M_OPT_UNSET)
+    la_opt_explicit_relocs = (HAVE_AS_EXPLICIT_RELOCS
+			      ? (HAVE_AS_MRELAX_OPTION
+				 ? EXPLICIT_RELOCS_AUTO
+				 : EXPLICIT_RELOCS_ALWAYS)
+			      : EXPLICIT_RELOCS_NONE);
+
   if (TARGET_ABI_LP64)
     flag_pcc_struct_return = 0;
 
@@ -7417,7 +7436,7 @@ loongarch_option_override_internal (struct gcc_options *opts,
       case CMODEL_EXTREME:
 	if (!TARGET_EXPLICIT_RELOCS)
 	  error ("code model %qs needs %s",
-		 "extreme", "-mexplicit-relocs");
+		 "extreme", "-mexplicit-relocs=always");
 
 	if (opts->x_flag_plt)
 	  {
@@ -7721,7 +7740,8 @@ loongarch_handle_model_attribute (tree *node, tree name, tree arg, int,
       if (!TARGET_EXPLICIT_RELOCS)
 	{
 	  error_at (DECL_SOURCE_LOCATION (decl),
-		    "%qE attribute requires %s", name, "-mexplicit-relocs");
+		    "%qE attribute requires %s", name,
+		    "-mexplicit-relocs=always");
 	  *no_add_attrs = true;
 	  return NULL_TREE;
 	}
diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index da3ec2add9a..184e8bea6a5 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -1231,3 +1231,6 @@ struct GTY (()) machine_function
   (TARGET_HARD_FLOAT_ABI ? (TARGET_DOUBLE_FLOAT_ABI ? 8 : 4) : 0)
 
 #define FUNCTION_VALUE_REGNO_P(N) ((N) == GP_RETURN || (N) == FP_RETURN)
+
+#define TARGET_EXPLICIT_RELOCS \
+  (la_opt_explicit_relocs == EXPLICIT_RELOCS_ALWAYS)
diff --git a/gcc/config/loongarch/loongarch.opt b/gcc/config/loongarch/loongarch.opt
index 6215abcac04..02946608327 100644
--- a/gcc/config/loongarch/loongarch.opt
+++ b/gcc/config/loongarch/loongarch.opt
@@ -177,10 +177,27 @@ mmax-inline-memcpy-size=
 Target Joined RejectNegative UInteger Var(loongarch_max_inline_memcpy_size) Init(1024)
 -mmax-inline-memcpy-size=SIZE	Set the max size of memcpy to inline, default is 1024.
 
-mexplicit-relocs
-Target Var(TARGET_EXPLICIT_RELOCS) Init(HAVE_AS_EXPLICIT_RELOCS & !HAVE_AS_MRELAX_OPTION)
+Enum
+Name(explicit_relocs) Type(int)
+The code model option names for -mexplicit-relocs:
+
+EnumValue
+Enum(explicit_relocs) String(auto) Value(EXPLICIT_RELOCS_AUTO)
+
+EnumValue
+Enum(explicit_relocs) String(none) Value(EXPLICIT_RELOCS_NONE)
+
+EnumValue
+Enum(explicit_relocs) String(always) Value(EXPLICIT_RELOCS_ALWAYS)
+
+mexplicit-relocs=
+Target RejectNegative Joined Enum(explicit_relocs) Var(la_opt_explicit_relocs) Init(M_OPT_UNSET)
 Use %reloc() assembly operators.
 
+mexplicit-relocs
+Target Var(la_opt_explicit_relocs_backward) Init(M_OPT_UNSET)
+Use %reloc() assembly operators (for backward compatibility).
+
 ; The code model option names for -mcmodel.
 Enum
 Name(cmodel) Type(int)
-- 
2.42.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2/5] LoongArch: Use explicit relocs for GOT access when -mexplicit-relocs=auto and LTO during a final link with linker plugin
  2023-10-19 14:02 [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling Xi Ruoyao
  2023-10-19 14:02 ` [PATCH 1/5] LoongArch: Add enum-style -mexplicit-relocs= option Xi Ruoyao
@ 2023-10-19 14:02 ` Xi Ruoyao
  2023-10-21  7:32   ` chenglulu
  2023-10-19 14:02 ` [PATCH 3/5] LoongArch: Use explicit relocs for TLS access with -mexplicit-relocs=auto Xi Ruoyao
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 10+ messages in thread
From: Xi Ruoyao @ 2023-10-19 14:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: chenglulu, i, xuchenghua, mengqinggang, Xi Ruoyao

If we are performing LTO for a final link and linker plugin is enabled,
then we are sure any GOT access may resolve to a symbol out of the link
unit (otherwise the linker plugin will tell us the symbol should be
resolved locally and we'll use PC-relative access instead).

Produce machine instructions with explicit relocs instead of la.global
for better scheduling.

gcc/ChangeLog:

	* config/loongarch/loongarch-protos.h
	(loongarch_explicit_relocs_p): Declare new function.
	* config/loongarch/loongarch.cc (loongarch_explicit_relocs_p):
	Implement.
	(loongarch_symbol_insns): Call loongarch_explicit_relocs_p for
	SYMBOL_GOT_DISP, instead of using TARGET_EXPLICIT_RELOCS.
	(loongarch_split_symbol): Call loongarch_explicit_relocs_p for
	deciding if return early, instead of using
	TARGET_EXPLICIT_RELOCS.
	(loongarch_output_move): CAll loongarch_explicit_relocs_p
	instead of using TARGET_EXPLICIT_RELOCS.
	* config/loongarch/loongarch.md (*low<mode>): Remove
	TARGET_EXPLICIT_RELOCS from insn condition.
	(@ld_from_got<mode>): Likewise.
	* config/loongarch/predicates.md (move_operand): Call
	loongarch_explicit_relocs_p instead of using
	TARGET_EXPLICIT_RELOCS.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/explicit-relocs-auto-lto.c: New test.
---
 gcc/config/loongarch/loongarch-protos.h       |  1 +
 gcc/config/loongarch/loongarch.cc             | 34 +++++++++++++++----
 gcc/config/loongarch/loongarch.md             |  4 +--
 gcc/config/loongarch/predicates.md            |  8 ++---
 .../loongarch/explicit-relocs-auto-lto.c      | 26 ++++++++++++++
 5 files changed, 59 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-lto.c

diff --git a/gcc/config/loongarch/loongarch-protos.h b/gcc/config/loongarch/loongarch-protos.h
index 72ae9918b09..cb8fc36b086 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -220,4 +220,5 @@ extern rtx loongarch_gen_const_int_vector_shuffle (machine_mode, int);
 extern tree loongarch_build_builtin_va_list (void);
 
 extern rtx loongarch_build_signbit_mask (machine_mode, bool, bool);
+extern bool loongarch_explicit_relocs_p (enum loongarch_symbol_type);
 #endif /* ! GCC_LOONGARCH_PROTOS_H */
diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 5df8b12ed92..c12d77ea144 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -1925,6 +1925,29 @@ loongarch_symbolic_constant_p (rtx x, enum loongarch_symbol_type *symbol_type)
   gcc_unreachable ();
 }
 
+/* If -mexplicit-relocs=auto, we use machine operations with reloc hints
+   for cases where the linker is unable to relax so we can schedule the
+   machine operations, otherwise use an assembler pseudo-op so the
+   assembler will generate R_LARCH_RELAX.  */
+
+bool
+loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
+{
+  if (la_opt_explicit_relocs != EXPLICIT_RELOCS_AUTO)
+    return la_opt_explicit_relocs == EXPLICIT_RELOCS_ALWAYS;
+
+  /* If we are performing LTO for a final link, and we have the linker
+     plugin so we know the resolution of the symbols, then all GOT
+     references are binding to external symbols or preemptable symbols.
+     So the linker cannot relax them.  */
+  return (in_lto_p
+	  && !flag_incremental_link
+	  && HAVE_LTO_PLUGIN == 2
+	  && (!global_options_set.x_flag_use_linker_plugin
+	      || global_options.x_flag_use_linker_plugin)
+	  && type == SYMBOL_GOT_DISP);
+}
+
 /* Returns the number of instructions necessary to reference a symbol.  */
 
 static int
@@ -1940,7 +1963,7 @@ loongarch_symbol_insns (enum loongarch_symbol_type type, machine_mode mode)
     case SYMBOL_GOT_DISP:
       /* The constant will have to be loaded from the GOT before it
 	 is used in an address.  */
-      if (!TARGET_EXPLICIT_RELOCS && mode != MAX_MACHINE_MODE)
+      if (!loongarch_explicit_relocs_p (type) && mode != MAX_MACHINE_MODE)
 	return 0;
 
       return 3;
@@ -3038,7 +3061,7 @@ loongarch_symbol_extreme_p (enum loongarch_symbol_type type)
    If so, and if LOW_OUT is nonnull, emit the high part and store the
    low part in *LOW_OUT.  Leave *LOW_OUT unchanged otherwise.
 
-   Return false if build with '-mno-explicit-relocs'.
+   Return false if build with '-mexplicit-relocs=none'.
 
    TEMP is as for loongarch_force_temporary and is used to load the high
    part into a register.
@@ -3052,12 +3075,9 @@ loongarch_split_symbol (rtx temp, rtx addr, machine_mode mode, rtx *low_out)
 {
   enum loongarch_symbol_type symbol_type;
 
-  /* If build with '-mno-explicit-relocs', don't split symbol.  */
-  if (!TARGET_EXPLICIT_RELOCS)
-    return false;
-
   if ((GET_CODE (addr) == HIGH && mode == MAX_MACHINE_MODE)
       || !loongarch_symbolic_constant_p (addr, &symbol_type)
+      || !loongarch_explicit_relocs_p (symbol_type)
       || loongarch_symbol_insns (symbol_type, mode) == 0
       || !loongarch_split_symbol_type (symbol_type))
     return false;
@@ -4797,7 +4817,7 @@ loongarch_output_move (rtx dest, rtx src)
 	}
     }
 
-  if (!TARGET_EXPLICIT_RELOCS
+  if (!loongarch_explicit_relocs_p (loongarch_classify_symbol (src))
       && dest_code == REG && symbolic_operand (src, VOIDmode))
     {
       if (loongarch_classify_symbol (src) == SYMBOL_PCREL)
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index 365b4127e31..bec73f1bc91 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -2247,7 +2247,7 @@ (define_insn "*low<mode>"
   [(set (match_operand:P 0 "register_operand" "=r")
  (lo_sum:P (match_operand:P 1 "register_operand" " r")
      (match_operand:P 2 "symbolic_operand" "")))]
-  "TARGET_EXPLICIT_RELOCS"
+  ""
   "addi.<d>\t%0,%1,%L2"
   [(set_attr "type" "arith")
    (set_attr "mode" "<MODE>")])
@@ -2275,7 +2275,7 @@ (define_insn "@ld_from_got<mode>"
 				(match_operand:P 1 "register_operand" "r")
 				(match_operand:P 2 "symbolic_operand")))]
 	UNSPEC_LOAD_FROM_GOT))]
-  "TARGET_EXPLICIT_RELOCS"
+  ""
   "ld.<d>\t%0,%1,%L2"
   [(set_attr "type" "move")]
 )
diff --git a/gcc/config/loongarch/predicates.md b/gcc/config/loongarch/predicates.md
index 499518b82ba..359878f5bcf 100644
--- a/gcc/config/loongarch/predicates.md
+++ b/gcc/config/loongarch/predicates.md
@@ -541,16 +541,14 @@ (define_predicate "move_operand"
     case SYMBOL_REF:
     case LABEL_REF:
       return (loongarch_symbolic_constant_p (op, &symbol_type)
-	      && (!TARGET_EXPLICIT_RELOCS
+	      && (!loongarch_explicit_relocs_p (symbol_type)
 		  || !loongarch_split_symbol_type (symbol_type)));
 
     case HIGH:
-      /* '-mno-explicit-relocs' don't generate high/low pairs.  */
-      if (!TARGET_EXPLICIT_RELOCS)
-	return false;
-
       op = XEXP (op, 0);
+
       return (loongarch_symbolic_constant_p (op, &symbol_type)
+	      && loongarch_explicit_relocs_p (symbol_type)
 	      && loongarch_split_symbol_type (symbol_type));
 
     default:
diff --git a/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-lto.c b/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-lto.c
new file mode 100644
index 00000000000..f53b5468924
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-lto.c
@@ -0,0 +1,26 @@
+/* { dg-do link } */
+/* { dg-require-effective-target lto } */
+/* { dg-require-linker-plugin "" } */
+/* { dg-options "-fpic -shared -O2 --save-temps -mexplicit-relocs=auto -flto -fuse-linker-plugin -flto-partition=one" } */
+
+int pcrel __attribute__ ((visibility ("hidden")));
+int got __attribute__ ((visibility ("default")));
+
+int
+*addr_pcrel (void)
+{
+  return &pcrel;
+}
+
+int
+*addr_got (void)
+{
+  return &got;
+}
+
+/* With linker plugin we should use la.local (it can be relaxed to pcaddi),
+   but not la.global (we are pretty sure the linker cannot relax la.global
+   got).  */
+/* { dg-final { scan-lto-assembler "la.local.*pcrel" } } */
+/* { dg-final { scan-lto-assembler "pcalau12i.*%got_pc_hi20\\\(got\\\)" } } */
+/* { dg-final { scan-lto-assembler "ld.*%got_pc_lo12\\\(got\\\)" } } */
-- 
2.42.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3/5] LoongArch: Use explicit relocs for TLS access with -mexplicit-relocs=auto
  2023-10-19 14:02 [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling Xi Ruoyao
  2023-10-19 14:02 ` [PATCH 1/5] LoongArch: Add enum-style -mexplicit-relocs= option Xi Ruoyao
  2023-10-19 14:02 ` [PATCH 2/5] LoongArch: Use explicit relocs for GOT access when -mexplicit-relocs=auto and LTO during a final link with linker plugin Xi Ruoyao
@ 2023-10-19 14:02 ` Xi Ruoyao
  2023-10-19 14:02 ` [PATCH 4/5] LoongArch: Use explicit relocs for addresses only used for one load or store with -mexplicit-relocs=auto and -mcmodel={normal,medium} Xi Ruoyao
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Xi Ruoyao @ 2023-10-19 14:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: chenglulu, i, xuchenghua, mengqinggang, Xi Ruoyao

The linker does not know how to relax TLS access for LoongArch, so let's
emit machine instructions with explicit relocs for TLS.

gcc/ChangeLog:

	* config/loongarch/loongarch.cc (loongarch_explicit_relocs_p):
	Return true for TLS symbol types if -mexplicit-relocs=auto.
	(loongarch_call_tls_get_addr): Replace TARGET_EXPLICIT_RELOCS
	with la_opt_explicit_relocs != EXPLICIT_RELOCS_NONE.
	(loongarch_legitimize_tls_address): Likewise.
	* config/loongarch/loongarch.md (@tls_low<mode>): Remove
	TARGET_EXPLICIT_RELOCS from insn condition.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/explicit-relocs-auto-tls-ld-gd.c: New
	test.
	* gcc.target/loongarch/explicit-relocs-auto-tls-le-ie.c: New
	test.
---
 gcc/config/loongarch/loongarch.cc             | 37 ++++++++++++-------
 gcc/config/loongarch/loongarch.md             |  2 +-
 .../explicit-relocs-auto-tls-ld-gd.c          |  9 +++++
 .../explicit-relocs-auto-tls-le-ie.c          |  6 +++
 4 files changed, 40 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-tls-ld-gd.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-tls-le-ie.c

diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index c12d77ea144..c782f571abc 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -1936,16 +1936,27 @@ loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
   if (la_opt_explicit_relocs != EXPLICIT_RELOCS_AUTO)
     return la_opt_explicit_relocs == EXPLICIT_RELOCS_ALWAYS;
 
-  /* If we are performing LTO for a final link, and we have the linker
-     plugin so we know the resolution of the symbols, then all GOT
-     references are binding to external symbols or preemptable symbols.
-     So the linker cannot relax them.  */
-  return (in_lto_p
-	  && !flag_incremental_link
-	  && HAVE_LTO_PLUGIN == 2
-	  && (!global_options_set.x_flag_use_linker_plugin
-	      || global_options.x_flag_use_linker_plugin)
-	  && type == SYMBOL_GOT_DISP);
+  switch (type)
+    {
+      case SYMBOL_TLS_IE:
+      case SYMBOL_TLS_LE:
+      case SYMBOL_TLSGD:
+      case SYMBOL_TLSLDM:
+	/* The linker don't know how to relax TLS accesses.  */
+	return true;
+      case SYMBOL_GOT_DISP:
+	/* If we are performing LTO for a final link, and we have the
+	   linker plugin so we know the resolution of the symbols, then
+	   all GOT references are binding to external symbols or
+	   preemptable symbols.  So the linker cannot relax them.  */
+	return (in_lto_p
+		&& !flag_incremental_link
+		&& HAVE_LTO_PLUGIN == 2
+		&& (!global_options_set.x_flag_use_linker_plugin
+		    || global_options.x_flag_use_linker_plugin));
+      default:
+	return false;
+    }
 }
 
 /* Returns the number of instructions necessary to reference a symbol.  */
@@ -2753,7 +2764,7 @@ loongarch_call_tls_get_addr (rtx sym, enum loongarch_symbol_type type, rtx v0)
 
   start_sequence ();
 
-  if (TARGET_EXPLICIT_RELOCS)
+  if (la_opt_explicit_relocs != EXPLICIT_RELOCS_NONE)
     {
       /* Split tls symbol to high and low.  */
       rtx high = gen_rtx_HIGH (Pmode, copy_rtx (loc));
@@ -2918,7 +2929,7 @@ loongarch_legitimize_tls_address (rtx loc)
 	  tp = gen_rtx_REG (Pmode, THREAD_POINTER_REGNUM);
 	  tmp1 = gen_reg_rtx (Pmode);
 	  dest = gen_reg_rtx (Pmode);
-	  if (TARGET_EXPLICIT_RELOCS)
+	  if (la_opt_explicit_relocs != EXPLICIT_RELOCS_NONE)
 	    {
 	      tmp2 = loongarch_unspec_address (loc, SYMBOL_TLS_IE);
 	      tmp3 = gen_reg_rtx (Pmode);
@@ -2955,7 +2966,7 @@ loongarch_legitimize_tls_address (rtx loc)
 	  tmp1 = gen_reg_rtx (Pmode);
 	  dest = gen_reg_rtx (Pmode);
 
-	  if (TARGET_EXPLICIT_RELOCS)
+	  if (la_opt_explicit_relocs != EXPLICIT_RELOCS_NONE)
 	    {
 	      tmp2 = loongarch_unspec_address (loc, SYMBOL_TLS_LE);
 	      tmp3 = gen_reg_rtx (Pmode);
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index bec73f1bc91..695c8eb9a6f 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -2257,7 +2257,7 @@ (define_insn "@tls_low<mode>"
 	(unspec:P [(mem:P (lo_sum:P (match_operand:P 1 "register_operand" "r")
 				    (match_operand:P 2 "symbolic_operand" "")))]
 	UNSPEC_TLS_LOW))]
-  "TARGET_EXPLICIT_RELOCS"
+  ""
   "addi.<d>\t%0,%1,%L2"
   [(set_attr "type" "arith")
    (set_attr "mode" "<MODE>")])
diff --git a/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-tls-ld-gd.c b/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-tls-ld-gd.c
new file mode 100644
index 00000000000..957ff98df62
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-tls-ld-gd.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fPIC -mexplicit-relocs=auto" } */
+
+__thread int a __attribute__((visibility("hidden")));
+extern __thread int b __attribute__((visibility("default")));
+
+int test() { return a + b; }
+
+/* { dg-final { scan-assembler-not "la.tls" { target tls_native } } } */
diff --git a/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-tls-le-ie.c b/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-tls-le-ie.c
new file mode 100644
index 00000000000..78898cfc6ab
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-tls-le-ie.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mexplicit-relocs=auto" } */
+
+#include "explicit-relocs-auto-tls-ld-gd.c"
+
+/* { dg-final { scan-assembler-not "la.tls" { target tls_native } } } */
-- 
2.42.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 4/5] LoongArch: Use explicit relocs for addresses only used for one load or store with -mexplicit-relocs=auto and -mcmodel={normal,medium}
  2023-10-19 14:02 [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling Xi Ruoyao
                   ` (2 preceding siblings ...)
  2023-10-19 14:02 ` [PATCH 3/5] LoongArch: Use explicit relocs for TLS access with -mexplicit-relocs=auto Xi Ruoyao
@ 2023-10-19 14:02 ` Xi Ruoyao
  2023-10-19 14:03 ` [PATCH 5/5] LoongArch: Document -mexplicit-relocs={auto,none,always} Xi Ruoyao
  2023-10-23  7:34 ` Pushed: [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling Xi Ruoyao
  5 siblings, 0 replies; 10+ messages in thread
From: Xi Ruoyao @ 2023-10-19 14:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: chenglulu, i, xuchenghua, mengqinggang, Xi Ruoyao

In these cases, if we use explicit relocs, we end up with 2
instructions:

    pcalau12i    t0, %pc_hi20(x)
    ld.d         t0, t0, %pc_lo12(x)

If we use la.local pseudo-op, in the best scenario (x is in +/- 2MiB
range) we still have 2 instructions:

    pcaddi       t0, %pcrel_20(x)
    ld.d         t0, t0, 0

If x is out of the range we'll have 3 instructions.  So for these cases
just emit machine instructions with explicit relocs.

gcc/ChangeLog:

	* config/loongarch/predicates.md (symbolic_pcrel_operand): New
	predicate.
	* config/loongarch/loongarch.md (define_peephole2): Optimize
	la.local + ld/st to pcalau12i + ld/st if the address is only used
	once if -mexplicit-relocs=auto and -mcmodel=normal or medium.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/explicit-relocs-auto-single-load-store.c:
	New test.
	* gcc.target/loongarch/explicit-relocs-auto-single-load-store-no-anchor.c:
	New test.
---
 gcc/config/loongarch/loongarch.md             | 122 ++++++++++++++++++
 gcc/config/loongarch/predicates.md            |   7 +
 ...-relocs-auto-single-load-store-no-anchor.c |   6 +
 .../explicit-relocs-auto-single-load-store.c  |  14 ++
 4 files changed, 149 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store-no-anchor.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store.c

diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index 695c8eb9a6f..13473472171 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -65,6 +65,7 @@ (define_c_enum "unspec" [
 
   UNSPEC_LOAD_FROM_GOT
   UNSPEC_PCALAU12I
+  UNSPEC_PCALAU12I_GR
   UNSPEC_ORI_L_LO12
   UNSPEC_LUI_L_HI20
   UNSPEC_LUI_H_LO20
@@ -2297,6 +2298,16 @@ (define_insn "@pcalau12i<mode>"
   "pcalau12i\t%0,%%pc_hi20(%1)"
   [(set_attr "type" "move")])
 
+;; @pcalau12i may be used for sibcall so it has a strict constraint.  This
+;; allows any general register as the operand.
+(define_insn "@pcalau12i_gr<mode>"
+  [(set (match_operand:P 0 "register_operand" "=r")
+       (unspec:P [(match_operand:P 1 "symbolic_operand" "")]
+       UNSPEC_PCALAU12I_GR))]
+  ""
+  "pcalau12i\t%0,%%pc_hi20(%1)"
+  [(set_attr "type" "move")])
+
 (define_insn "@ori_l_lo12<mode>"
   [(set (match_operand:P 0 "register_operand" "=r")
 	(unspec:P [(match_operand:P 1 "register_operand" "r")
@@ -3748,6 +3759,117 @@ (define_insn "loongarch_crcc_w_<size>_w"
   [(set_attr "type" "unknown")
    (set_attr "mode" "<MODE>")])
 
+;; With normal or medium code models, if the only use of a pc-relative
+;; address is for loading or storing a value, then relying on linker
+;; relaxation is not better than emitting the machine instruction directly.
+;; Even if the la.local pseudo op can be relaxed, we get:
+;;
+;;     pcaddi     $t0, %pcrel_20(x)
+;;     ld.d       $t0, $t0, 0
+;;
+;; There are still two instructions, same as using the machine instructions
+;; and explicit relocs:
+;;
+;;     pcalau12i  $t0, %pc_hi20(x)
+;;     ld.d       $t0, $t0, %pc_lo12(x)
+;;
+;; And if the pseudo op cannot be relaxed, we'll get a worse result (with
+;; 3 instructions).
+(define_peephole2
+  [(set (match_operand:P 0 "register_operand")
+	(match_operand:P 1 "symbolic_pcrel_operand"))
+   (set (match_operand:GPR 2 "register_operand")
+	(mem:GPR (match_dup 0)))]
+  "la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
+   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM) \
+   && (peep2_reg_dead_p (2, operands[0]) \
+       || REGNO (operands[0]) == REGNO (operands[2]))"
+  [(set (match_dup 2) (mem:GPR (lo_sum:P (match_dup 0) (match_dup 1))))]
+  {
+    emit_insn (gen_pcalau12i_gr<P:mode> (operands[0], operands[1]));
+  })
+
+(define_peephole2
+  [(set (match_operand:P 0 "register_operand")
+	(match_operand:P 1 "symbolic_pcrel_operand"))
+   (set (match_operand:GPR 2 "register_operand")
+	(mem:GPR (plus (match_dup 0)
+		       (match_operand 3 "const_int_operand"))))]
+  "la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
+   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM) \
+   && (peep2_reg_dead_p (2, operands[0]) \
+       || REGNO (operands[0]) == REGNO (operands[2]))"
+  [(set (match_dup 2) (mem:GPR (lo_sum:P (match_dup 0) (match_dup 1))))]
+  {
+    operands[1] = plus_constant (Pmode, operands[1], INTVAL (operands[3]));
+    emit_insn (gen_pcalau12i_gr<P:mode> (operands[0], operands[1]));
+  })
+
+(define_peephole2
+  [(set (match_operand:P 0 "register_operand")
+	(match_operand:P 1 "symbolic_pcrel_operand"))
+   (set (match_operand:GPR 2 "register_operand")
+	(any_extend:GPR (mem:SUBDI (match_dup 0))))]
+  "la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
+   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM) \
+   && (peep2_reg_dead_p (2, operands[0]) \
+       || REGNO (operands[0]) == REGNO (operands[2]))"
+  [(set (match_dup 2)
+	(any_extend:GPR (mem:SUBDI (lo_sum:P (match_dup 0)
+					     (match_dup 1)))))]
+  {
+    emit_insn (gen_pcalau12i_gr<P:mode> (operands[0], operands[1]));
+  })
+
+(define_peephole2
+  [(set (match_operand:P 0 "register_operand")
+	(match_operand:P 1 "symbolic_pcrel_operand"))
+   (set (match_operand:GPR 2 "register_operand")
+	(any_extend:GPR
+	  (mem:SUBDI (plus (match_dup 0)
+			   (match_operand 3 "const_int_operand")))))]
+  "la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
+   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM) \
+   && (peep2_reg_dead_p (2, operands[0]) \
+       || REGNO (operands[0]) == REGNO (operands[2]))"
+  [(set (match_dup 2)
+	(any_extend:GPR (mem:SUBDI (lo_sum:P (match_dup 0)
+					     (match_dup 1)))))]
+  {
+    operands[1] = plus_constant (Pmode, operands[1], INTVAL (operands[3]));
+    emit_insn (gen_pcalau12i_gr<P:mode> (operands[0], operands[1]));
+  })
+
+(define_peephole2
+  [(set (match_operand:P 0 "register_operand")
+	(match_operand:P 1 "symbolic_pcrel_operand"))
+   (set (mem:QHWD (match_dup 0))
+	(match_operand:QHWD 2 "register_operand"))]
+  "la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
+   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM) \
+   && (peep2_reg_dead_p (2, operands[0])) \
+   && REGNO (operands[0]) != REGNO (operands[2])"
+  [(set (mem:QHWD (lo_sum:P (match_dup 0) (match_dup 1))) (match_dup 2))]
+  {
+    emit_insn (gen_pcalau12i_gr<P:mode> (operands[0], operands[1]));
+  })
+
+(define_peephole2
+  [(set (match_operand:P 0 "register_operand")
+	(match_operand:P 1 "symbolic_pcrel_operand"))
+   (set (mem:QHWD (plus (match_dup 0)
+			(match_operand 3 "const_int_operand")))
+	(match_operand:QHWD 2 "register_operand"))]
+  "la_opt_explicit_relocs == EXPLICIT_RELOCS_AUTO \
+   && (TARGET_CMODEL_NORMAL || TARGET_CMODEL_MEDIUM) \
+   && (peep2_reg_dead_p (2, operands[0])) \
+   && REGNO (operands[0]) != REGNO (operands[2])"
+  [(set (mem:QHWD (lo_sum:P (match_dup 0) (match_dup 1))) (match_dup 2))]
+  {
+    operands[1] = plus_constant (Pmode, operands[1], INTVAL (operands[3]));
+    emit_insn (gen_pcalau12i_gr<P:mode> (operands[0], operands[1]));
+  })
+
 ;; Synchronization instructions.
 
 (include "sync.md")
diff --git a/gcc/config/loongarch/predicates.md b/gcc/config/loongarch/predicates.md
index 359878f5bcf..946ed0d8212 100644
--- a/gcc/config/loongarch/predicates.md
+++ b/gcc/config/loongarch/predicates.md
@@ -563,6 +563,13 @@ (define_predicate "symbolic_operand"
   return loongarch_symbolic_constant_p (op, &type);
 })
 
+(define_predicate "symbolic_pcrel_operand"
+  (match_code "const,symbol_ref,label_ref")
+{
+  enum loongarch_symbol_type type;
+  return loongarch_symbolic_constant_p (op, &type) && type == SYMBOL_PCREL;
+})
+
 (define_predicate "equality_operator"
   (match_code "eq,ne"))
 
diff --git a/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store-no-anchor.c b/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store-no-anchor.c
new file mode 100644
index 00000000000..fb03403d756
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store-no-anchor.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=loongarch64 -mabi=lp64d -mexplicit-relocs=auto -fno-section-anchors" } */
+
+#include "explicit-relocs-auto-single-load-store.c"
+
+/* { dg-final { scan-assembler-not "la.local" } } */
diff --git a/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store.c b/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store.c
new file mode 100644
index 00000000000..0d53644cda7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=loongarch64 -mabi=lp64d -mexplicit-relocs=auto" } */
+
+long a;
+int b;
+unsigned int c;
+
+long load_a() { return a; }
+long load_b() { return b; }
+long load_c() { return c; }
+void store_a(long x) { a = x; }
+void store_b(int x) { b = x; }
+
+/* { dg-final { scan-assembler-not "la.local" } } */
-- 
2.42.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 5/5] LoongArch: Document -mexplicit-relocs={auto,none,always}
  2023-10-19 14:02 [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling Xi Ruoyao
                   ` (3 preceding siblings ...)
  2023-10-19 14:02 ` [PATCH 4/5] LoongArch: Use explicit relocs for addresses only used for one load or store with -mexplicit-relocs=auto and -mcmodel={normal,medium} Xi Ruoyao
@ 2023-10-19 14:03 ` Xi Ruoyao
  2023-10-23  7:34 ` Pushed: [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling Xi Ruoyao
  5 siblings, 0 replies; 10+ messages in thread
From: Xi Ruoyao @ 2023-10-19 14:03 UTC (permalink / raw)
  To: gcc-patches; +Cc: chenglulu, i, xuchenghua, mengqinggang, Xi Ruoyao

gcc/ChangeLog:

	* doc/invoke.texi (-mexplicit-relocs=style): Document.
	(-mexplicit-relocs): Document as an alias of
	-mexplicit-relocs=always.
	(-mno-explicit-relocs): Document as an alias of
	-mexplicit-relocs=none.
	(-mcmodel=extreme): Mention -mexplicit-relocs=always instead of
	-mexplicit-relocs.
---
 gcc/doc/invoke.texi | 37 +++++++++++++++++++++++++------------
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 16c45843123..f4633715e2b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1038,7 +1038,7 @@ Objective-C and Objective-C++ Dialects}.
 -mcond-move-float  -mno-cond-move-float
 -memcpy  -mno-memcpy -mstrict-align -mno-strict-align
 -mmax-inline-memcpy-size=@var{n}
--mexplicit-relocs -mno-explicit-relocs
+-mexplicit-relocs=@var{style} -mexplicit-relocs -mno-explicit-relocs
 -mdirect-extern-access -mno-direct-extern-access
 -mcmodel=@var{code-model}}
 
@@ -26194,26 +26194,39 @@ The text segment and data segment must be within 2GB addressing space.
 
 @item extreme
 This mode does not limit the size of the code segment and data segment.
-The @option{-mcmodel=extreme} option is incompatible with @option{-fplt} and
-@option{-mno-explicit-relocs}.
+The @option{-mcmodel=extreme} option is incompatible with @option{-fplt},
+and it requires @option{-mexplicit-relocs=always}.
 @end table
 The default code model is @code{normal}.
 
-@opindex mexplicit-relocs
-@opindex mno-explicit-relocs
-@item -mexplicit-relocs
-@itemx -mno-explicit-relocs
-Use or do not use assembler relocation operators when dealing with symbolic
+@item -mexplicit-relocs=@var{style}
+Set when to use assembler relocation operators when dealing with symbolic
 addresses.  The alternative is to use assembler macros instead, which may
-limit instruction scheduling but allow linker relaxation.  The default
+limit instruction scheduling but allow linker relaxation.
+with @option{-mexplicit-relocs=none} the assembler macros are always used,
+with @option{-mexplicit-relocs=always} the assembler relocation operators
+are always used, with @option{-mexplicit-relocs=auto} the compiler will
+use the relocation operators where the linker relaxation is impossible to
+improve the code quality, and macros elsewhere.  The default
 value for the option is determined during GCC build-time by detecting
 corresponding assembler support:
-@code{-mno-explicit-relocs} if the assembler supports relaxation or it
-does not support relocation operators at all,
-@code{-mexplicit-relocs} otherwise.  This option is mostly useful for
+@option{-mexplicit-relocs=none} if the assembler does not support
+relocation operators at all,
+@option{-mexplicit-relocs=always} if the assembler supports relocation
+operators but does not support relaxation,
+@option{-mexplicit-relocs=auto} if the assembler supports both relocation
+operators and relaxation.  This option is mostly useful for
 debugging, or interoperation with assemblers different from the build-time
 one.
 
+@opindex mexplicit-relocs
+@item -mexplicit-relocs
+An alias of @option{-mexplicit-relocs=always} for backward compatibility.
+
+@opindex mno-explicit-relocs
+@item -mno-explicit-relocs
+An alias of @option{-mexplicit-relocs=none} for backward compatibility.
+
 @opindex mdirect-extern-access
 @item -mdirect-extern-access
 @itemx -mno-direct-extern-access
-- 
2.42.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] LoongArch: Use explicit relocs for GOT access when -mexplicit-relocs=auto and LTO during a final link with linker plugin
  2023-10-19 14:02 ` [PATCH 2/5] LoongArch: Use explicit relocs for GOT access when -mexplicit-relocs=auto and LTO during a final link with linker plugin Xi Ruoyao
@ 2023-10-21  7:32   ` chenglulu
  2023-10-21  8:42     ` Xi Ruoyao
  0 siblings, 1 reply; 10+ messages in thread
From: chenglulu @ 2023-10-21  7:32 UTC (permalink / raw)
  To: Xi Ruoyao, gcc-patches; +Cc: i, xuchenghua, mengqinggang

/* snip */

> +/* If -mexplicit-relocs=auto, we use machine operations with reloc hints
> +   for cases where the linker is unable to relax so we can schedule the
> +   machine operations, otherwise use an assembler pseudo-op so the
> +   assembler will generate R_LARCH_RELAX.  */
> +
> +bool
> +loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> +{
> +  if (la_opt_explicit_relocs != EXPLICIT_RELOCS_AUTO)
> +    return la_opt_explicit_relocs == EXPLICIT_RELOCS_ALWAYS;
> +
> +  /* If we are performing LTO for a final link, and we have the linker
> +     plugin so we know the resolution of the symbols, then all GOT
> +     references are binding to external symbols or preemptable symbols.
> +     So the linker cannot relax them.  */
> +  return (in_lto_p
> +	  && !flag_incremental_link

I don’t quite understand this condition "!flag_incremental_link". Can 
you explain it? Others LGTM.

Thanks.

> +	  && HAVE_LTO_PLUGIN == 2
> +	  && (!global_options_set.x_flag_use_linker_plugin
> +	      || global_options.x_flag_use_linker_plugin)
> +	  && type == SYMBOL_GOT_DISP);
> +}
> +


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] LoongArch: Use explicit relocs for GOT access when -mexplicit-relocs=auto and LTO during a final link with linker plugin
  2023-10-21  7:32   ` chenglulu
@ 2023-10-21  8:42     ` Xi Ruoyao
  2023-10-23  0:56       ` chenglulu
  0 siblings, 1 reply; 10+ messages in thread
From: Xi Ruoyao @ 2023-10-21  8:42 UTC (permalink / raw)
  To: chenglulu, gcc-patches; +Cc: i, xuchenghua, mengqinggang

On Sat, 2023-10-21 at 15:32 +0800, chenglulu wrote:
> > +  /* If we are performing LTO for a final link, and we have the linker
> > +     plugin so we know the resolution of the symbols, then all GOT
> > +     references are binding to external symbols or preemptable symbols.
> > +     So the linker cannot relax them.  */
> > +  return (in_lto_p
> > +	  && !flag_incremental_link
> 
> I don’t quite understand this condition "!flag_incremental_link". Can 
> you explain it? Others LGTM.
> 
> Thanks.

If we have two (or several) .o files containing LTO bytecode, GCC
supports "LTO incremental linking" with:

gcc a.o b.o -o ab.o -O2 -flto -flinker-output=nolto-rel

The resulted ab.o will include data and code in a.o and b.o, but it
contains machine code instead of LTO bytecode.  Now if ab.o refers to an
external symbol c, the linker may relax "la.global c" to "la.local c"
(if ab.o is linked together with another file c.o which contains the
definition of c) or not.  As we cannot exclude the possibility of a
relaxation on la.global for incremental linking, just emit la.global and
let the linker to do the correct thing.

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/5] LoongArch: Use explicit relocs for GOT access when -mexplicit-relocs=auto and LTO during a final link with linker plugin
  2023-10-21  8:42     ` Xi Ruoyao
@ 2023-10-23  0:56       ` chenglulu
  0 siblings, 0 replies; 10+ messages in thread
From: chenglulu @ 2023-10-23  0:56 UTC (permalink / raw)
  To: Xi Ruoyao, gcc-patches; +Cc: i, xuchenghua, mengqinggang


在 2023/10/21 下午4:42, Xi Ruoyao 写道:
> On Sat, 2023-10-21 at 15:32 +0800, chenglulu wrote:
>>> +  /* If we are performing LTO for a final link, and we have the linker
>>> +     plugin so we know the resolution of the symbols, then all GOT
>>> +     references are binding to external symbols or preemptable symbols.
>>> +     So the linker cannot relax them.  */
>>> +  return (in_lto_p
>>> +	  && !flag_incremental_link
>> I don’t quite understand this condition "!flag_incremental_link". Can
>> you explain it? Others LGTM.
>>
>> Thanks.
> If we have two (or several) .o files containing LTO bytecode, GCC
> supports "LTO incremental linking" with:
>
> gcc a.o b.o -o ab.o -O2 -flto -flinker-output=nolto-rel
>
> The resulted ab.o will include data and code in a.o and b.o, but it
> contains machine code instead of LTO bytecode.  Now if ab.o refers to an
> external symbol c, the linker may relax "la.global c" to "la.local c"
> (if ab.o is linked together with another file c.o which contains the
> definition of c) or not.  As we cannot exclude the possibility of a
> relaxation on la.global for incremental linking, just emit la.global and
> let the linker to do the correct thing.
>
I have no problem, thank you!


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Pushed: [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling
  2023-10-19 14:02 [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling Xi Ruoyao
                   ` (4 preceding siblings ...)
  2023-10-19 14:03 ` [PATCH 5/5] LoongArch: Document -mexplicit-relocs={auto,none,always} Xi Ruoyao
@ 2023-10-23  7:34 ` Xi Ruoyao
  5 siblings, 0 replies; 10+ messages in thread
From: Xi Ruoyao @ 2023-10-23  7:34 UTC (permalink / raw)
  To: gcc-patches; +Cc: chenglulu, i, xuchenghua, mengqinggang

Pushed r14-{4848..4852}.

On Thu, 2023-10-19 at 22:02 +0800, Xi Ruoyao wrote:
> For relaxation we are now generating assembler macros for symbolic
> addresses everywhere, but this is limiting scheduling and there are
> known situations where the relaxation cannot improve the code.
> 
> 1. When we are performing LTO during a final link and the linker plugin
> is used, la.global won't be relaxed because they reference to an
> external or preemptable symbol.
> 2. The linker currently do not relax la.tls.*.
> 3. For la.local + ld/st pairs, if the address is only used once,
> emitting pcalau12i + ld/st is always not worse than relying on linker
> relaxation.
> 
> Add -mexplicit-relocs=auto to allow the compiler to use explicit relocs
> for these cases, but assembler macros for other cases.  Use it as the
> default if the assembler supports both explicit relocs and relaxation.
> 
> LTO-bootstrapped and regtested on loongarch64-linux-gnu.  Ok for trunk?
> 
> Xi Ruoyao (5):
>   LoongArch: Add enum-style -mexplicit-relocs= option
>   LoongArch: Use explicit relocs for GOT access when
>     -mexplicit-relocs=auto and LTO during a final link with linker
>     plugin
>   LoongArch: Use explicit relocs for TLS access with
>     -mexplicit-relocs=auto
>   LoongArch: Use explicit relocs for addresses only used for one load or
>     store with -mexplicit-relocs=auto and -mcmodel={normal,medium}
>   LoongArch: Document -mexplicit-relocs={auto,none,always}
> 
>  .../loongarch/genopts/loongarch-strings       |   6 +
>  gcc/config/loongarch/genopts/loongarch.opt.in |  21 ++-
>  gcc/config/loongarch/loongarch-def.h          |   6 +
>  gcc/config/loongarch/loongarch-protos.h       |   1 +
>  gcc/config/loongarch/loongarch-str.h          |   5 +
>  gcc/config/loongarch/loongarch.cc             |  75 ++++++++--
>  gcc/config/loongarch/loongarch.h              |   3 +
>  gcc/config/loongarch/loongarch.md             | 128 +++++++++++++++++-
>  gcc/config/loongarch/loongarch.opt            |  21 ++-
>  gcc/config/loongarch/predicates.md            |  15 +-
>  gcc/doc/invoke.texi                           |  37 +++--
>  .../loongarch/explicit-relocs-auto-lto.c      |  26 ++++
>  ...-relocs-auto-single-load-store-no-anchor.c |   6 +
>  .../explicit-relocs-auto-single-load-store.c  |  14 ++
>  .../explicit-relocs-auto-tls-ld-gd.c          |   9 ++
>  .../explicit-relocs-auto-tls-le-ie.c          |   6 +
>  16 files changed, 343 insertions(+), 36 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-lto.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store-no-anchor.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-single-load-store.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-tls-ld-gd.c
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/explicit-relocs-auto-tls-le-ie.c
> 

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-10-23  7:35 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-19 14:02 [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling Xi Ruoyao
2023-10-19 14:02 ` [PATCH 1/5] LoongArch: Add enum-style -mexplicit-relocs= option Xi Ruoyao
2023-10-19 14:02 ` [PATCH 2/5] LoongArch: Use explicit relocs for GOT access when -mexplicit-relocs=auto and LTO during a final link with linker plugin Xi Ruoyao
2023-10-21  7:32   ` chenglulu
2023-10-21  8:42     ` Xi Ruoyao
2023-10-23  0:56       ` chenglulu
2023-10-19 14:02 ` [PATCH 3/5] LoongArch: Use explicit relocs for TLS access with -mexplicit-relocs=auto Xi Ruoyao
2023-10-19 14:02 ` [PATCH 4/5] LoongArch: Use explicit relocs for addresses only used for one load or store with -mexplicit-relocs=auto and -mcmodel={normal,medium} Xi Ruoyao
2023-10-19 14:03 ` [PATCH 5/5] LoongArch: Document -mexplicit-relocs={auto,none,always} Xi Ruoyao
2023-10-23  7:34 ` Pushed: [PATCH 0/5] LoongArch: Better balance between relaxation and scheduling Xi Ruoyao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).