public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Michael Meissner <meissner@linux.ibm.com>
To: Michael Meissner <meissner@linux.ibm.com>,
	gcc-patches@gcc.gnu.org,        segher@kernel.crashing.org,
	dje.gcc@gmail.com
Subject: [PATCH, V3, #7 of 10], Implement PCREL_OPT relocation optimization
Date: Mon, 26 Aug 2019 22:06:00 -0000	[thread overview]
Message-ID: <20190826214341.GG11790@ibm-toto.the-meissners.org> (raw)
In-Reply-To: <20190826173320.GA7958@ibm-toto.the-meissners.org>

This patch is a slight rework on V1 patch #7 (V1 patch #6 is not going
to be re-submitted at this time).

This patch adds a new RTL pass that supports creating the optimization
and flagging the appropriate load of external pc-relative addresses and
the use of that address in the basic block.

Here is the comment from the beginning of rs6000-pcrel.c that describes
the optimization.

/* This file implements a RTL pass that looks for pc-relative loads of the
   address of an external variable using the PCREL_GOT relocation and a single
   load/store that uses that GOT pointer.  If that is found we create the
   PCREL_OPT relocation to possibly convert:

	pld b,var@pcrel@got(0),1

	# possibly other instructions that do not use the base register 'b' or
        # the result register 'r'.

	lwz r,0(b)

   into:

	plwz r,var@pcrel(0),1

	# possibly other instructions that do not use the base register 'b' or
        # the result register 'r'.

	nop

   If the variable is not defined in the main program or the code using it is
   not in the main program, the linker put the address in the .got section and
   do:

	.section .got
	.Lvar_got:	.dword var

	.section .text
	pld b,.Lvar_got@pcrel(0),1

	# possibly other instructions that do not use the base register 'b' or
        # the result register 'r'.

	lwz r,0(b)
	
   We only look for a single usage in the basic block where the GOT pointer is
   loaded.  Multiple uses or references in another basic block will force us to
   not use the PCREL_OPT relocation.  */

I have built a bootstrap compiler on a little endian power8 system, and
there wre no regressions when I ran make check.  Assuming the previous
patches are checked in, can I check this into the trunk?

[gcc]
2019-08-26  Michael Meissner  <meissner@linux.ibm.com>

	* config.gcc (powerpc*-*-*): Add rs6000-pcrel.c.
	(rs6000*-*-*): Add rs6000-pcrel.c.
	* config/rs6000/pcrel.md: New file.
	* config/rs6000/predicates.md (one_reg_memory_operand): New
	predicate.
	(pcrel_ext_mem_operand): New predicate.
	* config/rs6000/rs6000-cpus.def (ADDRESSING_FUTURE_MASKS): Add
	-mpcrel-opt.
	(POWERPC_MASKS): Add -mpcrel-opt.
	* config/rs6000/rs6000-passes.def: Add pcrel optimization pass.
	* config/rs6000/rs6000-pcrel.c: New file.
	* config/rs6000/rs6000-protos.h (make_pass_pcrel_opt): New
	declaration.
	* config/rs6000/rs6000.c (rs6000_option_override_internal): Add
	-mpcrel-opt support.
	(pcrel_opt_label_num): New state static flag.
	(rs6000_final_prescan_insn): Add -mpcrel-opt support.
	(rs6000_asm_output_opcode): Add -mpcrel-opt support.
	(rs6000_opt_masks): Add -mpcrel-opt.
	* config/rs6000/rs6000.md: Include pcrel.md.
	(pcrel_opt RTL attribute): New RTL attribute.
	* config/rs6000/t-rs6000 (rs6000-pcrel.o): Add build rules.
	(MD_INCLUDES): Add pcrel.md.

[gcc/testsuite]
2019-08-26   Michael Meissner  <meissner@linux.ibm.com>

	* gcc.target/powerpc/pcrel-opt-di.c: New test for -mpcrel-opt.

Index: gcc/config/rs6000/pcrel.md
===================================================================
--- gcc/config/rs6000/pcrel.md	(revision 274877)
+++ gcc/config/rs6000/pcrel.md	(working copy)
@@ -0,0 +1,563 @@
+;; PC relative support.
+;; Copyright (C) 2019 Free Software Foundation, Inc.
+;; Contributed by Peter Bergner <bergner@linux.ibm.com> and
+;;		  Michael Meissner <meissner@linux.ibm.com>
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; <http://www.gnu.org/licenses/>.
+
+;;
+;; UNSPEC usage
+;;
+
+(define_c_enum "unspec"
+  [UNSPEC_PCREL_LD
+   UNSPEC_PCREL_ST
+  ])
+
+
+;; Optimize references to external variables to combine loading up the external
+;; address from the GOT and doing the load or store operation.
+;;
+;; A typical optimization looks like:
+;;
+;;		pld b,var@pcrel@got(0),1
+;;	100:
+;;		...
+;;		.reloc 100b-8,R_PPC64_PCREL_OPT,0
+;;		lwz r,0(b)
+;;
+;; If 'var' is an external variable defined in another module in the main
+;; program, and the code is being linked for the main program, then the
+;; linker can optimize this to:
+;;
+;;		plwz r,var(0),1
+;;	100:
+;;		...
+;;		nop
+;;
+;; If either the variable or the code being linked is defined in a shared
+;; library, then the linker puts the address in the GOT area, and the pld will
+;; load up the pointer, and then that pointer is used for the load or store.
+;; If there is more than one reference to the GOT pointer, the compiler will
+;; not do this optimization, and use the GOT pointer normally.
+;;
+;; Having the label after the pld instruction and using label-8 in the .reloc
+;; addresses the prefixed instruction properly.  If we put the label before the
+;; pld instruction, then the relocation might point to the NOP that is
+;; generated if the prefixed instruction is not aligned.
+;;
+;; We need to rewrite the normal GOT load operation before register allocation
+;; to include setting the eventual destination register for loads, or referring
+;; to the value being stored for store operations so that the proper register
+;; lifetime is set in case the optimization is done and the pld/lwz is
+;; converted to plwz/nop.
+
+(define_mode_iterator PO [QI HI SI DI SF DF
+			  V16QI V8HI V4SI V4SF V2DI V2DF V1TI KF
+			  (TF "FLOAT128_IEEE_P (TFmode)")])
+
+;; Vector types for pcrel optimization
+(define_mode_iterator POV [V16QI V8HI V4SI V4SF V2DI V2DF V1TI KF
+			   (TF "FLOAT128_IEEE_P (TFmode)")])
+
+;; Define the constraints for each mode for pcrel_opt.  The order of the
+;; constraints should have the most natural register class first.
+(define_mode_attr PO_constraint [(QI    "r,d,v")
+				 (HI    "r,d,v")
+				 (SI    "r,d,v")
+				 (DI    "r,d,v")
+				 (SF    "d,v,r")
+				 (DF    "d,v,r")
+				 (V16QI "wa,wn,wn")
+				 (V8HI  "wa,wn,wn")
+				 (V4SI  "wa,wn,wn")
+				 (V4SF  "wa,wn,wn")
+				 (V2DI  "wa,wn,wn")
+				 (V2DF  "wa,wn,wn")
+				 (V1TI  "wa,wn,wn")
+				 (KF    "wa,wn,wn")
+				 (TF    "wa,wn,wn")])
+
+;; Combiner pattern that combines the load of the GOT along with the load.  The
+;; first split pass before register allocation will split this into the load of
+;; the GOT that indicates the resultant value may be created if the PCREL_OPT
+;; relocation is done.
+;;
+;; The (set (match_dup 0)
+;;	    (unspec:<MODE> [(const_int 0)] UNSPEC_PCREL_LD))
+;;
+;; Is to signal to the register allocator that the destination register may be
+;; set by the GOT operation (if the linker does the optimization).
+;;
+;; We need to set the "cost" explicitly so that the instruction length is not
+;; used.  We return the same cost as a normal load (4 if we are not optimizing
+;; for speed, 8 if we are optimizing for speed)
+
+(define_insn_and_split "*mov<mode>_pcrel_opt_load"
+  [(set (match_operand:PO 0 "gpc_reg_operand")
+	(match_operand:PO 1 "pcrel_ext_mem_operand"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64
+   && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(parallel [(set (match_dup 2)
+		   (match_dup 3))
+	      (set (match_dup 0)
+		   (unspec:<MODE> [(const_int 0)] UNSPEC_PCREL_LD))
+	      (use (const_int 0))])
+   (parallel [(set (match_dup 0)
+		   (match_dup 4))
+	      (use (match_dup 0))
+	      (use (const_int 0))])]
+{
+  rtx mem = operands[1];
+  rtx got = gen_reg_rtx (DImode);
+
+  operands[2] = got;
+  operands[3] = XEXP (mem, 0);
+  operands[4] = change_address (mem, <MODE>mode, got);
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "16")
+   (set (attr "cost")
+	(if_then_else (match_test "optimize_function_for_speed_p (cfun)")
+		      (const_string "8")
+		      (const_string "4")))
+   (set_attr "prefixed" "yes")])
+
+;; Zero extend combiner patterns
+(define_insn_and_split "*mov<mode>_pcrel_opt_zero_extend"
+  [(set (match_operand:DI 0 "gpc_reg_operand")
+	(zero_extend:DI
+	 (match_operand:QHSI 1 "pcrel_ext_mem_operand")))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64
+   && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(parallel [(set (match_dup 2)
+		   (match_dup 3))
+	      (set (match_dup 0)
+		   (unspec:DI [(const_int 0)] UNSPEC_PCREL_LD))
+	      (use (const_int 0))])
+   (parallel [(set (match_dup 0)
+		   (zero_extend:DI
+		    (match_dup 4)))
+	      (use (match_dup 0))
+	      (use (const_int 0))])]
+{
+  rtx mem = operands[1];
+  rtx got = gen_reg_rtx (DImode);
+
+  operands[2] = got;
+  operands[3] = XEXP (mem, 0);
+  operands[4] = change_address (mem, <MODE>mode, got);
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "16")
+   (set (attr "cost")
+	(if_then_else (match_test "optimize_function_for_speed_p (cfun)")
+		      (const_string "8")
+		      (const_string "4")))
+   (set_attr "prefixed" "yes")])
+
+;; Sign extend combiner patterns
+(define_insn_and_split "*mov<mode>_pcrel_opt_sign_extend"
+  [(set (match_operand:DI 0 "gpc_reg_operand")
+	(sign_extend:DI
+	 (match_operand:HSI 1 "pcrel_ext_mem_operand")))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64
+   && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(parallel [(set (match_dup 2)
+		   (match_dup 3))
+	      (set (match_dup 0)
+		   (unspec:DI [(const_int 0)] UNSPEC_PCREL_LD))
+	      (use (const_int 0))])
+   (parallel [(set (match_dup 0)
+		   (sign_extend:DI
+		    (match_dup 4)))
+	      (use (match_dup 0))
+	      (use (const_int 0))])]
+{
+  rtx mem = operands[1];
+  rtx got = gen_reg_rtx (DImode);
+
+  operands[2] = got;
+  operands[3] = XEXP (mem, 0);
+  operands[4] = change_address (mem, <MODE>mode, got);
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "16")
+   (set (attr "cost")
+	(if_then_else (match_test "optimize_function_for_speed_p (cfun)")
+		      (const_string "8")
+		      (const_string "4")))
+   (set_attr "prefixed" "yes")])
+
+;; Float extend combiner pattern
+(define_insn_and_split "*movdf_pcrel_opt_float_extend"
+  [(set (match_operand:DF 0 "gpc_reg_operand")
+	(float_extend:DF
+	 (match_operand:SF 1 "pcrel_ext_mem_operand")))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64
+   && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(parallel [(set (match_dup 2)
+		   (match_dup 3))
+	      (set (match_dup 0)
+		   (unspec:DF [(const_int 0)] UNSPEC_PCREL_LD))
+	      (use (const_int 0))])
+   (parallel [(set (match_dup 0)
+		   (float_extend:DF
+		    (match_dup 4)))
+	      (use (match_dup 0))
+	      (use (const_int 0))])]
+{
+  rtx mem = operands[1];
+  rtx got = gen_reg_rtx (DImode);
+
+  operands[2] = got;
+  operands[3] = XEXP (mem, 0);
+  operands[4] = change_address (mem, SFmode, got);
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "16")
+   (set (attr "cost")
+	(if_then_else (match_test "optimize_function_for_speed_p (cfun)")
+		      (const_string "8")
+		      (const_string "4")))
+   (set_attr "prefixed" "yes")])
+
+;; Patterns to load up the GOT address that may be changed into the load of the
+;; actual variable.
+(define_insn "*mov<mode>_pcrel_opt_load_got"
+  [(set (match_operand:DI 0 "base_reg_operand" "=b,b,b")
+	(match_operand:DI 1 "pcrel_ext_address"))
+   (set (match_operand:PO 2 "gpc_reg_operand" "=<PO_constraint>")
+	(unspec:PO [(const_int 0)] UNSPEC_PCREL_LD))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+{
+  return (INTVAL (operands[3])) ? "ld %0,%a1\n.Lpcrel%3:" : "ld %0,%a1";
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "12")
+   (set_attr "pcrel_opt" "load_got")
+   (set (attr "cost")
+	(if_then_else (match_test "optimize_function_for_speed_p (cfun)")
+		      (const_string "8")
+		      (const_string "4")))
+   (set_attr "prefixed" "yes")])
+
+;; The secondary load insns that uses the GOT pointer that may become a NOP.
+(define_insn "*mov<mode>_pcrel_opt_load_mem"
+  [(set (match_operand:QHI 0 "gpc_reg_operand" "+r,wa")
+	(match_operand:QHI 1 "one_reg_memory_operand" "Q,Q"))
+   (use (match_operand:QHI 2 "gpc_reg_operand" "0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   l<wd>z %0,%1
+   lxsi<wd>zx %x0,%y1"
+  [(set_attr "type" "load,fpload")
+   (set_attr "pcrel_opt" "load,no")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movsi_pcrel_opt_load_mem"
+  [(set (match_operand:SI 0 "gpc_reg_operand" "+r,d,v")
+	(match_operand:SI 1 "one_reg_memory_operand" "Q,Q,Q"))
+   (use (match_operand:SI 2 "gpc_reg_operand" "0,0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lwz %0,%1
+   lfiwzx %0,%y1
+   lxsiwzx %x0,%y1"
+  [(set_attr "type" "load,fpload,fpload")
+   (set_attr "pcrel_opt" "load,no,no")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movdi_pcrel_opt_load_mem"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "+r,d,v")
+	(match_operand:DI 1 "one_reg_memory_operand" "Q,Q,Q"))
+   (use (match_operand:DI 2 "gpc_reg_operand" "0,0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   ld %0,%1
+   lfd %0,%1
+   lxsd %0,%1"
+  [(set_attr "type" "load,fpload,fpload")
+   (set_attr "pcrel_opt" "load")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movsf_pcrel_opt_load_mem"
+  [(set (match_operand:SF 0 "gpc_reg_operand" "+d,v,r")
+	(match_operand:SF 1 "one_reg_memory_operand" "Q,Q,Q"))
+   (use (match_operand:SF 2 "gpc_reg_operand" "0,0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lfs %0,%1
+   lxssp %0,%1
+   lwz %0,%1"
+  [(set_attr "type" "fpload,fpload,load")
+   (set_attr "pcrel_opt" "load")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movdf_pcrel_opt_load_mem"
+  [(set (match_operand:DF 0 "gpc_reg_operand" "+d,v,r")
+	(match_operand:DF 1 "one_reg_memory_operand" "Q,Q,Q"))
+   (use (match_operand:DF 2 "gpc_reg_operand" "0,0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lfd %0,%1
+   lxsd %0,%1
+   ld %0,%1"
+  [(set_attr "type" "fpload,fpload,load")
+   (set_attr "pcrel_opt" "load")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*mov<mode>_pcrel_opt_load_mem"
+  [(set (match_operand:POV 0 "gpc_reg_operand" "+wa")
+	(match_operand:POV 1 "one_reg_memory_operand" "Q"))
+   (use (match_operand:POV 2 "gpc_reg_operand" "0"))
+   (use (match_operand:DI 3 "const_int_operand" "n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "lxv %x0,%1"
+  [(set_attr "type" "vecload")
+   (set_attr "pcrel_opt" "load")
+   (set_attr "prefixed" "no")])
+
+;; Zero extend insns
+(define_insn "*mov<mode>_pcrel_opt_load_zero_extend2"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "+r,wa")
+	(zero_extend:DI
+	 (match_operand:QHI 1 "one_reg_memory_operand" "Q,Q")))
+   (use (match_operand:DI 2 "gpc_reg_operand" "0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   l<wd>z %0,%1
+   lxsi<wd>zx %x0,%y1"
+  [(set_attr "type" "load,fpload")
+   (set_attr "pcrel_opt" "load,no")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movsi_pcrel_opt_load_zero_extend2"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "+r,d,v")
+	(zero_extend:DI
+	 (match_operand:SI 1 "one_reg_memory_operand" "Q,Q,Q")))
+   (use (match_operand:DI 2 "gpc_reg_operand" "0,0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lwz %0,%1
+   lfiwzx %0,%y1
+   lxsiwzx %x0,%y1"
+  [(set_attr "type" "load,fpload,fpload")
+   (set_attr "pcrel_opt" "load,no,no")
+   (set_attr "prefixed" "no")])
+
+;; Sign extend insns
+(define_insn "*movsi_pcrel_opt_load_sign_extend2"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "+r,d,v")
+	(sign_extend:DI
+	 (match_operand:SI 1 "one_reg_memory_operand" "Q,Q,Q")))
+   (use (match_operand:DI 2 "gpc_reg_operand" "0,0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lwa %0,%1
+   lfiwax %0,%y1
+   lxsiwax %x0,%y1"
+  [(set_attr "type" "load,fpload,fpload")
+   (set_attr "pcrel_opt" "load,no,no")
+   (set_attr "prefixed" "no")])
+
+(define_insn_and_split "*movhi_pcrel_opt_load_sign_extend2"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "+r,v")
+	(sign_extend:DI
+	 (match_operand:HI 1 "one_reg_memory_operand" "Q,Q")))
+   (use (match_operand:DI 2 "gpc_reg_operand" "0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lha %0,%1
+   #"
+  "&& reload_completed && altivec_register_operand (operands[0], HImode)"
+  [(parallel [(set (match_dup 4)
+		   (match_dup 1))
+	      (use (match_dup 4))
+	      (use (const_int 0))])
+   (set (match_dup 0)
+	(sign_extend:DI
+	 (match_dup 4)))]
+{
+  operands[4] = gen_rtx_REG (HImode, REGNO (operands[0]));
+}
+  [(set_attr "type" "load,fpload")
+   (set_attr "pcrel_opt" "load,no")
+   (set_attr "length" "4,8")
+   (set_attr "prefixed" "no")])
+
+;; Floating point extend insn
+(define_insn "*movsf_pcrel_opt_load_float_extend2"
+  [(set (match_operand:DF 0 "gpc_reg_operand" "+d,v")
+	(float_extend:DF
+	 (match_operand:SF 1 "one_reg_memory_operand" "Q,Q")))
+   (use (match_operand:DF 2 "gpc_reg_operand" "0,0"))
+   (use (match_operand:DI 3 "const_int_operand" "n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+   lfs %0,%1
+   lxssp %0,%1"
+  [(set_attr "type" "fpload")
+   (set_attr "pcrel_opt" "load")
+   (set_attr "prefixed" "no")])
+
+; ;; Store combiner insns that merge together loading up the address of the
+; ;; external variable and doing the store.  This is split in the first split
+; ;; pass before register allocation.
+;;
+;; We need to set the "cost" explicitly so that the instruction length is not
+;; used.  We return the same cost as a normal store (4).
+(define_insn_and_split "*mov<mode>_pcrel_opt_store"
+  [(set (match_operand:PO 0 "pcrel_ext_mem_operand")
+ 	(match_operand:PO 1 "gpc_reg_operand"))]
+   "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64
+    && can_create_pseudo_p ()"
+   "#"
+   "&& 1"
+   [(set (match_dup 2)
+	 (unspec:DI [(match_dup 1)
+		     (match_dup 3)
+		     (const_int 0)] UNSPEC_PCREL_ST))
+    (parallel [(set (match_dup 4)
+		    (match_dup 1))
+	       (use (const_int 0))])]
+{
+  rtx mem = operands[0];
+  rtx addr = XEXP (mem, 0);
+  rtx got = gen_reg_rtx (DImode);
+
+  operands[2] = got;
+  operands[3] = addr;
+  operands[4] = change_address (mem, <MODE>mode, got);
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "20")
+   (set_attr "pcrel_opt" "store_got")
+   (set_attr "cost" "4")
+   (set_attr "prefixed" "yes")])
+
+;; Load of the GOT address for a store operation that may be converted into a
+;; direct store.
+(define_insn "*mov<mode>_pcrel_opt_store_got"
+  [(set (match_operand:DI 0 "base_reg_operand" "=&b,&b,&b")
+	(unspec:DI [(match_operand:PO 1 "gpc_reg_operand" "<PO_constraint>")
+		    (match_operand:DI 2 "pcrel_ext_address")
+		    (match_operand:DI 3 "const_int_operand" "n,n,n")]
+		   UNSPEC_PCREL_ST))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+{
+  return (INTVAL (operands[3])) ? "ld %0,%a2\n.Lpcrel%3:" : "ld %0,%a2";
+}
+  [(set_attr "type" "load")
+   (set_attr "length" "12")
+   (set_attr "pcrel_opt" "store_got")
+   (set_attr "cost" "4")
+   (set_attr "prefixed" "yes")])
+
+;; Secondary store instruction that uses the GOT pointer, and may be optimized
+;; into a NOP instruction.
+(define_insn "*mov<mode>_pcrel_opt_store_mem"
+  [(set (match_operand:QHI 0 "one_reg_memory_operand" "=Q,Q")
+	(match_operand:QHI 1 "gpc_reg_operand" "r,wa"))
+   (use (match_operand:DI 2 "const_int_operand" "n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+  st<wd> %1,%0
+  stxsi<wd>x %x1,%y0"
+  [(set_attr "type" "store,fpstore")
+   (set_attr "pcrel_opt" "store,no")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movsi_pcrel_opt_store_mem"
+  [(set (match_operand:SI 0 "one_reg_memory_operand" "=Q,Q,Q")
+	(match_operand:SI 1 "gpc_reg_operand" "r,d,v"))
+   (use (match_operand:DI 2 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+  stw %1,%0
+  stfiwx %1,%y0
+  stxsiwx %1,%y0"
+  [(set_attr "type" "store,fpstore,fpstore")
+   (set_attr "pcrel_opt" "store,no,no")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movdi_pcrel_opt_store_mem"
+  [(set (match_operand:DI 0 "one_reg_memory_operand" "=Q,Q,Q")
+	(match_operand:DI 1 "gpc_reg_operand" "r,d,v"))
+   (use (match_operand:DI 2 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+  std %1,%0
+  stfd %1,%0
+  stxsd %1,%0"
+  [(set_attr "type" "store,fpstore,fpstore")
+   (set_attr "pcrel_opt" "store")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movsf_pcrel_opt_store_mem"
+  [(set (match_operand:SF 0 "one_reg_memory_operand" "=Q,Q,Q")
+	(match_operand:SF 1 "gpc_reg_operand" "d,v,r"))
+   (use (match_operand:DI 2 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+  stfs %1,%0
+  stxssp %1,%0
+  stw %1,%0"
+  [(set_attr "type" "fpstore,fpstore,store")
+   (set_attr "pcrel_opt" "store")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*movdf_pcrel_opt_store_mem"
+  [(set (match_operand:DF 0 "one_reg_memory_operand" "=Q,Q,Q")
+	(match_operand:DF 1 "gpc_reg_operand" "d,v,r"))
+   (use (match_operand:DI 2 "const_int_operand" "n,n,n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "@
+  stfd %1,%0
+  stxsd %1,%0
+  std %1,%0"
+  [(set_attr "type" "fpstore,fpstore,store")
+   (set_attr "pcrel_opt" "store")
+   (set_attr "prefixed" "no")])
+
+(define_insn "*mov<mode>_pcrel_opt_store_mem"
+  [(set (match_operand:POV 0 "one_reg_memory_operand" "=Q")
+	(match_operand:POV 1 "gpc_reg_operand" "wa"))
+   (use (match_operand:DI 2 "const_int_operand" "n"))]
+  "TARGET_PCREL && TARGET_PCREL_OPT && TARGET_POWERPC64"
+  "stxv %x1,%0"
+  [(set_attr "type" "vecstore")
+   (set_attr "pcrel_opt" "store")
+   (set_attr "prefixed" "no")])
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(revision 274876)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -775,6 +775,13 @@ (define_predicate "indexed_or_indirect_o
   return indexed_or_indirect_address (op, mode);
 })
 
+;; Return 1 if the operand uses a single register for the address.
+(define_predicate "one_reg_memory_operand"
+  (match_code "mem")
+{
+  return REG_P (XEXP (op, 0));
+})
+
 ;; Like indexed_or_indirect_operand, but also allow a GPR register if direct
 ;; moves are supported.
 (define_predicate "reg_or_indexed_operand"
@@ -1695,6 +1702,15 @@ (define_predicate "pcrel_ext_address"
   return (SYMBOL_REF_P (op) && !SYMBOL_REF_LOCAL_P (op));
 })
 
+;; Return 1 if op is a memory operand to an external variable when we
+;; support pc-relative addressing and the PCREL_OPT relocation to
+;; optimize references to it.
+(define_predicate "pcrel_ext_mem_operand"
+  (match_code "mem")
+{
+  return pcrel_ext_address (XEXP (op, 0), Pmode);
+})
+
 ;; Return 1 if op is a memory operand that is not prefixed.
 (define_predicate "non_prefixed_mem_operand"
   (match_code "mem")
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(revision 274875)
+++ gcc/config/rs6000/rs6000-cpus.def	(working copy)
@@ -86,6 +86,7 @@
    prefixed addressing, and we want to clear all of the addressing bits
    on targets that cannot support prefixed/pcrel addressing.  */
 #define ADDRESSING_FUTURE_MASKS	(OPTION_MASK_PCREL			\
+				 | OPTION_MASK_PCREL_OPT		\
 				 | OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-future.  */
@@ -144,6 +145,7 @@
 				 | OPTION_MASK_P9_MISC			\
 				 | OPTION_MASK_P9_VECTOR		\
 				 | OPTION_MASK_PCREL			\
+				 | OPTION_MASK_PCREL_OPT		\
 				 | OPTION_MASK_POPCNTB			\
 				 | OPTION_MASK_POPCNTD			\
 				 | OPTION_MASK_POWERPC64		\
Index: gcc/config/rs6000/rs6000-passes.def
===================================================================
--- gcc/config/rs6000/rs6000-passes.def	(revision 274864)
+++ gcc/config/rs6000/rs6000-passes.def	(working copy)
@@ -25,3 +25,12 @@ along with GCC; see the file COPYING3.
  */
 
   INSERT_PASS_BEFORE (pass_cse, 1, pass_analyze_swaps);
+
+/* The pcrel_opt pass must be the final pass before final.  This pass combines
+   references to external pc-relative variables with their use.  There must be
+   only one reference to the external pointer loaded in order to do the
+   optimization.  Otherwise we load up the addresses (either via PADDI if the
+   label is local or via a PLD from the got section if it is defined in another
+   module) and the value as a base pointer.  */
+
+  INSERT_PASS_BEFORE (pass_final, 1, pass_pcrel_opt);
Index: gcc/config/rs6000/rs6000-pcrel.c
===================================================================
--- gcc/config/rs6000/rs6000-pcrel.c	(revision 274877)
+++ gcc/config/rs6000/rs6000-pcrel.c	(working copy)
@@ -0,0 +1,463 @@
+/* Subroutines used support the pc-relative linker optimization.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file implements a RTL pass that looks for pc-relative loads of the
+   address of an external variable using the PCREL_GOT relocation and a single
+   load/store that uses that GOT pointer.  If that is found we create the
+   PCREL_OPT relocation to possibly convert:
+
+	pld b,var@pcrel@got(0),1
+
+	# possibly other instructions that do not use the base register 'b' or
+        # the result register 'r'.
+
+	lwz r,0(b)
+
+   into:
+
+	plwz r,var@pcrel(0),1
+
+	# possibly other instructions that do not use the base register 'b' or
+        # the result register 'r'.
+
+	nop
+
+   If the variable is not defined in the main program or the code using it is
+   not in the main program, the linker put the address in the .got section and
+   do:
+
+	.section .got
+	.Lvar_got:	.dword var
+
+	.section .text
+	pld b,.Lvar_got@pcrel(0),1
+
+	# possibly other instructions that do not use the base register 'b' or
+        # the result register 'r'.
+
+	lwz r,0(b)
+	
+   We only look for a single usage in the basic block where the GOT pointer is
+   loaded.  Multiple uses or references in another basic block will force us to
+   not use the PCREL_OPT relocation.
+
+   This file also contains the support function for prefixed memory to emit the
+   leading 'p' in front of prefixed instructions, and to create the necessary
+   relocations needed for PCREL_OPT.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "tree.h"
+#include "memmodel.h"
+#include "df.h"
+#include "tm_p.h"
+#include "ira.h"
+#include "print-tree.h"
+#include "varasm.h"
+#include "explow.h"
+#include "expr.h"
+#include "output.h"
+#include "tree-pass.h"
+#include "rtx-vector-builder.h"
+#include "print-rtl.h"
+#include "insn-attr.h"
+
+\f
+// Optimize pc-relative references
+const pass_data pass_data_pcrel =
+{
+  RTL_PASS,			// type
+  "pcrel",			// name
+  OPTGROUP_NONE,		// optinfo_flags
+  TV_NONE,			// tv_id
+  0,				// properties_required
+  0,				// properties_provided
+  0,				// properties_destroyed
+  0,				// todo_flags_start
+  TODO_df_finish,		// todo_flags_finish
+};
+
+// Pass data structures
+class pcrel : public rtl_opt_pass
+{
+private:
+  // Function to optimize pc relative loads/stores
+  unsigned int do_pcrel_opt (function *);
+
+  // A GOT pointer used for a load
+  void load_got (rtx_insn *);
+
+  // A load insn that uses the GOT ponter
+  void load_insn (rtx_insn *);
+
+  // A GOT pointer used for a store
+  void store_got (rtx_insn *);
+
+  // A store insn that uses the GOT ponter
+  void store_insn (rtx_insn *);
+
+  // Record the number of loads and stores optimized
+  unsigned long num_got_loads;
+  unsigned long num_got_stores;
+  unsigned long num_loads;
+  unsigned long num_stores;
+  unsigned long num_opt_loads;
+  unsigned long num_opt_stores;
+
+  // We record the GOT insn for each register that sets a GOT for a load or a
+  // store instruction.
+  rtx_insn *got_reg[32];
+
+public:
+  pcrel (gcc::context *ctxt)
+  : rtl_opt_pass (pass_data_pcrel, ctxt),
+    num_got_loads (0),
+    num_got_stores (0),
+    num_loads (0),
+    num_stores (0),
+    num_opt_loads (0),
+    num_opt_stores (0)
+  {}
+
+  ~pcrel (void)
+  {}
+
+  // opt_pass methods:
+  virtual bool gate (function *)
+  {
+    return TARGET_PCREL && TARGET_PCREL_OPT && optimize;
+  }
+
+  virtual unsigned int execute (function *fun)
+  {
+    return do_pcrel_opt (fun);
+  }
+
+  opt_pass *clone ()
+  {
+    return new pcrel (m_ctxt);
+  }
+};
+
+\f
+/* Return a marker to create the backward pointing label that links the load or
+   store to the insn that loads the adddress of an external label with
+   PCREL_GOT.  This allows us to create the necessary R_PPC64_PCREL_OPT
+   relocation to link the two instructions.  */
+
+static rtx
+pcrel_marker (void)
+{
+  static unsigned int label_number = 0;
+
+  label_number++;
+  return GEN_INT (label_number);
+}
+
+\f
+// Save the current PCREL_OPT load GOT insn address in the register # of the
+// GOT pointer that is loaded.
+//
+// The PCREL_OPT LOAD_GOT insn looks like:
+//
+//	(parallel [(set (base) (addr))
+//		   (set (reg)  (unspec [(const_int 0)] UNSPEC_PCREL_LD))
+//		   (use (marker))])
+//
+// The base register is the GOT address, and the marker is a numeric label that
+// is created in this pass if the only use of the GOT load pointer is for a
+// single load.
+
+void
+pcrel::load_got (rtx_insn *insn)
+{
+  rtx pattern = PATTERN (insn);
+  rtx set = XVECEXP (pattern, 0, 0);
+  int got = REGNO (SET_DEST (set));
+
+  gcc_assert (IN_RANGE (got, FIRST_GPR_REGNO+1, LAST_GPR_REGNO));
+  got_reg[got] = insn;
+  num_got_loads++;
+}
+
+// See if the use of this load of a GOT pointer is the only usage.  If so,
+// allocate a marker to create a label.
+//
+// The PCREL_OPT LOAD insn looks like:
+//
+//	(parallel [(set (reg) (mem))
+//		   (use (reg)
+//		   (use (marker))])
+//
+// Between the reg and the memory might be a SIGN_EXTEND, ZERO_EXTEND, or
+// FLOAT_EXTEND:
+//
+//	(parallel [(set (reg) (sign_extend (mem)))
+//		   (use (reg)
+//		   (use (marker))])
+
+void
+pcrel::load_insn (rtx_insn *insn)
+{
+  num_loads++;
+
+  /* If the optimizer has changed the load instruction, just use the GOT
+     pointer as an address.  */
+  rtx pattern = PATTERN (insn);
+  if (GET_CODE (pattern) != PARALLEL || XVECLEN (pattern, 0) != 3)
+    return;
+
+  rtx set = XVECEXP (pattern, 0, 0);
+  if (GET_CODE (set) != SET
+      || GET_CODE (XVECEXP (pattern, 0, 1)) != USE
+      || GET_CODE (XVECEXP (pattern, 0, 2)) != USE)
+    return;
+
+  rtx dest = SET_DEST (set);
+  rtx src = SET_SRC (set);
+
+  if (!rtx_equal_p (dest, XEXP (XVECEXP (pattern, 0, 1), 0)))
+    return;
+
+  if (GET_CODE (src) == SIGN_EXTEND || GET_CODE (src) == ZERO_EXTEND
+      || GET_CODE (src) == FLOAT_EXTEND)
+    src = XEXP (src, 0);
+
+  if (!MEM_P (src))
+    return;
+
+  rtx addr = XEXP (src, 0);
+  if (!REG_P (addr))
+    return;
+
+  int r = REGNO (addr);
+  if (!IN_RANGE (r, FIRST_GPR_REGNO+1, LAST_GPR_REGNO))
+    return;
+
+  rtx_insn *got_insn = got_reg[r];
+
+  // See if this is the only reference, and there is a set of the GOT pointer
+  // previously in the same basic block.  If this is the only reference,
+  // optimize it.
+  if (got_insn
+      && get_attr_pcrel_opt (got_insn) == PCREL_OPT_LOAD_GOT
+      && !reg_used_between_p (addr, got_insn, insn)
+      && (find_reg_note (insn, REG_DEAD, addr) || rtx_equal_p (dest, addr)))
+    {
+      rtx marker = pcrel_marker ();
+      rtx got_use = XVECEXP (PATTERN (got_insn), 0, 2);
+      rtx insn_use = XVECEXP (pattern, 0, 2);
+
+      gcc_checking_assert (rtx_equal_p (XEXP (got_use, 0), const0_rtx));
+      gcc_checking_assert (rtx_equal_p (XEXP (insn_use, 0), const0_rtx));
+
+      XEXP (got_use, 0) = marker;
+      XEXP (insn_use, 0) = marker;
+      num_opt_loads++;
+    }
+
+  // Forget the GOT now that we've used it.
+  got_reg[r] = (rtx_insn *)0;
+}
+
+// Save the current PCREL_OPT store GOT insn address in the register # of the
+// GOT pointer that is loaded.
+//
+// The PCREL_OPT STORE_GOT insn looks like:
+//
+//	(set (set (base)
+//	     (unspec:DI [(src)
+//			 (addr)
+//			 (marker)] UNSPEC_PCREL_ST))
+//
+// The base register is the GOT address, and the marker is a numeric label that
+// is created in this pass or 0 to indicate there are other uses of the GOT
+// pointer.
+
+void
+pcrel::store_got (rtx_insn *insn)
+{
+  rtx pattern = PATTERN (insn);
+  int got = REGNO (SET_DEST (pattern));
+
+  gcc_checking_assert (IN_RANGE (got, FIRST_GPR_REGNO+1, LAST_GPR_REGNO));
+  got_reg[got] = insn;
+  num_got_stores++;
+}
+
+// See if the use of this store using a GOT pointer is the only usage.  If so,
+// allocate a marker to create a label.
+//
+// The PCREL_OPT STORE insn looks like:
+//
+//	(parallel [(set (mem) (reg))
+//		   (use (marker))])
+
+void
+pcrel::store_insn (rtx_insn *insn)
+{
+  num_stores++;
+
+  /* If the optimizer has changed the store instruction, just use the GOT
+     pointer as an address.  */
+  rtx pattern = PATTERN (insn);
+  if (GET_CODE (pattern) != PARALLEL || XVECLEN (pattern, 0) != 2)
+    return;
+
+  rtx set = XVECEXP (pattern, 0, 0);
+  if (GET_CODE (set) != SET || GET_CODE (XVECEXP (pattern, 0, 1)) != USE)
+    return;
+
+  rtx dest = SET_DEST (set);
+
+  if (!MEM_P (dest))
+    return;
+
+  rtx addr = XEXP (dest, 0);
+  if (!REG_P (addr))
+    return;
+
+  int r = REGNO (addr);
+  if (!IN_RANGE (r, FIRST_GPR_REGNO+1, LAST_GPR_REGNO))
+    return;
+
+  rtx_insn *got_insn = got_reg[r];
+
+  // See if this is the only reference, and there is a GOT pointer previously.
+  // If this is the only reference, optimize it.
+  if (got_insn
+      && get_attr_pcrel_opt (got_insn) == PCREL_OPT_STORE_GOT
+      && !reg_used_between_p (addr, got_insn, insn)
+      && find_reg_note (insn, REG_DEAD, addr))
+    {
+      rtx marker = pcrel_marker ();
+      rtx got_src = SET_SRC (PATTERN (got_insn));
+      rtx insn_use = XVECEXP (pattern, 0, 1);
+
+      gcc_checking_assert (rtx_equal_p (XVECEXP (got_src, 0, 2), const0_rtx));
+      gcc_checking_assert (rtx_equal_p (XEXP (insn_use, 0), const0_rtx));
+
+      XVECEXP (got_src, 0, 2) = marker;
+      XEXP (insn_use, 0) = marker;
+      num_opt_stores++;
+    }
+
+  // Forget the GOT now
+  got_reg[r] = (rtx_insn *)0;
+}
+
+// Optimize pcrel external variable references
+
+unsigned int
+pcrel::do_pcrel_opt (function *fun)
+{
+  basic_block bb;
+  rtx_insn *insn, *curr_insn = 0;
+
+  // Dataflow analysis for use-def chains.
+  df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
+  df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
+  df_analyze ();
+  df_set_flags (DF_DEFER_INSN_RESCAN | DF_LR_RUN_DCE);
+
+  // Look at each basic block to see if there is a load of an external
+  // variable's GOT address, and a single load/store using that GOT address.
+  FOR_ALL_BB_FN (bb, fun)
+    {
+      bool clear_got_p = true;
+
+      FOR_BB_INSNS_SAFE (bb, insn, curr_insn)
+	{
+	  if (clear_got_p)
+	    {
+	      memset ((void *) &got_reg[0], 0, sizeof (got_reg));
+	      clear_got_p = false;
+	    }
+
+	  if (NONJUMP_INSN_P (insn))
+	    {
+	      rtx pattern = PATTERN (insn);
+	      if (GET_CODE (pattern) == SET || GET_CODE (pattern) == PARALLEL)
+		{
+		  switch (get_attr_pcrel_opt (insn))
+		    {
+		    case PCREL_OPT_NO:
+		      break;
+
+		    case PCREL_OPT_LOAD_GOT:
+		      load_got (insn);
+		      break;
+
+		    case PCREL_OPT_LOAD:
+		      load_insn (insn);
+		      break;
+
+		    case PCREL_OPT_STORE_GOT:
+		      store_got (insn);
+		      break;
+
+		    case PCREL_OPT_STORE:
+		      store_insn (insn);
+		      break;
+
+		    default:
+		      gcc_unreachable ();
+		    }
+		}
+	    }
+
+	  /* Don't let the GOT load be moved before a label, jump, or call and
+	     the dependent load/store after the label, jump, or call.  */
+	  else if (JUMP_P (insn) || CALL_P (insn) || LABEL_P (insn))
+	    clear_got_p = true;
+	}
+    }
+
+  // Rebuild ud chains.
+  df_remove_problem (df_chain);
+  df_process_deferred_rescans ();
+  df_set_flags (DF_RD_PRUNE_DEAD_DEFS | DF_LR_RUN_DCE);
+  df_chain_add_problem (DF_UD_CHAIN);
+  df_analyze ();
+
+  if (dump_file)
+    {
+      fprintf (dump_file, "\npc-relative optimizations:\n");
+      fprintf (dump_file, "\tgot loads        = %lu\n", num_got_loads);
+      fprintf (dump_file, "\tpotential loads  = %lu\n", num_loads);
+      fprintf (dump_file, "\toptimized loads  = %lu\n", num_opt_loads);
+      fprintf (dump_file, "\tgot stores       = %lu\n", num_got_stores);
+      fprintf (dump_file, "\tpotential stores = %lu\n", num_stores);
+      fprintf (dump_file, "\toptimized stores = %lu\n\n", num_opt_stores);
+    }
+
+  return 0;
+}
+
+\f
+rtl_opt_pass *
+make_pass_pcrel_opt (gcc::context *ctxt)
+{
+  return new pcrel (ctxt);
+}
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 274874)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -266,6 +266,7 @@ extern bool rs6000_linux_float_exception
 namespace gcc { class context; }
 class rtl_opt_pass;
 
+extern rtl_opt_pass *make_pass_pcrel_opt (gcc::context *);
 extern rtl_opt_pass *make_pass_analyze_swaps (gcc::context *);
 extern bool rs6000_sum_of_two_registers_p (const_rtx expr);
 extern bool rs6000_quadword_masked_address_p (const_rtx exp);
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 274875)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -4415,7 +4415,7 @@ rs6000_option_override_internal (bool gl
 	  if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
 	    error ("%qs requires %qs", "-mpcrel", "-mcmodel=medium");
 
-	  rs6000_isa_flags &= ~OPTION_MASK_PCREL;
+	  rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PCREL_OPT);
 	}
 
       /* Enable defaults if desired.  */
@@ -4429,7 +4429,11 @@ rs6000_option_override_internal (bool gl
 
 	  if (!explicit_pcrel && TARGET_PCREL_DEFAULT
 	      && TARGET_CMODEL == CMODEL_MEDIUM)
-	    rs6000_isa_flags |= OPTION_MASK_PCREL;
+	    {
+	      rs6000_isa_flags |= OPTION_MASK_PCREL;
+	      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL_OPT) == 0)
+		rs6000_isa_flags |= OPTION_MASK_PCREL_OPT;
+	    }
 	}
     }
 
@@ -4453,6 +4457,15 @@ rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_PCREL;
     }
 
+  /* Check -mfuture debug switches.  */
+  if (!TARGET_PCREL && TARGET_PCREL_OPT)
+    {
+      if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL_OPT) != 0)
+	error ("%qs requires %qs", "-mpcrel-opt", "-mpcrel");
+
+      rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT;
+    }
+
   if (TARGET_DEBUG_REG || TARGET_DEBUG_TARGET)
     rs6000_print_isa_options (stderr, 0, "after subtarget", rs6000_isa_flags);
 
@@ -14244,13 +14257,40 @@ prefixed_paddi_p (rtx_insn *insn)
    instruction is printed out.  */
 static bool next_insn_prefixed_p;
 
+/* Numeric label that is the address of the GOT load instruction + 8 that we
+   link the R_PPC64_PCREL_OPT relocation to for on the next instruction.  */
+static unsigned int pcrel_opt_label_num;
+
 /* Define FINAL_PRESCAN_INSN if some processing needs to be done before
    outputting the assembler code.  On the PowerPC, we remember if the current
-   insn is a prefixed insn where we need to emit a 'p' before the insn.  */
+   insn is a prefixed insn where we need to emit a 'p' before the insn.
+
+   In addition, if the insn is part of a pc-relative reference to an external
+   label optimization, this is recorded also.  */
 void
-rs6000_final_prescan_insn (rtx_insn *insn, rtx [], int)
+rs6000_final_prescan_insn (rtx_insn *insn, rtx operands[], int noperands)
 {
   next_insn_prefixed_p = (get_attr_prefixed (insn) != PREFIXED_NO);
+
+  enum attr_pcrel_opt pcrel_attr = get_attr_pcrel_opt (insn);
+
+  /* For the load and store instructions that are tied to a GOT pointer, we
+     know that operand 3 contains a marker for loads and operand 2 contains
+     the marker for stores.  If it is non-zero, it is the numeric label where
+     we load the address + 8.  */
+  if (pcrel_attr == PCREL_OPT_LOAD)
+    {
+      gcc_assert (noperands >= 3);
+      pcrel_opt_label_num = INTVAL (operands[3]);
+    }
+  else if (pcrel_attr == PCREL_OPT_STORE)
+    {
+      gcc_assert (noperands >= 2);
+      pcrel_opt_label_num = INTVAL (operands[2]);
+    }
+  else
+    pcrel_opt_label_num = 0;
+
   return;
 }
 
@@ -14260,6 +14300,13 @@ rs6000_final_prescan_insn (rtx_insn *ins
 void
 rs6000_asm_output_opcode (FILE *stream)
 {
+  if (pcrel_opt_label_num)
+    {
+      fprintf (stream, ".reloc .Lpcrel%u-8,R_PPC64_PCREL_OPT,.-(.Lpcrel%u-8)\n\t",
+	       pcrel_opt_label_num, pcrel_opt_label_num);
+      pcrel_opt_label_num = 0;
+    }
+
   if (next_insn_prefixed_p)
     fputc ('p', stream);
 
@@ -23422,6 +23469,7 @@ static struct rs6000_opt_mask const rs60
   { "mulhw",			OPTION_MASK_MULHW,		false, true  },
   { "multiple",			OPTION_MASK_MULTIPLE,		false, true  },
   { "pcrel",			OPTION_MASK_PCREL,		false, true  },
+  { "pcrel-opt",		OPTION_MASK_PCREL_OPT,		false, true  },
   { "popcntb",			OPTION_MASK_POPCNTB,		false, true  },
   { "popcntd",			OPTION_MASK_POPCNTD,		false, true  },
   { "power8-fusion",		OPTION_MASK_P8_FUSION,		false, true  },
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 274874)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -258,6 +258,31 @@ (define_attr "var_shift" "no,yes"
 ;; Is copying of this instruction disallowed?
 (define_attr "cannot_copy" "no,yes" (const_string "no"))
 
+;; Whether this instruction is part of the two instruction sequence that
+;; supports PCREL_OPT optimizations, where the linker can change code of the
+;; form:
+;;
+;;		pld b,var@got@pcrel
+;;	100:
+;;		# possibly other instructions
+;;		.reloc 100b-8,R_PPC64_PCREL_OPT,0
+;;		lwz r,0(b)
+;;
+;; into the following if 'var' is in the main program:
+;;
+;;		plwz r,0(b)
+;;		# possibly other instructions
+;;		nop
+;;
+;; The states are:
+;;	no		-- insn is not involved with PCREL_OPT optimizations
+;;	load_got	-- insn loads up the GOT pointer for a load instruction
+;;	load		-- insn is an offsettable load that uses the GOT pointer
+;;	store_got	-- insn loads up the GOT pointer for a store instruction
+;;	store		-- insn is an offsettable store that uses the GOT pointer
+
+(define_attr "pcrel_opt" "no,load_got,load,store_got,store" (const_string "no"))
+
 ;; Whether an insn is a prefixed insn, and an initial 'p' should be printed
 ;; before the instruction.  A prefixed instruction has a prefix instruction
 ;; word that extends the immediate value of the instructions from 12-16 bits to
@@ -14726,6 +14751,7 @@ (define_insn "*cmpeqb_internal"
   [(set_attr "type" "logical")])
 \f
 
+(include "pcrel.md")
 (include "sync.md")
 (include "vector.md")
 (include "vsx.md")
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt	(revision 274864)
+++ gcc/config/rs6000/rs6000.opt	(working copy)
@@ -577,3 +577,7 @@ Generate (do not generate) prefixed memo
 mpcrel
 Target Report Mask(PCREL) Var(rs6000_isa_flags)
 Generate (do not generate) pc-relative memory addressing.
+
+mpcrel-opt
+Target Undocumented Mask(PCREL_OPT) Var(rs6000_isa_flags)
+Generate (do not generate) pc-relative memory optimizations for externals.
Index: gcc/config/rs6000/t-rs6000
===================================================================
--- gcc/config/rs6000/t-rs6000	(revision 274864)
+++ gcc/config/rs6000/t-rs6000	(working copy)
@@ -47,6 +47,10 @@ rs6000-call.o: $(srcdir)/config/rs6000/r
 	$(COMPILE) $<
 	$(POSTCOMPILE)
 
+rs6000-pcrel.o: $(srcdir)/config/rs6000/rs6000-pcrel.c
+	$(COMPILE) $<
+	$(POSTCOMPILE)
+
 $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh \
   $(srcdir)/config/rs6000/rs6000-cpus.def
 	$(SHELL) $(srcdir)/config/rs6000/genopt.sh $(srcdir)/config/rs6000 > \
@@ -79,6 +83,7 @@ MD_INCLUDES = $(srcdir)/config/rs6000/rs
 	$(srcdir)/config/rs6000/predicates.md \
 	$(srcdir)/config/rs6000/constraints.md \
 	$(srcdir)/config/rs6000/darwin.md \
+	$(srcdir)/config/rs6000/pcrel.md \
 	$(srcdir)/config/rs6000/sync.md \
 	$(srcdir)/config/rs6000/vector.md \
 	$(srcdir)/config/rs6000/vsx.md \


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.ibm.com, phone: +1 (978) 899-4797

  parent reply	other threads:[~2019-08-26 21:43 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-26 19:21 PowerPC future machine, version 3 Michael Meissner
2019-08-26 20:41 ` [PATCH V3, #1 of 10], Add basic pc-relative support Michael Meissner
2019-08-28 18:46   ` Segher Boessenkool
2019-08-28 21:48     ` Michael Meissner
2019-08-30  0:08       ` Segher Boessenkool
2019-09-06  0:18         ` Michael Meissner
2019-09-06 12:50           ` Segher Boessenkool
2019-09-09 20:28             ` Michael Meissner
2019-08-26 21:07 ` [PATCH, V3, #3 of 10], Add prefixed RTL insn attribute Michael Meissner
2019-08-30  1:58   ` Segher Boessenkool
2019-08-26 21:12 ` [PATCH, V3, #2 of 10], Improve rs6000_setup_addr_mask Michael Meissner
2019-08-29  2:59   ` Segher Boessenkool
2019-08-26 21:23 ` [PATCH, V3, #4 of 10], Add general prefixed/pcrel support Michael Meissner
2019-08-30 19:22   ` Segher Boessenkool
2019-08-31  3:08     ` Alan Modra
2019-08-31 14:13       ` Segher Boessenkool
2019-08-26 21:43 ` [PATCH, V3, #5 of 10], Make -mpcrel default on little endian Linux systems Michael Meissner
2019-08-30 19:46   ` Segher Boessenkool
2019-09-03 21:07     ` Michael Meissner
2019-09-03 22:25       ` Segher Boessenkool
2019-08-26 21:52 ` [PATCH, V3, #6 of 10], Fix vec_extract breakage Michael Meissner
2019-09-03 19:49   ` Segher Boessenkool
2019-09-05 20:48     ` Michael Meissner
2019-09-05 22:38       ` Segher Boessenkool
2019-09-06 10:26         ` Segher Boessenkool
2019-08-26 22:06 ` Michael Meissner [this message]
2019-08-28 21:48   ` [PATCH, V3, #7 of 10], Implement PCREL_OPT relocation optimization Michael Meissner
2019-09-03 22:56   ` Segher Boessenkool
2019-09-03 23:20     ` Michael Meissner
2019-09-03 23:33       ` Segher Boessenkool
2019-09-04 17:26         ` Michael Meissner
2019-09-06 12:09           ` Segher Boessenkool
2019-09-09 20:32             ` Michael Meissner
2019-09-09 20:56               ` Segher Boessenkool
2019-09-09 22:39                 ` Michael Meissner
2019-08-27  7:01 ` [PATCH, V3, #8 of 10], Miscellaneous prefixed addressing tests Michael Meissner
2019-09-03 23:17   ` Segher Boessenkool
2019-09-05 21:01     ` Michael Meissner
2019-09-05 22:57       ` Segher Boessenkool
2019-08-27  7:14 ` [PATCH, V3, #10 of #10], Pc-relative tests Michael Meissner
2019-08-27  7:55 ` [PATCH, V3, #9 of 10], Prefixed addressing tests with large offsets Michael Meissner
2019-09-03 23:22   ` Segher Boessenkool

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190826214341.GG11790@ibm-toto.the-meissners.org \
    --to=meissner@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).