* Re: [RFC Patch], PowerPC memory support pre-gcc9, patch #2
2018-03-14 23:01 [RFC Patch], PowerPC memory support pre-gcc9, patch #1 Michael Meissner
@ 2018-03-15 17:09 ` Michael Meissner
2018-03-20 13:32 ` Segher Boessenkool
2018-03-15 23:33 ` [RFC Patch], PowerPC memory support pre-gcc9, patch #3 Michael Meissner
` (3 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Michael Meissner @ 2018-03-15 17:09 UTC (permalink / raw)
To: Michael Meissner, GCC Patches, Segher Boessenkool,
David Edelsohn, Bill Schmidt
[-- Attachment #1: Type: text/plain, Size: 1313 bytes --]
This is patch #2 of my series for improving the PowerPC internal memory
support. It assumes patch #1 has been applied.
This patch moves the rs6000_move_128bit function from rs6000.c to a new file,
rs6000-output.c.
The third patch will create a rs6000_move_64bit function and change both 32-bit
and 64-bit movdi to call it, instead of having all of the instructions be
literals. I will also likely add improvements to setting the reg_addr address
masks for offsetable addresses.
The fourth patch will like move movdd and movdf to call rs6000_move_64bit as
well.
I tested this on a little endian power8 system and there were no regressions.
2018-03-14 Michael Meissner <meissner@linux.vnet.ibm.com>
* config.gcc (powerpc*-*-*): Add rs6000-output.o to extra_objs.
* config/rs6000/t-rs6000 (rs6000-output.o): Add build rule.
* config/rs6000/rs6000.c (rs6000_output_move_128bit): Move to
rs6000-output.c.
(rs6000_move_128bit_ok_p): Likewise.
(rs6000_split_128bit_ok_p): Likewise.
* config/rs6000/rs6000-output.c (rs6000_output_move_128bit):
Likewise.
to rs6000-output.c.
(rs6000_move_128bit_ok_p): Likewise.
(rs6000_split_128bit_ok_p): Likewise.
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
[-- Attachment #2: ext-addr.patch02b --]
[-- Type: text/plain, Size: 12388 bytes --]
Index: gcc/config.gcc
===================================================================
--- gcc/config.gcc (revision 258531)
+++ gcc/config.gcc (working copy)
@@ -466,7 +466,7 @@ powerpc*-*-*spe*)
;;
powerpc*-*-*)
cpu_type=rs6000
- extra_objs="rs6000-string.o rs6000-p8swap.o"
+ extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-output.o"
extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h"
extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h"
extra_headers="${extra_headers} xmmintrin.h mm_malloc.h emmintrin.h"
Index: gcc/config/rs6000/t-rs6000
===================================================================
--- gcc/config/rs6000/t-rs6000 (revision 258530)
+++ gcc/config/rs6000/t-rs6000 (working copy)
@@ -30,6 +30,10 @@ rs6000-string.o: $(srcdir)/config/rs6000
$(COMPILE) $<
$(POSTCOMPILE)
+rs6000-output.o: $(srcdir)/config/rs6000/rs6000-output.c
+ $(COMPILE) $<
+ $(POSTCOMPILE)
+
rs6000-p8swap.o: $(srcdir)/config/rs6000/rs6000-p8swap.c
$(COMPILE) $<
$(POSTCOMPILE)
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c (revision 258535)
+++ gcc/config/rs6000/rs6000.c (working copy)
@@ -20921,205 +20921,6 @@ rs6000_debug_can_change_mode_class (mach
return ret;
}
\f
-/* Return a string to do a move operation of 128 bits of data. */
-
-const char *
-rs6000_output_move_128bit (rtx operands[])
-{
- rtx dest = operands[0];
- rtx src = operands[1];
- machine_mode mode = GET_MODE (dest);
- int dest_regno;
- int src_regno;
- bool dest_gpr_p, dest_fp_p, dest_vmx_p, dest_vsx_p;
- bool src_gpr_p, src_fp_p, src_vmx_p, src_vsx_p;
-
- if (REG_P (dest))
- {
- dest_regno = REGNO (dest);
- dest_gpr_p = INT_REGNO_P (dest_regno);
- dest_fp_p = FP_REGNO_P (dest_regno);
- dest_vmx_p = ALTIVEC_REGNO_P (dest_regno);
- dest_vsx_p = dest_fp_p | dest_vmx_p;
- }
- else
- {
- dest_regno = -1;
- dest_gpr_p = dest_fp_p = dest_vmx_p = dest_vsx_p = false;
- }
-
- if (REG_P (src))
- {
- src_regno = REGNO (src);
- src_gpr_p = INT_REGNO_P (src_regno);
- src_fp_p = FP_REGNO_P (src_regno);
- src_vmx_p = ALTIVEC_REGNO_P (src_regno);
- src_vsx_p = src_fp_p | src_vmx_p;
- }
- else
- {
- src_regno = -1;
- src_gpr_p = src_fp_p = src_vmx_p = src_vsx_p = false;
- }
-
- /* Register moves. */
- if (dest_regno >= 0 && src_regno >= 0)
- {
- if (dest_gpr_p)
- {
- if (src_gpr_p)
- return "#";
-
- if (TARGET_DIRECT_MOVE_128 && src_vsx_p)
- return (WORDS_BIG_ENDIAN
- ? "mfvsrd %0,%x1\n\tmfvsrld %L0,%x1"
- : "mfvsrd %L0,%x1\n\tmfvsrld %0,%x1");
-
- else if (TARGET_VSX && TARGET_DIRECT_MOVE && src_vsx_p)
- return "#";
- }
-
- else if (TARGET_VSX && dest_vsx_p)
- {
- if (src_vsx_p)
- return "xxlor %x0,%x1,%x1";
-
- else if (TARGET_DIRECT_MOVE_128 && src_gpr_p)
- return (WORDS_BIG_ENDIAN
- ? "mtvsrdd %x0,%1,%L1"
- : "mtvsrdd %x0,%L1,%1");
-
- else if (TARGET_DIRECT_MOVE && src_gpr_p)
- return "#";
- }
-
- else if (TARGET_ALTIVEC && dest_vmx_p && src_vmx_p)
- return "vor %0,%1,%1";
-
- else if (dest_fp_p && src_fp_p)
- return "#";
- }
-
- /* Loads. */
- else if (dest_regno >= 0 && MEM_P (src))
- {
- if (dest_gpr_p)
- {
- if (TARGET_QUAD_MEMORY && quad_load_store_p (dest, src))
- return "lq %0,%1";
- else
- return "#";
- }
-
- else if (TARGET_ALTIVEC && dest_vmx_p
- && altivec_indexed_or_indirect_operand (src, mode))
- return "lvx %0,%y1";
-
- else if (TARGET_VSX && dest_vsx_p)
- {
- if (mode_supports_dq_form (mode)
- && quad_address_p (XEXP (src, 0), mode, true))
- return "lxv %x0,%1";
-
- else if (TARGET_P9_VECTOR)
- return "lxvx %x0,%y1";
-
- else if (mode == V16QImode || mode == V8HImode || mode == V4SImode)
- return "lxvw4x %x0,%y1";
-
- else
- return "lxvd2x %x0,%y1";
- }
-
- else if (TARGET_ALTIVEC && dest_vmx_p)
- return "lvx %0,%y1";
-
- else if (dest_fp_p)
- return "#";
- }
-
- /* Stores. */
- else if (src_regno >= 0 && MEM_P (dest))
- {
- if (src_gpr_p)
- {
- if (TARGET_QUAD_MEMORY && quad_load_store_p (dest, src))
- return "stq %1,%0";
- else
- return "#";
- }
-
- else if (TARGET_ALTIVEC && src_vmx_p
- && altivec_indexed_or_indirect_operand (src, mode))
- return "stvx %1,%y0";
-
- else if (TARGET_VSX && src_vsx_p)
- {
- if (mode_supports_dq_form (mode)
- && quad_address_p (XEXP (dest, 0), mode, true))
- return "stxv %x1,%0";
-
- else if (TARGET_P9_VECTOR)
- return "stxvx %x1,%y0";
-
- else if (mode == V16QImode || mode == V8HImode || mode == V4SImode)
- return "stxvw4x %x1,%y0";
-
- else
- return "stxvd2x %x1,%y0";
- }
-
- else if (TARGET_ALTIVEC && src_vmx_p)
- return "stvx %1,%y0";
-
- else if (src_fp_p)
- return "#";
- }
-
- /* Constants. */
- else if (dest_regno >= 0
- && (GET_CODE (src) == CONST_INT
- || GET_CODE (src) == CONST_WIDE_INT
- || GET_CODE (src) == CONST_DOUBLE
- || GET_CODE (src) == CONST_VECTOR))
- {
- if (dest_gpr_p)
- return "#";
-
- else if ((dest_vmx_p && TARGET_ALTIVEC)
- || (dest_vsx_p && TARGET_VSX))
- return output_vec_const_move (operands);
- }
-
- fatal_insn ("Bad 128-bit move", gen_rtx_SET (dest, src));
-}
-
-/* Validate a 128-bit move. */
-bool
-rs6000_move_128bit_ok_p (rtx operands[])
-{
- machine_mode mode = GET_MODE (operands[0]);
- return (gpc_reg_operand (operands[0], mode)
- || gpc_reg_operand (operands[1], mode));
-}
-
-/* Return true if a 128-bit move needs to be split. */
-bool
-rs6000_split_128bit_ok_p (rtx operands[])
-{
- if (!reload_completed)
- return false;
-
- if (!gpr_or_gpr_p (operands[0], operands[1]))
- return false;
-
- if (quad_load_store_p (operands[0], operands[1]))
- return false;
-
- return true;
-}
-
-\f
/* Given a comparison operation, return the bit number in CCR to test. We
know this is a valid comparison.
Index: gcc/config/rs6000/rs6000-output.c
===================================================================
--- gcc/config/rs6000/rs6000-output.c (revision 0)
+++ gcc/config/rs6000/rs6000-output.c (revision 0)
@@ -0,0 +1,246 @@
+/* Subroutines used to emit code and split insns for PowerPC.
+ Copyright (C) 2018 Free Software Foundation, Inc.
+ Contributed by Richard Kenner (kenner@vlsi1.ultra.nyu.edu)
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published
+ by the Free Software Foundation; either version 3, or (at your
+ option) any later version.
+
+ GCC is distributed in the hope that it will be useful, but WITHOUT
+ ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+ or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
+ License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3. If not see
+ <http://www.gnu.org/licenses/>. */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "tree.h"
+#include "memmodel.h"
+#include "tm_p.h"
+#include "expmed.h"
+#include "optabs.h"
+#include "regs.h"
+#include "ira.h"
+#include "recog.h"
+#include "insn-attr.h"
+#include "flags.h"
+#include "print-tree.h"
+#include "fold-const.h"
+#include "stringpool.h"
+#include "attribs.h"
+#include "varasm.h"
+#include "explow.h"
+#include "expr.h"
+#include "output.h"
+#include "target.h"
+#include "tm-constrs.h"
+
+\f
+/* Return a string to do a move operation of 128 bits of data. */
+
+const char *
+rs6000_output_move_128bit (rtx operands[])
+{
+ rtx dest = operands[0];
+ rtx src = operands[1];
+ machine_mode mode = GET_MODE (dest);
+ int dest_regno;
+ int src_regno;
+ bool dest_gpr_p, dest_fp_p, dest_vmx_p, dest_vsx_p;
+ bool src_gpr_p, src_fp_p, src_vmx_p, src_vsx_p;
+
+ if (REG_P (dest))
+ {
+ dest_regno = REGNO (dest);
+ dest_gpr_p = INT_REGNO_P (dest_regno);
+ dest_fp_p = FP_REGNO_P (dest_regno);
+ dest_vmx_p = ALTIVEC_REGNO_P (dest_regno);
+ dest_vsx_p = dest_fp_p | dest_vmx_p;
+ }
+ else
+ {
+ dest_regno = -1;
+ dest_gpr_p = dest_fp_p = dest_vmx_p = dest_vsx_p = false;
+ }
+
+ if (REG_P (src))
+ {
+ src_regno = REGNO (src);
+ src_gpr_p = INT_REGNO_P (src_regno);
+ src_fp_p = FP_REGNO_P (src_regno);
+ src_vmx_p = ALTIVEC_REGNO_P (src_regno);
+ src_vsx_p = src_fp_p | src_vmx_p;
+ }
+ else
+ {
+ src_regno = -1;
+ src_gpr_p = src_fp_p = src_vmx_p = src_vsx_p = false;
+ }
+
+ /* Register moves. */
+ if (dest_regno >= 0 && src_regno >= 0)
+ {
+ if (dest_gpr_p)
+ {
+ if (src_gpr_p)
+ return "#";
+
+ if (TARGET_DIRECT_MOVE_128 && src_vsx_p)
+ return (WORDS_BIG_ENDIAN
+ ? "mfvsrd %0,%x1\n\tmfvsrld %L0,%x1"
+ : "mfvsrd %L0,%x1\n\tmfvsrld %0,%x1");
+
+ else if (TARGET_VSX && TARGET_DIRECT_MOVE && src_vsx_p)
+ return "#";
+ }
+
+ else if (TARGET_VSX && dest_vsx_p)
+ {
+ if (src_vsx_p)
+ return "xxlor %x0,%x1,%x1";
+
+ else if (TARGET_DIRECT_MOVE_128 && src_gpr_p)
+ return (WORDS_BIG_ENDIAN
+ ? "mtvsrdd %x0,%1,%L1"
+ : "mtvsrdd %x0,%L1,%1");
+
+ else if (TARGET_DIRECT_MOVE && src_gpr_p)
+ return "#";
+ }
+
+ else if (TARGET_ALTIVEC && dest_vmx_p && src_vmx_p)
+ return "vor %0,%1,%1";
+
+ else if (dest_fp_p && src_fp_p)
+ return "#";
+ }
+
+ /* Loads. */
+ else if (dest_regno >= 0 && MEM_P (src))
+ {
+ if (dest_gpr_p)
+ {
+ if (TARGET_QUAD_MEMORY && quad_load_store_p (dest, src))
+ return "lq %0,%1";
+ else
+ return "#";
+ }
+
+ else if (TARGET_ALTIVEC && dest_vmx_p
+ && altivec_indexed_or_indirect_operand (src, mode))
+ return "lvx %0,%y1";
+
+ else if (TARGET_VSX && dest_vsx_p)
+ {
+ if (mode_supports_dq_form (mode)
+ && quad_address_p (XEXP (src, 0), mode, true))
+ return "lxv %x0,%1";
+
+ else if (TARGET_P9_VECTOR)
+ return "lxvx %x0,%y1";
+
+ else if (mode == V16QImode || mode == V8HImode || mode == V4SImode)
+ return "lxvw4x %x0,%y1";
+
+ else
+ return "lxvd2x %x0,%y1";
+ }
+
+ else if (TARGET_ALTIVEC && dest_vmx_p)
+ return "lvx %0,%y1";
+
+ else if (dest_fp_p)
+ return "#";
+ }
+
+ /* Stores. */
+ else if (src_regno >= 0 && MEM_P (dest))
+ {
+ if (src_gpr_p)
+ {
+ if (TARGET_QUAD_MEMORY && quad_load_store_p (dest, src))
+ return "stq %1,%0";
+ else
+ return "#";
+ }
+
+ else if (TARGET_ALTIVEC && src_vmx_p
+ && altivec_indexed_or_indirect_operand (src, mode))
+ return "stvx %1,%y0";
+
+ else if (TARGET_VSX && src_vsx_p)
+ {
+ if (mode_supports_dq_form (mode)
+ && quad_address_p (XEXP (dest, 0), mode, true))
+ return "stxv %x1,%0";
+
+ else if (TARGET_P9_VECTOR)
+ return "stxvx %x1,%y0";
+
+ else if (mode == V16QImode || mode == V8HImode || mode == V4SImode)
+ return "stxvw4x %x1,%y0";
+
+ else
+ return "stxvd2x %x1,%y0";
+ }
+
+ else if (TARGET_ALTIVEC && src_vmx_p)
+ return "stvx %1,%y0";
+
+ else if (src_fp_p)
+ return "#";
+ }
+
+ /* Constants. */
+ else if (dest_regno >= 0
+ && (GET_CODE (src) == CONST_INT
+ || GET_CODE (src) == CONST_WIDE_INT
+ || GET_CODE (src) == CONST_DOUBLE
+ || GET_CODE (src) == CONST_VECTOR))
+ {
+ if (dest_gpr_p)
+ return "#";
+
+ else if ((dest_vmx_p && TARGET_ALTIVEC)
+ || (dest_vsx_p && TARGET_VSX))
+ return output_vec_const_move (operands);
+ }
+
+ fatal_insn ("Bad 128-bit move", gen_rtx_SET (dest, src));
+}
+
+/* Validate a 128-bit move. */
+bool
+rs6000_move_128bit_ok_p (rtx operands[])
+{
+ machine_mode mode = GET_MODE (operands[0]);
+ return (gpc_reg_operand (operands[0], mode)
+ || gpc_reg_operand (operands[1], mode));
+}
+
+/* Return true if a 128-bit move needs to be split. */
+bool
+rs6000_split_128bit_ok_p (rtx operands[])
+{
+ if (!reload_completed)
+ return false;
+
+ if (!gpr_or_gpr_p (operands[0], operands[1]))
+ return false;
+
+ if (quad_load_store_p (operands[0], operands[1]))
+ return false;
+
+ return true;
+}
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch], PowerPC memory support pre-gcc9, patch #2
2018-03-15 17:09 ` [RFC Patch], PowerPC memory support pre-gcc9, patch #2 Michael Meissner
@ 2018-03-20 13:32 ` Segher Boessenkool
2018-03-20 20:27 ` Michael Meissner
0 siblings, 1 reply; 11+ messages in thread
From: Segher Boessenkool @ 2018-03-20 13:32 UTC (permalink / raw)
To: Michael Meissner, GCC Patches, David Edelsohn, Bill Schmidt
On Thu, Mar 15, 2018 at 01:04:30PM -0400, Michael Meissner wrote:
> This is patch #2 of my series for improving the PowerPC internal memory
> support. It assumes patch #1 has been applied.
>
> This patch moves the rs6000_move_128bit function from rs6000.c to a new file,
> rs6000-output.c.
>
> The third patch will create a rs6000_move_64bit function and change both 32-bit
> and 64-bit movdi to call it, instead of having all of the instructions be
> literals. I will also likely add improvements to setting the reg_addr address
> masks for offsetable addresses.
>
> The fourth patch will like move movdd and movdf to call rs6000_move_64bit as
> well.
>
> I tested this on a little endian power8 system and there were no regressions.
>
> 2018-03-14 Michael Meissner <meissner@linux.vnet.ibm.com>
>
> * config.gcc (powerpc*-*-*): Add rs6000-output.o to extra_objs.
> * config/rs6000/t-rs6000 (rs6000-output.o): Add build rule.
> * config/rs6000/rs6000.c (rs6000_output_move_128bit): Move to
> rs6000-output.c.
I am not happy at all with this new file, and it won't even work as far
as I see (for multi-alternative define_insn's; splitting the strings to
a different file than the constraints and attributes is asking for
trouble, better keep it all together).
Files should bundle together code that conceptually belongs together,
not some arbitrary split ("these routine return strings that are
eventually output from the compiler as instructions").
Segher
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch], PowerPC memory support pre-gcc9, patch #2
2018-03-20 13:32 ` Segher Boessenkool
@ 2018-03-20 20:27 ` Michael Meissner
0 siblings, 0 replies; 11+ messages in thread
From: Michael Meissner @ 2018-03-20 20:27 UTC (permalink / raw)
To: Segher Boessenkool
Cc: Michael Meissner, GCC Patches, David Edelsohn, Bill Schmidt
On Tue, Mar 20, 2018 at 08:30:57AM -0500, Segher Boessenkool wrote:
> On Thu, Mar 15, 2018 at 01:04:30PM -0400, Michael Meissner wrote:
> > This is patch #2 of my series for improving the PowerPC internal memory
> > support. It assumes patch #1 has been applied.
> >
> > This patch moves the rs6000_move_128bit function from rs6000.c to a new file,
> > rs6000-output.c.
> >
> > The third patch will create a rs6000_move_64bit function and change both 32-bit
> > and 64-bit movdi to call it, instead of having all of the instructions be
> > literals. I will also likely add improvements to setting the reg_addr address
> > masks for offsetable addresses.
> >
> > The fourth patch will like move movdd and movdf to call rs6000_move_64bit as
> > well.
> >
> > I tested this on a little endian power8 system and there were no regressions.
> >
> > 2018-03-14 Michael Meissner <meissner@linux.vnet.ibm.com>
> >
> > * config.gcc (powerpc*-*-*): Add rs6000-output.o to extra_objs.
> > * config/rs6000/t-rs6000 (rs6000-output.o): Add build rule.
> > * config/rs6000/rs6000.c (rs6000_output_move_128bit): Move to
> > rs6000-output.c.
>
> I am not happy at all with this new file, and it won't even work as far
> as I see (for multi-alternative define_insn's; splitting the strings to
> a different file than the constraints and attributes is asking for
> trouble, better keep it all together).
>
> Files should bundle together code that conceptually belongs together,
> not some arbitrary split ("these routine return strings that are
> eventually output from the compiler as instructions").
I was eventually planning to move the other functions that split insns and
output the strings there. But I can keep it in rs6000.c if desired. I was
just trying to keep the mechanical changes down, rather than move everything
all at once.
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch], PowerPC memory support pre-gcc9, patch #3
2018-03-14 23:01 [RFC Patch], PowerPC memory support pre-gcc9, patch #1 Michael Meissner
2018-03-15 17:09 ` [RFC Patch], PowerPC memory support pre-gcc9, patch #2 Michael Meissner
@ 2018-03-15 23:33 ` Michael Meissner
2018-03-16 17:27 ` [RFC Patch], PowerPC memory support pre-gcc9, patch #4 Michael Meissner
` (2 subsequent siblings)
4 siblings, 0 replies; 11+ messages in thread
From: Michael Meissner @ 2018-03-15 23:33 UTC (permalink / raw)
To: Michael Meissner, GCC Patches, Segher Boessenkool,
David Edelsohn, Bill Schmidt
[-- Attachment #1: Type: text/plain, Size: 2127 bytes --]
This patch moves the instructions for movdi (both 32-bit and 64-bit) into a
separate rs6000_output_move_64bit function.
As I'm starting to move more stuff to checking the addr_masks instead of doing
a lot of if mode == MODE1 || mode == MODE2, etc. I realized that the
mult-register types (complex values, long double using IBM double double, etc.)
did not have the offset bits set correctly in reg_addr. I also prevented the
Altivec load/stores (that give you the free AND with -16) from being generated
for multi-register values.
I added a function (rs6000_valid_move_p) that replaces the old is operand[0] a
register or is operand[1] a register tests. Right now, it generates the same
tests, but I may need to add additional conditions in the future.
I have done a full bootstrap and make check on a little endian power8 system
with no regressions.
The next patch will change the MOVDF and MOVDD patterns to use
rs6000_output_move_64bit as well.
2018-03-15 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000-protos.h (rs6000_output_move_64bit): Add
declaration.
(rs6000_valid_move_p): Likewise.
* config/rs6000/rs6000-output.c (addr_is_xform_p): New helper
function to return if an addresses uses X-form (reg+reg).
(reg_is_spr_p): New helper function to determine if a register is
a SPR.
(rs6000_output_move_64bit): New function to return the proper
instruction to do a 64-bit move.
* config/rs6000/rs6000.c (rs6000_setup_reg_addr_masks): Rework
setting offset addresses to assume multi-register values have the
proper offset bits set. Do not enable Altivec & -16 on
mult-reigster moves.
(rs6000_valid_move_p): New function to validate moves.
(reg_offset_addressing_ok_p): Add check if the mode and register
class support offstable instructions.
* config/rs6000/rs6000.md (movdi_internal32): Move instruction
literals to rs6000_otuput_move_64bit. Check move validity with
rs6000_move_valid_p.
(movdi_internal64): Likewise.
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
[-- Attachment #2: ext-addr.patch03b --]
[-- Type: text/plain, Size: 13048 bytes --]
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h (revision 258535)
+++ gcc/config/rs6000/rs6000-protos.h (working copy)
@@ -52,6 +52,7 @@ extern rtx rs6000_got_register (rtx);
extern rtx find_addr_reg (rtx);
extern rtx gen_easy_altivec_constant (rtx);
extern const char *output_vec_const_move (rtx *);
+extern const char *rs6000_output_move_64bit (rtx *);
extern const char *rs6000_output_move_128bit (rtx *);
extern bool rs6000_move_128bit_ok_p (rtx []);
extern bool rs6000_split_128bit_ok_p (rtx []);
@@ -89,6 +90,7 @@ extern bool rs6000_is_valid_2insn_and (r
extern void rs6000_emit_2insn_and (machine_mode, rtx *, bool, int);
extern int registers_ok_for_quad_peep (rtx, rtx);
extern int mems_ok_for_quad_peep (rtx, rtx);
+extern bool rs6000_valid_move_p (rtx, rtx);
extern bool gpr_or_gpr_p (rtx, rtx);
extern bool direct_move_p (rtx, rtx);
extern bool quad_address_p (rtx, machine_mode, bool);
Index: gcc/config/rs6000/rs6000-output.c
===================================================================
--- gcc/config/rs6000/rs6000-output.c (revision 258538)
+++ gcc/config/rs6000/rs6000-output.c (working copy)
@@ -47,6 +47,215 @@
#include "tm-constrs.h"
\f
+/* Return whether an address is an x-form (reg or reg+reg) address. This is
+ used when we know the instruction is not a traditional GPR or FPR
+ load/store, so check to make sure auto increment is not present in the
+ address. */
+inline static bool
+addr_is_xform_p (rtx addr)
+{
+ gcc_assert (GET_RTX_CLASS (GET_CODE (addr)) != RTX_AUTOINC);
+
+ if (REG_P (addr) || SUBREG_P (addr))
+ return true;
+
+ if (GET_CODE (addr) != PLUS)
+ return false;
+
+ rtx op1 = XEXP (addr, 1);
+ return REG_P (op1) || SUBREG_P (op1);
+}
+
+/* Return whether a register is a SPR. */
+inline static bool
+reg_is_spr_p (rtx reg)
+{
+ if (!REG_P (reg))
+ return false;
+
+ enum reg_class rclass = REGNO_REG_CLASS (REGNO (reg));
+ return reg_class_to_reg_type[(int)rclass] == SPR_REG_TYPE;
+}
+
+\f
+/* Return a string to do a move operation of 64 bits of data. */
+
+const char *
+rs6000_output_move_64bit (rtx operands[])
+{
+ rtx dest = operands[0];
+ rtx src = operands[1];
+ machine_mode mode = GET_MODE (dest);
+ int dest_regno;
+ int src_regno;
+ bool dest_gpr_p, dest_fp_p, dest_vmx_p, dest_vsx_p;
+ bool src_gpr_p, src_fp_p, src_vmx_p, src_vsx_p;
+
+ if (REG_P (dest) || SUBREG_P (dest))
+ {
+ dest_regno = regno_or_subregno (dest);
+ dest_gpr_p = INT_REGNO_P (dest_regno);
+ dest_fp_p = FP_REGNO_P (dest_regno);
+ dest_vmx_p = ALTIVEC_REGNO_P (dest_regno);
+ dest_vsx_p = dest_fp_p | dest_vmx_p;
+ }
+ else
+ {
+ dest_regno = -1;
+ dest_gpr_p = dest_fp_p = dest_vmx_p = dest_vsx_p = false;
+ }
+
+ if (REG_P (src) || SUBREG_P (src))
+ {
+ src_regno = regno_or_subregno (src);
+ src_gpr_p = INT_REGNO_P (src_regno);
+ src_fp_p = FP_REGNO_P (src_regno);
+ src_vmx_p = ALTIVEC_REGNO_P (src_regno);
+ src_vsx_p = src_fp_p | src_vmx_p;
+ }
+ else
+ {
+ src_regno = -1;
+ src_gpr_p = src_fp_p = src_vmx_p = src_vsx_p = false;
+ }
+
+ /* Register moves. */
+ if (dest_regno >= 0 && src_regno >= 0)
+ {
+ /* Moves to GPRs. */
+ if (dest_gpr_p)
+ {
+ if (!TARGET_POWERPC64)
+ return "#";
+
+ else if (src_gpr_p)
+ return "mr %0,%1";
+
+ else if (TARGET_DIRECT_MOVE && src_vsx_p)
+ return "mfvsrd %0,%x1";
+
+ else if (TARGET_MFPGPR && src_fp_p)
+ return "mftgpr %0,%1";
+
+ else if (reg_is_spr_p (src))
+ return "mf%1 %0";
+ }
+
+ /* Moves to vector/floating point registers. */
+ else if (dest_vsx_p)
+ {
+ if (dest_fp_p && src_fp_p)
+ return "fmr %0,%1";
+
+ else if (TARGET_VSX && src_vsx_p)
+ return "xxlor %x0,%x1,%x1";
+
+ else if (TARGET_POWERPC64 && src_gpr_p)
+ {
+ if (TARGET_DIRECT_MOVE)
+ return "mtvsrd %x0,%1";
+
+ else if (TARGET_MFPGPR && dest_fp_p)
+ return "mffgpr %0,%1";
+ }
+ }
+
+ /* Moves to SPRs. */
+ else if (reg_is_spr_p (dest))
+ return "mt%0 %1";
+ }
+
+ /* Loads. */
+ else if (dest_regno >= 0 && MEM_P (src))
+ {
+ if (dest_gpr_p)
+ return TARGET_POWERPC64 ? "ld%U1%X1 %0,%1" : "#";
+
+ else if (dest_fp_p)
+ return "lfd%U1%X1 %0,%1";
+
+ else if (dest_vmx_p)
+ {
+ if (TARGET_VSX && addr_is_xform_p (XEXP (src, 0)))
+ return "lxsdx %x0,%y1";
+
+ else if (TARGET_P9_VECTOR)
+ return "lxsd %0,%1";
+ }
+ }
+
+ /* Stores. */
+ else if (src_regno >= 0 && MEM_P (dest))
+ {
+ if (src_gpr_p)
+ return TARGET_POWERPC64 ? "std%U0%X0 %1,%0" : "#";
+
+ else if (src_fp_p)
+ return "stfd%U0%X0 %1,%0";
+
+ else if (src_vmx_p)
+ {
+ if (TARGET_VSX && addr_is_xform_p (XEXP (dest, 0)))
+ return "stxsdx %x1,%y0";
+
+ else if (TARGET_P9_VECTOR)
+ return "stxsd %1,%0";
+ }
+ }
+
+ /* Constants. */
+ else if (dest_regno >= 0 && CONSTANT_P (src))
+ {
+ if (dest_gpr_p)
+ {
+ if (satisfies_constraint_I (src))
+ return "li %0,%1";
+
+ if (satisfies_constraint_L (src))
+ return "lis %0,%v1";
+
+ return "#";
+ }
+
+ else if (TARGET_VSX && dest_vsx_p)
+ {
+ /* We prefer to generate XXSPLTIB/VSPLTISW over XXLXOR/XXLORC to
+ generate 0/-1, because the later can potentially cause a stall if
+ the previous use of the register did a long operation followed by
+ a store. This would cause this insn to wait for the previous
+ operation to finish, even though it doesn't use any of the bits in
+ the previous value. */
+ if (src == CONST0_RTX (mode))
+ {
+ /* Note 0.0 is not all zeros in IBM decimal format. */
+ gcc_assert (mode != DDmode);
+
+ if (TARGET_P9_VECTOR)
+ return "xxspltib %x0,0";
+ else if (dest_vmx_p)
+ return "vspltisw %0,0";
+ else
+ return "xxlxor %x0,%x0,%x0";
+ }
+ else if (GET_MODE_CLASS (mode) == MODE_INT
+ && src == CONSTM1_RTX (mode))
+ {
+ if (TARGET_P9_VECTOR)
+ return "xxspltib %x0,255";
+ else if (dest_vmx_p)
+ return "vspltisw %0,-1";
+ else if (TARGET_P8_VECTOR)
+ return "xxlorc %x0,%x0,%x0";
+ /* XXX: We could generate xxlxor/xxlnor for power7 if
+ desired. */
+ }
+ }
+ }
+
+ fatal_insn ("Bad 64-bit move", gen_rtx_SET (dest, src));
+}
+
+\f
/* Return a string to do a move operation of 128 bits of data. */
const char *
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c (revision 258538)
+++ gcc/config/rs6000/rs6000.c (working copy)
@@ -2957,49 +2957,52 @@ rs6000_setup_reg_addr_masks (void)
/* GPR and FPR registers can do REG+OFFSET addressing, except
possibly for SDmode. ISA 3.0 (i.e. power9) adds D-form addressing
- for 64-bit scalars and 32-bit SFmode to altivec registers. */
- if ((addr_mask != 0) && !indexed_only_p
- && msize <= 8
- && (rc == RELOAD_REG_GPR
- || ((msize == 8 || m2 == SFmode)
- && (rc == RELOAD_REG_FPR
- || (rc == RELOAD_REG_VMX && TARGET_P9_VECTOR)))))
- addr_mask |= RELOAD_REG_OFFSET;
-
- /* VSX registers can do REG+OFFSET addresssing if ISA 3.0
- instructions are enabled. The offset for 128-bit VSX registers is
- only 12-bits. While GPRs can handle the full offset range, VSX
- registers can only handle the restricted range. */
- else if ((addr_mask != 0) && !indexed_only_p
- && msize == 16 && TARGET_P9_VECTOR
- && (ALTIVEC_OR_VSX_VECTOR_MODE (m2)
- || (m2 == TImode && TARGET_VSX)))
- {
- addr_mask |= RELOAD_REG_OFFSET;
- if (rc == RELOAD_REG_FPR || rc == RELOAD_REG_VMX)
- addr_mask |= RELOAD_REG_QUAD_OFFSET;
- }
-
- /* LD and STD are DS-form instructions, which must have the bottom 2
- bits be 0. However, since DFmode is primarily used in the
- floating point/vector registers, don't restrict the offsets in ISA
- 2.xx. */
- if (rc == RELOAD_REG_GPR && msize == 8 && TARGET_POWERPC64
- && (addr_mask & RELOAD_REG_OFFSET) != 0
- && INTEGRAL_MODE_P (m2))
- addr_mask |= RELOAD_REG_DS_OFFSET;
-
- /* ISA 3.0 LXSD, LXSSP, STXSD, STXSSP altivec load/store instructions
- are DS-FORM. */
- else if (rc == RELOAD_REG_VMX && TARGET_P9_VECTOR
- && (addr_mask & RELOAD_REG_OFFSET) != 0
- && (msize == 8 || m2 == SFmode))
- addr_mask |= RELOAD_REG_DS_OFFSET;
+ for 64-bit scalars and 32-bit SFmode to altivec registers.
+
+ 64-bit GPR offset memory references and Altivec offset memory
+ references use DS-mode offsets where the bottom 2 bits are 0.
+
+ 128-bit vector offset memory references use DQ-mode offsets where
+ the bottom 4 bits are 0. */
+ if ((addr_mask != 0) && !indexed_only_p)
+ {
+ if (rc == RELOAD_REG_GPR)
+ {
+ /* LD/STD on 64-bit use DS-form addresses. */
+ addr_mask |= RELOAD_REG_OFFSET;
+ if (msize >= 8 && TARGET_POWERPC64)
+ addr_mask |= RELOAD_REG_DS_OFFSET;
+ }
+ else if (msize >= 8 || m == E_SFmode)
+ {
+ if (rc == RELOAD_REG_FPR)
+ {
+ /* LXV/STXV use DQ-form addresses. */
+ addr_mask |= RELOAD_REG_OFFSET;
+ if (msize == 16
+ && (addr_mask & RELOAD_REG_MULTIPLE) == 0
+ && TARGET_P9_VECTOR)
+ addr_mask |= RELOAD_REG_QUAD_OFFSET;
+ }
+ else if (rc == RELOAD_REG_VMX && TARGET_P9_VECTOR)
+ {
+ /* LXV/STXV use DQ-form addresses, LXSD/LXSSP/STXSD/STXSSP
+ use DS-form addresses. */
+ addr_mask |= RELOAD_REG_OFFSET;
+ if (msize == 16
+ && (addr_mask & RELOAD_REG_MULTIPLE) == 0)
+ addr_mask |= RELOAD_REG_QUAD_OFFSET;
+ else
+ addr_mask |= RELOAD_REG_DS_OFFSET;
+ }
+ }
+ }
/* VMX registers can do (REG & -16) and ((REG+REG) & -16)
addressing on 128-bit types. */
if (rc == RELOAD_REG_VMX && msize == 16
- && (addr_mask & RELOAD_REG_VALID) != 0)
+ && ((addr_mask & (RELOAD_REG_VALID
+ | RELOAD_REG_MULTIPLE)) == RELOAD_REG_VALID))
addr_mask |= RELOAD_REG_AND_M16;
reg_addr[m].addr_mask[rc] = addr_mask;
@@ -8007,6 +8010,26 @@ small_data_operand (rtx op ATTRIBUTE_UNU
#endif
}
+/* Return true if a move is valid. */
+
+bool
+rs6000_valid_move_p (rtx dest, rtx src)
+{
+ if (SUBREG_P (dest))
+ dest = SUBREG_REG (dest);
+
+ if (SUBREG_P (src))
+ src = SUBREG_REG (src);
+
+ if (REG_P (dest))
+ return true;
+
+ if (MEM_P (dest) && REG_P (src))
+ return true;
+
+ return false;
+}
+
/* Return true if either operand is a general purpose register. */
bool
@@ -8239,6 +8262,9 @@ mem_operand_ds_form (rtx op, machine_mod
static bool
reg_offset_addressing_ok_p (machine_mode mode)
{
+ if (!mode_supports_d_form (mode))
+ return false;
+
switch (mode)
{
case E_V16QImode:
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md (revision 258531)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -8485,29 +8485,8 @@ (define_insn "*movdi_internal32"
Oj, wM, OjwM, Oj, wM, wS,
wB"))]
- "! TARGET_POWERPC64
- && (gpc_reg_operand (operands[0], DImode)
- || gpc_reg_operand (operands[1], DImode))"
- "@
- #
- #
- #
- stfd%U0%X0 %1,%0
- lfd%U1%X1 %0,%1
- fmr %0,%1
- #
- stxsd %1,%0
- stxsdx %x1,%y0
- lxsd %0,%1
- lxsdx %x0,%y1
- xxlor %x0,%x1,%x1
- xxspltib %x0,0
- xxspltib %x0,255
- vspltisw %0,%1
- xxlxor %x0,%x0,%x0
- xxlorc %x0,%x0,%x0
- #
- #"
+ "! TARGET_POWERPC64 && rs6000_valid_move_p (operands[0], operands[1])"
+ "* return rs6000_output_move_64bit (operands);"
[(set_attr "type"
"store, load, *, fpstore, fpload, fpsimple,
*, fpstore, fpstore, fpload, fpload, veclogical,
@@ -8562,38 +8541,8 @@ (define_insn "*movdi_internal64"
wM, wS, wB, *h, r, 0,
wg, r, wj, r"))]
- "TARGET_POWERPC64
- && (gpc_reg_operand (operands[0], DImode)
- || gpc_reg_operand (operands[1], DImode))"
- "@
- std%U0%X0 %1,%0
- ld%U1%X1 %0,%1
- mr %0,%1
- li %0,%1
- lis %0,%v1
- #
- stfd%U0%X0 %1,%0
- lfd%U1%X1 %0,%1
- fmr %0,%1
- stxsd %1,%0
- stxsdx %x1,%y0
- lxsd %0,%1
- lxsdx %x0,%y1
- xxlor %x0,%x1,%x1
- xxspltib %x0,0
- xxspltib %x0,255
- #
- xxlxor %x0,%x0,%x0
- xxlorc %x0,%x0,%x0
- #
- #
- mf%1 %0
- mt%0 %1
- nop
- mftgpr %0,%1
- mffgpr %0,%1
- mfvsrd %0,%x1
- mtvsrd %x0,%1"
+ "TARGET_POWERPC64 && rs6000_valid_move_p (operands[0], operands[1])"
+ "* return rs6000_output_move_64bit (operands);"
[(set_attr "type"
"store, load, *, *, *, *,
fpstore, fpload, fpsimple, fpstore, fpstore, fpload,
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch], PowerPC memory support pre-gcc9, patch #4
2018-03-14 23:01 [RFC Patch], PowerPC memory support pre-gcc9, patch #1 Michael Meissner
2018-03-15 17:09 ` [RFC Patch], PowerPC memory support pre-gcc9, patch #2 Michael Meissner
2018-03-15 23:33 ` [RFC Patch], PowerPC memory support pre-gcc9, patch #3 Michael Meissner
@ 2018-03-16 17:27 ` Michael Meissner
2018-03-20 16:21 ` Segher Boessenkool
2018-03-16 23:31 ` [RFC Patch], PowerPC memory support pre-gcc9, patch #1 Michael Meissner
2018-03-20 13:04 ` Segher Boessenkool
4 siblings, 1 reply; 11+ messages in thread
From: Michael Meissner @ 2018-03-16 17:27 UTC (permalink / raw)
To: Michael Meissner, GCC Patches, Segher Boessenkool,
David Edelsohn, Bill Schmidt
[-- Attachment #1: Type: text/plain, Size: 1388 bytes --]
Here is patch #4 that moves the MOVDF/MOVDD insns into calling C code. I added
documentation to the various MOVD{F,D} patterns similar to the documentation
I've done on the other patterns to make it simpler to track which two
constraints match which instruction and which instruction type is used.
The next patch may tackle an instruction discrepancy that I've noticed in
building Spec 2006. The tonto benchmark generates slightly different code with
these changes than with. It doesn't affect the runtime of the benchmark, but
for these infrastructure changes, they should generate the same code.
After that, I will tackle the 32-bit moves and then the 8/16-bit moves.
2018-03-16 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000-output.c (rs6000_output_move_64bit): Deal
with SPR<-SPR where the register is the same.
* config/rs6000/rs6000.md (mov<mode>_hardfloat32): Add comments
and spacing to allow easier understanding of which constraints are
used for which alternative. Use rs6000_valid_move_p to validate
the move. Use rs6000_output_move_64bit to print out the correct
instruction.
(mov<mode>_softfloat32): Likewise.
(mov<mode>_hardfloat64): Likewise.
(mov<mode>_softfloat64): Likewise.
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
[-- Attachment #2: ext-addr.patch04b --]
[-- Type: text/plain, Size: 7904 bytes --]
Index: gcc/config/rs6000/rs6000-output.c
===================================================================
--- gcc/config/rs6000/rs6000-output.c (revision 258576)
+++ gcc/config/rs6000/rs6000-output.c (working copy)
@@ -162,7 +162,13 @@ rs6000_output_move_64bit (rtx operands[]
/* Moves to SPRs. */
else if (reg_is_spr_p (dest))
- return "mt%0 %1";
+ {
+ if (src_gpr_p)
+ return "mt%0 %1";
+
+ else if (dest_regno == src_regno)
+ return "nop";
+ }
}
/* Loads. */
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md (revision 258576)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -7398,92 +7398,108 @@ (define_split
;; If we have FPR registers, rs6000_emit_move has moved all constants to memory,
;; except for 0.0 which can be created on VSX with an xor instruction.
+;; STFD LFD FMR LXSD STXSD
+;; LXSD STXSD XXLOR XXLXOR GPR<-0
+;; LWZ STW MR
+
(define_insn "*mov<mode>_hardfloat32"
- [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,<f64_p9>,wY,<f64_av>,Z,<f64_vsx>,<f64_vsx>,!r,Y,r,!r")
- (match_operand:FMOVE64 1 "input_operand" "d,m,d,wY,<f64_p9>,Z,<f64_av>,<f64_vsx>,<zero_fp>,<zero_fp>,r,Y,r"))]
+ [(set (match_operand:FMOVE64 0 "nonimmediate_operand"
+ "=m, d, d, <f64_p9>, wY,
+ <f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
+ Y, r, !r")
+
+ (match_operand:FMOVE64 1 "input_operand"
+ "d, m, d, wY, <f64_p9>,
+ Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
+ r, Y, r"))]
+
"! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT
- && (gpc_reg_operand (operands[0], <MODE>mode)
- || gpc_reg_operand (operands[1], <MODE>mode))"
- "@
- stfd%U0%X0 %1,%0
- lfd%U1%X1 %0,%1
- fmr %0,%1
- lxsd %0,%1
- stxsd %1,%0
- lxsd%U1x %x0,%y1
- stxsd%U0x %x1,%y0
- xxlor %x0,%x1,%x1
- xxlxor %x0,%x0,%x0
- #
- #
- #
- #"
- [(set_attr "type" "fpstore,fpload,fpsimple,fpload,fpstore,fpload,fpstore,veclogical,veclogical,two,store,load,two")
+ && rs6000_valid_move_p (operands[0], operands[1])"
+ "* return rs6000_output_move_64bit (operands);"
+ [(set_attr "type"
+ "fpstore, fpload, fpsimple, fpload, fpstore,
+ fpload, fpstore, veclogical, veclogical, two,
+ store, load, two")
+
(set_attr "size" "64")
- (set_attr "length" "4,4,4,4,4,4,4,4,4,8,8,8,8")])
+ (set_attr "length"
+ "4, 4, 4, 4, 4,
+ 4, 4, 4, 4, 8,
+ 8, 8, 8")])
+
+;; STW LWZ MR G-const H-const F-const
(define_insn "*mov<mode>_softfloat32"
- [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=Y,r,r,r,r,r")
- (match_operand:FMOVE64 1 "input_operand" "r,Y,r,G,H,F"))]
- "! TARGET_POWERPC64
- && (TARGET_SINGLE_FLOAT || TARGET_SOFT_FLOAT)
- && (gpc_reg_operand (operands[0], <MODE>mode)
- || gpc_reg_operand (operands[1], <MODE>mode))"
- "#"
- [(set_attr "type" "store,load,two,*,*,*")
- (set_attr "length" "8,8,8,8,12,16")])
+ [(set (match_operand:FMOVE64 0 "nonimmediate_operand"
+ "=Y, r, r, r, r, r")
+
+ (match_operand:FMOVE64 1 "input_operand"
+ "r, Y, r, G, H, F"))]
+
+ "! TARGET_POWERPC64 && (TARGET_SINGLE_FLOAT || TARGET_SOFT_FLOAT)
+ && rs6000_valid_move_p (operands[0], operands[1])"
+ "* return rs6000_output_move_64bit (operands);"
+ [(set_attr "type"
+ "store, load, two, *, *, *")
+
+ (set_attr "length"
+ "8, 8, 8, 8, 12, 16")])
; ld/std require word-aligned displacements -> 'Y' constraint.
; List Y->r and r->Y before r->r for reload.
+
+;; STFD LFD FMR LXSD STXSD
+;; LXSDX STXSDX XXLOR XXLXOR LI 0
+;; STD LD MR MT<SPR> MF<SPR>
+;; NOP MFTGPR MFFGPR MTVSRD MFVSRD
+
(define_insn "*mov<mode>_hardfloat64"
- [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,<f64_p9>,wY,<f64_av>,Z,<f64_vsx>,<f64_vsx>,!r,Y,r,!r,*c*l,!r,*h,r,wg,r,<f64_dm>")
- (match_operand:FMOVE64 1 "input_operand" "d,m,d,wY,<f64_p9>,Z,<f64_av>,<f64_vsx>,<zero_fp>,<zero_fp>,r,Y,r,r,h,0,wg,r,<f64_dm>,r"))]
+ [(set (match_operand:FMOVE64 0 "nonimmediate_operand"
+ "=m, d, d, <f64_p9>, wY,
+ <f64_av>, Z, <f64_vsx>, <f64_vsx>, !r,
+ Y, r, !r, *c*l, !r,
+ *h, r, wg, r, <f64_dm>")
+
+ (match_operand:FMOVE64 1 "input_operand"
+ "d, m, d, wY, <f64_p9>,
+ Z, <f64_av>, <f64_vsx>, <zero_fp>, <zero_fp>,
+ r, Y, r, r, h,
+ 0, wg, r, <f64_dm>, r"))]
+
"TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT
- && (gpc_reg_operand (operands[0], <MODE>mode)
- || gpc_reg_operand (operands[1], <MODE>mode))"
- "@
- stfd%U0%X0 %1,%0
- lfd%U1%X1 %0,%1
- fmr %0,%1
- lxsd %0,%1
- stxsd %1,%0
- lxsd%U1x %x0,%y1
- stxsd%U0x %x1,%y0
- xxlor %x0,%x1,%x1
- xxlxor %x0,%x0,%x0
- li %0,0
- std%U0%X0 %1,%0
- ld%U1%X1 %0,%1
- mr %0,%1
- mt%0 %1
- mf%1 %0
- nop
- mftgpr %0,%1
- mffgpr %0,%1
- mfvsrd %0,%x1
- mtvsrd %x0,%1"
- [(set_attr "type" "fpstore,fpload,fpsimple,fpload,fpstore,fpload,fpstore,veclogical,veclogical,integer,store,load,*,mtjmpr,mfjmpr,*,mftgpr,mffgpr,mftgpr,mffgpr")
+ && rs6000_valid_move_p (operands[0], operands[1])"
+ "* return rs6000_output_move_64bit (operands);"
+ [(set_attr "type"
+ "fpstore, fpload, fpsimple, fpload, fpstore,
+ fpload, fpstore, veclogical, veclogical, integer,
+ store, load, *, mtjmpr, mfjmpr,
+ *, mftgpr, mffgpr, mftgpr, mffgpr")
+
(set_attr "size" "64")
(set_attr "length" "4")])
+;; STD LD MR MT<SPR> MF<SPR> G-const
+;; H-const F-const Special
+
(define_insn "*mov<mode>_softfloat64"
- [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=Y,r,r,cl,r,r,r,r,*h")
- (match_operand:FMOVE64 1 "input_operand" "r,Y,r,r,h,G,H,F,0"))]
+ [(set (match_operand:FMOVE64 0 "nonimmediate_operand"
+ "=Y, r, r, cl, r, r,
+ r, r, *h")
+
+ (match_operand:FMOVE64 1 "input_operand"
+ "r, Y, r, r, h, G,
+ H, F, 0"))]
+
"TARGET_POWERPC64 && TARGET_SOFT_FLOAT
- && (gpc_reg_operand (operands[0], <MODE>mode)
- || gpc_reg_operand (operands[1], <MODE>mode))"
- "@
- std%U0%X0 %1,%0
- ld%U1%X1 %0,%1
- mr %0,%1
- mt%0 %1
- mf%1 %0
- #
- #
- #
- nop"
- [(set_attr "type" "store,load,*,mtjmpr,mfjmpr,*,*,*,*")
- (set_attr "length" "4,4,4,4,4,8,12,16,4")])
+ && rs6000_valid_move_p (operands[0], operands[1])"
+ "* return rs6000_output_move_64bit (operands);"
+ [(set_attr "type"
+ "store, load, *, mtjmpr, mfjmpr, *,
+ *, *, *")
+
+ (set_attr "length"
+ "4, 4, 4, 4, 4, 8,
+ 12, 16, 4")])
\f
(define_expand "mov<mode>"
[(set (match_operand:FMOVE128 0 "general_operand")
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch], PowerPC memory support pre-gcc9, patch #4
2018-03-16 17:27 ` [RFC Patch], PowerPC memory support pre-gcc9, patch #4 Michael Meissner
@ 2018-03-20 16:21 ` Segher Boessenkool
0 siblings, 0 replies; 11+ messages in thread
From: Segher Boessenkool @ 2018-03-20 16:21 UTC (permalink / raw)
To: Michael Meissner, GCC Patches, David Edelsohn, Bill Schmidt
Hi Mike,
On Fri, Mar 16, 2018 at 12:50:45PM -0400, Michael Meissner wrote:
> --- gcc/config/rs6000/rs6000-output.c (revision 258576)
> +++ gcc/config/rs6000/rs6000-output.c (working copy)
> @@ -162,7 +162,13 @@ rs6000_output_move_64bit (rtx operands[]
>
> /* Moves to SPRs. */
> else if (reg_is_spr_p (dest))
> - return "mt%0 %1";
> + {
> + if (src_gpr_p)
> + return "mt%0 %1";
> +
> + else if (dest_regno == src_regno)
> + return "nop";
> + }
> }
Is this correct? Many SPRs are not simple registers, they do something
when you write to them. But I guess this is only for lr,ctr,vrsave (i.e.
regclass "h", "SPECIAL_REGS"). So maybe we want a better name?
> +;; STD LD MR MT<SPR> MF<SPR> G-const
> +;; H-const F-const Special
> +
> (define_insn "*mov<mode>_softfloat64"
> - [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=Y,r,r,cl,r,r,r,r,*h")
> - (match_operand:FMOVE64 1 "input_operand" "r,Y,r,r,h,G,H,F,0"))]
> + [(set (match_operand:FMOVE64 0 "nonimmediate_operand"
> + "=Y, r, r, cl, r, r,
> + r, r, *h")
> +
> + (match_operand:FMOVE64 1 "input_operand"
> + "r, Y, r, r, h, G,
> + H, F, 0"))]
> +
> "TARGET_POWERPC64 && TARGET_SOFT_FLOAT
> - && (gpc_reg_operand (operands[0], <MODE>mode)
> - || gpc_reg_operand (operands[1], <MODE>mode))"
> - "@
> - std%U0%X0 %1,%0
> - ld%U1%X1 %0,%1
> - mr %0,%1
> - mt%0 %1
> - mf%1 %0
> - #
> - #
> - #
> - nop"
> - [(set_attr "type" "store,load,*,mtjmpr,mfjmpr,*,*,*,*")
> - (set_attr "length" "4,4,4,4,4,8,12,16,4")])
> + && rs6000_valid_move_p (operands[0], operands[1])"
> + "* return rs6000_output_move_64bit (operands);"
> + [(set_attr "type"
> + "store, load, *, mtjmpr, mfjmpr, *,
> + *, *, *")
> +
> + (set_attr "length"
> + "4, 4, 4, 4, 4, 8,
> + 12, 16, 4")])
Let's take this one as example. The attributes depend on which alternative
is selected, but with your change the actual output insn does not. That is
no good.
Maybe you can reduce the number of alternatives? Make it just store,
load, and moves for example, and then select the attributes based on what
machine insns you actually output? The ones that are split are the
problematic ones in that case, the rest is easy to handle.
Segher
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch], PowerPC memory support pre-gcc9, patch #1
2018-03-14 23:01 [RFC Patch], PowerPC memory support pre-gcc9, patch #1 Michael Meissner
` (2 preceding siblings ...)
2018-03-16 17:27 ` [RFC Patch], PowerPC memory support pre-gcc9, patch #4 Michael Meissner
@ 2018-03-16 23:31 ` Michael Meissner
2018-03-20 16:31 ` Segher Boessenkool
2018-03-20 13:04 ` Segher Boessenkool
4 siblings, 1 reply; 11+ messages in thread
From: Michael Meissner @ 2018-03-16 23:31 UTC (permalink / raw)
To: Michael Meissner, GCC Patches, Segher Boessenkool,
David Edelsohn, Bill Schmidt
[-- Attachment #1: Type: text/plain, Size: 747 bytes --]
In patch #4, I mentioned that the spec 2006 benchmark 'tonto' generated
different with the patches applied. I tracked it down, and it was due to the
call I inserted in rs6000_debug_reg_print to update the conditional register
usage seemed to set the Altivec registers VS0..VS19 to call_used instead of
call_saved. Since I no longer need to set the conditional register usage with
-mdebug=reg, it is simpler to just delete it.
2018-03-16 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/rs6000.c (rs6000_debug_reg_print): Eliminate call
to rs6000_conditional_register_usage.
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
[-- Attachment #2: ext-addr.patch05b --]
[-- Type: text/plain, Size: 846 bytes --]
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c (revision 258576)
+++ gcc/config/rs6000/rs6000.c (working copy)
@@ -1314,7 +1314,6 @@ static bool rs6000_debug_can_change_mode
reg_class_t);
static bool rs6000_save_toc_in_prologue_p (void);
static rtx rs6000_internal_arg_pointer (void);
-static void rs6000_conditional_register_usage (void);
rtx (*rs6000_legitimize_reload_address_ptr) (rtx, machine_mode, int, int,
int, int *)
@@ -2144,10 +2143,6 @@ rs6000_debug_reg_print (int first_regno,
{
int r, m;
- /* Insure the conditional registers are up to date when printing the debug
- information. */
- rs6000_conditional_register_usage ();
-
for (r = first_regno; r <= last_regno; ++r)
{
const char *comma = "";
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch], PowerPC memory support pre-gcc9, patch #1
2018-03-16 23:31 ` [RFC Patch], PowerPC memory support pre-gcc9, patch #1 Michael Meissner
@ 2018-03-20 16:31 ` Segher Boessenkool
0 siblings, 0 replies; 11+ messages in thread
From: Segher Boessenkool @ 2018-03-20 16:31 UTC (permalink / raw)
To: Michael Meissner, GCC Patches, David Edelsohn, Bill Schmidt
On Fri, Mar 16, 2018 at 07:01:18PM -0400, Michael Meissner wrote:
> In patch #4, I mentioned that the spec 2006 benchmark 'tonto' generated
> different with the patches applied. I tracked it down, and it was due to the
> call I inserted in rs6000_debug_reg_print to update the conditional register
> usage seemed to set the Altivec registers VS0..VS19 to call_used instead of
> call_saved. Since I no longer need to set the conditional register usage with
> -mdebug=reg, it is simpler to just delete it.
>
> 2018-03-16 Michael Meissner <meissner@linux.vnet.ibm.com>
>
> * config/rs6000/rs6000.c (rs6000_debug_reg_print): Eliminate call
> to rs6000_conditional_register_usage.
Yes, debug output should *never* change *any* state.
Could you fold this patch into the patch that created the problem please?
Segher
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch], PowerPC memory support pre-gcc9, patch #1
2018-03-14 23:01 [RFC Patch], PowerPC memory support pre-gcc9, patch #1 Michael Meissner
` (3 preceding siblings ...)
2018-03-16 23:31 ` [RFC Patch], PowerPC memory support pre-gcc9, patch #1 Michael Meissner
@ 2018-03-20 13:04 ` Segher Boessenkool
2018-03-20 20:35 ` Michael Meissner
4 siblings, 1 reply; 11+ messages in thread
From: Segher Boessenkool @ 2018-03-20 13:04 UTC (permalink / raw)
To: Michael Meissner, GCC Patches, David Edelsohn, Bill Schmidt
Hi! Some comments...
On Wed, Mar 14, 2018 at 06:54:08PM -0400, Michael Meissner wrote:
> The first patch in the series moves most of the reg_addr structure from
> rs6000.c to rs6000-protos.h, so that in the next patch, we can start splitting
> some of the address code to other files.
Is that the correct header? It currently contains only function
prototypes, and the name indicates that is what it should be.
> 1) I was playing with making r12 be fixed with a new option (not in this
> set of patches), and I noticed it wasn't reflected in the -mdebug=reg
> debug dump, due to the debug dump being done before the conditional
> registers are setup. I made the debug dump set conditional registers.
Various ABIs use r12 for various things. It's also used for split stack.
Besides that it is available for programs to do with as they please.
> I likely will remove the undocumented toc-fusion all together, and eventually
> rework the p8/p9 fusion support.
Did it ever give any performance improvement?
> 2018-03-14 Michael Meissner <meissner@linux.vnet.ibm.com>
> * config/rs6000/rs6000-protos.h (regno_or_subregno): Add
> declaration.
There is a generic reg_or_subregno, how does this differ? If we need
it please change the name so the difference is clear.
It is very hard to review these patches. Please do patches that only
move or rename things, not changing functionality, as separate patches
(usually before everything else).
Segher
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC Patch], PowerPC memory support pre-gcc9, patch #1
2018-03-20 13:04 ` Segher Boessenkool
@ 2018-03-20 20:35 ` Michael Meissner
0 siblings, 0 replies; 11+ messages in thread
From: Michael Meissner @ 2018-03-20 20:35 UTC (permalink / raw)
To: Segher Boessenkool
Cc: Michael Meissner, GCC Patches, David Edelsohn, Bill Schmidt
On Tue, Mar 20, 2018 at 08:01:57AM -0500, Segher Boessenkool wrote:
> Hi! Some comments...
>
> On Wed, Mar 14, 2018 at 06:54:08PM -0400, Michael Meissner wrote:
> > The first patch in the series moves most of the reg_addr structure from
> > rs6000.c to rs6000-protos.h, so that in the next patch, we can start splitting
> > some of the address code to other files.
>
> Is that the correct header? It currently contains only function
> prototypes, and the name indicates that is what it should be.
>
> > 1) I was playing with making r12 be fixed with a new option (not in this
> > set of patches), and I noticed it wasn't reflected in the -mdebug=reg
> > debug dump, due to the debug dump being done before the conditional
> > registers are setup. I made the debug dump set conditional registers.
>
> Various ABIs use r12 for various things. It's also used for split stack.
> Besides that it is available for programs to do with as they please.
>
> > I likely will remove the undocumented toc-fusion all together, and eventually
> > rework the p8/p9 fusion support.
>
> Did it ever give any performance improvement?
>
> > 2018-03-14 Michael Meissner <meissner@linux.vnet.ibm.com>
> > * config/rs6000/rs6000-protos.h (regno_or_subregno): Add
> > declaration.
>
> There is a generic reg_or_subregno, how does this differ? If we need
> it please change the name so the difference is clear.
>
> It is very hard to review these patches. Please do patches that only
> move or rename things, not changing functionality, as separate patches
> (usually before everything else).
Ok, but if you want me to shove everything back into rs6000.c that simplifies
things. Some of the artiface is to support the reg_addr stuff in multiple
locations.
--
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797
^ permalink raw reply [flat|nested] 11+ messages in thread