public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Pr 68805, Fix PowerPC little endian -mvsx-timode
@ 2015-12-16 23:20 Michael Meissner
  2015-12-16 23:51 ` David Edelsohn
  0 siblings, 1 reply; 2+ messages in thread
From: Michael Meissner @ 2015-12-16 23:20 UTC (permalink / raw)
  To: gcc-patches, dje.gcc

[-- Attachment #1: Type: text/plain, Size: 2116 bytes --]

My first mail did not seem to be delivered, so I'm trying again.

This fixes a bug with the debug switch -mvsx-timode that we would eventually
like to enable by default on PowerPC little endian server systems.  The bug is
that the load with rotate or rotate with store instructions needed on power8
little endian systems used VEC_SELECT to swap the 64-bit words.  This patch
uses ROTATE for TImode, just like I did for KFmode.

Without this patch, 10 of the 30 spec 2006 benchmarks fail to compile on a
little endian PowerPC system with -mvsx-timode.  With the patch, all 30
benchmarks compile and do the spec verification.

In developing the patch, I noticed that the generic swap optimizations that are
done for vector types are not done for TImode, since we don't split the TImoves
until after register allocation when we discover a vector register was used
instead of a GPR register.  So, I added a peephole2 to catch the common case of
store followed by load, eliminating the pair of ROTATE insns.

I bootstrapped it on both a big endian power7 and a little endian power8 system
with no regressions.  Is it ok to install on the trunk?

At the current time, I don't see the need to back port it to GCC 5 (though the
backport is fairly simple), because it isn't on by default in GCC 5, and we
don't plan to eventually have -mvsx-timode and -mlra on by default in that
branch.

[gcc]
2015-12-15  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/68805
	* config/rs6000/rs6000.c (rs6000_gen_le_vsx_permute): Use ROTATE
	instead of VEC_SELECT for TImode.

	* config/rs6000/vsx.md (VSX_LE): Move TImode from VSX_LE to
	VSX_LE_128, so that we use ROTATE to swap the 64-bit words instead
	of using VEC_SELECT.
	(VSX_LE_128): Likewise.
	(define_peephole2): Add peephole to eliminate double xxpermdi when
	copying TImode.

[gcc/testsuite]
2015-12-15  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/68805
	* gcc.target/powerpc/pr68805.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[-- Attachment #2: pr68805.patch01b --]
[-- Type: text/plain, Size: 3239 bytes --]

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 231624)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -8829,8 +8829,9 @@ rs6000_const_vec (machine_mode mode)
 rtx
 rs6000_gen_le_vsx_permute (rtx source, machine_mode mode)
 {
-  /* Use ROTATE instead of VEC_SELECT on IEEE 128-bit floating point.  */
-  if (FLOAT128_VECTOR_P (mode))
+  /* Use ROTATE instead of VEC_SELECT on IEEE 128-bit floating point, and
+     128-bit integers if they are allowed in VSX registers.  */
+  if (FLOAT128_VECTOR_P (mode) || mode == TImode)
     return gen_rtx_ROTATE (mode, source, GEN_INT (64));
   else
     {
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 231624)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -26,15 +26,13 @@ (define_mode_iterator VSX_D [V2DF V2DI])
 
 ;; Iterator for the 2 64-bit vector types + 128-bit types that are loaded with
 ;; lxvd2x to properly handle swapping words on little endian
-(define_mode_iterator VSX_LE [V2DF
-			      V2DI
-			      V1TI
-			      (TI	"VECTOR_MEM_VSX_P (TImode)")])
+(define_mode_iterator VSX_LE [V2DF V2DI V1TI])
 
 ;; Mode iterator to handle swapping words on little endian for the 128-bit
 ;; types that goes in a single vector register.
 (define_mode_iterator VSX_LE_128 [(KF   "FLOAT128_VECTOR_P (KFmode)")
-				  (TF   "FLOAT128_VECTOR_P (TFmode)")])
+				  (TF   "FLOAT128_VECTOR_P (TFmode)")
+				  (TI	"TARGET_VSX_TIMODE")])
 
 ;; Iterator for the 2 32-bit vector types
 (define_mode_iterator VSX_W [V4SF V4SI])
@@ -739,6 +737,21 @@ (define_split
                                        : operands[0];
 })
 
+;; Peephole to catch memory to memory transfers for TImode if TImode landed in
+;; VSX registers on a little endian system.  The vector types and IEEE 128-bit
+;; floating point are handled by the more generic swap elimination pass.
+(define_peephole2
+  [(set (match_operand:TI 0 "vsx_register_operand" "")
+	(rotate:TI (match_operand:TI 1 "vsx_register_operand" "")
+		   (const_int 64)))
+   (set (match_operand:TI 2 "vsx_register_operand" "")
+	(rotate:TI (match_dup 0)
+		   (const_int 64)))]
+  "!BYTES_BIG_ENDIAN && TARGET_VSX && TARGET_VSX_TIMODE
+   && (rtx_equal_p (operands[0], operands[2])
+       || peep2_reg_dead_p (2, operands[0]))"
+   [(set (match_dup 2) (match_dup 1))])
+
 ;; The post-reload split requires that we re-permute the source
 ;; register in case it is still live.
 (define_split
Index: gcc/testsuite/gcc.target/powerpc/pr68805.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr68805.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr68805.c	(revision 0)
@@ -0,0 +1,10 @@
+/* { dg-do compile { target powerpc64le-*-* } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-O2 -mvsx-timode -mcpu=power8" } */
+
+void foo (__int128 *p, __int128 *q) { *p = *q; }
+
+/* { dg-final { scan-assembler     "lxvd2x"   } } */
+/* { dg-final { scan-assembler     "stxvd2x"  } } */
+/* { dg-final { scan-assembler-not "xxpermdi" } } */
+

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] Pr 68805, Fix PowerPC little endian -mvsx-timode
  2015-12-16 23:20 [PATCH] Pr 68805, Fix PowerPC little endian -mvsx-timode Michael Meissner
@ 2015-12-16 23:51 ` David Edelsohn
  0 siblings, 0 replies; 2+ messages in thread
From: David Edelsohn @ 2015-12-16 23:51 UTC (permalink / raw)
  To: Michael Meissner, GCC Patches

On Wed, Dec 16, 2015 at 6:20 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> My first mail did not seem to be delivered, so I'm trying again.
>
> This fixes a bug with the debug switch -mvsx-timode that we would eventually
> like to enable by default on PowerPC little endian server systems.  The bug is
> that the load with rotate or rotate with store instructions needed on power8
> little endian systems used VEC_SELECT to swap the 64-bit words.  This patch
> uses ROTATE for TImode, just like I did for KFmode.
>
> Without this patch, 10 of the 30 spec 2006 benchmarks fail to compile on a
> little endian PowerPC system with -mvsx-timode.  With the patch, all 30
> benchmarks compile and do the spec verification.
>
> In developing the patch, I noticed that the generic swap optimizations that are
> done for vector types are not done for TImode, since we don't split the TImoves
> until after register allocation when we discover a vector register was used
> instead of a GPR register.  So, I added a peephole2 to catch the common case of
> store followed by load, eliminating the pair of ROTATE insns.
>
> I bootstrapped it on both a big endian power7 and a little endian power8 system
> with no regressions.  Is it ok to install on the trunk?
>
> At the current time, I don't see the need to back port it to GCC 5 (though the
> backport is fairly simple), because it isn't on by default in GCC 5, and we
> don't plan to eventually have -mvsx-timode and -mlra on by default in that
> branch.
>
> [gcc]
> 2015-12-15  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         PR target/68805
>         * config/rs6000/rs6000.c (rs6000_gen_le_vsx_permute): Use ROTATE
>         instead of VEC_SELECT for TImode.
>
>         * config/rs6000/vsx.md (VSX_LE): Move TImode from VSX_LE to
>         VSX_LE_128, so that we use ROTATE to swap the 64-bit words instead
>         of using VEC_SELECT.
>         (VSX_LE_128): Likewise.
>         (define_peephole2): Add peephole to eliminate double xxpermdi when
>         copying TImode.
>
> [gcc/testsuite]
> 2015-12-15  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         PR target/68805
>         * gcc.target/powerpc/pr68805.c: New test.

Okay.

Thanks, David

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-12-16 23:51 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-16 23:20 [PATCH] Pr 68805, Fix PowerPC little endian -mvsx-timode Michael Meissner
2015-12-16 23:51 ` David Edelsohn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).