public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [PATCH] Power/GCC: Implement little-endian SPE operations
@ 2014-07-07 13:58 David Edelsohn
  2014-07-07 14:33 ` Maciej W. Rozycki
  0 siblings, 1 reply; 5+ messages in thread
From: David Edelsohn @ 2014-07-07 13:58 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: GCC Patches

2014-07-07  Maciej W. Rozycki  <macro@codesourcery.com>

gcc/
* config/rs6000/rs6000.c (output_vec_const_move): Handle
little-endian code generation.
* config/rs6000/spe.md (spe_evmergehi): Rename to...
(vec_perm00_v2si): ... this.  Handle little-endian code
generation.
(spe_evmergehilo): Rename to...
(vec_perm01_v2si): ... this.  Handle little-endian code
generation.
(spe_evmergelo): Rename to...
(vec_perm11_v2si): ... this.  Handle little-endian code
generation.
(spe_evmergelohi): Rename to...
(vec_perm10_v2si): ... this.  Handle little-endian code
generation.
(spe_evmergehi, spe_evmergehilo): New expanders.
(spe_evmergelo, spe_evmergelohi): Likewise.
(*frob_<SPE64:mode>_<DITI:mode>): Handle little-endian code
generation.
(*frob_tf_ti): Likewise.
(*frob_<mode>_di_2): Likewise.
(*frob_tf_di_8_2): Likewise.
(*frob_di_<mode>): Likewise.
(*frob_ti_tf): Likewise.
(*frob_<DITI:mode>_<SPE64:mode>_2): Likewise.
(*frob_ti_<mode>_8_2): Likewise.
(*frob_ti_tf_2): Likewise.
(mov_si<mode>_e500_subreg0): Rename to...
(mov_si<mode>_e500_subreg0_be): ... this.  Restrict to the big
endianness only.
(*mov_si<mode>_e500_subreg0_le): New instruction pattern.
(*mov_si<mode>_e500_subreg0_elf_low): Rename to...
(*mov_si<mode>_e500_subreg0_elf_low_be): ... this.  Restrict to
the big endianness only.
(*mov_si<mode>_e500_subreg0_elf_low_le): New instruction pattern.
(*mov_si<mode>_e500_subreg0_2): Rename to...
(*mov_si<mode>_e500_subreg0_2_be): ... this.  Restrict to the
big big endianness only.
(*mov_si<mode>_e500_subreg0_2_le): New instruction pattern.
(*mov_si<mode>_e500_subreg4): Rename to...
(*mov_si<mode>_e500_subreg4_be): ... this.  Restrict to the big
endianness only.
(mov_si<mode>_e500_subreg4_le): New instruction pattern.
(*mov_si<mode>_e500_subreg4_elf_low): Rename to...
(*mov_si<mode>_e500_subreg4_elf_low_be): ... this.  Restrict to
the big endianness only.
(*mov_si<mode>_e500_subreg4_elf_low_le): New instruction/splitter
pattern.
(*mov_si<mode>_e500_subreg4_2): Rename to...
(*mov_si<mode>_e500_subreg4_2_be): ... this.  Restrict to the big
endianness only.
(*mov_si<mode>_e500_subreg4_2_le): New instruction pattern.
(*mov_sitf_e500_subreg8): Rename to...
(*mov_sitf_e500_subreg8_be): ... this.  Restrict to the big
endianness only.
(*mov_sitf_e500_subreg8_le): New instruction pattern.
(*mov_sitf_e500_subreg8_2): Rename to...
(*mov_sitf_e500_subreg8_2_be): ... this.  Restrict to the big
endianness only.
(*mov_sitf_e500_subreg8_2_le): New instruction pattern.
(*mov_sitf_e500_subreg12): Rename to...
(*mov_sitf_e500_subreg12_be): ... this.  Restrict to the big
endianness only.
(*mov_sitf_e500_subreg12_le): New instruction pattern.
(*mov_sitf_e500_subreg12_2): Rename to...
(*mov_sitf_e500_subreg12_2_be): ... this.  Restrict to the big
endianness only.
(*mov_sitf_e500_subreg12_2_le): New instruction pattern.

gcc/testsuite/
* gcc.target/powerpc/spe-evmerge.c: New file.

Okay.

Could you add a short comment explaining what the "0" and "1" labels
in vec_perm[01][01]_v2si mean?

Thanks, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Power/GCC: Implement little-endian SPE operations
  2014-07-07 13:58 [PATCH] Power/GCC: Implement little-endian SPE operations David Edelsohn
@ 2014-07-07 14:33 ` Maciej W. Rozycki
  2014-07-07 15:05   ` David Edelsohn
  0 siblings, 1 reply; 5+ messages in thread
From: Maciej W. Rozycki @ 2014-07-07 14:33 UTC (permalink / raw)
  To: David Edelsohn; +Cc: GCC Patches

On Mon, 7 Jul 2014, David Edelsohn wrote:

> gcc/
> * config/rs6000/rs6000.c (output_vec_const_move): Handle
> little-endian code generation.
> * config/rs6000/spe.md (spe_evmergehi): Rename to...
> (vec_perm00_v2si): ... this.  Handle little-endian code
> generation.
> (spe_evmergehilo): Rename to...
> (vec_perm01_v2si): ... this.  Handle little-endian code
> generation.
> (spe_evmergelo): Rename to...
> (vec_perm11_v2si): ... this.  Handle little-endian code
> generation.
> (spe_evmergelohi): Rename to...
> (vec_perm10_v2si): ... this.  Handle little-endian code
> generation.
> (spe_evmergehi, spe_evmergehilo): New expanders.
> (spe_evmergelo, spe_evmergelohi): Likewise.
> (*frob_<SPE64:mode>_<DITI:mode>): Handle little-endian code
> generation.
> (*frob_tf_ti): Likewise.
> (*frob_<mode>_di_2): Likewise.
> (*frob_tf_di_8_2): Likewise.
> (*frob_di_<mode>): Likewise.
> (*frob_ti_tf): Likewise.
> (*frob_<DITI:mode>_<SPE64:mode>_2): Likewise.
> (*frob_ti_<mode>_8_2): Likewise.
> (*frob_ti_tf_2): Likewise.
> (mov_si<mode>_e500_subreg0): Rename to...
> (mov_si<mode>_e500_subreg0_be): ... this.  Restrict to the big
> endianness only.
> (*mov_si<mode>_e500_subreg0_le): New instruction pattern.
> (*mov_si<mode>_e500_subreg0_elf_low): Rename to...
> (*mov_si<mode>_e500_subreg0_elf_low_be): ... this.  Restrict to
> the big endianness only.
> (*mov_si<mode>_e500_subreg0_elf_low_le): New instruction pattern.
> (*mov_si<mode>_e500_subreg0_2): Rename to...
> (*mov_si<mode>_e500_subreg0_2_be): ... this.  Restrict to the
> big big endianness only.
> (*mov_si<mode>_e500_subreg0_2_le): New instruction pattern.
> (*mov_si<mode>_e500_subreg4): Rename to...
> (*mov_si<mode>_e500_subreg4_be): ... this.  Restrict to the big
> endianness only.
> (mov_si<mode>_e500_subreg4_le): New instruction pattern.
> (*mov_si<mode>_e500_subreg4_elf_low): Rename to...
> (*mov_si<mode>_e500_subreg4_elf_low_be): ... this.  Restrict to
> the big endianness only.
> (*mov_si<mode>_e500_subreg4_elf_low_le): New instruction/splitter
> pattern.
> (*mov_si<mode>_e500_subreg4_2): Rename to...
> (*mov_si<mode>_e500_subreg4_2_be): ... this.  Restrict to the big
> endianness only.
> (*mov_si<mode>_e500_subreg4_2_le): New instruction pattern.
> (*mov_sitf_e500_subreg8): Rename to...
> (*mov_sitf_e500_subreg8_be): ... this.  Restrict to the big
> endianness only.
> (*mov_sitf_e500_subreg8_le): New instruction pattern.
> (*mov_sitf_e500_subreg8_2): Rename to...
> (*mov_sitf_e500_subreg8_2_be): ... this.  Restrict to the big
> endianness only.
> (*mov_sitf_e500_subreg8_2_le): New instruction pattern.
> (*mov_sitf_e500_subreg12): Rename to...
> (*mov_sitf_e500_subreg12_be): ... this.  Restrict to the big
> endianness only.
> (*mov_sitf_e500_subreg12_le): New instruction pattern.
> (*mov_sitf_e500_subreg12_2): Rename to...
> (*mov_sitf_e500_subreg12_2_be): ... this.  Restrict to the big
> endianness only.
> (*mov_sitf_e500_subreg12_2_le): New instruction pattern.
> 
> gcc/testsuite/
> * gcc.target/powerpc/spe-evmerge.c: New file.
> 
> Okay.
> 
> Could you add a short comment explaining what the "0" and "1" labels
> in vec_perm[01][01]_v2si mean?

 Like this?

  Maciej

gcc-ppc-spe-le-update.diff
Index: gcc-fsf-trunk-quilt/gcc/config/rs6000/spe.md
===================================================================
--- gcc-fsf-trunk-quilt.orig/gcc/config/rs6000/spe.md	2014-07-07 15:27:54.258937029 +0100
+++ gcc-fsf-trunk-quilt/gcc/config/rs6000/spe.md	2014-07-07 15:26:43.748937008 +0100
@@ -438,6 +438,11 @@
   [(set_attr "type" "vecload")
    (set_attr  "length" "4")])
 
+;; Integer vector permutation instructions.  The pairs of digits in the
+;; names of these instructions indicate the indices, in the memory vector
+;; element ordering, of the vector elements permuted to the output vector
+;; from the first and the second input vector respectively.
+
 (define_insn "vec_perm00_v2si"
   [(set (match_operand:V2SI 0 "gpc_reg_operand" "=r")
 	(vec_select:V2SI
@@ -571,6 +576,8 @@
   DONE;
 })
 
+;; End of integer vector permutation instructions.
+
 (define_insn "spe_evnand"
   [(set (match_operand:V2SI 0 "gpc_reg_operand" "=r")
         (not:V2SI (and:V2SI (match_operand:V2SI 1 "gpc_reg_operand" "r")

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Power/GCC: Implement little-endian SPE operations
  2014-07-07 14:33 ` Maciej W. Rozycki
@ 2014-07-07 15:05   ` David Edelsohn
  2014-07-07 15:49     ` Maciej W. Rozycki
  0 siblings, 1 reply; 5+ messages in thread
From: David Edelsohn @ 2014-07-07 15:05 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: GCC Patches

On Mon, Jul 7, 2014 at 10:33 AM, Maciej W. Rozycki
<macro@codesourcery.com> wrote:

>> Could you add a short comment explaining what the "0" and "1" labels
>> in vec_perm[01][01]_v2si mean?
>
>  Like this?
>
>   Maciej
>
> gcc-ppc-spe-le-update.diff
> Index: gcc-fsf-trunk-quilt/gcc/config/rs6000/spe.md
> ===================================================================
> --- gcc-fsf-trunk-quilt.orig/gcc/config/rs6000/spe.md   2014-07-07 15:27:54.258937029 +0100
> +++ gcc-fsf-trunk-quilt/gcc/config/rs6000/spe.md        2014-07-07 15:26:43.748937008 +0100
> @@ -438,6 +438,11 @@
>    [(set_attr "type" "vecload")
>     (set_attr  "length" "4")])
>
> +;; Integer vector permutation instructions.  The pairs of digits in the
> +;; names of these instructions indicate the indices, in the memory vector
> +;; element ordering, of the vector elements permuted to the output vector
> +;; from the first and the second input vector respectively.

Yes, that's good. It helps if someone reading the code doesn't need to
reverse engineer the numbering convention of the name.

Thanks, David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] Power/GCC: Implement little-endian SPE operations
  2014-07-07 15:05   ` David Edelsohn
@ 2014-07-07 15:49     ` Maciej W. Rozycki
  0 siblings, 0 replies; 5+ messages in thread
From: Maciej W. Rozycki @ 2014-07-07 15:49 UTC (permalink / raw)
  To: David Edelsohn; +Cc: GCC Patches

On Mon, 7 Jul 2014, David Edelsohn wrote:

> Yes, that's good. It helps if someone reading the code doesn't need to
> reverse engineer the numbering convention of the name.

 Sure, I haven't questioned your request.  Thanks for your review, applied 
now.

  Maciej

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] Power/GCC: Implement little-endian SPE operations
@ 2014-07-07 11:41 Maciej W. Rozycki
  0 siblings, 0 replies; 5+ messages in thread
From: Maciej W. Rozycki @ 2014-07-07 11:41 UTC (permalink / raw)
  To: gcc-patches

Hi,

 This change implements little-endian code generation for Signal 
Processing Engine (SPE) operations.

 Where possible changes are handled within the existing patterns with 
suitable conditionals added to support the little-endian mode.

 In some cases operand constraints are different between the two 
endiannesses where an numerical entity is accessed in memory with a 
partial data transfer.  In these cases new patterns have been added and 
the existing patterns renamed to reflect the two endiannesses handled.  

 Finally the paired-integer vector permute intrinsics do not correspond to 
the same high-level operations and have therefore been reimplemented with 
new expander patterns.  The reason is number pairs in vectors are placed 
in memory in the same order regardless of the endianness selected -- the 
first number occupies the lower-addressed unit and the second number takes 
the higher-addressed unit.  When transferred into a register with a 
doubleword vector load operation they appear in the register word-swapped 
between endiannesses.

 These intrinsics turned out not properly covered by the testsuite, a 
mistake made in the process of implementing the new expanders went through 
unnoticed as only compilation-time checks are made and no run-time ones 
are.  Therefore a new test case has been added that covers the intrinsics, 
and that scores no failures with or without changes made to GCC with this 
patch.

 The existing patterns that used to handle these intrinsics and that can 
also be pulled implicitly by the optimiser, have been renamed to reflect 
the individual vector permutation operations they implement and extended 
to handle the little endianness too.

 This change removes several hundreds of failures seen in powerpc-eabi 
GCC, G++, libstdc++ and also GDB testing for the:

-mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe -mlittle

multilib and does not change results for the following powerpc-eabi 
multilibs:

-mcpu=603e
-mcpu=603e -msoft-float
-mcpu=8540 -mfloat-gprs=single -mspe=yes -mabi=spe
-mcpu=8540 -mfloat-gprs=single -mspe=yes -mabi=spe -msoft-float
-mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe
-mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe -msoft-float
-mcpu=7400 -maltivec -mabi=altivec

as well as the following powerpc-linux-gnu multilibs:

-mcpu=603e
-mcpu=603e -msoft-float
-mcpu=8540 -mfloat-gprs=single -mspe=yes -mabi=spe
-mcpu=8548 -mfloat-gprs=double -mspe=yes -mabi=spe
-mcpu=7400 -maltivec -mabi=altivec
-mcpu=e5500 -m64

 OK to apply?

2014-07-07  Maciej W. Rozycki  <macro@codesourcery.com>

	gcc/
	* config/rs6000/rs6000.c (output_vec_const_move): Handle
	little-endian code generation.
	* config/rs6000/spe.md (spe_evmergehi): Rename to...
	(vec_perm00_v2si): ... this.  Handle little-endian code 
	generation.
	(spe_evmergehilo): Rename to...
	(vec_perm01_v2si): ... this.  Handle little-endian code
	generation.
	(spe_evmergelo): Rename to...
	(vec_perm11_v2si): ... this.  Handle little-endian code
	generation.
	(spe_evmergelohi): Rename to...
	(vec_perm10_v2si): ... this.  Handle little-endian code
	generation.
	(spe_evmergehi, spe_evmergehilo): New expanders.
	(spe_evmergelo, spe_evmergelohi): Likewise.
	(*frob_<SPE64:mode>_<DITI:mode>): Handle little-endian code
	generation.
	(*frob_tf_ti): Likewise.
	(*frob_<mode>_di_2): Likewise.
	(*frob_tf_di_8_2): Likewise.
	(*frob_di_<mode>): Likewise.
	(*frob_ti_tf): Likewise.
	(*frob_<DITI:mode>_<SPE64:mode>_2): Likewise.
	(*frob_ti_<mode>_8_2): Likewise.
	(*frob_ti_tf_2): Likewise.
	(mov_si<mode>_e500_subreg0): Rename to...
	(mov_si<mode>_e500_subreg0_be): ... this.  Restrict to the big
	endianness only.
	(*mov_si<mode>_e500_subreg0_le): New instruction pattern.
	(*mov_si<mode>_e500_subreg0_elf_low): Rename to...
	(*mov_si<mode>_e500_subreg0_elf_low_be): ... this.  Restrict to 
	the big endianness only.
	(*mov_si<mode>_e500_subreg0_elf_low_le): New instruction pattern.
	(*mov_si<mode>_e500_subreg0_2): Rename to...
	(*mov_si<mode>_e500_subreg0_2_be): ... this.  Restrict to the
	big big endianness only.
	(*mov_si<mode>_e500_subreg0_2_le): New instruction pattern.
	(*mov_si<mode>_e500_subreg4): Rename to...
	(*mov_si<mode>_e500_subreg4_be): ... this.  Restrict to the big
	endianness only.
	(mov_si<mode>_e500_subreg4_le): New instruction pattern.
	(*mov_si<mode>_e500_subreg4_elf_low): Rename to...
	(*mov_si<mode>_e500_subreg4_elf_low_be): ... this.  Restrict to
	the big endianness only.
	(*mov_si<mode>_e500_subreg4_elf_low_le): New instruction/splitter
	pattern.
	(*mov_si<mode>_e500_subreg4_2): Rename to...
	(*mov_si<mode>_e500_subreg4_2_be): ... this.  Restrict to the big
	endianness only.
	(*mov_si<mode>_e500_subreg4_2_le): New instruction pattern.
	(*mov_sitf_e500_subreg8): Rename to...
	(*mov_sitf_e500_subreg8_be): ... this.  Restrict to the big
	endianness only.
	(*mov_sitf_e500_subreg8_le): New instruction pattern.
	(*mov_sitf_e500_subreg8_2): Rename to...
	(*mov_sitf_e500_subreg8_2_be): ... this.  Restrict to the big
	endianness only.
	(*mov_sitf_e500_subreg8_2_le): New instruction pattern.
	(*mov_sitf_e500_subreg12): Rename to...
	(*mov_sitf_e500_subreg12_be): ... this.  Restrict to the big
	endianness only.
	(*mov_sitf_e500_subreg12_le): New instruction pattern.
	(*mov_sitf_e500_subreg12_2): Rename to...
	(*mov_sitf_e500_subreg12_2_be): ... this.  Restrict to the big
	endianness only.
	(*mov_sitf_e500_subreg12_2_le): New instruction pattern.

	gcc/testsuite/
	* gcc.target/powerpc/spe-evmerge.c: New file.

  Maciej

gcc-ppc-spe-le.diff
Index: gcc-fsf-trunk-quilt/gcc/config/rs6000/rs6000.c
===================================================================
--- gcc-fsf-trunk-quilt.orig/gcc/config/rs6000/rs6000.c	2014-06-11 16:35:08.917560846 +0100
+++ gcc-fsf-trunk-quilt/gcc/config/rs6000/rs6000.c	2014-06-11 16:35:25.917851800 +0100
@@ -5299,8 +5299,10 @@ output_vec_const_move (rtx *operands)
   operands[2] = CONST_VECTOR_ELT (vec, 1);
   if (cst == cst2)
     return "li %0,%1\n\tevmergelo %0,%0,%0";
-  else
+  else if (WORDS_BIG_ENDIAN)
     return "li %0,%1\n\tevmergelo %0,%0,%0\n\tli %0,%2";
+  else
+    return "li %0,%2\n\tevmergelo %0,%0,%0\n\tli %0,%1";
 }
 
 /* Initialize TARGET of vector PAIRED to VALS.  */
Index: gcc-fsf-trunk-quilt/gcc/config/rs6000/spe.md
===================================================================
--- gcc-fsf-trunk-quilt.orig/gcc/config/rs6000/spe.md	2014-05-16 16:01:20.197526085 +0100
+++ gcc-fsf-trunk-quilt/gcc/config/rs6000/spe.md	2014-06-11 16:35:25.917851800 +0100
@@ -438,7 +438,7 @@
   [(set_attr "type" "vecload")
    (set_attr  "length" "4")])
 
-(define_insn "spe_evmergehi"
+(define_insn "vec_perm00_v2si"
   [(set (match_operand:V2SI 0 "gpc_reg_operand" "=r")
 	(vec_select:V2SI
 	  (vec_concat:V4SI
@@ -446,11 +446,16 @@
 	    (match_operand:V2SI 2 "gpc_reg_operand" "r"))
 	  (parallel [(const_int 0) (const_int 2)])))]
   "TARGET_SPE"
-  "evmergehi %0,%1,%2"
+{
+  if (WORDS_BIG_ENDIAN)
+    return "evmergehi %0,%1,%2";
+  else
+    return "evmergelo %0,%2,%1";
+}
   [(set_attr "type" "vecsimple")
    (set_attr  "length" "4")])
 
-(define_insn "spe_evmergehilo"
+(define_insn "vec_perm01_v2si"
   [(set (match_operand:V2SI 0 "gpc_reg_operand" "=r")
 	(vec_select:V2SI
 	  (vec_concat:V4SI
@@ -458,11 +463,16 @@
 	    (match_operand:V2SI 2 "gpc_reg_operand" "r"))
 	  (parallel [(const_int 0) (const_int 3)])))]
   "TARGET_SPE"
-  "evmergehilo %0,%1,%2"
+{
+  if (WORDS_BIG_ENDIAN)
+    return "evmergehilo %0,%1,%2";
+  else
+    return "evmergehilo %0,%2,%1";
+}
   [(set_attr "type" "vecsimple")
    (set_attr  "length" "4")])
 
-(define_insn "spe_evmergelo"
+(define_insn "vec_perm11_v2si"
   [(set (match_operand:V2SI 0 "gpc_reg_operand" "=r")
 	(vec_select:V2SI
 	  (vec_concat:V4SI
@@ -470,11 +480,16 @@
 	    (match_operand:V2SI 2 "gpc_reg_operand" "r"))
 	  (parallel [(const_int 1) (const_int 3)])))]
   "TARGET_SPE"
-  "evmergelo %0,%1,%2"
+{
+  if (WORDS_BIG_ENDIAN)
+    return "evmergelo %0,%1,%2";
+  else
+    return "evmergehi %0,%2,%1";
+}
   [(set_attr "type" "vecsimple")
    (set_attr  "length" "4")])
 
-(define_insn "spe_evmergelohi"
+(define_insn "vec_perm10_v2si"
   [(set (match_operand:V2SI 0 "gpc_reg_operand" "=r")
 	(vec_select:V2SI
 	  (vec_concat:V4SI
@@ -482,7 +497,12 @@
 	    (match_operand:V2SI 2 "gpc_reg_operand" "r"))
 	  (parallel [(const_int 1) (const_int 2)])))]
   "TARGET_SPE"
-  "evmergelohi %0,%1,%2"
+{
+  if (WORDS_BIG_ENDIAN)
+    return "evmergelohi %0,%1,%2";
+  else
+    return "evmergelohi %0,%2,%1";
+}
   [(set_attr "type" "vecsimple")
    (set_attr  "length" "4")])
 
@@ -499,6 +519,58 @@
     FAIL;
 })
 
+(define_expand "spe_evmergehi"
+  [(match_operand:V2SI 0 "register_operand" "")
+   (match_operand:V2SI 1 "register_operand" "")
+   (match_operand:V2SI 2 "register_operand" "")]
+  "TARGET_SPE"
+{
+  if (BYTES_BIG_ENDIAN)
+    emit_insn (gen_vec_perm00_v2si (operands[0], operands[1], operands[2]));
+  else
+    emit_insn (gen_vec_perm11_v2si (operands[0], operands[2], operands[1]));
+  DONE;
+})
+
+(define_expand "spe_evmergehilo"
+  [(match_operand:V2SI 0 "register_operand" "")
+   (match_operand:V2SI 1 "register_operand" "")
+   (match_operand:V2SI 2 "register_operand" "")]
+  "TARGET_SPE"
+{
+  if (BYTES_BIG_ENDIAN)
+    emit_insn (gen_vec_perm01_v2si (operands[0], operands[1], operands[2]));
+  else
+    emit_insn (gen_vec_perm01_v2si (operands[0], operands[2], operands[1]));
+  DONE;
+})
+
+(define_expand "spe_evmergelo"
+  [(match_operand:V2SI 0 "register_operand" "")
+   (match_operand:V2SI 1 "register_operand" "")
+   (match_operand:V2SI 2 "register_operand" "")]
+  "TARGET_SPE"
+{
+  if (BYTES_BIG_ENDIAN)
+    emit_insn (gen_vec_perm11_v2si (operands[0], operands[1], operands[2]));
+  else
+    emit_insn (gen_vec_perm00_v2si (operands[0], operands[2], operands[1]));
+  DONE;
+})
+
+(define_expand "spe_evmergelohi"
+  [(match_operand:V2SI 0 "register_operand" "")
+   (match_operand:V2SI 1 "register_operand" "")
+   (match_operand:V2SI 2 "register_operand" "")]
+  "TARGET_SPE"
+{
+  if (BYTES_BIG_ENDIAN)
+    emit_insn (gen_vec_perm10_v2si (operands[0], operands[1], operands[2]));
+  else
+    emit_insn (gen_vec_perm10_v2si (operands[0], operands[2], operands[1]));
+  DONE;
+})
+
 (define_insn "spe_evnand"
   [(set (match_operand:V2SI 0 "gpc_reg_operand" "=r")
         (not:V2SI (and:V2SI (match_operand:V2SI 1 "gpc_reg_operand" "r")
@@ -2220,15 +2292,31 @@
         (subreg:SPE64 (match_operand:DITI 1 "input_operand" "r,m") 0))]
   "(TARGET_E500_DOUBLE && <SPE64:MODE>mode == DFmode)
    || (TARGET_SPE && <SPE64:MODE>mode != DFmode)"
-  "@
-   evmergelo %0,%1,%L1
-   evldd%X1 %0,%y1")
+{
+  switch (which_alternative)
+    {
+    default:
+      gcc_unreachable ();
+    case 0:
+      if (WORDS_BIG_ENDIAN)
+	return "evmergelo %0,%1,%L1";
+      else
+	return "evmergelo %0,%L1,%1";
+    case 1:
+      return "evldd%X1 %0,%y1";
+    }
+})
 
 (define_insn "*frob_tf_ti"
   [(set (match_operand:TF 0 "gpc_reg_operand" "=r")
         (subreg:TF (match_operand:TI 1 "gpc_reg_operand" "r") 0))]
   "TARGET_E500_DOUBLE"
-  "evmergelo %0,%1,%L1\;evmergelo %L0,%Y1,%Z1"
+{
+  if (WORDS_BIG_ENDIAN)
+    return "evmergelo %0,%1,%L1\;evmergelo %L0,%Y1,%Z1";
+  else
+    return "evmergelo %L0,%Z1,%Y1\;evmergelo %0,%L1,%1";
+}
   [(set_attr "length" "8")])
 
 (define_insn "*frob_<mode>_di_2"
@@ -2236,31 +2324,63 @@
         (match_operand:DI 1 "input_operand" "r,m"))]
   "(TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
    || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode)"
-  "@
-   evmergelo %0,%1,%L1
-   evldd%X1 %0,%y1")
+{
+  switch (which_alternative)
+    {
+    default:
+      gcc_unreachable ();
+    case 0:
+      if (WORDS_BIG_ENDIAN)
+	return "evmergelo %0,%1,%L1";
+      else
+	return "evmergelo %0,%L1,%1";
+    case 1:
+      return "evldd%X1 %0,%y1";
+    }
+})
 
 (define_insn "*frob_tf_di_8_2"
   [(set (subreg:DI (match_operand:TF 0 "nonimmediate_operand" "+&r,r") 8)
         (match_operand:DI 1 "input_operand" "r,m"))]
   "TARGET_E500_DOUBLE"
-  "@
-   evmergelo %L0,%1,%L1
-   evldd%X1 %L0,%y1")
+{
+  switch (which_alternative)
+    {
+    default:
+      gcc_unreachable ();
+    case 0:
+      if (WORDS_BIG_ENDIAN)
+	return "evmergelo %L0,%1,%L1";
+      else
+	return "evmergelo %L0,%L1,%1";
+    case 1:
+      return "evldd%X1 %L0,%y1";
+    }
+})
 
 (define_insn "*frob_di_<mode>"
   [(set (match_operand:DI 0 "nonimmediate_operand" "=&r")
         (subreg:DI (match_operand:SPE64TF 1 "input_operand" "r") 0))]
   "(TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
    || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode)"
-  "evmergehi %0,%1,%1\;mr %L0,%1"
+{
+  if (WORDS_BIG_ENDIAN)
+    return "evmergehi %0,%1,%1\;mr %L0,%1";
+  else
+    return "evmergehi %L0,%1,%1\;mr %0,%1";
+}
   [(set_attr "length" "8")])
 
 (define_insn "*frob_ti_tf"
   [(set (match_operand:TI 0 "nonimmediate_operand" "=&r")
         (subreg:TI (match_operand:TF 1 "input_operand" "r") 0))]
   "TARGET_E500_DOUBLE"
-  "evmergehi %0,%1,%1\;mr %L0,%1\;evmergehi %Y0,%L1,%L1\;mr %Z0,%L1"
+{
+  if (WORDS_BIG_ENDIAN)
+    return "evmergehi %0,%1,%1\;mr %L0,%1\;evmergehi %Y0,%L1,%L1\;mr %Z0,%L1";
+  else
+    return "evmergehi %Z0,%L1,%L1\;mr %Y0,%L1\;evmergehi %L0,%1,%1\;mr %0,%1";
+}
   [(set_attr "length" "16")])
 
 (define_insn "*frob_<DITI:mode>_<SPE64:mode>_2"
@@ -2275,22 +2395,40 @@
     default: 
       gcc_unreachable ();
     case 0:
-      return \"evmergehi %0,%1,%1\;mr %L0,%1\";
+      if (WORDS_BIG_ENDIAN)
+	return \"evmergehi %0,%1,%1\;mr %L0,%1\";
+      else
+	return \"evmergehi %L0,%1,%1\;mr %0,%1\";
     case 1:
       /* If the address is not offsettable we need to load the whole
 	 doubleword into a 64-bit register and then copy the high word
 	 to form the correct output layout.  */
       if (!offsettable_nonstrict_memref_p (operands[1]))
-	return \"evldd%X1 %L0,%y1\;evmergehi %0,%L0,%L0\";
+	{
+	  if (WORDS_BIG_ENDIAN)
+	    return \"evldd%X1 %L0,%y1\;evmergehi %0,%L0,%L0\";
+	  else
+	    return \"evldd%X1 %0,%y1\;evmergehi %L0,%0,%0\";
+	}
       /* If the low-address word is used in the address, we must load
 	it last.  Otherwise, load it first.  Note that we cannot have
 	auto-increment in that case since the address register is
 	known to be dead.  */
       if (refers_to_regno_p (REGNO (operands[0]), REGNO (operands[0]) + 1,
 			     operands[1], 0))
-	return \"lwz %L0,%L1\;lwz %0,%1\";
+	{
+	  if (WORDS_BIG_ENDIAN)
+	    return \"lwz %L0,%L1\;lwz %0,%1\";
+	  else
+	    return \"lwz %0,%1\;lwz %L0,%L1\";
+	}
       else
-        return \"lwz%U1%X1 %0,%1\;lwz %L0,%L1\";
+	{
+	  if (WORDS_BIG_ENDIAN)
+	    return \"lwz%U1%X1 %0,%1\;lwz %L0,%L1\";
+	  else
+	    return \"lwz%U1%X1 %L0,%L1\;lwz %0,%1\";
+	}
     }
 }"
   [(set_attr "length" "8,8")])
@@ -2308,15 +2446,33 @@
     default: 
       gcc_unreachable ();
     case 0:
-      return \"evmergehi %Y0,%1,%1\;mr %Z0,%1\";
+      if (WORDS_BIG_ENDIAN)
+	return \"evmergehi %Y0,%1,%1\;mr %Z0,%1\";
+      else
+	return \"evmergehi %Z0,%1,%1\;mr %Y0,%1\";
     case 1:
       if (!offsettable_nonstrict_memref_p (operands[1]))
-	return \"evldd%X1 %Z0,%y1\;evmergehi %Y0,%Z0,%Z0\";
+	{
+	  if (WORDS_BIG_ENDIAN)
+	    return \"evldd%X1 %Z0,%y1\;evmergehi %Y0,%Z0,%Z0\";
+	  else
+	    return \"evldd%X1 %Y0,%y1\;evmergehi %Z0,%Y0,%Y0\";
+	}
       if (refers_to_regno_p (REGNO (operands[0]), REGNO (operands[0]) + 1,
 			     operands[1], 0))
-	return \"lwz %Z0,%L1\;lwz %Y0,%1\";
+	{
+	  if (WORDS_BIG_ENDIAN)
+	    return \"lwz %Z0,%L1\;lwz %Y0,%1\";
+	  else
+	    return \"lwz %Y0,%1\;lwz %Z0,%L1\";
+	}
       else
-        return \"lwz%U1%X1 %Y0,%1\;lwz %Z0,%L1\";
+	{
+	  if (WORDS_BIG_ENDIAN)
+	    return \"lwz%U1%X1 %Y0,%1\;lwz %Z0,%L1\";
+	  else
+	    return \"lwz%U1%X1 %Z0,%L1\;lwz %Y0,%1\";
+	}
     }
 }"
   [(set_attr "length" "8,8")])
@@ -2325,110 +2481,226 @@
   [(set (subreg:TF (match_operand:TI 0 "gpc_reg_operand" "=&r") 0)
 	(match_operand:TF 1 "input_operand" "r"))]
   "TARGET_E500_DOUBLE"
-  "evmergehi %0,%1,%1\;mr %L0,%1\;evmergehi %Y0,%L1,%L1\;mr %Z0,%L1"
+{
+  if (WORDS_BIG_ENDIAN)
+    return "evmergehi %0,%1,%1\;mr %L0,%1\;evmergehi %Y0,%L1,%L1\;mr %Z0,%L1";
+  else
+    return "evmergehi %Z0,%L1,%L1\;mr %Y0,%L1\;evmergehi %L0,%1,%1\;mr %0,%1";
+}
   [(set_attr "length" "16")])
 
-(define_insn "mov_si<mode>_e500_subreg0"
+(define_insn "mov_si<mode>_e500_subreg0_be"
   [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r,&r") 0)
 	(match_operand:SI 1 "input_operand" "r,m"))]
-  "(TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
-   || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode)"
+  "WORDS_BIG_ENDIAN
+   && ((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
+       || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))"
   "@
    evmergelo %0,%1,%0
    evmergelohi %0,%0,%0\;lwz%U1%X1 %0,%1\;evmergelohi %0,%0,%0"
   [(set_attr "length" "4,12")])
 
-(define_insn_and_split "*mov_si<mode>_e500_subreg0_elf_low"
+(define_insn "*mov_si<mode>_e500_subreg0_le"
+  [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r,r") 0)
+	(match_operand:SI 1 "input_operand" "r,m"))]
+  "!WORDS_BIG_ENDIAN
+   && ((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
+       || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))"
+  "@
+   mr %0,%1
+   lwz%U1%X1 %0,%1")
+
+(define_insn_and_split "*mov_si<mode>_e500_subreg0_elf_low_be"
   [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r") 0)
 	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "r")
 		   (match_operand 2 "" "")))]
-  "((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
-    || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))
-   && TARGET_ELF && !TARGET_64BIT && can_create_pseudo_p ()"
+  "WORDS_BIG_ENDIAN
+   && (((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
+	|| (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))
+       && TARGET_ELF && !TARGET_64BIT && can_create_pseudo_p ())"
   "#"
   "&& 1"
   [(pc)]
 {
   rtx tmp = gen_reg_rtx (SImode);
   emit_insn (gen_elf_low (tmp, operands[1], operands[2]));
-  emit_insn (gen_mov_si<mode>_e500_subreg0 (operands[0], tmp));
+  emit_insn (gen_mov_si<mode>_e500_subreg0_be (operands[0], tmp));
   DONE;
 }
   [(set_attr "length" "8")])
 
+(define_insn "*mov_si<mode>_e500_subreg0_elf_low_le"
+  [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r") 0)
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "r")
+		   (match_operand 2 "" "")))]
+  "!WORDS_BIG_ENDIAN
+   && (((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
+	|| (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))
+       && TARGET_ELF && !TARGET_64BIT)"
+  "addic %0,%1,%K2")
+
 ;; ??? Could use evstwwe for memory stores in some cases, depending on
 ;; the offset.
-(define_insn "*mov_si<mode>_e500_subreg0_2"
+(define_insn "*mov_si<mode>_e500_subreg0_2_be"
   [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m")
 	(subreg:SI (match_operand:SPE64TF 1 "register_operand" "+r,&r") 0))]
-  "(TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
-   || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode)"
+  "WORDS_BIG_ENDIAN
+   && ((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
+       || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))"
   "@
    evmergehi %0,%0,%1
    evmergelohi %1,%1,%1\;stw%U0%X0 %1,%0"
   [(set_attr "length" "4,8")])
 
-(define_insn "*mov_si<mode>_e500_subreg4"
+(define_insn "*mov_si<mode>_e500_subreg0_2_le"
+  [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m")
+	(subreg:SI (match_operand:SPE64TF 1 "register_operand" "+r,r") 0))]
+  "!WORDS_BIG_ENDIAN
+   && ((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
+       || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))"
+  "@
+   mr %0,%1
+   stw%U0%X0 %1,%0")
+
+(define_insn "*mov_si<mode>_e500_subreg4_be"
   [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r,r") 4)
 	(match_operand:SI 1 "input_operand" "r,m"))]
-  "(TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
-   || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode)"
+  "WORDS_BIG_ENDIAN
+   && ((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
+       || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))"
   "@
    mr %0,%1
    lwz%U1%X1 %0,%1")
 
-(define_insn "*mov_si<mode>_e500_subreg4_elf_low"
+(define_insn "mov_si<mode>_e500_subreg4_le"
+  [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r,&r") 4)
+	(match_operand:SI 1 "input_operand" "r,m"))]
+  "!WORDS_BIG_ENDIAN
+   && ((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
+       || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))"
+  "@
+   evmergelo %0,%1,%0
+   evmergelohi %0,%0,%0\;lwz%U1%X1 %0,%1\;evmergelohi %0,%0,%0"
+  [(set_attr "length" "4,12")])
+
+(define_insn "*mov_si<mode>_e500_subreg4_elf_low_be"
   [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r") 4)
 	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "r")
 		   (match_operand 2 "" "")))]
-  "((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
-    || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))
-   && TARGET_ELF && !TARGET_64BIT"
+  "WORDS_BIG_ENDIAN
+   && (((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
+	|| (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))
+       && TARGET_ELF && !TARGET_64BIT)"
   "addic %0,%1,%K2")
 
-(define_insn "*mov_si<mode>_e500_subreg4_2"
+(define_insn_and_split "*mov_si<mode>_e500_subreg4_elf_low_le"
+  [(set (subreg:SI (match_operand:SPE64TF 0 "register_operand" "+r") 4)
+	(lo_sum:SI (match_operand:SI 1 "gpc_reg_operand" "r")
+		   (match_operand 2 "" "")))]
+  "!WORDS_BIG_ENDIAN
+   && (((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
+	|| (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))
+       && TARGET_ELF && !TARGET_64BIT && can_create_pseudo_p ())"
+  "#"
+  "&& 1"
+  [(pc)]
+{
+  rtx tmp = gen_reg_rtx (SImode);
+  emit_insn (gen_elf_low (tmp, operands[1], operands[2]));
+  emit_insn (gen_mov_si<mode>_e500_subreg4_le (operands[0], tmp));
+  DONE;
+}
+  [(set_attr "length" "8")])
+
+(define_insn "*mov_si<mode>_e500_subreg4_2_be"
   [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m")
 	(subreg:SI (match_operand:SPE64TF 1 "register_operand" "r,r") 4))]
-  "(TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
-   || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode)"
+  "WORDS_BIG_ENDIAN
+   && ((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
+       || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))"
   "@
    mr %0,%1
    stw%U0%X0 %1,%0")
 
-(define_insn "*mov_sitf_e500_subreg8"
+(define_insn "*mov_si<mode>_e500_subreg4_2_le"
+  [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m")
+	(subreg:SI (match_operand:SPE64TF 1 "register_operand" "+r,&r") 4))]
+  "!WORDS_BIG_ENDIAN
+   && ((TARGET_E500_DOUBLE && (<MODE>mode == DFmode || <MODE>mode == TFmode))
+       || (TARGET_SPE && <MODE>mode != DFmode && <MODE>mode != TFmode))"
+  "@
+   evmergehi %0,%0,%1
+   evmergelohi %1,%1,%1\;stw%U0%X0 %1,%0"
+  [(set_attr "length" "4,8")])
+
+(define_insn "*mov_sitf_e500_subreg8_be"
   [(set (subreg:SI (match_operand:TF 0 "register_operand" "+r,&r") 8)
 	(match_operand:SI 1 "input_operand" "r,m"))]
-  "TARGET_E500_DOUBLE"
+  "WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE"
   "@
    evmergelo %L0,%1,%L0
    evmergelohi %L0,%L0,%L0\;lwz%U1%X1 %L0,%1\;evmergelohi %L0,%L0,%L0"
   [(set_attr "length" "4,12")])
 
-(define_insn "*mov_sitf_e500_subreg8_2"
+(define_insn "*mov_sitf_e500_subreg8_le"
+  [(set (subreg:SI (match_operand:TF 0 "register_operand" "+r,r") 8)
+	(match_operand:SI 1 "input_operand" "r,m"))]
+  "!WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE"
+  "@
+   mr %L0,%1
+   lwz%U1%X1 %L0,%1")
+
+(define_insn "*mov_sitf_e500_subreg8_2_be"
   [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m")
 	(subreg:SI (match_operand:TF 1 "register_operand" "+r,&r") 8))]
-  "TARGET_E500_DOUBLE"
+  "WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE"
   "@
    evmergehi %0,%0,%L1
    evmergelohi %L1,%L1,%L1\;stw%U0%X0 %L1,%0"
   [(set_attr "length" "4,8")])
 
-(define_insn "*mov_sitf_e500_subreg12"
+(define_insn "*mov_sitf_e500_subreg8_2_le"
+  [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m")
+	(subreg:SI (match_operand:TF 1 "register_operand" "r,r") 8))]
+  "!WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE"
+  "@
+   mr %0,%L1
+   stw%U0%X0 %L1,%0")
+
+(define_insn "*mov_sitf_e500_subreg12_be"
   [(set (subreg:SI (match_operand:TF 0 "register_operand" "+r,r") 12)
 	(match_operand:SI 1 "input_operand" "r,m"))]
-  "TARGET_E500_DOUBLE"
+  "WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE"
   "@
    mr %L0,%1
    lwz%U1%X1 %L0,%1")
 
-(define_insn "*mov_sitf_e500_subreg12_2"
+(define_insn "*mov_sitf_e500_subreg12_le"
+  [(set (subreg:SI (match_operand:TF 0 "register_operand" "+r,&r") 12)
+	(match_operand:SI 1 "input_operand" "r,m"))]
+  "!WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE"
+  "@
+   evmergelo %L0,%1,%L0
+   evmergelohi %L0,%L0,%L0\;lwz%U1%X1 %L0,%1\;evmergelohi %L0,%L0,%L0"
+  [(set_attr "length" "4,12")])
+
+(define_insn "*mov_sitf_e500_subreg12_2_be"
   [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m")
 	(subreg:SI (match_operand:TF 1 "register_operand" "r,r") 12))]
-  "TARGET_E500_DOUBLE"
+  "WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE"
   "@
    mr %0,%L1
    stw%U0%X0 %L1,%0")
 
+(define_insn "*mov_sitf_e500_subreg12_2_le"
+  [(set (match_operand:SI 0 "rs6000_nonimmediate_operand" "+r,m")
+	(subreg:SI (match_operand:TF 1 "register_operand" "+r,&r") 12))]
+  "!WORDS_BIG_ENDIAN && TARGET_E500_DOUBLE"
+  "@
+   evmergehi %0,%0,%L1
+   evmergelohi %L1,%L1,%L1\;stw%U0%X0 %L1,%0"
+  [(set_attr "length" "4,8")])
+
 ;; FIXME: Allow r=CONST0.
 (define_insn "*movdf_e500_double"
   [(set (match_operand:DF 0 "rs6000_nonimmediate_operand" "=r,r,m")
Index: gcc-fsf-trunk-quilt/gcc/testsuite/gcc.target/powerpc/spe-evmerge.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ gcc-fsf-trunk-quilt/gcc/testsuite/gcc.target/powerpc/spe-evmerge.c	2014-06-11 16:35:25.917851800 +0100
@@ -0,0 +1,71 @@
+/* Verify SPE vector permute builtins.  */
+/* { dg-do run { target { powerpc*-*-* && powerpc_spe } } } */
+/* Remove `-ansi' from options so that <spe.h> compiles.  */
+/* { dg-options "" } */
+
+#include <spe.h>
+#include <stdlib.h>
+
+#define vector __attribute__ ((vector_size (8)))
+
+#define WORDS_BIG_ENDIAN (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
+
+int
+main (void)
+{
+  vector int a = { 0x11111111, 0x22222222 };
+  vector int b = { 0x33333333, 0x44444444 };
+  vector int c;
+
+  /* c[hi] = a[hi], c[lo] = b[hi]  */
+  c = __ev_mergehi (a, b);
+  if (c[0] != (WORDS_BIG_ENDIAN ? 0x11111111 : 0x44444444))
+    abort ();
+  if (c[1] != (WORDS_BIG_ENDIAN ? 0x33333333 : 0x22222222))
+    abort ();
+  /* c[hi] = a[lo], c[lo] = b[lo]  */
+  c = __ev_mergelo (a, b);
+  if (c[0] != (WORDS_BIG_ENDIAN ? 0x22222222 : 0x33333333))
+    abort ();
+  if (c[1] != (WORDS_BIG_ENDIAN ? 0x44444444 : 0x11111111))
+    abort ();
+  /* c[hi] = a[lo], c[lo] = b[hi]  */
+  c = __ev_mergelohi (a, b);
+  if (c[0] != (WORDS_BIG_ENDIAN ? 0x22222222 : 0x44444444))
+    abort ();
+  if (c[1] != (WORDS_BIG_ENDIAN ? 0x33333333 : 0x11111111))
+    abort ();
+  /* c[hi] = a[hi], c[lo] = b[lo]  */
+  c = __ev_mergehilo (a, b);
+  if (c[0] != (WORDS_BIG_ENDIAN ? 0x11111111 : 0x33333333))
+    abort ();
+  if (c[1] != (WORDS_BIG_ENDIAN ? 0x44444444 : 0x22222222))
+    abort ();
+
+  /* c[hi] = a[hi], c[lo] = b[hi]  */
+  c = __builtin_spe_evmergehi (a, b);
+  if (c[0] != (WORDS_BIG_ENDIAN ? 0x11111111 : 0x44444444))
+    abort ();
+  if (c[1] != (WORDS_BIG_ENDIAN ? 0x33333333 : 0x22222222))
+    abort ();
+  /* c[hi] = a[lo], c[lo] = b[lo]  */
+  c = __builtin_spe_evmergelo (a, b);
+  if (c[0] != (WORDS_BIG_ENDIAN ? 0x22222222 : 0x33333333))
+    abort ();
+  if (c[1] != (WORDS_BIG_ENDIAN ? 0x44444444 : 0x11111111))
+    abort ();
+  /* c[hi] = a[lo], c[lo] = b[hi]  */
+  c = __builtin_spe_evmergelohi (a, b);
+  if (c[0] != (WORDS_BIG_ENDIAN ? 0x22222222 : 0x44444444))
+    abort ();
+  if (c[1] != (WORDS_BIG_ENDIAN ? 0x33333333 : 0x11111111))
+    abort ();
+  /* c[hi] = a[hi], c[lo] = b[lo]  */
+  c = __builtin_spe_evmergehilo (a, b);
+  if (c[0] != (WORDS_BIG_ENDIAN ? 0x11111111 : 0x33333333))
+    abort ();
+  if (c[1] != (WORDS_BIG_ENDIAN ? 0x44444444 : 0x22222222))
+    abort ();
+
+  return 0;
+}

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-07-07 15:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-07 13:58 [PATCH] Power/GCC: Implement little-endian SPE operations David Edelsohn
2014-07-07 14:33 ` Maciej W. Rozycki
2014-07-07 15:05   ` David Edelsohn
2014-07-07 15:49     ` Maciej W. Rozycki
  -- strict thread matches above, loose matches on Subject: below --
2014-07-07 11:41 Maciej W. Rozycki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).