public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, ARM] Improved core -> NEON extend
@ 2012-11-13 13:50 Ulrich Weigand
  2012-12-17 14:35 ` Richard Earnshaw
  0 siblings, 1 reply; 2+ messages in thread
From: Ulrich Weigand @ 2012-11-13 13:50 UTC (permalink / raw)
  To: gcc-patches; +Cc: ramrad01

Hello,

here's another of Andrew's patches to improve NEON usage.  This one was
originally posted here:
http://gcc.gnu.org/ml/gcc-patches/2012-02/msg01213.html

The idea to improve SImode to DImode extends that also move from core
registers to NEON registers.  In this situation, the compiler used to
perform the extension in core registers first and then moves to NEON,
wasting a core register in the process.

The patch changes this to move to NEON first and extend there.

[ This patch requires both the NEON shift and the lower-subreg patches,
both of which are now in mainline, so this patch is ready to merge
as well at this point.  ]

Tested on arm-linux-gnueabi.

OK for mainline?

Bye,
Ulrich


2012-11-13  Andrew Stubbs  <ams@codesourcery.com>
	    Ulrich Weigand  <ulrich.weigand@linaro.org>

	gcc/
	* config/arm/arm.md (zero_extend<mode>di2): Add extra alternatives
	for NEON registers.
	Add alternative for one-instruction extend-in-place.
	(extend<mode>di2): Likewise.
	Add constraints for Thumb-mode memory loads.
	Prevent extend splitters doing NEON alternatives.
	* config/arm/iterators.md (qhs_extenddi_cstr, qhs_zextenddi_cstr):
	Adjust constraints to add new alternatives.
	* config/arm/neon.md: Add splitters for zero- and sign-extend.

	gcc/testsuite/
	* gcc.target/arm/neon-extend-1.c: New file.
	* gcc.target/arm/neon-extend-2.c: New file.

=== modified file 'gcc/config/arm/arm.md'
--- gcc/config/arm/arm.md	2012-09-19 12:57:52 +0000
+++ gcc/config/arm/arm.md	2012-09-19 13:19:31 +0000
@@ -4567,33 +4567,36 @@
 ;; Zero and sign extension instructions.
 
 (define_insn "zero_extend<mode>di2"
-  [(set (match_operand:DI 0 "s_register_operand" "=r")
+  [(set (match_operand:DI 0 "s_register_operand" "=w,r,?r")
         (zero_extend:DI (match_operand:QHSI 1 "<qhs_zextenddi_op>"
 					    "<qhs_zextenddi_cstr>")))]
   "TARGET_32BIT <qhs_zextenddi_cond>"
   "#"
-  [(set_attr "length" "8")
+  [(set_attr "length" "8,4,8")
    (set_attr "ce_count" "2")
    (set_attr "predicable" "yes")]
 )
 
 (define_insn "extend<mode>di2"
-  [(set (match_operand:DI 0 "s_register_operand" "=r")
+  [(set (match_operand:DI 0 "s_register_operand" "=w,r,?r,?r")
         (sign_extend:DI (match_operand:QHSI 1 "<qhs_extenddi_op>"
 					    "<qhs_extenddi_cstr>")))]
   "TARGET_32BIT <qhs_sextenddi_cond>"
   "#"
-  [(set_attr "length" "8")
+  [(set_attr "length" "8,4,8,8")
    (set_attr "ce_count" "2")
    (set_attr "shift" "1")
-   (set_attr "predicable" "yes")]
+   (set_attr "predicable" "yes")
+   (set_attr "arch" "*,*,a,t")]
 )
 
 ;; Splits for all extensions to DImode
 (define_split
   [(set (match_operand:DI 0 "s_register_operand" "")
         (zero_extend:DI (match_operand 1 "nonimmediate_operand" "")))]
-  "TARGET_32BIT"
+  "TARGET_32BIT && (!TARGET_NEON
+		    || (reload_completed
+			&& !(IS_VFP_REGNUM (REGNO (operands[0])))))"
   [(set (match_dup 0) (match_dup 1))]
 {
   rtx lo_part = gen_lowpart (SImode, operands[0]);
@@ -4619,7 +4622,9 @@
 (define_split
   [(set (match_operand:DI 0 "s_register_operand" "")
         (sign_extend:DI (match_operand 1 "nonimmediate_operand" "")))]
-  "TARGET_32BIT"
+  "TARGET_32BIT && (!TARGET_NEON
+		    || (reload_completed
+			&& !(IS_VFP_REGNUM (REGNO (operands[0])))))"
   [(set (match_dup 0) (ashiftrt:SI (match_dup 1) (const_int 31)))]
 {
   rtx lo_part = gen_lowpart (SImode, operands[0]);

=== modified file 'gcc/config/arm/iterators.md'
--- gcc/config/arm/iterators.md	2012-09-19 12:57:52 +0000
+++ gcc/config/arm/iterators.md	2012-09-19 13:19:31 +0000
@@ -412,8 +412,8 @@
 (define_mode_attr qhs_extenddi_op [(SI "s_register_operand")
 				   (HI "nonimmediate_operand")
 				   (QI "arm_reg_or_extendqisi_mem_op")])
-(define_mode_attr qhs_extenddi_cstr [(SI "r") (HI "rm") (QI "rUq")])
-(define_mode_attr qhs_zextenddi_cstr [(SI "r") (HI "rm") (QI "rm")])
+(define_mode_attr qhs_extenddi_cstr [(SI "r,0,r,r") (HI "r,0,rm,rm") (QI "r,0,rUq,rm")])
+(define_mode_attr qhs_zextenddi_cstr [(SI "r,0,r") (HI "r,0,rm") (QI "r,0,rm")])
 
 ;; Mode attributes used for fixed-point support.
 (define_mode_attr qaddsub_suf [(V4UQQ "8") (V2UHQ "16") (UQQ "8") (UHQ "16")

=== modified file 'gcc/config/arm/neon.md'
--- gcc/config/arm/neon.md	2012-09-19 12:57:52 +0000
+++ gcc/config/arm/neon.md	2012-09-19 13:19:31 +0000
@@ -5878,3 +5878,65 @@
                                    (const_string "neon_fp_vadd_qqq_vabs_qq"))
                      (const_string "neon_int_5")))]
 )
+
+;; Copy from core-to-neon regs, then extend, not vice-versa
+
+(define_split
+  [(set (match_operand:DI 0 "s_register_operand" "")
+	(sign_extend:DI (match_operand:SI 1 "s_register_operand" "")))]
+  "TARGET_NEON && reload_completed && IS_VFP_REGNUM (REGNO (operands[0]))"
+  [(set (match_dup 2) (vec_duplicate:V2SI (match_dup 1)))
+   (set (match_dup 0) (ashiftrt:DI (match_dup 0) (const_int 32)))]
+  {
+    operands[2] = gen_rtx_REG (V2SImode, REGNO (operands[0]));
+  })
+
+(define_split
+  [(set (match_operand:DI 0 "s_register_operand" "")
+	(sign_extend:DI (match_operand:HI 1 "s_register_operand" "")))]
+  "TARGET_NEON && reload_completed && IS_VFP_REGNUM (REGNO (operands[0]))"
+  [(set (match_dup 2) (vec_duplicate:V4HI (match_dup 1)))
+   (set (match_dup 0) (ashiftrt:DI (match_dup 0) (const_int 48)))]
+  {
+    operands[2] = gen_rtx_REG (V4HImode, REGNO (operands[0]));
+  })
+
+(define_split
+  [(set (match_operand:DI 0 "s_register_operand" "")
+	(sign_extend:DI (match_operand:QI 1 "s_register_operand" "")))]
+  "TARGET_NEON && reload_completed && IS_VFP_REGNUM (REGNO (operands[0]))"
+  [(set (match_dup 2) (vec_duplicate:V8QI (match_dup 1)))
+   (set (match_dup 0) (ashiftrt:DI (match_dup 0) (const_int 56)))]
+  {
+    operands[2] = gen_rtx_REG (V8QImode, REGNO (operands[0]));
+  })
+
+(define_split
+  [(set (match_operand:DI 0 "s_register_operand" "")
+	(zero_extend:DI (match_operand:SI 1 "s_register_operand" "")))]
+  "TARGET_NEON && reload_completed && IS_VFP_REGNUM (REGNO (operands[0]))"
+  [(set (match_dup 2) (vec_duplicate:V2SI (match_dup 1)))
+   (set (match_dup 0) (lshiftrt:DI (match_dup 0) (const_int 32)))]
+  {
+    operands[2] = gen_rtx_REG (V2SImode, REGNO (operands[0]));
+  })
+
+(define_split
+  [(set (match_operand:DI 0 "s_register_operand" "")
+	(zero_extend:DI (match_operand:HI 1 "s_register_operand" "")))]
+  "TARGET_NEON && reload_completed && IS_VFP_REGNUM (REGNO (operands[0]))"
+  [(set (match_dup 2) (vec_duplicate:V4HI (match_dup 1)))
+   (set (match_dup 0) (lshiftrt:DI (match_dup 0) (const_int 48)))]
+  {
+    operands[2] = gen_rtx_REG (V4HImode, REGNO (operands[0]));
+  })
+
+(define_split
+  [(set (match_operand:DI 0 "s_register_operand" "")
+	(zero_extend:DI (match_operand:QI 1 "s_register_operand" "")))]
+  "TARGET_NEON && reload_completed && IS_VFP_REGNUM (REGNO (operands[0]))"
+  [(set (match_dup 2) (vec_duplicate:V8QI (match_dup 1)))
+   (set (match_dup 0) (lshiftrt:DI (match_dup 0) (const_int 56)))]
+  {
+    operands[2] = gen_rtx_REG (V8QImode, REGNO (operands[0]));
+  })

=== added file 'gcc/testsuite/gcc.target/arm/neon-extend-1.c'
--- gcc/testsuite/gcc.target/arm/neon-extend-1.c	1970-01-01 00:00:00 +0000
+++ gcc/testsuite/gcc.target/arm/neon-extend-1.c	2012-09-19 13:19:31 +0000
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_neon } */
+
+void
+f (unsigned int a)
+{
+  unsigned long long b = a;
+  asm volatile ("@ extended to %0" : : "w" (b));
+}
+
+/* { dg-final { scan-assembler "vdup.32" } } */
+/* { dg-final { scan-assembler "vshr.u64" } } */

=== added file 'gcc/testsuite/gcc.target/arm/neon-extend-2.c'
--- gcc/testsuite/gcc.target/arm/neon-extend-2.c	1970-01-01 00:00:00 +0000
+++ gcc/testsuite/gcc.target/arm/neon-extend-2.c	2012-09-19 13:19:31 +0000
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target arm_neon_hw } */
+/* { dg-options "-O2" } */
+/* { dg-add-options arm_neon } */
+
+void
+f (int a)
+{
+  long long b = a;
+  asm volatile ("@ extended to %0" : : "w" (b));
+}
+
+/* { dg-final { scan-assembler "vdup.32" } } */
+/* { dg-final { scan-assembler "vshr.s64" } } */

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH, ARM] Improved core -> NEON extend
  2012-11-13 13:50 [PATCH, ARM] Improved core -> NEON extend Ulrich Weigand
@ 2012-12-17 14:35 ` Richard Earnshaw
  0 siblings, 0 replies; 2+ messages in thread
From: Richard Earnshaw @ 2012-12-17 14:35 UTC (permalink / raw)
  To: Ulrich Weigand; +Cc: gcc-patches, Ramana Radhakrishnan

On 13/11/12 13:50, Ulrich Weigand wrote:
> Hello,
>
> here's another of Andrew's patches to improve NEON usage.  This one was
> originally posted here:
> http://gcc.gnu.org/ml/gcc-patches/2012-02/msg01213.html
>
> The idea to improve SImode to DImode extends that also move from core
> registers to NEON registers.  In this situation, the compiler used to
> perform the extension in core registers first and then moves to NEON,
> wasting a core register in the process.
>
> The patch changes this to move to NEON first and extend there.
>
> [ This patch requires both the NEON shift and the lower-subreg patches,
> both of which are now in mainline, so this patch is ready to merge
> as well at this point.  ]
>
> Tested on arm-linux-gnueabi.
>
> OK for mainline?
>
> Bye,
> Ulrich
>
>
> 2012-11-13  Andrew Stubbs  <ams@codesourcery.com>
> 	    Ulrich Weigand  <ulrich.weigand@linaro.org>
>
> 	gcc/
> 	* config/arm/arm.md (zero_extend<mode>di2): Add extra alternatives
> 	for NEON registers.
> 	Add alternative for one-instruction extend-in-place.
> 	(extend<mode>di2): Likewise.
> 	Add constraints for Thumb-mode memory loads.
> 	Prevent extend splitters doing NEON alternatives.
> 	* config/arm/iterators.md (qhs_extenddi_cstr, qhs_zextenddi_cstr):
> 	Adjust constraints to add new alternatives.
> 	* config/arm/neon.md: Add splitters for zero- and sign-extend.
>
> 	gcc/testsuite/
> 	* gcc.target/arm/neon-extend-1.c: New file.
> 	* gcc.target/arm/neon-extend-2.c: New file.
>

OK.

R.


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-12-17 14:35 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-13 13:50 [PATCH, ARM] Improved core -> NEON extend Ulrich Weigand
2012-12-17 14:35 ` Richard Earnshaw

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).