* [PATCH, rs6000] Improve TImode add/sub
@ 2014-04-09 0:07 Pat Haugen
2014-04-09 2:56 ` segher
0 siblings, 1 reply; 7+ messages in thread
From: Pat Haugen @ 2014-04-09 0:07 UTC (permalink / raw)
To: GCC Patches; +Cc: David Edelsohn
[-- Attachment #1: Type: text/plain, Size: 449 bytes --]
The following patch improves the code generated for TImode add/sub so
that we now generate a 2 insn sequence which makes use of the carry bit.
Bootstrap/regtest (on both BE/LE) with no new failures. Ok for trunk and
4.8?
-Pat
2014-04-08 Pat Haugen <pthaugen@us.ibm.com>
* config/rs6000/rs6000.md (addti3, subti3): New.
gcc/testsuite:
* gcc.target/powerpc/ti_math1.c: New.
* gcc.target/powerpc/ti_math2.c: New.
[-- Attachment #2: add-sub_ti.diff --]
[-- Type: text/x-patch, Size: 4535 bytes --]
Index: gcc/testsuite/gcc.target/powerpc/ti_math1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ti_math1.c (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ti_math1.c (revision 0)
@@ -0,0 +1,21 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times "addc" 1 } } */
+/* { dg-final { scan-assembler-times "adde" 1 } } */
+/* { dg-final { scan-assembler-times "subfc" 1 } } */
+/* { dg-final { scan-assembler-times "subfe" 1 } } */
+/* { dg-final { scan-assembler-not "subf " } } */
+
+__int128
+add_128 (__int128 *ptr, __int128 val)
+{
+ return (*ptr + val);
+}
+
+__int128
+sub_128 (__int128 *ptr, __int128 val)
+{
+ return (*ptr - val);
+}
+
Index: gcc/testsuite/gcc.target/powerpc/ti_math2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ti_math2.c (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ti_math2.c (revision 0)
@@ -0,0 +1,74 @@
+/* { dg-do run { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-options "-O2 -fno-inline" } */
+
+union U {
+ __int128 i128;
+ struct {
+ long l1;
+ long l2;
+ } s;
+};
+
+union U u1,u2;
+
+__int128
+create_128 (long most_sig, long least_sig)
+{
+ union U u;
+
+#if __LITTLE_ENDIAN__
+ u.s.l1 = least_sig;
+ u.s.l2 = most_sig;
+#else
+ u.s.l1 = most_sig;
+ u.s.l2 = least_sig;
+#endif
+ return u.i128;
+}
+
+long most_sig (union U * u)
+{
+#if __LITTLE_ENDIAN__
+ return (*u).s.l2;
+#else
+ return (*u).s.l1;
+#endif
+}
+
+long least_sig (union U * u)
+{
+#if __LITTLE_ENDIAN__
+ return (*u).s.l1;
+#else
+ return (*u).s.l2;
+#endif
+}
+
+__int128
+add_128 (__int128 *ptr, __int128 val)
+{
+ return (*ptr + val);
+}
+
+__int128
+sub_128 (__int128 *ptr, __int128 val)
+{
+ return (*ptr - val);
+}
+
+int
+main (void)
+{
+ /* Do a simple add/sub to make sure carry is happening between the dwords
+ and that dwords are in correct endian order. */
+ u1.i128 = create_128 (1, -1);
+ u2.i128 = add_128 (&u1.i128, 1);
+ if ((most_sig (&u2) != 2) || (least_sig (&u2) != 0))
+ __builtin_abort ();
+ u2.i128 = sub_128 (&u2.i128, 1);
+ if ((most_sig (&u2) != 1) || (least_sig (&u2) != -1))
+ __builtin_abort ();
+ return 0;
+}
+
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md (revision 209226)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -6535,6 +6535,51 @@ (define_insn_and_split "*floatunsdisf2_m
[(set_attr "length" "8")
(set_attr "type" "fpload")])
\f
+;; Define the TImode operations that can be done in a small number
+;; of instructions. The & constraints are to prevent the register
+;; allocator from allocating registers that overlap with the inputs
+;; (for example, having an input in 7,8 and an output in 6,7). We
+;; also allow for the output being the same as one of the inputs.
+
+(define_insn "addti3"
+ [(set (match_operand:TI 0 "gpc_reg_operand" "=&r,&r,r,r")
+ (plus:TI (match_operand:TI 1 "gpc_reg_operand" "%r,r,0,0")
+ (match_operand:TI 2 "reg_or_short_operand" "r,I,r,I")))]
+ "TARGET_POWERPC64"
+ "*
+{
+ if (WORDS_BIG_ENDIAN)
+ return (GET_CODE (operands[2])) != CONST_INT
+ ? \"addc %L0,%L1,%L2\;adde %0,%1,%2\"
+ : \"addic %L0,%L1,%2\;add%G2e %0,%1\";
+ else
+ return (GET_CODE (operands[2])) != CONST_INT
+ ? \"addc %0,%1,%2\;adde %L0,%L1,%L2\"
+ : \"addic %0,%1,%2\;add%G2e %L0,%L1\";
+}"
+ [(set_attr "type" "two")
+ (set_attr "length" "8")])
+
+(define_insn "subti3"
+ [(set (match_operand:TI 0 "gpc_reg_operand" "=&r,&r,r,r,r")
+ (minus:TI (match_operand:TI 1 "reg_or_short_operand" "r,I,0,r,I")
+ (match_operand:TI 2 "gpc_reg_operand" "r,r,r,0,0")))]
+ "TARGET_POWERPC64"
+ "*
+{
+ if (WORDS_BIG_ENDIAN)
+ return (GET_CODE (operands[1]) != CONST_INT)
+ ? \"subfc %L0,%L2,%L1\;subfe %0,%2,%1\"
+ : \"subfic %L0,%L2,%1\;subf%G1e %0,%2\";
+ else
+ return (GET_CODE (operands[1]) != CONST_INT)
+ ? \"subfc %0,%2,%1\;subfe %L0,%L2,%L1\"
+ : \"subfic %0,%2,%1\;subf%G1e %L0,%L2\";
+}"
+ [(set_attr "type" "two")
+ (set_attr "length" "8")])
+
+
;; Define the DImode operations that can be done in a small number
;; of instructions. The & constraints are to prevent the register
;; allocator from allocating registers that overlap with the inputs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, rs6000] Improve TImode add/sub
2014-04-09 0:07 [PATCH, rs6000] Improve TImode add/sub Pat Haugen
@ 2014-04-09 2:56 ` segher
2014-04-16 21:06 ` Pat Haugen
0 siblings, 1 reply; 7+ messages in thread
From: segher @ 2014-04-09 2:56 UTC (permalink / raw)
To: Pat Haugen; +Cc: GCC Patches, David Edelsohn
Hello,
On Tue, Apr 08, 2014 at 07:06:58PM -0500, Pat Haugen wrote:
> The following patch improves the code generated for TImode add/sub so
> that we now generate a 2 insn sequence which makes use of the carry bit.
> +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
Please leave out the default arguments. Why does this need skipping
on Darwin?
> +;; Define the TImode operations that can be done in a small number
> +;; of instructions. The & constraints are to prevent the register
> +;; allocator from allocating registers that overlap with the inputs
> +;; (for example, having an input in 7,8 and an output in 6,7). We
> +;; also allow for the output being the same as one of the inputs.
> +
> +(define_insn "addti3"
> + [(set (match_operand:TI 0 "gpc_reg_operand" "=&r,&r,r,r")
> + (plus:TI (match_operand:TI 1 "gpc_reg_operand" "%r,r,0,0")
> + (match_operand:TI 2 "reg_or_short_operand" "r,I,r,I")))]
> + "TARGET_POWERPC64"
That's not the correct condition: the carry bit is set based on the 32-bit
carry in 32-bit mode, so the condition has to be TARGET_64BIT.
The adddi3 pattern has !TARGET_POWERPC64 since a 64-bit addition can
be done without addc on a 64-bit machine, no matter what mode the CPU
is in.
> + "*
> +{
Might as well leave out this stuff on new code, just use the braces :-)
Segher
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, rs6000] Improve TImode add/sub
2014-04-09 2:56 ` segher
@ 2014-04-16 21:06 ` Pat Haugen
2014-04-17 3:34 ` David Edelsohn
0 siblings, 1 reply; 7+ messages in thread
From: Pat Haugen @ 2014-04-16 21:06 UTC (permalink / raw)
To: segher; +Cc: GCC Patches, David Edelsohn
[-- Attachment #1: Type: text/plain, Size: 1387 bytes --]
On 04/08/2014 09:56 PM, segher@kernel.crashing.org wrote:
> +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
> +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
> Please leave out the default arguments. Why does this need skipping
> on Darwin?
>
>> +;; Define the TImode operations that can be done in a small number
>> +;; of instructions. The & constraints are to prevent the register
>> +;; allocator from allocating registers that overlap with the inputs
>> +;; (for example, having an input in 7,8 and an output in 6,7). We
>> +;; also allow for the output being the same as one of the inputs.
>> +
>> +(define_insn "addti3"
>> + [(set (match_operand:TI 0 "gpc_reg_operand" "=&r,&r,r,r")
>> + (plus:TI (match_operand:TI 1 "gpc_reg_operand" "%r,r,0,0")
>> + (match_operand:TI 2 "reg_or_short_operand" "r,I,r,I")))]
>> + "TARGET_POWERPC64"
> That's not the correct condition: the carry bit is set based on the 32-bit
> carry in 32-bit mode, so the condition has to be TARGET_64BIT.
>
> The adddi3 pattern has !TARGET_POWERPC64 since a 64-bit addition can
> be done without addc on a 64-bit machine, no matter what mode the CPU
> is in.
>
>> + "*
>> +{
> Might as well leave out this stuff on new code, just use the braces :-)
>
>
Updated patch with above comments incorporated. Bootstrap/regtest on
BE/LE with no new regressions. Ok for trunk?
-Pat
[-- Attachment #2: add-sub_ti.diff --]
[-- Type: text/x-patch, Size: 4387 bytes --]
Index: gcc/testsuite/gcc.target/powerpc/ti_math1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ti_math1.c (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ti_math1.c (revision 0)
@@ -0,0 +1,20 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times "addc" 1 } } */
+/* { dg-final { scan-assembler-times "adde" 1 } } */
+/* { dg-final { scan-assembler-times "subfc" 1 } } */
+/* { dg-final { scan-assembler-times "subfe" 1 } } */
+/* { dg-final { scan-assembler-not "subf " } } */
+
+__int128
+add_128 (__int128 *ptr, __int128 val)
+{
+ return (*ptr + val);
+}
+
+__int128
+sub_128 (__int128 *ptr, __int128 val)
+{
+ return (*ptr - val);
+}
+
Index: gcc/testsuite/gcc.target/powerpc/ti_math2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ti_math2.c (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ti_math2.c (revision 0)
@@ -0,0 +1,73 @@
+/* { dg-do run { target { powerpc*-*-* && lp64 } } } */
+/* { dg-options "-O2 -fno-inline" } */
+
+union U {
+ __int128 i128;
+ struct {
+ long l1;
+ long l2;
+ } s;
+};
+
+union U u1,u2;
+
+__int128
+create_128 (long most_sig, long least_sig)
+{
+ union U u;
+
+#if __LITTLE_ENDIAN__
+ u.s.l1 = least_sig;
+ u.s.l2 = most_sig;
+#else
+ u.s.l1 = most_sig;
+ u.s.l2 = least_sig;
+#endif
+ return u.i128;
+}
+
+long most_sig (union U * u)
+{
+#if __LITTLE_ENDIAN__
+ return (*u).s.l2;
+#else
+ return (*u).s.l1;
+#endif
+}
+
+long least_sig (union U * u)
+{
+#if __LITTLE_ENDIAN__
+ return (*u).s.l1;
+#else
+ return (*u).s.l2;
+#endif
+}
+
+__int128
+add_128 (__int128 *ptr, __int128 val)
+{
+ return (*ptr + val);
+}
+
+__int128
+sub_128 (__int128 *ptr, __int128 val)
+{
+ return (*ptr - val);
+}
+
+int
+main (void)
+{
+ /* Do a simple add/sub to make sure carry is happening between the dwords
+ and that dwords are in correct endian order. */
+ u1.i128 = create_128 (1, -1);
+ u2.i128 = add_128 (&u1.i128, 1);
+ if ((most_sig (&u2) != 2) || (least_sig (&u2) != 0))
+ __builtin_abort ();
+ u2.i128 = sub_128 (&u2.i128, 1);
+ if ((most_sig (&u2) != 1) || (least_sig (&u2) != -1))
+ __builtin_abort ();
+ return 0;
+}
+
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md (revision 209226)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -6535,6 +6535,49 @@ (define_insn_and_split "*floatunsdisf2_m
[(set_attr "length" "8")
(set_attr "type" "fpload")])
\f
+;; Define the TImode operations that can be done in a small number
+;; of instructions. The & constraints are to prevent the register
+;; allocator from allocating registers that overlap with the inputs
+;; (for example, having an input in 7,8 and an output in 6,7). We
+;; also allow for the output being the same as one of the inputs.
+
+(define_insn "addti3"
+ [(set (match_operand:TI 0 "gpc_reg_operand" "=&r,&r,r,r")
+ (plus:TI (match_operand:TI 1 "gpc_reg_operand" "%r,r,0,0")
+ (match_operand:TI 2 "reg_or_short_operand" "r,I,r,I")))]
+ "TARGET_64BIT"
+{
+ if (WORDS_BIG_ENDIAN)
+ return (GET_CODE (operands[2])) != CONST_INT
+ ? \"addc %L0,%L1,%L2\;adde %0,%1,%2\"
+ : \"addic %L0,%L1,%2\;add%G2e %0,%1\";
+ else
+ return (GET_CODE (operands[2])) != CONST_INT
+ ? \"addc %0,%1,%2\;adde %L0,%L1,%L2\"
+ : \"addic %0,%1,%2\;add%G2e %L0,%L1\";
+}
+ [(set_attr "type" "two")
+ (set_attr "length" "8")])
+
+(define_insn "subti3"
+ [(set (match_operand:TI 0 "gpc_reg_operand" "=&r,&r,r,r,r")
+ (minus:TI (match_operand:TI 1 "reg_or_short_operand" "r,I,0,r,I")
+ (match_operand:TI 2 "gpc_reg_operand" "r,r,r,0,0")))]
+ "TARGET_64BIT"
+{
+ if (WORDS_BIG_ENDIAN)
+ return (GET_CODE (operands[1]) != CONST_INT)
+ ? \"subfc %L0,%L2,%L1\;subfe %0,%2,%1\"
+ : \"subfic %L0,%L2,%1\;subf%G1e %0,%2\";
+ else
+ return (GET_CODE (operands[1]) != CONST_INT)
+ ? \"subfc %0,%2,%1\;subfe %L0,%L2,%L1\"
+ : \"subfic %0,%2,%1\;subf%G1e %L0,%L2\";
+}
+ [(set_attr "type" "two")
+ (set_attr "length" "8")])
+
+
;; Define the DImode operations that can be done in a small number
;; of instructions. The & constraints are to prevent the register
;; allocator from allocating registers that overlap with the inputs
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, rs6000] Improve TImode add/sub
2014-04-16 21:06 ` Pat Haugen
@ 2014-04-17 3:34 ` David Edelsohn
2014-04-28 19:38 ` Pat Haugen
0 siblings, 1 reply; 7+ messages in thread
From: David Edelsohn @ 2014-04-17 3:34 UTC (permalink / raw)
To: Pat Haugen; +Cc: Segher Boessenkool, GCC Patches
On Wed, Apr 16, 2014 at 4:51 PM, Pat Haugen <pthaugen@linux.vnet.ibm.com> wrote:
> Updated patch with above comments incorporated. Bootstrap/regtest on BE/LE
> with no new regressions. Ok for trunk?
2014-04-08 Pat Haugen <pthaugen@us.ibm.com>
* config/rs6000/rs6000.md (addti3, subti3): New.
gcc/testsuite:
* gcc.target/powerpc/ti_math1.c: New.
* gcc.target/powerpc/ti_math2.c: New.
Okay.
Thanks, David
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, rs6000] Improve TImode add/sub
2014-04-17 3:34 ` David Edelsohn
@ 2014-04-28 19:38 ` Pat Haugen
2014-04-28 19:58 ` David Edelsohn
2014-04-29 17:39 ` David Edelsohn
0 siblings, 2 replies; 7+ messages in thread
From: Pat Haugen @ 2014-04-28 19:38 UTC (permalink / raw)
To: David Edelsohn; +Cc: GCC Patches
On 04/16/2014 10:27 PM, David Edelsohn wrote:
>> >Updated patch with above comments incorporated. Bootstrap/regtest on BE/LE
>> >with no new regressions. Ok for trunk?
> 2014-04-08 Pat Haugen<pthaugen@us.ibm.com>
>
> * config/rs6000/rs6000.md (addti3, subti3): New.
>
> gcc/testsuite:
> * gcc.target/powerpc/ti_math1.c: New.
> * gcc.target/powerpc/ti_math2.c: New.
>
> Okay.
Is this also ok to backport to 4.8/4.9? Bootstrap/regtest on BE/LE with
no new regressions.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, rs6000] Improve TImode add/sub
2014-04-28 19:38 ` Pat Haugen
@ 2014-04-28 19:58 ` David Edelsohn
2014-04-29 17:39 ` David Edelsohn
1 sibling, 0 replies; 7+ messages in thread
From: David Edelsohn @ 2014-04-28 19:58 UTC (permalink / raw)
To: Pat Haugen; +Cc: GCC Patches
On Mon, Apr 28, 2014 at 3:33 PM, Pat Haugen <pthaugen@linux.vnet.ibm.com> wrote:
> On 04/16/2014 10:27 PM, David Edelsohn wrote:
>>>
>>> >Updated patch with above comments incorporated. Bootstrap/regtest on
>>> > BE/LE
>>> >with no new regressions. Ok for trunk?
>>
>> 2014-04-08 Pat Haugen<pthaugen@us.ibm.com>
>>
>> * config/rs6000/rs6000.md (addti3, subti3): New.
>>
>> gcc/testsuite:
>> * gcc.target/powerpc/ti_math1.c: New.
>> * gcc.target/powerpc/ti_math2.c: New.
>>
>> Okay.
>
> Is this also ok to backport to 4.8/4.9? Bootstrap/regtest on BE/LE with no
> new regressions.
The respective variants for 4.8 and 4.9/trunk are okay.
Thanks, David
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, rs6000] Improve TImode add/sub
2014-04-28 19:38 ` Pat Haugen
2014-04-28 19:58 ` David Edelsohn
@ 2014-04-29 17:39 ` David Edelsohn
1 sibling, 0 replies; 7+ messages in thread
From: David Edelsohn @ 2014-04-29 17:39 UTC (permalink / raw)
To: Pat Haugen; +Cc: GCC Patches
On Mon, Apr 28, 2014 at 3:33 PM, Pat Haugen <pthaugen@linux.vnet.ibm.com> wrote:
> On 04/16/2014 10:27 PM, David Edelsohn wrote:
>>>
>>> >Updated patch with above comments incorporated. Bootstrap/regtest on
>>> > BE/LE
>>> >with no new regressions. Ok for trunk?
>>
>> 2014-04-08 Pat Haugen<pthaugen@us.ibm.com>
>>
>> * config/rs6000/rs6000.md (addti3, subti3): New.
>>
>> gcc/testsuite:
>> * gcc.target/powerpc/ti_math1.c: New.
>> * gcc.target/powerpc/ti_math2.c: New.
>>
>> Okay.
>
> Is this also ok to backport to 4.8/4.9? Bootstrap/regtest on BE/LE with no
> new regressions.
Pat,
Just to be clear. The revised version of the patch is okay on 4.8 and
4.9 branches.
Thanks, David
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-04-29 17:37 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-09 0:07 [PATCH, rs6000] Improve TImode add/sub Pat Haugen
2014-04-09 2:56 ` segher
2014-04-16 21:06 ` Pat Haugen
2014-04-17 3:34 ` David Edelsohn
2014-04-28 19:38 ` Pat Haugen
2014-04-28 19:58 ` David Edelsohn
2014-04-29 17:39 ` David Edelsohn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).