* [PATCH, rs6000] Improve TImode add/sub @ 2014-04-09 0:07 Pat Haugen 2014-04-09 2:56 ` segher 0 siblings, 1 reply; 7+ messages in thread From: Pat Haugen @ 2014-04-09 0:07 UTC (permalink / raw) To: GCC Patches; +Cc: David Edelsohn [-- Attachment #1: Type: text/plain, Size: 449 bytes --] The following patch improves the code generated for TImode add/sub so that we now generate a 2 insn sequence which makes use of the carry bit. Bootstrap/regtest (on both BE/LE) with no new failures. Ok for trunk and 4.8? -Pat 2014-04-08 Pat Haugen <pthaugen@us.ibm.com> * config/rs6000/rs6000.md (addti3, subti3): New. gcc/testsuite: * gcc.target/powerpc/ti_math1.c: New. * gcc.target/powerpc/ti_math2.c: New. [-- Attachment #2: add-sub_ti.diff --] [-- Type: text/x-patch, Size: 4535 bytes --] Index: gcc/testsuite/gcc.target/powerpc/ti_math1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/ti_math1.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/ti_math1.c (revision 0) @@ -0,0 +1,21 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-times "addc" 1 } } */ +/* { dg-final { scan-assembler-times "adde" 1 } } */ +/* { dg-final { scan-assembler-times "subfc" 1 } } */ +/* { dg-final { scan-assembler-times "subfe" 1 } } */ +/* { dg-final { scan-assembler-not "subf " } } */ + +__int128 +add_128 (__int128 *ptr, __int128 val) +{ + return (*ptr + val); +} + +__int128 +sub_128 (__int128 *ptr, __int128 val) +{ + return (*ptr - val); +} + Index: gcc/testsuite/gcc.target/powerpc/ti_math2.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/ti_math2.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/ti_math2.c (revision 0) @@ -0,0 +1,74 @@ +/* { dg-do run { target { powerpc*-*-* && lp64 } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-options "-O2 -fno-inline" } */ + +union U { + __int128 i128; + struct { + long l1; + long l2; + } s; +}; + +union U u1,u2; + +__int128 +create_128 (long most_sig, long least_sig) +{ + union U u; + +#if __LITTLE_ENDIAN__ + u.s.l1 = least_sig; + u.s.l2 = most_sig; +#else + u.s.l1 = most_sig; + u.s.l2 = least_sig; +#endif + return u.i128; +} + +long most_sig (union U * u) +{ +#if __LITTLE_ENDIAN__ + return (*u).s.l2; +#else + return (*u).s.l1; +#endif +} + +long least_sig (union U * u) +{ +#if __LITTLE_ENDIAN__ + return (*u).s.l1; +#else + return (*u).s.l2; +#endif +} + +__int128 +add_128 (__int128 *ptr, __int128 val) +{ + return (*ptr + val); +} + +__int128 +sub_128 (__int128 *ptr, __int128 val) +{ + return (*ptr - val); +} + +int +main (void) +{ + /* Do a simple add/sub to make sure carry is happening between the dwords + and that dwords are in correct endian order. */ + u1.i128 = create_128 (1, -1); + u2.i128 = add_128 (&u1.i128, 1); + if ((most_sig (&u2) != 2) || (least_sig (&u2) != 0)) + __builtin_abort (); + u2.i128 = sub_128 (&u2.i128, 1); + if ((most_sig (&u2) != 1) || (least_sig (&u2) != -1)) + __builtin_abort (); + return 0; +} + Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 209226) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -6535,6 +6535,51 @@ (define_insn_and_split "*floatunsdisf2_m [(set_attr "length" "8") (set_attr "type" "fpload")]) \f +;; Define the TImode operations that can be done in a small number +;; of instructions. The & constraints are to prevent the register +;; allocator from allocating registers that overlap with the inputs +;; (for example, having an input in 7,8 and an output in 6,7). We +;; also allow for the output being the same as one of the inputs. + +(define_insn "addti3" + [(set (match_operand:TI 0 "gpc_reg_operand" "=&r,&r,r,r") + (plus:TI (match_operand:TI 1 "gpc_reg_operand" "%r,r,0,0") + (match_operand:TI 2 "reg_or_short_operand" "r,I,r,I")))] + "TARGET_POWERPC64" + "* +{ + if (WORDS_BIG_ENDIAN) + return (GET_CODE (operands[2])) != CONST_INT + ? \"addc %L0,%L1,%L2\;adde %0,%1,%2\" + : \"addic %L0,%L1,%2\;add%G2e %0,%1\"; + else + return (GET_CODE (operands[2])) != CONST_INT + ? \"addc %0,%1,%2\;adde %L0,%L1,%L2\" + : \"addic %0,%1,%2\;add%G2e %L0,%L1\"; +}" + [(set_attr "type" "two") + (set_attr "length" "8")]) + +(define_insn "subti3" + [(set (match_operand:TI 0 "gpc_reg_operand" "=&r,&r,r,r,r") + (minus:TI (match_operand:TI 1 "reg_or_short_operand" "r,I,0,r,I") + (match_operand:TI 2 "gpc_reg_operand" "r,r,r,0,0")))] + "TARGET_POWERPC64" + "* +{ + if (WORDS_BIG_ENDIAN) + return (GET_CODE (operands[1]) != CONST_INT) + ? \"subfc %L0,%L2,%L1\;subfe %0,%2,%1\" + : \"subfic %L0,%L2,%1\;subf%G1e %0,%2\"; + else + return (GET_CODE (operands[1]) != CONST_INT) + ? \"subfc %0,%2,%1\;subfe %L0,%L2,%L1\" + : \"subfic %0,%2,%1\;subf%G1e %L0,%L2\"; +}" + [(set_attr "type" "two") + (set_attr "length" "8")]) + + ;; Define the DImode operations that can be done in a small number ;; of instructions. The & constraints are to prevent the register ;; allocator from allocating registers that overlap with the inputs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, rs6000] Improve TImode add/sub 2014-04-09 0:07 [PATCH, rs6000] Improve TImode add/sub Pat Haugen @ 2014-04-09 2:56 ` segher 2014-04-16 21:06 ` Pat Haugen 0 siblings, 1 reply; 7+ messages in thread From: segher @ 2014-04-09 2:56 UTC (permalink / raw) To: Pat Haugen; +Cc: GCC Patches, David Edelsohn Hello, On Tue, Apr 08, 2014 at 07:06:58PM -0500, Pat Haugen wrote: > The following patch improves the code generated for TImode add/sub so > that we now generate a 2 insn sequence which makes use of the carry bit. > +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ > +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ Please leave out the default arguments. Why does this need skipping on Darwin? > +;; Define the TImode operations that can be done in a small number > +;; of instructions. The & constraints are to prevent the register > +;; allocator from allocating registers that overlap with the inputs > +;; (for example, having an input in 7,8 and an output in 6,7). We > +;; also allow for the output being the same as one of the inputs. > + > +(define_insn "addti3" > + [(set (match_operand:TI 0 "gpc_reg_operand" "=&r,&r,r,r") > + (plus:TI (match_operand:TI 1 "gpc_reg_operand" "%r,r,0,0") > + (match_operand:TI 2 "reg_or_short_operand" "r,I,r,I")))] > + "TARGET_POWERPC64" That's not the correct condition: the carry bit is set based on the 32-bit carry in 32-bit mode, so the condition has to be TARGET_64BIT. The adddi3 pattern has !TARGET_POWERPC64 since a 64-bit addition can be done without addc on a 64-bit machine, no matter what mode the CPU is in. > + "* > +{ Might as well leave out this stuff on new code, just use the braces :-) Segher ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, rs6000] Improve TImode add/sub 2014-04-09 2:56 ` segher @ 2014-04-16 21:06 ` Pat Haugen 2014-04-17 3:34 ` David Edelsohn 0 siblings, 1 reply; 7+ messages in thread From: Pat Haugen @ 2014-04-16 21:06 UTC (permalink / raw) To: segher; +Cc: GCC Patches, David Edelsohn [-- Attachment #1: Type: text/plain, Size: 1387 bytes --] On 04/08/2014 09:56 PM, segher@kernel.crashing.org wrote: > +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ > +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ > Please leave out the default arguments. Why does this need skipping > on Darwin? > >> +;; Define the TImode operations that can be done in a small number >> +;; of instructions. The & constraints are to prevent the register >> +;; allocator from allocating registers that overlap with the inputs >> +;; (for example, having an input in 7,8 and an output in 6,7). We >> +;; also allow for the output being the same as one of the inputs. >> + >> +(define_insn "addti3" >> + [(set (match_operand:TI 0 "gpc_reg_operand" "=&r,&r,r,r") >> + (plus:TI (match_operand:TI 1 "gpc_reg_operand" "%r,r,0,0") >> + (match_operand:TI 2 "reg_or_short_operand" "r,I,r,I")))] >> + "TARGET_POWERPC64" > That's not the correct condition: the carry bit is set based on the 32-bit > carry in 32-bit mode, so the condition has to be TARGET_64BIT. > > The adddi3 pattern has !TARGET_POWERPC64 since a 64-bit addition can > be done without addc on a 64-bit machine, no matter what mode the CPU > is in. > >> + "* >> +{ > Might as well leave out this stuff on new code, just use the braces :-) > > Updated patch with above comments incorporated. Bootstrap/regtest on BE/LE with no new regressions. Ok for trunk? -Pat [-- Attachment #2: add-sub_ti.diff --] [-- Type: text/x-patch, Size: 4387 bytes --] Index: gcc/testsuite/gcc.target/powerpc/ti_math1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/ti_math1.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/ti_math1.c (revision 0) @@ -0,0 +1,20 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-times "addc" 1 } } */ +/* { dg-final { scan-assembler-times "adde" 1 } } */ +/* { dg-final { scan-assembler-times "subfc" 1 } } */ +/* { dg-final { scan-assembler-times "subfe" 1 } } */ +/* { dg-final { scan-assembler-not "subf " } } */ + +__int128 +add_128 (__int128 *ptr, __int128 val) +{ + return (*ptr + val); +} + +__int128 +sub_128 (__int128 *ptr, __int128 val) +{ + return (*ptr - val); +} + Index: gcc/testsuite/gcc.target/powerpc/ti_math2.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/ti_math2.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/ti_math2.c (revision 0) @@ -0,0 +1,73 @@ +/* { dg-do run { target { powerpc*-*-* && lp64 } } } */ +/* { dg-options "-O2 -fno-inline" } */ + +union U { + __int128 i128; + struct { + long l1; + long l2; + } s; +}; + +union U u1,u2; + +__int128 +create_128 (long most_sig, long least_sig) +{ + union U u; + +#if __LITTLE_ENDIAN__ + u.s.l1 = least_sig; + u.s.l2 = most_sig; +#else + u.s.l1 = most_sig; + u.s.l2 = least_sig; +#endif + return u.i128; +} + +long most_sig (union U * u) +{ +#if __LITTLE_ENDIAN__ + return (*u).s.l2; +#else + return (*u).s.l1; +#endif +} + +long least_sig (union U * u) +{ +#if __LITTLE_ENDIAN__ + return (*u).s.l1; +#else + return (*u).s.l2; +#endif +} + +__int128 +add_128 (__int128 *ptr, __int128 val) +{ + return (*ptr + val); +} + +__int128 +sub_128 (__int128 *ptr, __int128 val) +{ + return (*ptr - val); +} + +int +main (void) +{ + /* Do a simple add/sub to make sure carry is happening between the dwords + and that dwords are in correct endian order. */ + u1.i128 = create_128 (1, -1); + u2.i128 = add_128 (&u1.i128, 1); + if ((most_sig (&u2) != 2) || (least_sig (&u2) != 0)) + __builtin_abort (); + u2.i128 = sub_128 (&u2.i128, 1); + if ((most_sig (&u2) != 1) || (least_sig (&u2) != -1)) + __builtin_abort (); + return 0; +} + Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 209226) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -6535,6 +6535,49 @@ (define_insn_and_split "*floatunsdisf2_m [(set_attr "length" "8") (set_attr "type" "fpload")]) \f +;; Define the TImode operations that can be done in a small number +;; of instructions. The & constraints are to prevent the register +;; allocator from allocating registers that overlap with the inputs +;; (for example, having an input in 7,8 and an output in 6,7). We +;; also allow for the output being the same as one of the inputs. + +(define_insn "addti3" + [(set (match_operand:TI 0 "gpc_reg_operand" "=&r,&r,r,r") + (plus:TI (match_operand:TI 1 "gpc_reg_operand" "%r,r,0,0") + (match_operand:TI 2 "reg_or_short_operand" "r,I,r,I")))] + "TARGET_64BIT" +{ + if (WORDS_BIG_ENDIAN) + return (GET_CODE (operands[2])) != CONST_INT + ? \"addc %L0,%L1,%L2\;adde %0,%1,%2\" + : \"addic %L0,%L1,%2\;add%G2e %0,%1\"; + else + return (GET_CODE (operands[2])) != CONST_INT + ? \"addc %0,%1,%2\;adde %L0,%L1,%L2\" + : \"addic %0,%1,%2\;add%G2e %L0,%L1\"; +} + [(set_attr "type" "two") + (set_attr "length" "8")]) + +(define_insn "subti3" + [(set (match_operand:TI 0 "gpc_reg_operand" "=&r,&r,r,r,r") + (minus:TI (match_operand:TI 1 "reg_or_short_operand" "r,I,0,r,I") + (match_operand:TI 2 "gpc_reg_operand" "r,r,r,0,0")))] + "TARGET_64BIT" +{ + if (WORDS_BIG_ENDIAN) + return (GET_CODE (operands[1]) != CONST_INT) + ? \"subfc %L0,%L2,%L1\;subfe %0,%2,%1\" + : \"subfic %L0,%L2,%1\;subf%G1e %0,%2\"; + else + return (GET_CODE (operands[1]) != CONST_INT) + ? \"subfc %0,%2,%1\;subfe %L0,%L2,%L1\" + : \"subfic %0,%2,%1\;subf%G1e %L0,%L2\"; +} + [(set_attr "type" "two") + (set_attr "length" "8")]) + + ;; Define the DImode operations that can be done in a small number ;; of instructions. The & constraints are to prevent the register ;; allocator from allocating registers that overlap with the inputs ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, rs6000] Improve TImode add/sub 2014-04-16 21:06 ` Pat Haugen @ 2014-04-17 3:34 ` David Edelsohn 2014-04-28 19:38 ` Pat Haugen 0 siblings, 1 reply; 7+ messages in thread From: David Edelsohn @ 2014-04-17 3:34 UTC (permalink / raw) To: Pat Haugen; +Cc: Segher Boessenkool, GCC Patches On Wed, Apr 16, 2014 at 4:51 PM, Pat Haugen <pthaugen@linux.vnet.ibm.com> wrote: > Updated patch with above comments incorporated. Bootstrap/regtest on BE/LE > with no new regressions. Ok for trunk? 2014-04-08 Pat Haugen <pthaugen@us.ibm.com> * config/rs6000/rs6000.md (addti3, subti3): New. gcc/testsuite: * gcc.target/powerpc/ti_math1.c: New. * gcc.target/powerpc/ti_math2.c: New. Okay. Thanks, David ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, rs6000] Improve TImode add/sub 2014-04-17 3:34 ` David Edelsohn @ 2014-04-28 19:38 ` Pat Haugen 2014-04-28 19:58 ` David Edelsohn 2014-04-29 17:39 ` David Edelsohn 0 siblings, 2 replies; 7+ messages in thread From: Pat Haugen @ 2014-04-28 19:38 UTC (permalink / raw) To: David Edelsohn; +Cc: GCC Patches On 04/16/2014 10:27 PM, David Edelsohn wrote: >> >Updated patch with above comments incorporated. Bootstrap/regtest on BE/LE >> >with no new regressions. Ok for trunk? > 2014-04-08 Pat Haugen<pthaugen@us.ibm.com> > > * config/rs6000/rs6000.md (addti3, subti3): New. > > gcc/testsuite: > * gcc.target/powerpc/ti_math1.c: New. > * gcc.target/powerpc/ti_math2.c: New. > > Okay. Is this also ok to backport to 4.8/4.9? Bootstrap/regtest on BE/LE with no new regressions. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, rs6000] Improve TImode add/sub 2014-04-28 19:38 ` Pat Haugen @ 2014-04-28 19:58 ` David Edelsohn 2014-04-29 17:39 ` David Edelsohn 1 sibling, 0 replies; 7+ messages in thread From: David Edelsohn @ 2014-04-28 19:58 UTC (permalink / raw) To: Pat Haugen; +Cc: GCC Patches On Mon, Apr 28, 2014 at 3:33 PM, Pat Haugen <pthaugen@linux.vnet.ibm.com> wrote: > On 04/16/2014 10:27 PM, David Edelsohn wrote: >>> >>> >Updated patch with above comments incorporated. Bootstrap/regtest on >>> > BE/LE >>> >with no new regressions. Ok for trunk? >> >> 2014-04-08 Pat Haugen<pthaugen@us.ibm.com> >> >> * config/rs6000/rs6000.md (addti3, subti3): New. >> >> gcc/testsuite: >> * gcc.target/powerpc/ti_math1.c: New. >> * gcc.target/powerpc/ti_math2.c: New. >> >> Okay. > > Is this also ok to backport to 4.8/4.9? Bootstrap/regtest on BE/LE with no > new regressions. The respective variants for 4.8 and 4.9/trunk are okay. Thanks, David ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH, rs6000] Improve TImode add/sub 2014-04-28 19:38 ` Pat Haugen 2014-04-28 19:58 ` David Edelsohn @ 2014-04-29 17:39 ` David Edelsohn 1 sibling, 0 replies; 7+ messages in thread From: David Edelsohn @ 2014-04-29 17:39 UTC (permalink / raw) To: Pat Haugen; +Cc: GCC Patches On Mon, Apr 28, 2014 at 3:33 PM, Pat Haugen <pthaugen@linux.vnet.ibm.com> wrote: > On 04/16/2014 10:27 PM, David Edelsohn wrote: >>> >>> >Updated patch with above comments incorporated. Bootstrap/regtest on >>> > BE/LE >>> >with no new regressions. Ok for trunk? >> >> 2014-04-08 Pat Haugen<pthaugen@us.ibm.com> >> >> * config/rs6000/rs6000.md (addti3, subti3): New. >> >> gcc/testsuite: >> * gcc.target/powerpc/ti_math1.c: New. >> * gcc.target/powerpc/ti_math2.c: New. >> >> Okay. > > Is this also ok to backport to 4.8/4.9? Bootstrap/regtest on BE/LE with no > new regressions. Pat, Just to be clear. The revised version of the patch is okay on 4.8 and 4.9 branches. Thanks, David ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-04-29 17:37 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-04-09 0:07 [PATCH, rs6000] Improve TImode add/sub Pat Haugen 2014-04-09 2:56 ` segher 2014-04-16 21:06 ` Pat Haugen 2014-04-17 3:34 ` David Edelsohn 2014-04-28 19:38 ` Pat Haugen 2014-04-28 19:58 ` David Edelsohn 2014-04-29 17:39 ` David Edelsohn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).