Re: [ARM] Add support for overflow add, sub, and neg operations

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
To: Michael Collison <michael.collison@linaro.org>,
	 GCC Patches <gcc-patches@gcc.gnu.org>,
	Ramana Radhakrishnan <Ramana.Radhakrishnan@arm.com>
Subject: Re: [ARM] Add support for overflow add, sub, and neg operations
Date: Tue, 24 May 2016 09:11:00 -0000	[thread overview]
Message-ID: <57441A12.2070203@foss.arm.com> (raw)
In-Reply-To: <56F9BC1C.50603@linaro.org>

Hi Michael,

Sorry for the delay in reviewing. A few comments at the bottom.

On 29/03/16 00:19, Michael Collison wrote:
> An updated patch that resolves the issues with thumb2 support and adds test cases as requested. Looking to check this into GCC 7 stage1 when it opens.
>
> 2016-02-24  Michael Collison  <michael.collison@arm.com>
>
>     * config/arm/arm-modes.def: Add new condition code mode CC_V
>     to represent the overflow bit.
>     * config/arm/arm.c (maybe_get_arm_condition_code):
>     Add support for CC_Vmode.
>     * config/arm/arm.md (addv<mode>4, add<mode>3_compareV,
>     addsi3_compareV_upper): New patterns to support signed
>     builtin overflow add operations.
>     (uaddv<mode>4, add<mode>3_compareC, addsi3_compareV_upper):
>     New patterns to support unsigned builtin add overflow operations.
>     (subv<mode>4, sub<mode>3_compare1): New patterns to support signed
>     builtin overflow subtract operations,
>     (usubv<mode>4): New patterns to support unsigned builtin subtract
>     overflow operations.
>     (negvsi3, negvdi3, negdi2_compare, negsi2_carryin_compare): New patterns
>     to support builtin overflow negate operations.
>     * gcc.target/arm/builtin_saddl.c: New testcase.
>     * gcc.target/arm/builtin_saddll.c: New testcase.
>     * gcc.target/arm/builtin_uaddl.c: New testcase.
>     * gcc.target/arm/builtin_uaddll.c: New testcase.
>     * gcc.target/arm/builtin_ssubl.c: New testcase.
>     * gcc.target/arm/builtin_ssubll.c: New testcase.
>     * gcc.target/arm/builtin_usubl.c: New testcase.
>     * gcc.target/arm/builtin_usubll.c: New testcase.
>
> On 02/29/2016 04:13 AM, Kyrill Tkachov wrote:
>>
>> On 26/02/16 10:32, Michael Collison wrote:
>>>
>>>
>>> On 02/25/2016 02:51 AM, Kyrill Tkachov wrote:
>>>> Hi Michael,
>>>>
>>>> On 24/02/16 23:02, Michael Collison wrote:
>>>>> This patch adds support for builtin overflow of add, subtract and negate. This patch is targeted for gcc 7 stage 1. It was tested with no regressions in arm and thumb modes on the following targets:
>>>>>
>>>>> arm-non-linux-gnueabi
>>>>> arm-non-linux-gnuabihf
>>>>> armeb-none-linux-gnuabihf
>>>>> arm-non-eabi
>>>>>
>>>>
>>>> I'll have a deeper look once we're closer to GCC 7 development.
>>>> I've got a few comments in the meantime.
>>>>
>>>>> 2016-02-24 Michael Collison <michael.collison@arm.com>
>>>>>
>>>>>     * config/arm/arm-modes.def: Add new condition code mode CC_V
>>>>>     to represent the overflow bit.
>>>>>     * config/arm/arm.c (maybe_get_arm_condition_code):
>>>>>     Add support for CC_Vmode.
>>>>>     * config/arm/arm.md (addv<mode>4, add<mode>3_compareV,
>>>>>     addsi3_compareV_upper): New patterns to support signed
>>>>>     builtin overflow add operations.
>>>>>     (uaddv<mode>4, add<mode>3_compareC, addsi3_compareV_upper):
>>>>>     New patterns to support unsigned builtin add overflow operations.
>>>>>     (subv<mode>4, sub<mode>3_compare1): New patterns to support signed
>>>>>     builtin overflow subtract operations,
>>>>>     (usubv<mode>4): New patterns to support unsigned builtin subtract
>>>>>     overflow operations.
>>>>>     (negvsi3, negvdi3, negdi2_compre, negsi2_carryin_compare): New patterns
>>>>>     to support builtin overflow negate operations.
>>>>>
>>>>>
>>>>
>>>> Can you please summarise what sequences are generated for these operations, and how
>>>> they are better than the default fallback sequences.
>>>
>>> Sure for a simple test case such as:
>>>
>>> int
>>> fn3 (int x, int y, int *ovf)
>>> {
>>>   int res;
>>>   *ovf = __builtin_sadd_overflow (x, y, &res);
>>>   return res;
>>> }
>>>
>>> Current trunk at -O2 generates
>>>
>>> fn3:
>>>         @ args = 0, pretend = 0, frame = 0
>>>         @ frame_needed = 0, uses_anonymous_args = 0
>>>         @ link register save eliminated.
>>>         cmp     r1, #0
>>>         mov     r3, #0
>>>         add     r1, r0, r1
>>>         blt     .L4
>>>         cmp     r1, r0
>>>         blt     .L3
>>> .L2:
>>>         str     r3, [r2]
>>>         mov     r0, r1
>>>         bx      lr
>>> .L4:
>>>         cmp     r1, r0
>>>         ble     .L2
>>> .L3:
>>>         mov     r3, #1
>>>         b       .L2
>>>
>>> With the overflow patch this now generates:
>>>
>>>        adds    r0, r0, r1
>>>        movvs   r3, #1
>>>        movvc   r3, #0
>>>        str     r3, [r2]
>>>        bx      lr
>>>
>>
>> Thanks! That looks much better.
>>
>>>> Also, we'd need tests for each of these overflow operations, since these are pretty complex
>>>> patterns that are being added.
>>>
>>> The patterns are tested now most notably by tests in:
>>>
>>> c-c++-common/torture/builtin-arith-overflow*.c
>>>
>>> I had a few failures I resolved so the builtin overflow arithmetic functions are definitely being exercised.
>>
>> Great, that gives me more confidence on the correctness aspects but...
>>
>>>>
>>>> Also, you may want to consider splitting this into a patch series, each adding a single
>>>> overflow operation, together with its tests. That way it will be easier to keep track of
>>>> which pattern applies to which use case and they can go in independently of each other.
>>>
>>> Let me know if you still fell the same way given the existing test cases.
>>>
>>
>> ... I'd like us to still have scan-assembler tests. The torture tests exercise the correctness,
>> but we'd want tests to catch regressions where we stop generating the new patterns due to other
>> optimisation changes, which would lead to code quality regressions.
>> So I'd like us to have scan-assembler tests for these sequences to make sure we generate the right
>> instructions.
>>
>> Thanks,
>> Kyrill
>>
>>>>
>>>> +(define_expand "uaddv<mode>4"
>>>> +  [(match_operand:SIDI 0 "register_operand")
>>>> +   (match_operand:SIDI 1 "register_operand")
>>>> +   (match_operand:SIDI 2 "register_operand")
>>>> +   (match_operand 3 "")]
>>>> +  "TARGET_ARM"
>>>> +{
>>>> +  emit_insn (gen_add<mode>3_compareC (operands[0], operands[1], operands[2]));
>>>> +
>>>> +  rtx x;
>>>> +  x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Cmode, CC_REGNUM), const0_rtx);
>>>> +  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
>>>> +                gen_rtx_LABEL_REF (VOIDmode, operands[3]),
>>>> +                pc_rtx);
>>>> +  emit_jump_insn (gen_rtx_SET (pc_rtx, x));
>>>> +  DONE;
>>>> +})
>>>> +
>>>>
>>>> I notice this and many other patterns in this patch are guarded on TARGET_ARM. Is there any reason why they
>>>> should be restricted to arm state and not be TARGET_32BIT ?
>>> I thought about this as well. I will test will TARGET_32BIT and get back to you.
>>>>
>>>>
>>>> Thanks,
>>>> Kyrill

+(define_expand "addv<mode>4"
+  [(match_operand:SIDI 0 "register_operand")
+   (match_operand:SIDI 1 "register_operand")
+   (match_operand:SIDI 2 "register_operand")
+   (match_operand 3 "")]
+  "TARGET_32BIT"
+{
+  emit_insn (gen_add<mode>3_compareV (operands[0], operands[1], operands[2]));
+
+  rtx x;
+  x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Vmode, CC_REGNUM), const0_rtx);
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+			    gen_rtx_LABEL_REF (VOIDmode, operands[3]),
+			    pc_rtx);
+  emit_jump_insn (gen_rtx_SET (pc_rtx, x));
+  DONE;
+})

This sequence (gen_rtx_NE, gen_rtx_IF_THEN_ELSE, emit_jump_insn) is used many times in this patch.
Please factor it out into a helper function in arm.c (and expose it through arm-protos.h) to eliminate
the code duplication. In the implementation of that function you can use emit_unlikely_jump instead of
emit_jump_insn for the branch so that the probability of the branch is set appropriately low since
overflow is expected to be the unlikely outcome in these operations.

  
+(define_insn_and_split "adddi3_compareV"
+  [(set (reg:CC_V CC_REGNUM)
+	(ne:CC_V
+	  (plus:TI
+	    (sign_extend:TI (match_operand:DI 1 "register_operand" "r"))
+	    (sign_extend:TI (match_operand:DI 2 "register_operand" "r")))
+	  (sign_extend:TI (plus:DI (match_dup 1) (match_dup 2)))))
+   (set (match_operand:DI 0 "register_operand" "=&r")
+	(plus:DI (match_dup 1) (match_dup 2)))]
+  "TARGET_32BIT"
+  "#"
+  "TARGET_32BIT && reload_completed"

For the split condition just write "&& reload_completed".
This means that the split condition is the match condition (TARGET_32BIT) with
"&& reload_completed" appended to it. I know there are a few places in arm.md that
don't follow that, but I hope to clean those up some day.
Same with the other splitters in this patch.

diff --git a/gcc/testsuite/gcc.target/arm/builtin_saddl.c b/gcc/testsuite/gcc.target/arm/builtin_saddl.c
new file mode 100644
index 0000000..e9dab3c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/builtin_saddl.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" }  */
+/* { dg-require-effective-target arm32 } */
+extern void overflow_handler ();
+
+long overflow_add (long x, long y)
+{
+  long r;
+
+  int ovr = __builtin_saddl_overflow (x, y, &r);
+  if (ovr)
+    overflow_handler ();
+
+  return r;
+}
+
+/* {dg-final { scan-assembler "adds" }} */

IIRC DejaGNU can be picky with spaces between the braces and the dg-* directives
please write this as:

{ dg-final { scan-assembler "adds" } }

I think the patch looks good otherwise.
Please respin with the above addressed and I think it'll be good to go.

Thanks for sticking with it,
Kyrill

     prev parent reply	other threads:[~2016-05-24  9:10 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-24 23:02 Michael Collison
2016-02-25  9:51 ` Kyrill Tkachov
2016-02-26 10:32   ` Michael Collison
2016-02-29 11:13     ` Kyrill Tkachov
2016-02-29 11:25       ` Michael Collison
2016-03-29  0:38       ` Michael Collison
2016-05-24  9:11         ` Kyrill Tkachov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57441A12.2070203@foss.arm.com \
    --to=kyrylo.tkachov@foss.arm.com \
    --cc=Ramana.Radhakrishnan@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=michael.collison@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).