An updated patch that resolves the issues with thumb2 support and adds 
test cases as requested. Looking to check this into GCC 7 stage1 when it 
opens.

2016-02-24  Michael Collison  <michael.collison@arm.com>

     * config/arm/arm-modes.def: Add new condition code mode CC_V
     to represent the overflow bit.
     * config/arm/arm.c (maybe_get_arm_condition_code):
     Add support for CC_Vmode.
     * config/arm/arm.md (addv<mode>4, add<mode>3_compareV,
     addsi3_compareV_upper): New patterns to support signed
     builtin overflow add operations.
     (uaddv<mode>4, add<mode>3_compareC, addsi3_compareV_upper):
     New patterns to support unsigned builtin add overflow operations.
     (subv<mode>4, sub<mode>3_compare1): New patterns to support signed
     builtin overflow subtract operations,
     (usubv<mode>4): New patterns to support unsigned builtin subtract
     overflow operations.
     (negvsi3, negvdi3, negdi2_compare, negsi2_carryin_compare): New 
patterns
     to support builtin overflow negate operations.
     * gcc.target/arm/builtin_saddl.c: New testcase.
     * gcc.target/arm/builtin_saddll.c: New testcase.
     * gcc.target/arm/builtin_uaddl.c: New testcase.
     * gcc.target/arm/builtin_uaddll.c: New testcase.
     * gcc.target/arm/builtin_ssubl.c: New testcase.
     * gcc.target/arm/builtin_ssubll.c: New testcase.
     * gcc.target/arm/builtin_usubl.c: New testcase.
     * gcc.target/arm/builtin_usubll.c: New testcase.

On 02/29/2016 04:13 AM, Kyrill Tkachov wrote:
>
> On 26/02/16 10:32, Michael Collison wrote:
>>
>>
>> On 02/25/2016 02:51 AM, Kyrill Tkachov wrote:
>>> Hi Michael,
>>>
>>> On 24/02/16 23:02, Michael Collison wrote:
>>>> This patch adds support for builtin overflow of add, subtract and 
>>>> negate. This patch is targeted for gcc 7 stage 1. It was tested 
>>>> with no regressions in arm and thumb modes on the following targets:
>>>>
>>>> arm-non-linux-gnueabi
>>>> arm-non-linux-gnuabihf
>>>> armeb-none-linux-gnuabihf
>>>> arm-non-eabi
>>>>
>>>
>>> I'll have a deeper look once we're closer to GCC 7 development.
>>> I've got a few comments in the meantime.
>>>
>>>> 2016-02-24 Michael Collison <michael.collison@arm.com>
>>>>
>>>>     * config/arm/arm-modes.def: Add new condition code mode CC_V
>>>>     to represent the overflow bit.
>>>>     * config/arm/arm.c (maybe_get_arm_condition_code):
>>>>     Add support for CC_Vmode.
>>>>     * config/arm/arm.md (addv<mode>4, add<mode>3_compareV,
>>>>     addsi3_compareV_upper): New patterns to support signed
>>>>     builtin overflow add operations.
>>>>     (uaddv<mode>4, add<mode>3_compareC, addsi3_compareV_upper):
>>>>     New patterns to support unsigned builtin add overflow operations.
>>>>     (subv<mode>4, sub<mode>3_compare1): New patterns to support signed
>>>>     builtin overflow subtract operations,
>>>>     (usubv<mode>4): New patterns to support unsigned builtin subtract
>>>>     overflow operations.
>>>>     (negvsi3, negvdi3, negdi2_compre, negsi2_carryin_compare): New 
>>>> patterns
>>>>     to support builtin overflow negate operations.
>>>>
>>>>
>>>
>>> Can you please summarise what sequences are generated for these 
>>> operations, and how
>>> they are better than the default fallback sequences.
>>
>> Sure for a simple test case such as:
>>
>> int
>> fn3 (int x, int y, int *ovf)
>> {
>>   int res;
>>   *ovf = __builtin_sadd_overflow (x, y, &res);
>>   return res;
>> }
>>
>> Current trunk at -O2 generates
>>
>> fn3:
>>         @ args = 0, pretend = 0, frame = 0
>>         @ frame_needed = 0, uses_anonymous_args = 0
>>         @ link register save eliminated.
>>         cmp     r1, #0
>>         mov     r3, #0
>>         add     r1, r0, r1
>>         blt     .L4
>>         cmp     r1, r0
>>         blt     .L3
>> .L2:
>>         str     r3, [r2]
>>         mov     r0, r1
>>         bx      lr
>> .L4:
>>         cmp     r1, r0
>>         ble     .L2
>> .L3:
>>         mov     r3, #1
>>         b       .L2
>>
>> With the overflow patch this now generates:
>>
>>        adds    r0, r0, r1
>>        movvs   r3, #1
>>        movvc   r3, #0
>>        str     r3, [r2]
>>        bx      lr
>>
>
> Thanks! That looks much better.
>
>>> Also, we'd need tests for each of these overflow operations, since 
>>> these are pretty complex
>>> patterns that are being added.
>>
>> The patterns are tested now most notably by tests in:
>>
>> c-c++-common/torture/builtin-arith-overflow*.c
>>
>> I had a few failures I resolved so the builtin overflow arithmetic 
>> functions are definitely being exercised.
>
> Great, that gives me more confidence on the correctness aspects but...
>
>>>
>>> Also, you may want to consider splitting this into a patch series, 
>>> each adding a single
>>> overflow operation, together with its tests. That way it will be 
>>> easier to keep track of
>>> which pattern applies to which use case and they can go in 
>>> independently of each other.
>>
>> Let me know if you still fell the same way given the existing test 
>> cases.
>>
>
> ... I'd like us to still have scan-assembler tests. The torture tests 
> exercise the correctness,
> but we'd want tests to catch regressions where we stop generating the 
> new patterns due to other
> optimisation changes, which would lead to code quality regressions.
> So I'd like us to have scan-assembler tests for these sequences to 
> make sure we generate the right
> instructions.
>
> Thanks,
> Kyrill
>
>>>
>>> +(define_expand "uaddv<mode>4"
>>> +  [(match_operand:SIDI 0 "register_operand")
>>> +   (match_operand:SIDI 1 "register_operand")
>>> +   (match_operand:SIDI 2 "register_operand")
>>> +   (match_operand 3 "")]
>>> +  "TARGET_ARM"
>>> +{
>>> +  emit_insn (gen_add<mode>3_compareC (operands[0], operands[1], 
>>> operands[2]));
>>> +
>>> +  rtx x;
>>> +  x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Cmode, CC_REGNUM), 
>>> const0_rtx);
>>> +  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
>>> +                gen_rtx_LABEL_REF (VOIDmode, operands[3]),
>>> +                pc_rtx);
>>> +  emit_jump_insn (gen_rtx_SET (pc_rtx, x));
>>> +  DONE;
>>> +})
>>> +
>>>
>>> I notice this and many other patterns in this patch are guarded on 
>>> TARGET_ARM. Is there any reason why they
>>> should be restricted to arm state and not be TARGET_32BIT ?
>> I thought about this as well. I will test will TARGET_32BIT and get 
>> back to you.
>>>
>>>
>>> Thanks,
>>> Kyrill
>>
>

-- 
Michael Collison
Linaro Toolchain Working Group
michael.collison@linaro.org