An updated patch that resolves the issues with thumb2 support and adds test cases as requested. Looking to check this into GCC 7 stage1 when it opens. 2016-02-24 Michael Collison * config/arm/arm-modes.def: Add new condition code mode CC_V to represent the overflow bit. * config/arm/arm.c (maybe_get_arm_condition_code): Add support for CC_Vmode. * config/arm/arm.md (addv4, add3_compareV, addsi3_compareV_upper): New patterns to support signed builtin overflow add operations. (uaddv4, add3_compareC, addsi3_compareV_upper): New patterns to support unsigned builtin add overflow operations. (subv4, sub3_compare1): New patterns to support signed builtin overflow subtract operations, (usubv4): New patterns to support unsigned builtin subtract overflow operations. (negvsi3, negvdi3, negdi2_compare, negsi2_carryin_compare): New patterns to support builtin overflow negate operations. * gcc.target/arm/builtin_saddl.c: New testcase. * gcc.target/arm/builtin_saddll.c: New testcase. * gcc.target/arm/builtin_uaddl.c: New testcase. * gcc.target/arm/builtin_uaddll.c: New testcase. * gcc.target/arm/builtin_ssubl.c: New testcase. * gcc.target/arm/builtin_ssubll.c: New testcase. * gcc.target/arm/builtin_usubl.c: New testcase. * gcc.target/arm/builtin_usubll.c: New testcase. On 02/29/2016 04:13 AM, Kyrill Tkachov wrote: > > On 26/02/16 10:32, Michael Collison wrote: >> >> >> On 02/25/2016 02:51 AM, Kyrill Tkachov wrote: >>> Hi Michael, >>> >>> On 24/02/16 23:02, Michael Collison wrote: >>>> This patch adds support for builtin overflow of add, subtract and >>>> negate. This patch is targeted for gcc 7 stage 1. It was tested >>>> with no regressions in arm and thumb modes on the following targets: >>>> >>>> arm-non-linux-gnueabi >>>> arm-non-linux-gnuabihf >>>> armeb-none-linux-gnuabihf >>>> arm-non-eabi >>>> >>> >>> I'll have a deeper look once we're closer to GCC 7 development. >>> I've got a few comments in the meantime. >>> >>>> 2016-02-24 Michael Collison >>>> >>>> * config/arm/arm-modes.def: Add new condition code mode CC_V >>>> to represent the overflow bit. >>>> * config/arm/arm.c (maybe_get_arm_condition_code): >>>> Add support for CC_Vmode. >>>> * config/arm/arm.md (addv4, add3_compareV, >>>> addsi3_compareV_upper): New patterns to support signed >>>> builtin overflow add operations. >>>> (uaddv4, add3_compareC, addsi3_compareV_upper): >>>> New patterns to support unsigned builtin add overflow operations. >>>> (subv4, sub3_compare1): New patterns to support signed >>>> builtin overflow subtract operations, >>>> (usubv4): New patterns to support unsigned builtin subtract >>>> overflow operations. >>>> (negvsi3, negvdi3, negdi2_compre, negsi2_carryin_compare): New >>>> patterns >>>> to support builtin overflow negate operations. >>>> >>>> >>> >>> Can you please summarise what sequences are generated for these >>> operations, and how >>> they are better than the default fallback sequences. >> >> Sure for a simple test case such as: >> >> int >> fn3 (int x, int y, int *ovf) >> { >> int res; >> *ovf = __builtin_sadd_overflow (x, y, &res); >> return res; >> } >> >> Current trunk at -O2 generates >> >> fn3: >> @ args = 0, pretend = 0, frame = 0 >> @ frame_needed = 0, uses_anonymous_args = 0 >> @ link register save eliminated. >> cmp r1, #0 >> mov r3, #0 >> add r1, r0, r1 >> blt .L4 >> cmp r1, r0 >> blt .L3 >> .L2: >> str r3, [r2] >> mov r0, r1 >> bx lr >> .L4: >> cmp r1, r0 >> ble .L2 >> .L3: >> mov r3, #1 >> b .L2 >> >> With the overflow patch this now generates: >> >> adds r0, r0, r1 >> movvs r3, #1 >> movvc r3, #0 >> str r3, [r2] >> bx lr >> > > Thanks! That looks much better. > >>> Also, we'd need tests for each of these overflow operations, since >>> these are pretty complex >>> patterns that are being added. >> >> The patterns are tested now most notably by tests in: >> >> c-c++-common/torture/builtin-arith-overflow*.c >> >> I had a few failures I resolved so the builtin overflow arithmetic >> functions are definitely being exercised. > > Great, that gives me more confidence on the correctness aspects but... > >>> >>> Also, you may want to consider splitting this into a patch series, >>> each adding a single >>> overflow operation, together with its tests. That way it will be >>> easier to keep track of >>> which pattern applies to which use case and they can go in >>> independently of each other. >> >> Let me know if you still fell the same way given the existing test >> cases. >> > > ... I'd like us to still have scan-assembler tests. The torture tests > exercise the correctness, > but we'd want tests to catch regressions where we stop generating the > new patterns due to other > optimisation changes, which would lead to code quality regressions. > So I'd like us to have scan-assembler tests for these sequences to > make sure we generate the right > instructions. > > Thanks, > Kyrill > >>> >>> +(define_expand "uaddv4" >>> + [(match_operand:SIDI 0 "register_operand") >>> + (match_operand:SIDI 1 "register_operand") >>> + (match_operand:SIDI 2 "register_operand") >>> + (match_operand 3 "")] >>> + "TARGET_ARM" >>> +{ >>> + emit_insn (gen_add3_compareC (operands[0], operands[1], >>> operands[2])); >>> + >>> + rtx x; >>> + x = gen_rtx_NE (VOIDmode, gen_rtx_REG (CC_Cmode, CC_REGNUM), >>> const0_rtx); >>> + x = gen_rtx_IF_THEN_ELSE (VOIDmode, x, >>> + gen_rtx_LABEL_REF (VOIDmode, operands[3]), >>> + pc_rtx); >>> + emit_jump_insn (gen_rtx_SET (pc_rtx, x)); >>> + DONE; >>> +}) >>> + >>> >>> I notice this and many other patterns in this patch are guarded on >>> TARGET_ARM. Is there any reason why they >>> should be restricted to arm state and not be TARGET_32BIT ? >> I thought about this as well. I will test will TARGET_32BIT and get >> back to you. >>> >>> >>> Thanks, >>> Kyrill >> > -- Michael Collison Linaro Toolchain Working Group michael.collison@linaro.org