public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Richard Earnshaw (lists)" <Richard.Earnshaw@arm.com>
To: Michael Collison <Michael.Collison@arm.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub operations
Date: Thu, 06 Jul 2017 08:22:00 -0000	[thread overview]
Message-ID: <f47cecda-78a3-0f62-fbb5-3c46c24edf7c@arm.com> (raw)
In-Reply-To: <HE1PR0802MB237777606A5886AC7E85743895D50@HE1PR0802MB2377.eurprd08.prod.outlook.com>

On 06/07/17 08:29, Michael Collison wrote:
> Richard,
> 
> Can you explain "Use of ne is wrong here.  The condition register should
> be set to the result of a compare rtl construct.  The same applies
> elsewhere within this patch.  NE is then used on the result of the
> comparison.  The mode of the compare then indicates what might or might
> not be valid in the way the comparison is finally constructed."?
> 
> Why is "ne" wrong? I don't doubt you are correct, but I see nothing in
> the internals manual that forbids it. I want to understand what issues
> this exposes.
> 

Because the idiomatic form on a machine with a flags register is

CCreg:mode = COMPARE:mode (A, B)

which is then used with

<cond-op> (CCreg:mode, 0)

where cond-op is NE, EQ, GE, ... as appropriate.


> As you indicate I used this idiom in the arm port when I added the
> overflow operations there as well. Additionally other targets seem to
> use the comparison operators this way (i386 for the umulv).

Some targets really have boolean predicate operations that set results
explicitly in GP registers as the truth of A < B, etc.  On those
machines using

 pred-reg = cond-op (A, B)

makes sense, but not on ARM or AArch64.

R.

> 
> Regards,
> 
> Michael Collison
> 
> -----Original Message-----
> From: Richard Earnshaw (lists) [mailto:Richard.Earnshaw@arm.com]
> Sent: Wednesday, July 5, 2017 2:38 AM
> To: Michael Collison <Michael.Collison@arm.com>; Christophe Lyon
> <christophe.lyon@linaro.org>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>
> Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub
> operations
> 
> On 19/05/17 22:11, Michael Collison wrote:
>> Christophe,
>> 
>> I had a type in the two test cases: "addcs" should have been "adcs". I caught this previously but submitted the previous patch incorrectly. Updated patch attached.
>> 
>> Okay for trunk?
>> 
> 
> Apologies for the delay responding, I've been procrastinating over this
> one.   In part it's due to the size of the patch with very little
> top-level description of what's the motivation and overall approach to
> the problem.
> 
> It would really help review if this could be split into multiple patches
> with a description of what each stage achieves.
> 
> Anyway, there are a couple of obvious formatting issues to deal with
> first, before we get into the details of the patch.
> 
>> -----Original Message-----
>> From: Christophe Lyon [mailto:christophe.lyon@linaro.org]
>> Sent: Friday, May 19, 2017 3:59 AM
>> To: Michael Collison <Michael.Collison@arm.com>
>> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>
>> Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub 
>> operations
>> 
>> Hi Michael,
>> 
>> 
>> On 19 May 2017 at 07:12, Michael Collison <Michael.Collison@arm.com> wrote:
>>> Hi,
>>>
>>> This patch improves code generations for builtin arithmetic overflow operations for the aarch64 backend. As an example for a simple test case such as:
>>>
>>> Sure for a simple test case such as:
>>>
>>> int
>>> f (int x, int y, int *ovf)
>>> {
>>>   int res;
>>>   *ovf = __builtin_sadd_overflow (x, y, &res);
>>>   return res;
>>> }
>>>
>>> Current trunk at -O2 generates
>>>
>>> f:
>>>         mov     w3, w0
>>>         mov     w4, 0
>>>         add     w0, w0, w1
>>>         tbnz    w1, #31, .L4
>>>         cmp     w0, w3
>>>         blt     .L3
>>> .L2:
>>>         str     w4, [x2]
>>>         ret
>>>         .p2align 3
>>> .L4:
>>>         cmp     w0, w3
>>>         ble     .L2
>>> .L3:
>>>         mov     w4, 1
>>>         b       .L2
>>>
>>>
>>> With the patch this now generates:
>>>
>>> f:
>>>         adds    w0, w0, w1
>>>         cset    w1, vs
>>>         str     w1, [x2]
>>>         ret
>>>
>>>
>>> Original patch from Richard Henderson:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01903.html
>>>
>>>
>>> Okay for trunk?
>>>
>>> 2017-05-17  Michael Collison  <michael.collison@arm.com>
>>>             Richard Henderson <rth@redhat.com>
>>>
>>>         * config/aarch64/aarch64-modes.def (CC_V): New.
>>>         * config/aarch64/aarch64-protos.h
>>>         (aarch64_add_128bit_scratch_regs): Declare
>>>         (aarch64_add_128bit_scratch_regs): Declare.
>>>         (aarch64_expand_subvti): Declare.
>>>         (aarch64_gen_unlikely_cbranch): Declare
>>>         * config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
>>>         for signed overflow using CC_Vmode.
>>>         (aarch64_get_condition_code_1): Handle CC_Vmode.
>>>         (aarch64_gen_unlikely_cbranch): New function.
>>>         (aarch64_add_128bit_scratch_regs): New function.
>>>         (aarch64_subv_128bit_scratch_regs): New function.
>>>         (aarch64_expand_subvti): New function.
>>>         * config/aarch64/aarch64.md (addv<GPI>4, uaddv<GPI>4): New.
>>>         (addti3): Create simpler code if low part is already known to be 0.
>>>         (addvti4, uaddvti4): New.
>>>         (*add<GPI>3_compareC_cconly_imm): New.
>>>         (*add<GPI>3_compareC_cconly): New.
>>>         (*add<GPI>3_compareC_imm): New.
>>>         (*add<GPI>3_compareC): Rename from add<GPI>3_compare1; do not
>>>         handle constants within this pattern.
>>>         (*add<GPI>3_compareV_cconly_imm): New.
>>>         (*add<GPI>3_compareV_cconly): New.
>>>         (*add<GPI>3_compareV_imm): New.
>>>         (add<GPI>3_compareV): New.
>>>         (add<GPI>3_carryinC, add<GPI>3_carryinV): New.
>>>         (*add<GPI>3_carryinC_zero, *add<GPI>3_carryinV_zero): New.
>>>         (*add<GPI>3_carryinC, *add<GPI>3_carryinV): New.
>>>         (subv<GPI>4, usubv<GPI>4): New.
>>>         (subti): Handle op1 zero.
>>>         (subvti4, usub4ti4): New.
>>>         (*sub<GPI>3_compare1_imm): New.
>>>         (sub<GPI>3_carryinCV): New.
>>>         (*sub<GPI>3_carryinCV_z1_z2, *sub<GPI>3_carryinCV_z1): New.
>>>         (*sub<GPI>3_carryinCV_z2, *sub<GPI>3_carryinCV): New.
>>>         * testsuite/gcc.target/arm/builtin_sadd_128.c: New testcase.
>>>         * testsuite/gcc.target/arm/builtin_saddl.c: New testcase.
>>>         * testsuite/gcc.target/arm/builtin_saddll.c: New testcase.
>>>         * testsuite/gcc.target/arm/builtin_uadd_128.c: New testcase.
>>>         * testsuite/gcc.target/arm/builtin_uaddl.c: New testcase.
>>>         * testsuite/gcc.target/arm/builtin_uaddll.c: New testcase.
>>>         * testsuite/gcc.target/arm/builtin_ssub_128.c: New testcase.
>>>         * testsuite/gcc.target/arm/builtin_ssubl.c: New testcase.
>>>         * testsuite/gcc.target/arm/builtin_ssubll.c: New testcase.
>>>         * testsuite/gcc.target/arm/builtin_usub_128.c: New testcase.
>>>         * testsuite/gcc.target/arm/builtin_usubl.c: New testcase.
>>>         * testsuite/gcc.target/arm/builtin_usubll.c: New testcase.
>> 
>> I've tried your patch, and 2 of the new tests FAIL:
>>     gcc.target/aarch64/builtin_sadd_128.c scan-assembler addcs
>>     gcc.target/aarch64/builtin_uadd_128.c scan-assembler addcs
>> 
>> Am I missing something?
>> 
>> Thanks,
>> 
>> Christophe
>> 
>> 
>> pr6308v2.patch
>> 
>> 
>> diff --git a/gcc/config/aarch64/aarch64-modes.def 
>> b/gcc/config/aarch64/aarch64-modes.def
>> index 45f7a44..244e490 100644
>> --- a/gcc/config/aarch64/aarch64-modes.def
>> +++ b/gcc/config/aarch64/aarch64-modes.def
>> @@ -24,6 +24,7 @@ CC_MODE (CC_SWP);
>>  CC_MODE (CC_NZ);    /* Only N and Z bits of condition flags are valid.  */
>>  CC_MODE (CC_Z);     /* Only Z bit of condition flags is valid.  */
>>  CC_MODE (CC_C);     /* Only C bit of condition flags is valid.  */
>> +CC_MODE (CC_V);     /* Only V bit of condition flags is valid.  */
>>  
>>  /* Half-precision floating point for __fp16.  */  FLOAT_MODE (HF, 2, 
>> 0); diff --git a/gcc/config/aarch64/aarch64-protos.h 
>> b/gcc/config/aarch64/aarch64-protos.h
>> index f55d4ba..f38b2b8 100644
>> --- a/gcc/config/aarch64/aarch64-protos.h
>> +++ b/gcc/config/aarch64/aarch64-protos.h
>> @@ -388,6 +388,18 @@ void aarch64_relayout_simd_types (void);  void 
>> aarch64_reset_previous_fndecl (void);  bool 
>> aarch64_return_address_signing_enabled (void);  void 
>> aarch64_save_restore_target_globals (tree);
>> +void aarch64_add_128bit_scratch_regs (rtx op1, rtx op2, rtx *low_dest,
>> +                                   rtx *low_in1, rtx *low_in2,
>> +                                   rtx *high_dest, rtx *high_in1,
>> +                                   rtx *high_in2);
>> +void aarch64_subv_128bit_scratch_regs (rtx op1, rtx op2, rtx *low_dest,
>> +                                    rtx *low_in1, rtx *low_in2,
>> +                                    rtx *high_dest, rtx *high_in1,
>> +                                    rtx *high_in2);
>> +void aarch64_expand_subvti (rtx op0, rtx low_dest, rtx low_in1,
>> +                         rtx low_in2, rtx high_dest, rtx high_in1,
>> +                         rtx high_in2);
>> +
> 
> It's a little bit inconsistent, but the general style in
> aarch64-protos.h is not to include parameter names in prototypes, just
> their types.
> 
>>  
>>  /* Initialize builtins for SIMD intrinsics.  */  void 
>> init_aarch64_simd_builtins (void); @@ -412,6 +424,8 @@ bool 
>> aarch64_float_const_representable_p (rtx);
>>  
>>  #if defined (RTX_CODE)
>>  
>> +void aarch64_gen_unlikely_cbranch (enum rtx_code, machine_mode cc_mode,
>> +                                rtx label_ref);
>>  bool aarch64_legitimate_address_p (machine_mode, rtx, RTX_CODE, 
>> bool);  machine_mode aarch64_select_cc_mode (RTX_CODE, rtx, rtx);  rtx 
>> aarch64_gen_compare_reg (RTX_CODE, rtx, rtx); diff --git 
>> a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 
>> f343d92..71a651c 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -4716,6 +4716,13 @@ aarch64_select_cc_mode (RTX_CODE code, rtx x, rtx y)
>>        && GET_CODE (y) == ZERO_EXTEND)
>>      return CC_Cmode;
>>  
>> +  /* A test for signed overflow.  */
>> +  if ((GET_MODE (x) == DImode || GET_MODE (x) == TImode)
>> +      && code == NE
>> +      && GET_CODE (x) == PLUS
>> +      && GET_CODE (y) == SIGN_EXTEND)
>> +    return CC_Vmode;
>> +
>>    /* For everything else, return CCmode.  */
>>    return CCmode;
>>  }
>> @@ -4822,6 +4829,15 @@ aarch64_get_condition_code_1 (enum machine_mode mode, enum rtx_code comp_code)
>>        }
>>        break;
>>  
>> +    case CC_Vmode:
>> +      switch (comp_code)
>> +     {
>> +     case NE: return AARCH64_VS;
>> +     case EQ: return AARCH64_VC;
>> +     default: return -1;
>> +     }
>> +      break;
>> +
>>      default:
>>        return -1;
>>      }
>> @@ -13630,6 +13646,88 @@ aarch64_split_dimode_const_store (rtx dst, rtx src)
>>    return true;
>>  }
>>  
>> +/* Generate RTL for a conditional branch with rtx comparison CODE in
>> +   mode CC_MODE.  The destination of the unlikely conditional branch
>> +   is LABEL_REF.  */
>> +
>> +void
>> +aarch64_gen_unlikely_cbranch (enum rtx_code code, machine_mode cc_mode,
>> +                           rtx label_ref)
>> +{
>> +  rtx x;
>> +  x = gen_rtx_fmt_ee (code, VOIDmode,
>> +                   gen_rtx_REG (cc_mode, CC_REGNUM),
>> +                   const0_rtx);
>> +
>> +  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
>> +                         gen_rtx_LABEL_REF (VOIDmode, label_ref),
>> +                         pc_rtx);
>> +  aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x)); }
>> +
>> +void aarch64_add_128bit_scratch_regs (rtx op1, rtx op2, rtx 
>> +*low_dest,
> 
> Function names must start in column 1, with the return type on the
> preceding line.  All functions should have a top-level comment
> describing what they do (their contract with the caller).
> 
>> +                                   rtx *low_in1, rtx *low_in2,
>> +                                   rtx *high_dest, rtx *high_in1,
>> +                                   rtx *high_in2)
>> +{
>> +  *low_dest = gen_reg_rtx (DImode);
>> +  *low_in1 = gen_lowpart (DImode, op1);
>> +  *low_in2 = simplify_gen_subreg (DImode, op2, TImode,
>> +                               subreg_lowpart_offset (DImode, TImode));
>> +  *high_dest = gen_reg_rtx (DImode);
>> +  *high_in1 = gen_highpart (DImode, op1);
>> +  *high_in2 = simplify_gen_subreg (DImode, op2, TImode,
>> +                                subreg_highpart_offset (DImode, TImode)); }
>> +
>> +void aarch64_subv_128bit_scratch_regs (rtx op1, rtx op2, rtx 
>> +*low_dest,
> 
> Same here.
> 
>> +                                    rtx *low_in1, rtx *low_in2,
>> +                                    rtx *high_dest, rtx *high_in1,
>> +                                    rtx *high_in2)
>> +{
>> +  *low_dest = gen_reg_rtx (DImode);
>> +  *low_in1 = simplify_gen_subreg (DImode, op1, TImode,
>> +                               subreg_lowpart_offset (DImode, TImode));
>> +  *low_in2 = simplify_gen_subreg (DImode, op2, TImode,
>> +                               subreg_lowpart_offset (DImode, TImode));
>> +  *high_dest = gen_reg_rtx (DImode);
>> +  *high_in1 = simplify_gen_subreg (DImode, op1, TImode,
>> +                                subreg_highpart_offset (DImode, TImode));
>> +  *high_in2 = simplify_gen_subreg (DImode, op2, TImode,
>> +                                subreg_highpart_offset (DImode, TImode));
>> +
>> +}
>> +
>> +void aarch64_expand_subvti (rtx op0, rtx low_dest, rtx low_in1,
> And here.
> 
>> +                         rtx low_in2, rtx high_dest, rtx high_in1,
>> +                         rtx high_in2)
>> +{
>> +  if (low_in2 == const0_rtx)
>> +    {
>> +      low_dest = low_in1;
>> +      emit_insn (gen_subdi3_compare1 (high_dest, high_in1,
>> +                                   force_reg (DImode, high_in2)));
>> +    }
>> +  else
>> +    {
>> +      if (CONST_INT_P (low_in2))
>> +     {
>> +       low_in2 = force_reg (DImode, GEN_INT (-UINTVAL (low_in2)));
>> +       high_in2 = force_reg (DImode, high_in2);
>> +       emit_insn (gen_adddi3_compareC (low_dest, low_in1, low_in2));
>> +     }
>> +      else
>> +     emit_insn (gen_subdi3_compare1 (low_dest, low_in1, low_in2));
>> +      emit_insn (gen_subdi3_carryinCV (high_dest,
>> +                                    force_reg (DImode, high_in1),
>> +                                    high_in2));
>> +    }
>> +
>> +  emit_move_insn (gen_lowpart (DImode, op0), low_dest);  
>> + emit_move_insn (gen_highpart (DImode, op0), high_dest);
>> +
>> +}
>> +
>>  /* Implement the TARGET_ASAN_SHADOW_OFFSET hook.  */
>>  
>>  static unsigned HOST_WIDE_INT
>> diff --git a/gcc/config/aarch64/aarch64.md 
>> b/gcc/config/aarch64/aarch64.md index a693a3b..3976ecb 100644
>> --- a/gcc/config/aarch64/aarch64.md
>> +++ b/gcc/config/aarch64/aarch64.md
>> @@ -1711,25 +1711,123 @@
>>    }
>>  )
>>  
>> +(define_expand "addv<mode>4"
>> +  [(match_operand:GPI 0 "register_operand")
>> +   (match_operand:GPI 1 "register_operand")
>> +   (match_operand:GPI 2 "register_operand")
>> +   (match_operand 3 "")]
>> +  ""
>> +{
>> +  emit_insn (gen_add<mode>3_compareV (operands[0], operands[1], 
>> +operands[2]));
>> +  aarch64_gen_unlikely_cbranch (NE, CC_Vmode, operands[3]);
>> +
>> +  DONE;
>> +})
>> +
>> +(define_expand "uaddv<mode>4"
>> +  [(match_operand:GPI 0 "register_operand")
>> +   (match_operand:GPI 1 "register_operand")
>> +   (match_operand:GPI 2 "register_operand")
>> +   (match_operand 3 "")]
> 
> With no rtl in the expand to describe this pattern, it really should
> have a top-level comment explaining the arguments (reference to the
> manual is probably OK in this case).
> 
>> +  ""
>> +{
>> +  emit_insn (gen_add<mode>3_compareC (operands[0], operands[1], 
>> +operands[2]));
>> +  aarch64_gen_unlikely_cbranch (NE, CC_Cmode, operands[3]);
>> +
>> +  DONE;
>> +})
>> +
>> +
>>  (define_expand "addti3"
>>    [(set (match_operand:TI 0 "register_operand" "")
>>        (plus:TI (match_operand:TI 1 "register_operand" "")
>> -              (match_operand:TI 2 "register_operand" "")))]
>> +              (match_operand:TI 2 "aarch64_reg_or_imm" "")))]
>>    ""
>>  {
>> -  rtx low = gen_reg_rtx (DImode);
>> -  emit_insn (gen_adddi3_compareC (low, gen_lowpart (DImode, operands[1]),
>> -                               gen_lowpart (DImode, operands[2])));
>> +  rtx l0,l1,l2,h0,h1,h2;
>>  
>> -  rtx high = gen_reg_rtx (DImode);
>> -  emit_insn (gen_adddi3_carryin (high, gen_highpart (DImode, operands[1]),
>> -                              gen_highpart (DImode, operands[2])));
>> +  aarch64_add_128bit_scratch_regs (operands[1], operands[2],
>> +                                &l0, &l1, &l2, &h0, &h1, &h2);
>> +
>> +  if (l2 == const0_rtx)
>> +    {
>> +      l0 = l1;
>> +      if (!aarch64_pluslong_operand (h2, DImode))
>> +     h2 = force_reg (DImode, h2);
>> +      emit_insn (gen_adddi3 (h0, h1, h2));
>> +    }
>> +  else
>> +    {
>> +      emit_insn (gen_adddi3_compareC (l0, l1, force_reg (DImode, l2)));
>> +      emit_insn (gen_adddi3_carryin (h0, h1, force_reg (DImode, h2)));
>> +    }
>> +
>> +  emit_move_insn (gen_lowpart (DImode, operands[0]), l0);  
>> + emit_move_insn (gen_highpart (DImode, operands[0]), h0);
>>  
>> -  emit_move_insn (gen_lowpart (DImode, operands[0]), low);
>> -  emit_move_insn (gen_highpart (DImode, operands[0]), high);
>>    DONE;
>>  })
>>  
>> +(define_expand "addvti4"
>> +  [(match_operand:TI 0 "register_operand" "")
>> +   (match_operand:TI 1 "register_operand" "")
>> +   (match_operand:TI 2 "aarch64_reg_or_imm" "")
>> +   (match_operand 3 "")]
> 
> Same here.
> 
>> +  ""
>> +{
>> +  rtx l0,l1,l2,h0,h1,h2;
>> +
>> +  aarch64_add_128bit_scratch_regs (operands[1], operands[2],
>> +                                &l0, &l1, &l2, &h0, &h1, &h2);
>> +
>> +  if (l2 == const0_rtx)
>> +    {
>> +      l0 = l1;
>> +      emit_insn (gen_adddi3_compareV (h0, h1, force_reg (DImode, h2)));
>> +    }
>> +  else
>> +    {
>> +      emit_insn (gen_adddi3_compareC (l0, l1, force_reg (DImode, l2)));
>> +      emit_insn (gen_adddi3_carryinV (h0, h1, force_reg (DImode, h2)));
>> +    }
>> +
>> +  emit_move_insn (gen_lowpart (DImode, operands[0]), l0);  
>> + emit_move_insn (gen_highpart (DImode, operands[0]), h0);
>> +
>> +  aarch64_gen_unlikely_cbranch (NE, CC_Vmode, operands[3]);
>> +  DONE;
>> +})
>> +
>> +(define_expand "uaddvti4"
>> +  [(match_operand:TI 0 "register_operand" "")
>> +   (match_operand:TI 1 "register_operand" "")
>> +   (match_operand:TI 2 "aarch64_reg_or_imm" "")
>> +   (match_operand 3 "")]
>> +  ""
>> +{
>> +  rtx l0,l1,l2,h0,h1,h2;
>> +
>> +  aarch64_add_128bit_scratch_regs (operands[1], operands[2],
>> +                                &l0, &l1, &l2, &h0, &h1, &h2);
>> +
>> +  if (l2 == const0_rtx)
>> +    {
>> +      l0 = l1;
>> +      emit_insn (gen_adddi3_compareC (h0, h1, force_reg (DImode, h2)));
>> +    }
>> +  else
>> +    {
>> +      emit_insn (gen_adddi3_compareC (l0, l1, force_reg (DImode, l2)));
>> +      emit_insn (gen_adddi3_carryinC (h0, h1, force_reg (DImode, h2)));
>> +    }
>> +
>> +  emit_move_insn (gen_lowpart (DImode, operands[0]), l0);  
>> + emit_move_insn (gen_highpart (DImode, operands[0]), h0);
>> +
>> +  aarch64_gen_unlikely_cbranch (NE, CC_Cmode, operands[3]);  DONE;
>> + })
>> +
>>  (define_insn "add<mode>3_compare0"
>>    [(set (reg:CC_NZ CC_REGNUM)
>>        (compare:CC_NZ
>> @@ -1828,10 +1926,70 @@
>>    [(set_attr "type" "alus_sreg")]
>>  )
>>  
>> +;; Note that since we're sign-extending, match the immediate in GPI 
>> +;; rather than in DWI.  Since CONST_INT is modeless, this works fine.
>> +(define_insn "*add<mode>3_compareV_cconly_imm"
>> +  [(set (reg:CC_V CC_REGNUM)
>> +     (ne:CC_V
>> +       (plus:<DWI>
>> +         (sign_extend:<DWI> (match_operand:GPI 0 "register_operand" "r,r"))
>> +         (match_operand:GPI 1 "aarch64_plus_immediate" "I,J"))
>> +       (sign_extend:<DWI> (plus:GPI (match_dup 0) (match_dup 1)))))]
>> +  ""
>> +  "@
>> +  cmn\\t%<w>0, %<w>1
>> +  cmp\\t%<w>0, #%n1"
>> +  [(set_attr "type" "alus_imm")]
>> +)
>> +
>> +(define_insn "*add<mode>3_compareV_cconly"
>> +  [(set (reg:CC_V CC_REGNUM)
>> +     (ne:CC_V
> 
> Use of ne is wrong here.  The condition register should be set to the
> result of a compare rtl construct.  The same applies elsewhere within
> this patch.  NE is then used on the result of the comparison.  The mode
> of the compare then indicates what might or might not be valid in the
> way the comparison is finally constructed.
> 
> Note that this issue may go back to the earlier patches that this is
> based on, but those are equally incorrect and wil need fixing as well at
> some point.  We shouldn't prepetuate the issue.
> 
>> +       (plus:<DWI>
>> +         (sign_extend:<DWI> (match_operand:GPI 0 "register_operand" "r"))
>> +         (sign_extend:<DWI> (match_operand:GPI 1 "register_operand" "r")))
>> +       (sign_extend:<DWI> (plus:GPI (match_dup 0) (match_dup 1)))))]
>> +  ""
>> +  "cmn\\t%<w>0, %<w>1"
>> +  [(set_attr "type" "alus_sreg")]
>> +)
>> +
>> +(define_insn "*add<mode>3_compareV_imm"
>> +  [(set (reg:CC_V CC_REGNUM)
>> +     (ne:CC_V
>> +       (plus:<DWI>
>> +         (sign_extend:<DWI>
>> +           (match_operand:GPI 1 "register_operand" "r,r"))
>> +         (match_operand:GPI 2 "aarch64_plus_immediate" "I,J"))
>> +       (sign_extend:<DWI>
>> +         (plus:GPI (match_dup 1) (match_dup 2)))))
>> +   (set (match_operand:GPI 0 "register_operand" "=r,r")
>> +     (plus:GPI (match_dup 1) (match_dup 2)))]
>> +   ""
>> +   "@
>> +   adds\\t%<w>0, %<w>1, %<w>2
>> +   subs\\t%<w>0, %<w>1, #%n2"
>> +  [(set_attr "type" "alus_imm,alus_imm")]
>> +)
>> +
>> +(define_insn "add<mode>3_compareV"
>> +  [(set (reg:CC_V CC_REGNUM)
>> +     (ne:CC_V
>> +       (plus:<DWI>
>> +         (sign_extend:<DWI> (match_operand:GPI 1 "register_operand" "r"))
>> +         (sign_extend:<DWI> (match_operand:GPI 2 "register_operand" "r")))
>> +       (sign_extend:<DWI> (plus:GPI (match_dup 1) (match_dup 2)))))
>> +   (set (match_operand:GPI 0 "register_operand" "=r")
>> +     (plus:GPI (match_dup 1) (match_dup 2)))]
>> +  ""
>> +  "adds\\t%<w>0, %<w>1, %<w>2"
>> +  [(set_attr "type" "alus_sreg")]
>> +)
>> +
>>  (define_insn "*adds_shift_imm_<mode>"
>>    [(set (reg:CC_NZ CC_REGNUM)
>>        (compare:CC_NZ
>> -      (plus:GPI (ASHIFT:GPI 
>> +      (plus:GPI (ASHIFT:GPI
>>                    (match_operand:GPI 1 "register_operand" "r")
>>                    (match_operand:QI 2 "aarch64_shift_imm_<mode>" "n"))
>>                   (match_operand:GPI 3 "register_operand" "r")) @@ -2187,6 
>> +2345,138 @@
>>    [(set_attr "type" "adc_reg")]
>>  )
>>  
>> +(define_expand "add<mode>3_carryinC"
>> +  [(parallel
>> +     [(set (match_dup 3)
>> +        (ne:CC_C
>> +          (plus:<DWI>
>> +            (plus:<DWI>
>> +              (match_dup 4)
>> +              (zero_extend:<DWI>
>> +                (match_operand:GPI 1 "register_operand" "r")))
>> +            (zero_extend:<DWI>
>> +              (match_operand:GPI 2 "register_operand" "r")))
>> +        (zero_extend:<DWI>
>> +          (plus:GPI
>> +            (plus:GPI (match_dup 5) (match_dup 1))
>> +            (match_dup 2)))))
>> +      (set (match_operand:GPI 0 "register_operand")
>> +        (plus:GPI
>> +          (plus:GPI (match_dup 5) (match_dup 1))
>> +          (match_dup 2)))])]
>> +   ""
>> +{
>> +  operands[3] = gen_rtx_REG (CC_Cmode, CC_REGNUM);
>> +  operands[4] = gen_rtx_NE (<DWI>mode, operands[3], const0_rtx);
>> +  operands[5] = gen_rtx_NE (<MODE>mode, operands[3], const0_rtx);
>> +})
>> +
>> +(define_insn "*add<mode>3_carryinC_zero"
>> +  [(set (reg:CC_C CC_REGNUM)
>> +     (ne:CC_C
>> +       (plus:<DWI>
>> +         (match_operand:<DWI> 2 "aarch64_carry_operation" "")
>> +         (zero_extend:<DWI> (match_operand:GPI 1 "register_operand" "r")))
>> +       (zero_extend:<DWI>
>> +         (plus:GPI
>> +           (match_operand:GPI 3 "aarch64_carry_operation" "")
>> +           (match_dup 1)))))
>> +   (set (match_operand:GPI 0 "register_operand")
>> +     (plus:GPI (match_dup 3) (match_dup 1)))]
>> +   ""
>> +   "adcs\\t%<w>0, %<w>1, <w>zr"
>> +  [(set_attr "type" "adc_reg")]
>> +)
>> +
>> +(define_insn "*add<mode>3_carryinC"
>> +  [(set (reg:CC_C CC_REGNUM)
>> +     (ne:CC_C
>> +       (plus:<DWI>
>> +         (plus:<DWI>
>> +           (match_operand:<DWI> 3 "aarch64_carry_operation" "")
>> +           (zero_extend:<DWI> (match_operand:GPI 1 "register_operand" "r")))
>> +         (zero_extend:<DWI> (match_operand:GPI 2 "register_operand" "r")))
>> +       (zero_extend:<DWI>
>> +         (plus:GPI
>> +           (plus:GPI
>> +             (match_operand:GPI 4 "aarch64_carry_operation" "")
>> +             (match_dup 1))
>> +           (match_dup 2)))))
>> +   (set (match_operand:GPI 0 "register_operand")
>> +     (plus:GPI
>> +       (plus:GPI (match_dup 4) (match_dup 1))
>> +       (match_dup 2)))]
>> +   ""
>> +   "adcs\\t%<w>0, %<w>1, %<w>2"
>> +  [(set_attr "type" "adc_reg")]
>> +)
>> +
>> +(define_expand "add<mode>3_carryinV"
>> +  [(parallel
>> +     [(set (reg:CC_V CC_REGNUM)
>> +        (ne:CC_V
>> +          (plus:<DWI>
>> +            (plus:<DWI>
>> +              (match_dup 3)
>> +              (sign_extend:<DWI>
>> +                (match_operand:GPI 1 "register_operand" "r")))
>> +            (sign_extend:<DWI>
>> +              (match_operand:GPI 2 "register_operand" "r")))
>> +        (sign_extend:<DWI>
>> +          (plus:GPI
>> +            (plus:GPI (match_dup 4) (match_dup 1))
>> +            (match_dup 2)))))
>> +      (set (match_operand:GPI 0 "register_operand")
>> +        (plus:GPI
>> +          (plus:GPI (match_dup 4) (match_dup 1))
>> +          (match_dup 2)))])]
>> +   ""
>> +{
>> +  rtx cc = gen_rtx_REG (CC_Cmode, CC_REGNUM);
>> +  operands[3] = gen_rtx_NE (<DWI>mode, cc, const0_rtx);
>> +  operands[4] = gen_rtx_NE (<MODE>mode, cc, const0_rtx);
>> +})
>> +
>> +(define_insn "*add<mode>3_carryinV_zero"
>> +  [(set (reg:CC_V CC_REGNUM)
>> +     (ne:CC_V
>> +       (plus:<DWI>
>> +         (match_operand:<DWI> 2 "aarch64_carry_operation" "")
>> +         (sign_extend:<DWI> (match_operand:GPI 1 "register_operand" "r")))
>> +       (sign_extend:<DWI>
>> +         (plus:GPI
>> +           (match_operand:GPI 3 "aarch64_carry_operation" "")
>> +           (match_dup 1)))))
>> +   (set (match_operand:GPI 0 "register_operand")
>> +     (plus:GPI (match_dup 3) (match_dup 1)))]
>> +   ""
>> +   "adcs\\t%<w>0, %<w>1, <w>zr"
>> +  [(set_attr "type" "adc_reg")]
>> +)
>> +
>> +(define_insn "*add<mode>3_carryinV"
>> +  [(set (reg:CC_V CC_REGNUM)
>> +     (ne:CC_V
>> +       (plus:<DWI>
>> +         (plus:<DWI>
>> +           (match_operand:<DWI> 3 "aarch64_carry_operation" "")
>> +           (sign_extend:<DWI> (match_operand:GPI 1 "register_operand" "r")))
>> +         (sign_extend:<DWI> (match_operand:GPI 2 "register_operand" "r")))
>> +       (sign_extend:<DWI>
>> +         (plus:GPI
>> +           (plus:GPI
>> +             (match_operand:GPI 4 "aarch64_carry_operation" "")
>> +             (match_dup 1))
>> +           (match_dup 2)))))
>> +   (set (match_operand:GPI 0 "register_operand")
>> +     (plus:GPI
>> +       (plus:GPI (match_dup 4) (match_dup 1))
>> +       (match_dup 2)))]
>> +   ""
>> +   "adcs\\t%<w>0, %<w>1, %<w>2"
>> +  [(set_attr "type" "adc_reg")]
>> +)
>> +
>>  (define_insn "*add_uxt<mode>_shift2"
>>    [(set (match_operand:GPI 0 "register_operand" "=rk")
>>        (plus:GPI (and:GPI
>> @@ -2283,22 +2573,86 @@
>>     (set_attr "simd" "*,yes")]
>>  )
>>  
>> +(define_expand "subv<mode>4"
>> +  [(match_operand:GPI 0 "register_operand")
>> +   (match_operand:GPI 1 "aarch64_reg_or_zero")
>> +   (match_operand:GPI 2 "aarch64_reg_or_zero")
>> +   (match_operand 3 "")]
>> +  ""
>> +{
>> +  emit_insn (gen_sub<mode>3_compare1 (operands[0], operands[1], 
>> +operands[2]));
>> +  aarch64_gen_unlikely_cbranch (NE, CC_Vmode, operands[3]);
>> +
>> +  DONE;
>> +})
>> +
>> +(define_expand "usubv<mode>4"
>> +  [(match_operand:GPI 0 "register_operand")
>> +   (match_operand:GPI 1 "aarch64_reg_or_zero")
>> +   (match_operand:GPI 2 "aarch64_reg_or_zero")
>> +   (match_operand 3 "")]
>> +  ""
>> +{
>> +  emit_insn (gen_sub<mode>3_compare1 (operands[0], operands[1], 
>> +operands[2]));
>> +  aarch64_gen_unlikely_cbranch (LTU, CCmode, operands[3]);
>> +
>> +  DONE;
>> +})
>> +
>>  (define_expand "subti3"
>>    [(set (match_operand:TI 0 "register_operand" "")
>> -     (minus:TI (match_operand:TI 1 "register_operand" "")
>> +     (minus:TI (match_operand:TI 1 "aarch64_reg_or_zero" "")
>>                  (match_operand:TI 2 "register_operand" "")))]
>>    ""
>>  {
>> -  rtx low = gen_reg_rtx (DImode);
>> -  emit_insn (gen_subdi3_compare1 (low, gen_lowpart (DImode, operands[1]),
>> -                               gen_lowpart (DImode, operands[2])));
>> +  rtx l0 = gen_reg_rtx (DImode);
>> +  rtx l1 = simplify_gen_subreg (DImode, operands[1], TImode,
>> +                             subreg_lowpart_offset (DImode, TImode));
>> +  rtx l2 = gen_lowpart (DImode, operands[2]);
>> +  rtx h0 = gen_reg_rtx (DImode);
>> +  rtx h1 = simplify_gen_subreg (DImode, operands[1], TImode,
>> +                             subreg_highpart_offset (DImode, TImode));
>> +  rtx h2 = gen_highpart (DImode, operands[2]);
>>  
>> -  rtx high = gen_reg_rtx (DImode);
>> -  emit_insn (gen_subdi3_carryin (high, gen_highpart (DImode, operands[1]),
>> -                              gen_highpart (DImode, operands[2])));
>> +  emit_insn (gen_subdi3_compare1 (l0, l1, l2));  emit_insn 
>> + (gen_subdi3_carryin (h0, h1, h2));
>>  
>> -  emit_move_insn (gen_lowpart (DImode, operands[0]), low);
>> -  emit_move_insn (gen_highpart (DImode, operands[0]), high);
>> +  emit_move_insn (gen_lowpart (DImode, operands[0]), l0);
>> +  emit_move_insn (gen_highpart (DImode, operands[0]), h0);
>> +  DONE;
>> +})
>> +
>> +(define_expand "subvti4"
>> +  [(match_operand:TI 0 "register_operand")
>> +   (match_operand:TI 1 "aarch64_reg_or_zero")
>> +   (match_operand:TI 2 "aarch64_reg_or_imm")
>> +   (match_operand 3 "")]
>> +  ""
>> +{
>> +  rtx l0,l1,l2,h0,h1,h2;
>> +
>> +  aarch64_subv_128bit_scratch_regs (operands[1], operands[2],
>> +                                 &l0, &l1, &l2, &h0, &h1, &h2);
>> +  aarch64_expand_subvti (operands[0], l0, l1, l2, h0, h1, h2);
>> +
>> +  aarch64_gen_unlikely_cbranch (NE, CC_Vmode, operands[3]);
>> +  DONE;
>> +})
>> +
>> +(define_expand "usubvti4"
>> +  [(match_operand:TI 0 "register_operand")
>> +   (match_operand:TI 1 "aarch64_reg_or_zero")
>> +   (match_operand:TI 2 "aarch64_reg_or_imm")
>> +   (match_operand 3 "")]
>> +  ""
>> +{
>> +  rtx l0,l1,l2,h0,h1,h2;
>> +
>> +  aarch64_subv_128bit_scratch_regs (operands[1], operands[2],
>> +                                 &l0, &l1, &l2, &h0, &h1, &h2);
>> +  aarch64_expand_subvti (operands[0], l0, l1, l2, h0, h1, h2);
>> +
>> +  aarch64_gen_unlikely_cbranch (LTU, CCmode, operands[3]);
>>    DONE;
>>  })
>>  
>> @@ -2327,6 +2681,22 @@
>>    [(set_attr "type" "alus_sreg")]
>>  )
>>  
>> +(define_insn "*sub<mode>3_compare1_imm"
>> +  [(set (reg:CC CC_REGNUM)
>> +     (compare:CC
>> +       (match_operand:GPI 1 "aarch64_reg_or_zero" "rZ,rZ")
>> +       (match_operand:GPI 2 "aarch64_plus_immediate" "I,J")))
>> +   (set (match_operand:GPI 0 "register_operand" "=r,r")
>> +     (plus:GPI
>> +       (match_dup 1)
>> +       (match_operand:GPI 3 "aarch64_plus_immediate" "J,I")))]
>> +  "UINTVAL (operands[2]) == -UINTVAL (operands[3])"
>> +  "@
>> +  subs\\t%<w>0, %<w>1, %<w>2
>> +  adds\\t%<w>0, %<w>1, %<w>3"
>> +  [(set_attr "type" "alus_imm")]
>> +)
>> +
>>  (define_insn "sub<mode>3_compare1"
>>    [(set (reg:CC CC_REGNUM)
>>        (compare:CC
>> @@ -2554,6 +2924,85 @@
>>    [(set_attr "type" "adc_reg")]
>>  )
>>  
>> +(define_expand "sub<mode>3_carryinCV"
>> +  [(parallel
>> +     [(set (reg:CC CC_REGNUM)
>> +        (compare:CC
>> +          (sign_extend:<DWI>
>> +            (match_operand:GPI 1 "aarch64_reg_or_zero" "rZ"))
>> +          (plus:<DWI>
>> +            (sign_extend:<DWI>
>> +              (match_operand:GPI 2 "register_operand" "r"))
>> +            (ltu:<DWI> (reg:CC CC_REGNUM) (const_int 0)))))
>> +      (set (match_operand:GPI 0 "register_operand" "=r")
>> +        (minus:GPI
>> +          (minus:GPI (match_dup 1) (match_dup 2))
>> +          (ltu:GPI (reg:CC CC_REGNUM) (const_int 0))))])]
>> +   ""
>> +)
>> +
>> +(define_insn "*sub<mode>3_carryinCV_z1_z2"
>> +  [(set (reg:CC CC_REGNUM)
>> +     (compare:CC
>> +       (const_int 0)
>> +       (match_operand:<DWI> 2 "aarch64_borrow_operation" "")))
>> +   (set (match_operand:GPI 0 "register_operand" "=r")
>> +     (neg:GPI (match_operand:GPI 1 "aarch64_borrow_operation" "")))]
>> +   ""
>> +   "sbcs\\t%<w>0, <w>zr, <w>zr"
>> +  [(set_attr "type" "adc_reg")]
>> +)
>> +
>> +(define_insn "*sub<mode>3_carryinCV_z1"
>> +  [(set (reg:CC CC_REGNUM)
>> +     (compare:CC
>> +       (const_int 0)
>> +       (plus:<DWI>
>> +         (sign_extend:<DWI>
>> +           (match_operand:GPI 1 "register_operand" "r"))
>> +         (match_operand:<DWI> 2 "aarch64_borrow_operation" ""))))
>> +   (set (match_operand:GPI 0 "register_operand" "=r")
>> +     (minus:GPI
>> +       (neg:GPI (match_dup 1))
>> +       (match_operand:GPI 3 "aarch64_borrow_operation" "")))]
>> +   ""
>> +   "sbcs\\t%<w>0, <w>zr, %<w>1"
>> +  [(set_attr "type" "adc_reg")]
>> +)
>> +
>> +(define_insn "*sub<mode>3_carryinCV_z2"
>> +  [(set (reg:CC CC_REGNUM)
>> +     (compare:CC
>> +       (sign_extend:<DWI>
>> +         (match_operand:GPI 1 "register_operand" "r"))
>> +       (match_operand:<DWI> 2 "aarch64_borrow_operation" "")))
>> +   (set (match_operand:GPI 0 "register_operand" "=r")
>> +     (minus:GPI
>> +       (match_dup 1)
>> +       (match_operand:GPI 3 "aarch64_borrow_operation" "")))]
>> +   ""
>> +   "sbcs\\t%<w>0, %<w>1, <w>zr"
>> +  [(set_attr "type" "adc_reg")]
>> +)
>> +
>> +(define_insn "*sub<mode>3_carryinCV"
>> +  [(set (reg:CC CC_REGNUM)
>> +     (compare:CC
>> +       (sign_extend:<DWI>
>> +         (match_operand:GPI 1 "register_operand" "r"))
>> +       (plus:<DWI>
>> +         (sign_extend:<DWI>
>> +           (match_operand:GPI 2 "register_operand" "r"))
>> +         (match_operand:<DWI> 3 "aarch64_borrow_operation" ""))))
>> +   (set (match_operand:GPI 0 "register_operand" "=r")
>> +     (minus:GPI
>> +       (minus:GPI (match_dup 1) (match_dup 2))
>> +       (match_operand:GPI 4 "aarch64_borrow_operation" "")))]
>> +   ""
>> +   "sbcs\\t%<w>0, %<w>1, %<w>2"
>> +  [(set_attr "type" "adc_reg")]
>> +)
>> +
>>  (define_insn "*sub_uxt<mode>_shift2"
>>    [(set (match_operand:GPI 0 "register_operand" "=rk")
>>        (minus:GPI (match_operand:GPI 4 "register_operand" "rk") diff --git 
>> a/gcc/testsuite/gcc.target/aarch64/builtin_sadd_128.c 
>> b/gcc/testsuite/gcc.target/aarch64/builtin_sadd_128.c
>> new file mode 100644
>> index 0000000..0b31500
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_sadd_128.c
>> @@ -0,0 +1,18 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" }  */
>> +
>> +extern void overflow_handler ();
>> +
>> +__int128 overflow_add (__int128 x, __int128 y) {
>> +  __int128 r;
>> +
>> +  int ovr = __builtin_add_overflow (x, y, &r);  if (ovr)
>> +    overflow_handler ();
>> +
>> +  return r;
>> +}
>> +
>> +/* { dg-final { scan-assembler "adds" } } */
>> +/* { dg-final { scan-assembler "adcs" } } */
>> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_saddl.c 
>> b/gcc/testsuite/gcc.target/aarch64/builtin_saddl.c
>> new file mode 100644
>> index 0000000..9768a98
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_saddl.c
>> @@ -0,0 +1,17 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" }  */
>> +
>> +extern void overflow_handler ();
>> +
>> +long overflow_add (long x, long y)
>> +{
>> +  long r;
>> +
>> +  int ovr = __builtin_saddl_overflow (x, y, &r);  if (ovr)
>> +    overflow_handler ();
>> +
>> +  return r;
>> +}
>> +
>> +/* { dg-final { scan-assembler "adds" } } */
>> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_saddll.c 
>> b/gcc/testsuite/gcc.target/aarch64/builtin_saddll.c
>> new file mode 100644
>> index 0000000..126a526
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_saddll.c
>> @@ -0,0 +1,18 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" }  */
>> +
>> +extern void overflow_handler ();
>> +
>> +long long overflow_add (long long x, long long y) {
>> +  long long r;
>> +
>> +  int ovr = __builtin_saddll_overflow (x, y, &r);  if (ovr)
>> +    overflow_handler ();
>> +
>> +  return r;
>> +}
>> +
>> +/* { dg-final { scan-assembler "adds" } } */
>> +
>> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_ssub_128.c 
>> b/gcc/testsuite/gcc.target/aarch64/builtin_ssub_128.c
>> new file mode 100644
>> index 0000000..c1261e3
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_ssub_128.c
>> @@ -0,0 +1,18 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" }  */
>> +
>> +extern void overflow_handler ();
>> +
>> +__int128 overflow_sub (__int128 x, __int128 y) {
>> +  __int128 r;
>> +
>> +  int ovr = __builtin_sub_overflow (x, y, &r);  if (ovr)
>> +    overflow_handler ();
>> +
>> +  return r;
>> +}
>> +
>> +/* { dg-final { scan-assembler "subs" } } */
>> +/* { dg-final { scan-assembler "sbcs" } } */
>> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_ssubl.c 
>> b/gcc/testsuite/gcc.target/aarch64/builtin_ssubl.c
>> new file mode 100644
>> index 0000000..1040464
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_ssubl.c
>> @@ -0,0 +1,17 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" }  */
>> +
>> +extern void overflow_handler ();
>> +
>> +long overflow_sub (long x, long y)
>> +{
>> +  long r;
>> +
>> +  int ovr = __builtin_ssubl_overflow (x, y, &r);  if (ovr)
>> +    overflow_handler ();
>> +
>> +  return r;
>> +}
>> +
>> +/* { dg-final { scan-assembler "subs" } } */
>> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_ssubll.c 
>> b/gcc/testsuite/gcc.target/aarch64/builtin_ssubll.c
>> new file mode 100644
>> index 0000000..a03df88
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_ssubll.c
>> @@ -0,0 +1,18 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" }  */
>> +
>> +extern void overflow_handler ();
>> +
>> +long long overflow_sub (long long x, long long y) {
>> +  long long r;
>> +
>> +  int ovr = __builtin_ssubll_overflow (x, y, &r);  if (ovr)
>> +    overflow_handler ();
>> +
>> +  return r;
>> +}
>> +
>> +/* { dg-final { scan-assembler "subs" } } */
>> +
>> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_uadd_128.c 
>> b/gcc/testsuite/gcc.target/aarch64/builtin_uadd_128.c
>> new file mode 100644
>> index 0000000..c573c2a
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_uadd_128.c
>> @@ -0,0 +1,18 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" }  */
>> +
>> +extern void overflow_handler ();
>> +
>> +unsigned __int128 overflow_add (unsigned __int128 x, unsigned 
>> +__int128 y) {
>> +  unsigned __int128 r;
>> +
>> +  int ovr = __builtin_add_overflow (x, y, &r);  if (ovr)
>> +    overflow_handler ();
>> +
>> +  return r;
>> +}
>> +
>> +/* { dg-final { scan-assembler "adds" } } */
>> +/* { dg-final { scan-assembler "adcs" } } */
>> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_uaddl.c 
>> b/gcc/testsuite/gcc.target/aarch64/builtin_uaddl.c
>> new file mode 100644
>> index 0000000..e325591
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_uaddl.c
>> @@ -0,0 +1,17 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" }  */
>> +
>> +extern void overflow_handler ();
>> +
>> +unsigned long overflow_add (unsigned long x, unsigned long y) {
>> +  unsigned long r;
>> +
>> +  int ovr = __builtin_uaddl_overflow (x, y, &r);  if (ovr)
>> +    overflow_handler ();
>> +
>> +  return r;
>> +}
>> +
>> +/* { dg-final { scan-assembler "adds" } } */
>> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_uaddll.c 
>> b/gcc/testsuite/gcc.target/aarch64/builtin_uaddll.c
>> new file mode 100644
>> index 0000000..5f42886
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_uaddll.c
>> @@ -0,0 +1,18 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" }  */
>> +
>> +extern void overflow_handler ();
>> +
>> +unsigned long long overflow_add (unsigned long long x, unsigned long 
>> +long y) {
>> +  unsigned long long r;
>> +
>> +  int ovr = __builtin_uaddll_overflow (x, y, &r);  if (ovr)
>> +    overflow_handler ();
>> +
>> +  return r;
>> +}
>> +
>> +/* { dg-final { scan-assembler "adds" } } */
>> +
>> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_usub_128.c 
>> b/gcc/testsuite/gcc.target/aarch64/builtin_usub_128.c
>> new file mode 100644
>> index 0000000..a84f4a4
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_usub_128.c
>> @@ -0,0 +1,18 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" }  */
>> +
>> +extern void overflow_handler ();
>> +
>> +unsigned __int128 overflow_sub (unsigned __int128 x, unsigned 
>> +__int128 y) {
>> +  unsigned __int128 r;
>> +
>> +  int ovr = __builtin_sub_overflow (x, y, &r);  if (ovr)
>> +    overflow_handler ();
>> +
>> +  return r;
>> +}
>> +
>> +/* { dg-final { scan-assembler "subs" } } */
>> +/* { dg-final { scan-assembler "sbcs" } } */
>> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_usubl.c 
>> b/gcc/testsuite/gcc.target/aarch64/builtin_usubl.c
>> new file mode 100644
>> index 0000000..ed033da
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_usubl.c
>> @@ -0,0 +1,17 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" }  */
>> +
>> +extern void overflow_handler ();
>> +
>> +unsigned long overflow_sub (unsigned long x, unsigned long y) {
>> +  unsigned long r;
>> +
>> +  int ovr = __builtin_usubl_overflow (x, y, &r);  if (ovr)
>> +    overflow_handler ();
>> +
>> +  return r;
>> +}
>> +
>> +/* { dg-final { scan-assembler "subs" } } */
>> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_usubll.c 
>> b/gcc/testsuite/gcc.target/aarch64/builtin_usubll.c
>> new file mode 100644
>> index 0000000..a742f0c
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_usubll.c
>> @@ -0,0 +1,18 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" }  */
>> +
>> +extern void overflow_handler ();
>> +
>> +unsigned long long overflow_sub (unsigned long long x, unsigned long 
>> +long y) {
>> +  unsigned long long r;
>> +
>> +  int ovr = __builtin_usubll_overflow (x, y, &r);  if (ovr)
>> +    overflow_handler ();
>> +
>> +  return r;
>> +}
>> +
>> +/* { dg-final { scan-assembler "subs" } } */
>> +
>> 
> 

  reply	other threads:[~2017-07-06  8:22 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-19  6:27 Michael Collison
2017-05-19 11:00 ` Christophe Lyon
2017-05-19 21:42   ` Michael Collison
2017-07-05  9:38     ` Richard Earnshaw (lists)
2017-07-06  7:29       ` Michael Collison
2017-07-06  8:22         ` Richard Earnshaw (lists) [this message]
2017-08-01  6:33       ` Michael Collison
  -- strict thread matches above, loose matches on Subject: below --
2016-11-30 23:06 Michael Collison

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f47cecda-78a3-0f62-fbb5-3c46c24edf7c@arm.com \
    --to=richard.earnshaw@arm.com \
    --cc=Michael.Collison@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).