From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4690 invoked by alias); 6 Jul 2017 08:22:20 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 3235 invoked by uid 89); 6 Jul 2017 08:22:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00,GIT_PATCH_0,GIT_PATCH_1,GIT_PATCH_2,GIT_PATCH_3,RP_MATCHES_RCVD,SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 06 Jul 2017 08:22:14 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 155952B; Thu, 6 Jul 2017 01:22:12 -0700 (PDT) Received: from e105689-lin.cambridge.arm.com (e105689-lin.cambridge.arm.com [10.2.207.32]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 73EE83F3E1; Thu, 6 Jul 2017 01:22:11 -0700 (PDT) Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub operations To: Michael Collison Cc: "gcc-patches@gcc.gnu.org" References: From: "Richard Earnshaw (lists)" Message-ID: Date: Thu, 06 Jul 2017 08:22:00 -0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-SW-Source: 2017-07/txt/msg00307.txt.bz2 On 06/07/17 08:29, Michael Collison wrote: > Richard, > > Can you explain "Use of ne is wrong here. The condition register should > be set to the result of a compare rtl construct. The same applies > elsewhere within this patch. NE is then used on the result of the > comparison. The mode of the compare then indicates what might or might > not be valid in the way the comparison is finally constructed."? > > Why is "ne" wrong? I don't doubt you are correct, but I see nothing in > the internals manual that forbids it. I want to understand what issues > this exposes. > Because the idiomatic form on a machine with a flags register is CCreg:mode = COMPARE:mode (A, B) which is then used with (CCreg:mode, 0) where cond-op is NE, EQ, GE, ... as appropriate. > As you indicate I used this idiom in the arm port when I added the > overflow operations there as well. Additionally other targets seem to > use the comparison operators this way (i386 for the umulv). Some targets really have boolean predicate operations that set results explicitly in GP registers as the truth of A < B, etc. On those machines using pred-reg = cond-op (A, B) makes sense, but not on ARM or AArch64. R. > > Regards, > > Michael Collison > > -----Original Message----- > From: Richard Earnshaw (lists) [mailto:Richard.Earnshaw@arm.com] > Sent: Wednesday, July 5, 2017 2:38 AM > To: Michael Collison ; Christophe Lyon > > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub > operations > > On 19/05/17 22:11, Michael Collison wrote: >> Christophe, >> >> I had a type in the two test cases: "addcs" should have been "adcs". I caught this previously but submitted the previous patch incorrectly. Updated patch attached. >> >> Okay for trunk? >> > > Apologies for the delay responding, I've been procrastinating over this > one. In part it's due to the size of the patch with very little > top-level description of what's the motivation and overall approach to > the problem. > > It would really help review if this could be split into multiple patches > with a description of what each stage achieves. > > Anyway, there are a couple of obvious formatting issues to deal with > first, before we get into the details of the patch. > >> -----Original Message----- >> From: Christophe Lyon [mailto:christophe.lyon@linaro.org] >> Sent: Friday, May 19, 2017 3:59 AM >> To: Michael Collison >> Cc: gcc-patches@gcc.gnu.org; nd >> Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub >> operations >> >> Hi Michael, >> >> >> On 19 May 2017 at 07:12, Michael Collison wrote: >>> Hi, >>> >>> This patch improves code generations for builtin arithmetic overflow operations for the aarch64 backend. As an example for a simple test case such as: >>> >>> Sure for a simple test case such as: >>> >>> int >>> f (int x, int y, int *ovf) >>> { >>> int res; >>> *ovf = __builtin_sadd_overflow (x, y, &res); >>> return res; >>> } >>> >>> Current trunk at -O2 generates >>> >>> f: >>> mov w3, w0 >>> mov w4, 0 >>> add w0, w0, w1 >>> tbnz w1, #31, .L4 >>> cmp w0, w3 >>> blt .L3 >>> .L2: >>> str w4, [x2] >>> ret >>> .p2align 3 >>> .L4: >>> cmp w0, w3 >>> ble .L2 >>> .L3: >>> mov w4, 1 >>> b .L2 >>> >>> >>> With the patch this now generates: >>> >>> f: >>> adds w0, w0, w1 >>> cset w1, vs >>> str w1, [x2] >>> ret >>> >>> >>> Original patch from Richard Henderson: >>> >>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01903.html >>> >>> >>> Okay for trunk? >>> >>> 2017-05-17 Michael Collison >>> Richard Henderson >>> >>> * config/aarch64/aarch64-modes.def (CC_V): New. >>> * config/aarch64/aarch64-protos.h >>> (aarch64_add_128bit_scratch_regs): Declare >>> (aarch64_add_128bit_scratch_regs): Declare. >>> (aarch64_expand_subvti): Declare. >>> (aarch64_gen_unlikely_cbranch): Declare >>> * config/aarch64/aarch64.c (aarch64_select_cc_mode): Test >>> for signed overflow using CC_Vmode. >>> (aarch64_get_condition_code_1): Handle CC_Vmode. >>> (aarch64_gen_unlikely_cbranch): New function. >>> (aarch64_add_128bit_scratch_regs): New function. >>> (aarch64_subv_128bit_scratch_regs): New function. >>> (aarch64_expand_subvti): New function. >>> * config/aarch64/aarch64.md (addv4, uaddv4): New. >>> (addti3): Create simpler code if low part is already known to be 0. >>> (addvti4, uaddvti4): New. >>> (*add3_compareC_cconly_imm): New. >>> (*add3_compareC_cconly): New. >>> (*add3_compareC_imm): New. >>> (*add3_compareC): Rename from add3_compare1; do not >>> handle constants within this pattern. >>> (*add3_compareV_cconly_imm): New. >>> (*add3_compareV_cconly): New. >>> (*add3_compareV_imm): New. >>> (add3_compareV): New. >>> (add3_carryinC, add3_carryinV): New. >>> (*add3_carryinC_zero, *add3_carryinV_zero): New. >>> (*add3_carryinC, *add3_carryinV): New. >>> (subv4, usubv4): New. >>> (subti): Handle op1 zero. >>> (subvti4, usub4ti4): New. >>> (*sub3_compare1_imm): New. >>> (sub3_carryinCV): New. >>> (*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New. >>> (*sub3_carryinCV_z2, *sub3_carryinCV): New. >>> * testsuite/gcc.target/arm/builtin_sadd_128.c: New testcase. >>> * testsuite/gcc.target/arm/builtin_saddl.c: New testcase. >>> * testsuite/gcc.target/arm/builtin_saddll.c: New testcase. >>> * testsuite/gcc.target/arm/builtin_uadd_128.c: New testcase. >>> * testsuite/gcc.target/arm/builtin_uaddl.c: New testcase. >>> * testsuite/gcc.target/arm/builtin_uaddll.c: New testcase. >>> * testsuite/gcc.target/arm/builtin_ssub_128.c: New testcase. >>> * testsuite/gcc.target/arm/builtin_ssubl.c: New testcase. >>> * testsuite/gcc.target/arm/builtin_ssubll.c: New testcase. >>> * testsuite/gcc.target/arm/builtin_usub_128.c: New testcase. >>> * testsuite/gcc.target/arm/builtin_usubl.c: New testcase. >>> * testsuite/gcc.target/arm/builtin_usubll.c: New testcase. >> >> I've tried your patch, and 2 of the new tests FAIL: >> gcc.target/aarch64/builtin_sadd_128.c scan-assembler addcs >> gcc.target/aarch64/builtin_uadd_128.c scan-assembler addcs >> >> Am I missing something? >> >> Thanks, >> >> Christophe >> >> >> pr6308v2.patch >> >> >> diff --git a/gcc/config/aarch64/aarch64-modes.def >> b/gcc/config/aarch64/aarch64-modes.def >> index 45f7a44..244e490 100644 >> --- a/gcc/config/aarch64/aarch64-modes.def >> +++ b/gcc/config/aarch64/aarch64-modes.def >> @@ -24,6 +24,7 @@ CC_MODE (CC_SWP); >> CC_MODE (CC_NZ); /* Only N and Z bits of condition flags are valid. */ >> CC_MODE (CC_Z); /* Only Z bit of condition flags is valid. */ >> CC_MODE (CC_C); /* Only C bit of condition flags is valid. */ >> +CC_MODE (CC_V); /* Only V bit of condition flags is valid. */ >> >> /* Half-precision floating point for __fp16. */ FLOAT_MODE (HF, 2, >> 0); diff --git a/gcc/config/aarch64/aarch64-protos.h >> b/gcc/config/aarch64/aarch64-protos.h >> index f55d4ba..f38b2b8 100644 >> --- a/gcc/config/aarch64/aarch64-protos.h >> +++ b/gcc/config/aarch64/aarch64-protos.h >> @@ -388,6 +388,18 @@ void aarch64_relayout_simd_types (void); void >> aarch64_reset_previous_fndecl (void); bool >> aarch64_return_address_signing_enabled (void); void >> aarch64_save_restore_target_globals (tree); >> +void aarch64_add_128bit_scratch_regs (rtx op1, rtx op2, rtx *low_dest, >> + rtx *low_in1, rtx *low_in2, >> + rtx *high_dest, rtx *high_in1, >> + rtx *high_in2); >> +void aarch64_subv_128bit_scratch_regs (rtx op1, rtx op2, rtx *low_dest, >> + rtx *low_in1, rtx *low_in2, >> + rtx *high_dest, rtx *high_in1, >> + rtx *high_in2); >> +void aarch64_expand_subvti (rtx op0, rtx low_dest, rtx low_in1, >> + rtx low_in2, rtx high_dest, rtx high_in1, >> + rtx high_in2); >> + > > It's a little bit inconsistent, but the general style in > aarch64-protos.h is not to include parameter names in prototypes, just > their types. > >> >> /* Initialize builtins for SIMD intrinsics. */ void >> init_aarch64_simd_builtins (void); @@ -412,6 +424,8 @@ bool >> aarch64_float_const_representable_p (rtx); >> >> #if defined (RTX_CODE) >> >> +void aarch64_gen_unlikely_cbranch (enum rtx_code, machine_mode cc_mode, >> + rtx label_ref); >> bool aarch64_legitimate_address_p (machine_mode, rtx, RTX_CODE, >> bool); machine_mode aarch64_select_cc_mode (RTX_CODE, rtx, rtx); rtx >> aarch64_gen_compare_reg (RTX_CODE, rtx, rtx); diff --git >> a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index >> f343d92..71a651c 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -4716,6 +4716,13 @@ aarch64_select_cc_mode (RTX_CODE code, rtx x, rtx y) >> && GET_CODE (y) == ZERO_EXTEND) >> return CC_Cmode; >> >> + /* A test for signed overflow. */ >> + if ((GET_MODE (x) == DImode || GET_MODE (x) == TImode) >> + && code == NE >> + && GET_CODE (x) == PLUS >> + && GET_CODE (y) == SIGN_EXTEND) >> + return CC_Vmode; >> + >> /* For everything else, return CCmode. */ >> return CCmode; >> } >> @@ -4822,6 +4829,15 @@ aarch64_get_condition_code_1 (enum machine_mode mode, enum rtx_code comp_code) >> } >> break; >> >> + case CC_Vmode: >> + switch (comp_code) >> + { >> + case NE: return AARCH64_VS; >> + case EQ: return AARCH64_VC; >> + default: return -1; >> + } >> + break; >> + >> default: >> return -1; >> } >> @@ -13630,6 +13646,88 @@ aarch64_split_dimode_const_store (rtx dst, rtx src) >> return true; >> } >> >> +/* Generate RTL for a conditional branch with rtx comparison CODE in >> + mode CC_MODE. The destination of the unlikely conditional branch >> + is LABEL_REF. */ >> + >> +void >> +aarch64_gen_unlikely_cbranch (enum rtx_code code, machine_mode cc_mode, >> + rtx label_ref) >> +{ >> + rtx x; >> + x = gen_rtx_fmt_ee (code, VOIDmode, >> + gen_rtx_REG (cc_mode, CC_REGNUM), >> + const0_rtx); >> + >> + x = gen_rtx_IF_THEN_ELSE (VOIDmode, x, >> + gen_rtx_LABEL_REF (VOIDmode, label_ref), >> + pc_rtx); >> + aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x)); } >> + >> +void aarch64_add_128bit_scratch_regs (rtx op1, rtx op2, rtx >> +*low_dest, > > Function names must start in column 1, with the return type on the > preceding line. All functions should have a top-level comment > describing what they do (their contract with the caller). > >> + rtx *low_in1, rtx *low_in2, >> + rtx *high_dest, rtx *high_in1, >> + rtx *high_in2) >> +{ >> + *low_dest = gen_reg_rtx (DImode); >> + *low_in1 = gen_lowpart (DImode, op1); >> + *low_in2 = simplify_gen_subreg (DImode, op2, TImode, >> + subreg_lowpart_offset (DImode, TImode)); >> + *high_dest = gen_reg_rtx (DImode); >> + *high_in1 = gen_highpart (DImode, op1); >> + *high_in2 = simplify_gen_subreg (DImode, op2, TImode, >> + subreg_highpart_offset (DImode, TImode)); } >> + >> +void aarch64_subv_128bit_scratch_regs (rtx op1, rtx op2, rtx >> +*low_dest, > > Same here. > >> + rtx *low_in1, rtx *low_in2, >> + rtx *high_dest, rtx *high_in1, >> + rtx *high_in2) >> +{ >> + *low_dest = gen_reg_rtx (DImode); >> + *low_in1 = simplify_gen_subreg (DImode, op1, TImode, >> + subreg_lowpart_offset (DImode, TImode)); >> + *low_in2 = simplify_gen_subreg (DImode, op2, TImode, >> + subreg_lowpart_offset (DImode, TImode)); >> + *high_dest = gen_reg_rtx (DImode); >> + *high_in1 = simplify_gen_subreg (DImode, op1, TImode, >> + subreg_highpart_offset (DImode, TImode)); >> + *high_in2 = simplify_gen_subreg (DImode, op2, TImode, >> + subreg_highpart_offset (DImode, TImode)); >> + >> +} >> + >> +void aarch64_expand_subvti (rtx op0, rtx low_dest, rtx low_in1, > And here. > >> + rtx low_in2, rtx high_dest, rtx high_in1, >> + rtx high_in2) >> +{ >> + if (low_in2 == const0_rtx) >> + { >> + low_dest = low_in1; >> + emit_insn (gen_subdi3_compare1 (high_dest, high_in1, >> + force_reg (DImode, high_in2))); >> + } >> + else >> + { >> + if (CONST_INT_P (low_in2)) >> + { >> + low_in2 = force_reg (DImode, GEN_INT (-UINTVAL (low_in2))); >> + high_in2 = force_reg (DImode, high_in2); >> + emit_insn (gen_adddi3_compareC (low_dest, low_in1, low_in2)); >> + } >> + else >> + emit_insn (gen_subdi3_compare1 (low_dest, low_in1, low_in2)); >> + emit_insn (gen_subdi3_carryinCV (high_dest, >> + force_reg (DImode, high_in1), >> + high_in2)); >> + } >> + >> + emit_move_insn (gen_lowpart (DImode, op0), low_dest); >> + emit_move_insn (gen_highpart (DImode, op0), high_dest); >> + >> +} >> + >> /* Implement the TARGET_ASAN_SHADOW_OFFSET hook. */ >> >> static unsigned HOST_WIDE_INT >> diff --git a/gcc/config/aarch64/aarch64.md >> b/gcc/config/aarch64/aarch64.md index a693a3b..3976ecb 100644 >> --- a/gcc/config/aarch64/aarch64.md >> +++ b/gcc/config/aarch64/aarch64.md >> @@ -1711,25 +1711,123 @@ >> } >> ) >> >> +(define_expand "addv4" >> + [(match_operand:GPI 0 "register_operand") >> + (match_operand:GPI 1 "register_operand") >> + (match_operand:GPI 2 "register_operand") >> + (match_operand 3 "")] >> + "" >> +{ >> + emit_insn (gen_add3_compareV (operands[0], operands[1], >> +operands[2])); >> + aarch64_gen_unlikely_cbranch (NE, CC_Vmode, operands[3]); >> + >> + DONE; >> +}) >> + >> +(define_expand "uaddv4" >> + [(match_operand:GPI 0 "register_operand") >> + (match_operand:GPI 1 "register_operand") >> + (match_operand:GPI 2 "register_operand") >> + (match_operand 3 "")] > > With no rtl in the expand to describe this pattern, it really should > have a top-level comment explaining the arguments (reference to the > manual is probably OK in this case). > >> + "" >> +{ >> + emit_insn (gen_add3_compareC (operands[0], operands[1], >> +operands[2])); >> + aarch64_gen_unlikely_cbranch (NE, CC_Cmode, operands[3]); >> + >> + DONE; >> +}) >> + >> + >> (define_expand "addti3" >> [(set (match_operand:TI 0 "register_operand" "") >> (plus:TI (match_operand:TI 1 "register_operand" "") >> - (match_operand:TI 2 "register_operand" "")))] >> + (match_operand:TI 2 "aarch64_reg_or_imm" "")))] >> "" >> { >> - rtx low = gen_reg_rtx (DImode); >> - emit_insn (gen_adddi3_compareC (low, gen_lowpart (DImode, operands[1]), >> - gen_lowpart (DImode, operands[2]))); >> + rtx l0,l1,l2,h0,h1,h2; >> >> - rtx high = gen_reg_rtx (DImode); >> - emit_insn (gen_adddi3_carryin (high, gen_highpart (DImode, operands[1]), >> - gen_highpart (DImode, operands[2]))); >> + aarch64_add_128bit_scratch_regs (operands[1], operands[2], >> + &l0, &l1, &l2, &h0, &h1, &h2); >> + >> + if (l2 == const0_rtx) >> + { >> + l0 = l1; >> + if (!aarch64_pluslong_operand (h2, DImode)) >> + h2 = force_reg (DImode, h2); >> + emit_insn (gen_adddi3 (h0, h1, h2)); >> + } >> + else >> + { >> + emit_insn (gen_adddi3_compareC (l0, l1, force_reg (DImode, l2))); >> + emit_insn (gen_adddi3_carryin (h0, h1, force_reg (DImode, h2))); >> + } >> + >> + emit_move_insn (gen_lowpart (DImode, operands[0]), l0); >> + emit_move_insn (gen_highpart (DImode, operands[0]), h0); >> >> - emit_move_insn (gen_lowpart (DImode, operands[0]), low); >> - emit_move_insn (gen_highpart (DImode, operands[0]), high); >> DONE; >> }) >> >> +(define_expand "addvti4" >> + [(match_operand:TI 0 "register_operand" "") >> + (match_operand:TI 1 "register_operand" "") >> + (match_operand:TI 2 "aarch64_reg_or_imm" "") >> + (match_operand 3 "")] > > Same here. > >> + "" >> +{ >> + rtx l0,l1,l2,h0,h1,h2; >> + >> + aarch64_add_128bit_scratch_regs (operands[1], operands[2], >> + &l0, &l1, &l2, &h0, &h1, &h2); >> + >> + if (l2 == const0_rtx) >> + { >> + l0 = l1; >> + emit_insn (gen_adddi3_compareV (h0, h1, force_reg (DImode, h2))); >> + } >> + else >> + { >> + emit_insn (gen_adddi3_compareC (l0, l1, force_reg (DImode, l2))); >> + emit_insn (gen_adddi3_carryinV (h0, h1, force_reg (DImode, h2))); >> + } >> + >> + emit_move_insn (gen_lowpart (DImode, operands[0]), l0); >> + emit_move_insn (gen_highpart (DImode, operands[0]), h0); >> + >> + aarch64_gen_unlikely_cbranch (NE, CC_Vmode, operands[3]); >> + DONE; >> +}) >> + >> +(define_expand "uaddvti4" >> + [(match_operand:TI 0 "register_operand" "") >> + (match_operand:TI 1 "register_operand" "") >> + (match_operand:TI 2 "aarch64_reg_or_imm" "") >> + (match_operand 3 "")] >> + "" >> +{ >> + rtx l0,l1,l2,h0,h1,h2; >> + >> + aarch64_add_128bit_scratch_regs (operands[1], operands[2], >> + &l0, &l1, &l2, &h0, &h1, &h2); >> + >> + if (l2 == const0_rtx) >> + { >> + l0 = l1; >> + emit_insn (gen_adddi3_compareC (h0, h1, force_reg (DImode, h2))); >> + } >> + else >> + { >> + emit_insn (gen_adddi3_compareC (l0, l1, force_reg (DImode, l2))); >> + emit_insn (gen_adddi3_carryinC (h0, h1, force_reg (DImode, h2))); >> + } >> + >> + emit_move_insn (gen_lowpart (DImode, operands[0]), l0); >> + emit_move_insn (gen_highpart (DImode, operands[0]), h0); >> + >> + aarch64_gen_unlikely_cbranch (NE, CC_Cmode, operands[3]); DONE; >> + }) >> + >> (define_insn "add3_compare0" >> [(set (reg:CC_NZ CC_REGNUM) >> (compare:CC_NZ >> @@ -1828,10 +1926,70 @@ >> [(set_attr "type" "alus_sreg")] >> ) >> >> +;; Note that since we're sign-extending, match the immediate in GPI >> +;; rather than in DWI. Since CONST_INT is modeless, this works fine. >> +(define_insn "*add3_compareV_cconly_imm" >> + [(set (reg:CC_V CC_REGNUM) >> + (ne:CC_V >> + (plus: >> + (sign_extend: (match_operand:GPI 0 "register_operand" "r,r")) >> + (match_operand:GPI 1 "aarch64_plus_immediate" "I,J")) >> + (sign_extend: (plus:GPI (match_dup 0) (match_dup 1)))))] >> + "" >> + "@ >> + cmn\\t%0, %1 >> + cmp\\t%0, #%n1" >> + [(set_attr "type" "alus_imm")] >> +) >> + >> +(define_insn "*add3_compareV_cconly" >> + [(set (reg:CC_V CC_REGNUM) >> + (ne:CC_V > > Use of ne is wrong here. The condition register should be set to the > result of a compare rtl construct. The same applies elsewhere within > this patch. NE is then used on the result of the comparison. The mode > of the compare then indicates what might or might not be valid in the > way the comparison is finally constructed. > > Note that this issue may go back to the earlier patches that this is > based on, but those are equally incorrect and wil need fixing as well at > some point. We shouldn't prepetuate the issue. > >> + (plus: >> + (sign_extend: (match_operand:GPI 0 "register_operand" "r")) >> + (sign_extend: (match_operand:GPI 1 "register_operand" "r"))) >> + (sign_extend: (plus:GPI (match_dup 0) (match_dup 1)))))] >> + "" >> + "cmn\\t%0, %1" >> + [(set_attr "type" "alus_sreg")] >> +) >> + >> +(define_insn "*add3_compareV_imm" >> + [(set (reg:CC_V CC_REGNUM) >> + (ne:CC_V >> + (plus: >> + (sign_extend: >> + (match_operand:GPI 1 "register_operand" "r,r")) >> + (match_operand:GPI 2 "aarch64_plus_immediate" "I,J")) >> + (sign_extend: >> + (plus:GPI (match_dup 1) (match_dup 2))))) >> + (set (match_operand:GPI 0 "register_operand" "=r,r") >> + (plus:GPI (match_dup 1) (match_dup 2)))] >> + "" >> + "@ >> + adds\\t%0, %1, %2 >> + subs\\t%0, %1, #%n2" >> + [(set_attr "type" "alus_imm,alus_imm")] >> +) >> + >> +(define_insn "add3_compareV" >> + [(set (reg:CC_V CC_REGNUM) >> + (ne:CC_V >> + (plus: >> + (sign_extend: (match_operand:GPI 1 "register_operand" "r")) >> + (sign_extend: (match_operand:GPI 2 "register_operand" "r"))) >> + (sign_extend: (plus:GPI (match_dup 1) (match_dup 2))))) >> + (set (match_operand:GPI 0 "register_operand" "=r") >> + (plus:GPI (match_dup 1) (match_dup 2)))] >> + "" >> + "adds\\t%0, %1, %2" >> + [(set_attr "type" "alus_sreg")] >> +) >> + >> (define_insn "*adds_shift_imm_" >> [(set (reg:CC_NZ CC_REGNUM) >> (compare:CC_NZ >> - (plus:GPI (ASHIFT:GPI >> + (plus:GPI (ASHIFT:GPI >> (match_operand:GPI 1 "register_operand" "r") >> (match_operand:QI 2 "aarch64_shift_imm_" "n")) >> (match_operand:GPI 3 "register_operand" "r")) @@ -2187,6 >> +2345,138 @@ >> [(set_attr "type" "adc_reg")] >> ) >> >> +(define_expand "add3_carryinC" >> + [(parallel >> + [(set (match_dup 3) >> + (ne:CC_C >> + (plus: >> + (plus: >> + (match_dup 4) >> + (zero_extend: >> + (match_operand:GPI 1 "register_operand" "r"))) >> + (zero_extend: >> + (match_operand:GPI 2 "register_operand" "r"))) >> + (zero_extend: >> + (plus:GPI >> + (plus:GPI (match_dup 5) (match_dup 1)) >> + (match_dup 2))))) >> + (set (match_operand:GPI 0 "register_operand") >> + (plus:GPI >> + (plus:GPI (match_dup 5) (match_dup 1)) >> + (match_dup 2)))])] >> + "" >> +{ >> + operands[3] = gen_rtx_REG (CC_Cmode, CC_REGNUM); >> + operands[4] = gen_rtx_NE (mode, operands[3], const0_rtx); >> + operands[5] = gen_rtx_NE (mode, operands[3], const0_rtx); >> +}) >> + >> +(define_insn "*add3_carryinC_zero" >> + [(set (reg:CC_C CC_REGNUM) >> + (ne:CC_C >> + (plus: >> + (match_operand: 2 "aarch64_carry_operation" "") >> + (zero_extend: (match_operand:GPI 1 "register_operand" "r"))) >> + (zero_extend: >> + (plus:GPI >> + (match_operand:GPI 3 "aarch64_carry_operation" "") >> + (match_dup 1))))) >> + (set (match_operand:GPI 0 "register_operand") >> + (plus:GPI (match_dup 3) (match_dup 1)))] >> + "" >> + "adcs\\t%0, %1, zr" >> + [(set_attr "type" "adc_reg")] >> +) >> + >> +(define_insn "*add3_carryinC" >> + [(set (reg:CC_C CC_REGNUM) >> + (ne:CC_C >> + (plus: >> + (plus: >> + (match_operand: 3 "aarch64_carry_operation" "") >> + (zero_extend: (match_operand:GPI 1 "register_operand" "r"))) >> + (zero_extend: (match_operand:GPI 2 "register_operand" "r"))) >> + (zero_extend: >> + (plus:GPI >> + (plus:GPI >> + (match_operand:GPI 4 "aarch64_carry_operation" "") >> + (match_dup 1)) >> + (match_dup 2))))) >> + (set (match_operand:GPI 0 "register_operand") >> + (plus:GPI >> + (plus:GPI (match_dup 4) (match_dup 1)) >> + (match_dup 2)))] >> + "" >> + "adcs\\t%0, %1, %2" >> + [(set_attr "type" "adc_reg")] >> +) >> + >> +(define_expand "add3_carryinV" >> + [(parallel >> + [(set (reg:CC_V CC_REGNUM) >> + (ne:CC_V >> + (plus: >> + (plus: >> + (match_dup 3) >> + (sign_extend: >> + (match_operand:GPI 1 "register_operand" "r"))) >> + (sign_extend: >> + (match_operand:GPI 2 "register_operand" "r"))) >> + (sign_extend: >> + (plus:GPI >> + (plus:GPI (match_dup 4) (match_dup 1)) >> + (match_dup 2))))) >> + (set (match_operand:GPI 0 "register_operand") >> + (plus:GPI >> + (plus:GPI (match_dup 4) (match_dup 1)) >> + (match_dup 2)))])] >> + "" >> +{ >> + rtx cc = gen_rtx_REG (CC_Cmode, CC_REGNUM); >> + operands[3] = gen_rtx_NE (mode, cc, const0_rtx); >> + operands[4] = gen_rtx_NE (mode, cc, const0_rtx); >> +}) >> + >> +(define_insn "*add3_carryinV_zero" >> + [(set (reg:CC_V CC_REGNUM) >> + (ne:CC_V >> + (plus: >> + (match_operand: 2 "aarch64_carry_operation" "") >> + (sign_extend: (match_operand:GPI 1 "register_operand" "r"))) >> + (sign_extend: >> + (plus:GPI >> + (match_operand:GPI 3 "aarch64_carry_operation" "") >> + (match_dup 1))))) >> + (set (match_operand:GPI 0 "register_operand") >> + (plus:GPI (match_dup 3) (match_dup 1)))] >> + "" >> + "adcs\\t%0, %1, zr" >> + [(set_attr "type" "adc_reg")] >> +) >> + >> +(define_insn "*add3_carryinV" >> + [(set (reg:CC_V CC_REGNUM) >> + (ne:CC_V >> + (plus: >> + (plus: >> + (match_operand: 3 "aarch64_carry_operation" "") >> + (sign_extend: (match_operand:GPI 1 "register_operand" "r"))) >> + (sign_extend: (match_operand:GPI 2 "register_operand" "r"))) >> + (sign_extend: >> + (plus:GPI >> + (plus:GPI >> + (match_operand:GPI 4 "aarch64_carry_operation" "") >> + (match_dup 1)) >> + (match_dup 2))))) >> + (set (match_operand:GPI 0 "register_operand") >> + (plus:GPI >> + (plus:GPI (match_dup 4) (match_dup 1)) >> + (match_dup 2)))] >> + "" >> + "adcs\\t%0, %1, %2" >> + [(set_attr "type" "adc_reg")] >> +) >> + >> (define_insn "*add_uxt_shift2" >> [(set (match_operand:GPI 0 "register_operand" "=rk") >> (plus:GPI (and:GPI >> @@ -2283,22 +2573,86 @@ >> (set_attr "simd" "*,yes")] >> ) >> >> +(define_expand "subv4" >> + [(match_operand:GPI 0 "register_operand") >> + (match_operand:GPI 1 "aarch64_reg_or_zero") >> + (match_operand:GPI 2 "aarch64_reg_or_zero") >> + (match_operand 3 "")] >> + "" >> +{ >> + emit_insn (gen_sub3_compare1 (operands[0], operands[1], >> +operands[2])); >> + aarch64_gen_unlikely_cbranch (NE, CC_Vmode, operands[3]); >> + >> + DONE; >> +}) >> + >> +(define_expand "usubv4" >> + [(match_operand:GPI 0 "register_operand") >> + (match_operand:GPI 1 "aarch64_reg_or_zero") >> + (match_operand:GPI 2 "aarch64_reg_or_zero") >> + (match_operand 3 "")] >> + "" >> +{ >> + emit_insn (gen_sub3_compare1 (operands[0], operands[1], >> +operands[2])); >> + aarch64_gen_unlikely_cbranch (LTU, CCmode, operands[3]); >> + >> + DONE; >> +}) >> + >> (define_expand "subti3" >> [(set (match_operand:TI 0 "register_operand" "") >> - (minus:TI (match_operand:TI 1 "register_operand" "") >> + (minus:TI (match_operand:TI 1 "aarch64_reg_or_zero" "") >> (match_operand:TI 2 "register_operand" "")))] >> "" >> { >> - rtx low = gen_reg_rtx (DImode); >> - emit_insn (gen_subdi3_compare1 (low, gen_lowpart (DImode, operands[1]), >> - gen_lowpart (DImode, operands[2]))); >> + rtx l0 = gen_reg_rtx (DImode); >> + rtx l1 = simplify_gen_subreg (DImode, operands[1], TImode, >> + subreg_lowpart_offset (DImode, TImode)); >> + rtx l2 = gen_lowpart (DImode, operands[2]); >> + rtx h0 = gen_reg_rtx (DImode); >> + rtx h1 = simplify_gen_subreg (DImode, operands[1], TImode, >> + subreg_highpart_offset (DImode, TImode)); >> + rtx h2 = gen_highpart (DImode, operands[2]); >> >> - rtx high = gen_reg_rtx (DImode); >> - emit_insn (gen_subdi3_carryin (high, gen_highpart (DImode, operands[1]), >> - gen_highpart (DImode, operands[2]))); >> + emit_insn (gen_subdi3_compare1 (l0, l1, l2)); emit_insn >> + (gen_subdi3_carryin (h0, h1, h2)); >> >> - emit_move_insn (gen_lowpart (DImode, operands[0]), low); >> - emit_move_insn (gen_highpart (DImode, operands[0]), high); >> + emit_move_insn (gen_lowpart (DImode, operands[0]), l0); >> + emit_move_insn (gen_highpart (DImode, operands[0]), h0); >> + DONE; >> +}) >> + >> +(define_expand "subvti4" >> + [(match_operand:TI 0 "register_operand") >> + (match_operand:TI 1 "aarch64_reg_or_zero") >> + (match_operand:TI 2 "aarch64_reg_or_imm") >> + (match_operand 3 "")] >> + "" >> +{ >> + rtx l0,l1,l2,h0,h1,h2; >> + >> + aarch64_subv_128bit_scratch_regs (operands[1], operands[2], >> + &l0, &l1, &l2, &h0, &h1, &h2); >> + aarch64_expand_subvti (operands[0], l0, l1, l2, h0, h1, h2); >> + >> + aarch64_gen_unlikely_cbranch (NE, CC_Vmode, operands[3]); >> + DONE; >> +}) >> + >> +(define_expand "usubvti4" >> + [(match_operand:TI 0 "register_operand") >> + (match_operand:TI 1 "aarch64_reg_or_zero") >> + (match_operand:TI 2 "aarch64_reg_or_imm") >> + (match_operand 3 "")] >> + "" >> +{ >> + rtx l0,l1,l2,h0,h1,h2; >> + >> + aarch64_subv_128bit_scratch_regs (operands[1], operands[2], >> + &l0, &l1, &l2, &h0, &h1, &h2); >> + aarch64_expand_subvti (operands[0], l0, l1, l2, h0, h1, h2); >> + >> + aarch64_gen_unlikely_cbranch (LTU, CCmode, operands[3]); >> DONE; >> }) >> >> @@ -2327,6 +2681,22 @@ >> [(set_attr "type" "alus_sreg")] >> ) >> >> +(define_insn "*sub3_compare1_imm" >> + [(set (reg:CC CC_REGNUM) >> + (compare:CC >> + (match_operand:GPI 1 "aarch64_reg_or_zero" "rZ,rZ") >> + (match_operand:GPI 2 "aarch64_plus_immediate" "I,J"))) >> + (set (match_operand:GPI 0 "register_operand" "=r,r") >> + (plus:GPI >> + (match_dup 1) >> + (match_operand:GPI 3 "aarch64_plus_immediate" "J,I")))] >> + "UINTVAL (operands[2]) == -UINTVAL (operands[3])" >> + "@ >> + subs\\t%0, %1, %2 >> + adds\\t%0, %1, %3" >> + [(set_attr "type" "alus_imm")] >> +) >> + >> (define_insn "sub3_compare1" >> [(set (reg:CC CC_REGNUM) >> (compare:CC >> @@ -2554,6 +2924,85 @@ >> [(set_attr "type" "adc_reg")] >> ) >> >> +(define_expand "sub3_carryinCV" >> + [(parallel >> + [(set (reg:CC CC_REGNUM) >> + (compare:CC >> + (sign_extend: >> + (match_operand:GPI 1 "aarch64_reg_or_zero" "rZ")) >> + (plus: >> + (sign_extend: >> + (match_operand:GPI 2 "register_operand" "r")) >> + (ltu: (reg:CC CC_REGNUM) (const_int 0))))) >> + (set (match_operand:GPI 0 "register_operand" "=r") >> + (minus:GPI >> + (minus:GPI (match_dup 1) (match_dup 2)) >> + (ltu:GPI (reg:CC CC_REGNUM) (const_int 0))))])] >> + "" >> +) >> + >> +(define_insn "*sub3_carryinCV_z1_z2" >> + [(set (reg:CC CC_REGNUM) >> + (compare:CC >> + (const_int 0) >> + (match_operand: 2 "aarch64_borrow_operation" ""))) >> + (set (match_operand:GPI 0 "register_operand" "=r") >> + (neg:GPI (match_operand:GPI 1 "aarch64_borrow_operation" "")))] >> + "" >> + "sbcs\\t%0, zr, zr" >> + [(set_attr "type" "adc_reg")] >> +) >> + >> +(define_insn "*sub3_carryinCV_z1" >> + [(set (reg:CC CC_REGNUM) >> + (compare:CC >> + (const_int 0) >> + (plus: >> + (sign_extend: >> + (match_operand:GPI 1 "register_operand" "r")) >> + (match_operand: 2 "aarch64_borrow_operation" "")))) >> + (set (match_operand:GPI 0 "register_operand" "=r") >> + (minus:GPI >> + (neg:GPI (match_dup 1)) >> + (match_operand:GPI 3 "aarch64_borrow_operation" "")))] >> + "" >> + "sbcs\\t%0, zr, %1" >> + [(set_attr "type" "adc_reg")] >> +) >> + >> +(define_insn "*sub3_carryinCV_z2" >> + [(set (reg:CC CC_REGNUM) >> + (compare:CC >> + (sign_extend: >> + (match_operand:GPI 1 "register_operand" "r")) >> + (match_operand: 2 "aarch64_borrow_operation" ""))) >> + (set (match_operand:GPI 0 "register_operand" "=r") >> + (minus:GPI >> + (match_dup 1) >> + (match_operand:GPI 3 "aarch64_borrow_operation" "")))] >> + "" >> + "sbcs\\t%0, %1, zr" >> + [(set_attr "type" "adc_reg")] >> +) >> + >> +(define_insn "*sub3_carryinCV" >> + [(set (reg:CC CC_REGNUM) >> + (compare:CC >> + (sign_extend: >> + (match_operand:GPI 1 "register_operand" "r")) >> + (plus: >> + (sign_extend: >> + (match_operand:GPI 2 "register_operand" "r")) >> + (match_operand: 3 "aarch64_borrow_operation" "")))) >> + (set (match_operand:GPI 0 "register_operand" "=r") >> + (minus:GPI >> + (minus:GPI (match_dup 1) (match_dup 2)) >> + (match_operand:GPI 4 "aarch64_borrow_operation" "")))] >> + "" >> + "sbcs\\t%0, %1, %2" >> + [(set_attr "type" "adc_reg")] >> +) >> + >> (define_insn "*sub_uxt_shift2" >> [(set (match_operand:GPI 0 "register_operand" "=rk") >> (minus:GPI (match_operand:GPI 4 "register_operand" "rk") diff --git >> a/gcc/testsuite/gcc.target/aarch64/builtin_sadd_128.c >> b/gcc/testsuite/gcc.target/aarch64/builtin_sadd_128.c >> new file mode 100644 >> index 0000000..0b31500 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_sadd_128.c >> @@ -0,0 +1,18 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +extern void overflow_handler (); >> + >> +__int128 overflow_add (__int128 x, __int128 y) { >> + __int128 r; >> + >> + int ovr = __builtin_add_overflow (x, y, &r); if (ovr) >> + overflow_handler (); >> + >> + return r; >> +} >> + >> +/* { dg-final { scan-assembler "adds" } } */ >> +/* { dg-final { scan-assembler "adcs" } } */ >> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_saddl.c >> b/gcc/testsuite/gcc.target/aarch64/builtin_saddl.c >> new file mode 100644 >> index 0000000..9768a98 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_saddl.c >> @@ -0,0 +1,17 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +extern void overflow_handler (); >> + >> +long overflow_add (long x, long y) >> +{ >> + long r; >> + >> + int ovr = __builtin_saddl_overflow (x, y, &r); if (ovr) >> + overflow_handler (); >> + >> + return r; >> +} >> + >> +/* { dg-final { scan-assembler "adds" } } */ >> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_saddll.c >> b/gcc/testsuite/gcc.target/aarch64/builtin_saddll.c >> new file mode 100644 >> index 0000000..126a526 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_saddll.c >> @@ -0,0 +1,18 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +extern void overflow_handler (); >> + >> +long long overflow_add (long long x, long long y) { >> + long long r; >> + >> + int ovr = __builtin_saddll_overflow (x, y, &r); if (ovr) >> + overflow_handler (); >> + >> + return r; >> +} >> + >> +/* { dg-final { scan-assembler "adds" } } */ >> + >> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_ssub_128.c >> b/gcc/testsuite/gcc.target/aarch64/builtin_ssub_128.c >> new file mode 100644 >> index 0000000..c1261e3 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_ssub_128.c >> @@ -0,0 +1,18 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +extern void overflow_handler (); >> + >> +__int128 overflow_sub (__int128 x, __int128 y) { >> + __int128 r; >> + >> + int ovr = __builtin_sub_overflow (x, y, &r); if (ovr) >> + overflow_handler (); >> + >> + return r; >> +} >> + >> +/* { dg-final { scan-assembler "subs" } } */ >> +/* { dg-final { scan-assembler "sbcs" } } */ >> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_ssubl.c >> b/gcc/testsuite/gcc.target/aarch64/builtin_ssubl.c >> new file mode 100644 >> index 0000000..1040464 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_ssubl.c >> @@ -0,0 +1,17 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +extern void overflow_handler (); >> + >> +long overflow_sub (long x, long y) >> +{ >> + long r; >> + >> + int ovr = __builtin_ssubl_overflow (x, y, &r); if (ovr) >> + overflow_handler (); >> + >> + return r; >> +} >> + >> +/* { dg-final { scan-assembler "subs" } } */ >> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_ssubll.c >> b/gcc/testsuite/gcc.target/aarch64/builtin_ssubll.c >> new file mode 100644 >> index 0000000..a03df88 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_ssubll.c >> @@ -0,0 +1,18 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +extern void overflow_handler (); >> + >> +long long overflow_sub (long long x, long long y) { >> + long long r; >> + >> + int ovr = __builtin_ssubll_overflow (x, y, &r); if (ovr) >> + overflow_handler (); >> + >> + return r; >> +} >> + >> +/* { dg-final { scan-assembler "subs" } } */ >> + >> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_uadd_128.c >> b/gcc/testsuite/gcc.target/aarch64/builtin_uadd_128.c >> new file mode 100644 >> index 0000000..c573c2a >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_uadd_128.c >> @@ -0,0 +1,18 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +extern void overflow_handler (); >> + >> +unsigned __int128 overflow_add (unsigned __int128 x, unsigned >> +__int128 y) { >> + unsigned __int128 r; >> + >> + int ovr = __builtin_add_overflow (x, y, &r); if (ovr) >> + overflow_handler (); >> + >> + return r; >> +} >> + >> +/* { dg-final { scan-assembler "adds" } } */ >> +/* { dg-final { scan-assembler "adcs" } } */ >> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_uaddl.c >> b/gcc/testsuite/gcc.target/aarch64/builtin_uaddl.c >> new file mode 100644 >> index 0000000..e325591 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_uaddl.c >> @@ -0,0 +1,17 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +extern void overflow_handler (); >> + >> +unsigned long overflow_add (unsigned long x, unsigned long y) { >> + unsigned long r; >> + >> + int ovr = __builtin_uaddl_overflow (x, y, &r); if (ovr) >> + overflow_handler (); >> + >> + return r; >> +} >> + >> +/* { dg-final { scan-assembler "adds" } } */ >> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_uaddll.c >> b/gcc/testsuite/gcc.target/aarch64/builtin_uaddll.c >> new file mode 100644 >> index 0000000..5f42886 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_uaddll.c >> @@ -0,0 +1,18 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +extern void overflow_handler (); >> + >> +unsigned long long overflow_add (unsigned long long x, unsigned long >> +long y) { >> + unsigned long long r; >> + >> + int ovr = __builtin_uaddll_overflow (x, y, &r); if (ovr) >> + overflow_handler (); >> + >> + return r; >> +} >> + >> +/* { dg-final { scan-assembler "adds" } } */ >> + >> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_usub_128.c >> b/gcc/testsuite/gcc.target/aarch64/builtin_usub_128.c >> new file mode 100644 >> index 0000000..a84f4a4 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_usub_128.c >> @@ -0,0 +1,18 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +extern void overflow_handler (); >> + >> +unsigned __int128 overflow_sub (unsigned __int128 x, unsigned >> +__int128 y) { >> + unsigned __int128 r; >> + >> + int ovr = __builtin_sub_overflow (x, y, &r); if (ovr) >> + overflow_handler (); >> + >> + return r; >> +} >> + >> +/* { dg-final { scan-assembler "subs" } } */ >> +/* { dg-final { scan-assembler "sbcs" } } */ >> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_usubl.c >> b/gcc/testsuite/gcc.target/aarch64/builtin_usubl.c >> new file mode 100644 >> index 0000000..ed033da >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_usubl.c >> @@ -0,0 +1,17 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +extern void overflow_handler (); >> + >> +unsigned long overflow_sub (unsigned long x, unsigned long y) { >> + unsigned long r; >> + >> + int ovr = __builtin_usubl_overflow (x, y, &r); if (ovr) >> + overflow_handler (); >> + >> + return r; >> +} >> + >> +/* { dg-final { scan-assembler "subs" } } */ >> diff --git a/gcc/testsuite/gcc.target/aarch64/builtin_usubll.c >> b/gcc/testsuite/gcc.target/aarch64/builtin_usubll.c >> new file mode 100644 >> index 0000000..a742f0c >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/aarch64/builtin_usubll.c >> @@ -0,0 +1,18 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2" } */ >> + >> +extern void overflow_handler (); >> + >> +unsigned long long overflow_sub (unsigned long long x, unsigned long >> +long y) { >> + unsigned long long r; >> + >> + int ovr = __builtin_usubll_overflow (x, y, &r); if (ovr) >> + overflow_handler (); >> + >> + return r; >> +} >> + >> +/* { dg-final { scan-assembler "subs" } } */ >> + >> >