From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl1-x642.google.com (mail-pl1-x642.google.com [IPv6:2607:f8b0:4864:20::642]) by sourceware.org (Postfix) with ESMTPS id 394CD385E011 for ; Thu, 2 Apr 2020 18:54:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 394CD385E011 Received: by mail-pl1-x642.google.com with SMTP id d24so1677547pll.8 for ; Thu, 02 Apr 2020 11:54:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=AGMCzS9WYE5y5J6KQfxFImcpJ/791dR7P3NSnV9ggSc=; b=iOLsdDH5D1uHQInejMpmkMSnOGOBnHQ5zqyrqUEzXleDT/e9RKSmPCEmuG1lPdC4Sy eXgLLOHK3rrU5A0LaHl5S19Xt83hGEpJr+y8Tfj0+PXz/tTxvOeUriCOFnR41Css7XIh 9x9ttv9xPRuFLxFduyvSkBAe8tuLljfeuvCFoHzzJ4l6AzA9XqruV/i/nOEyysuDnNW3 SEb23Cm4SObmlc25WBbGNTJdyvlbR8JWvgGNEbwKGaAevhWEmkFePnqqBcovqztpehhI tIX77MhACOi+xuU2DPKv4wdiRIi82LmKO5hpF46x9TE58hPFNdFRq7/INl80UvGPK/m4 NUCg== X-Gm-Message-State: AGi0PuZPzgP6uehHrsko8dMTk+Jm+CL7H8Q2KXJPizG2gM2BICwXsDe6 p22PsIHwts3QEHEqtIgs9YgFvo0LTfE= X-Google-Smtp-Source: APiQypIXsIMP2YulPU5+SlqSft6bQbB8QZQ6H/Gpd/wcgjtt39safEWfRP+NVcL4JBOEucSAW6gUxw== X-Received: by 2002:a17:902:7583:: with SMTP id j3mr4206613pll.196.1585853641799; Thu, 02 Apr 2020 11:54:01 -0700 (PDT) Received: from localhost.localdomain (174-21-149-226.tukw.qwest.net. [174.21.149.226]) by smtp.gmail.com with ESMTPSA id r64sm4216973pjb.15.2020.04.02.11.54.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Apr 2020 11:54:01 -0700 (PDT) From: Richard Henderson To: gcc-patches@gcc.gnu.org Cc: richard.sandiford@arm.com, segher@kernel.crashing.org, richard.earnshaw@arm.com, Wilco.Dijkstra@arm.com, marcus.shawcroft@arm.com, kyrylo.tkachov@arm.com Subject: [PATCH v2 05/11] aarch64: Use UNSPEC_SBCS for subtract-with-borrow + output flags Date: Thu, 2 Apr 2020 11:53:47 -0700 Message-Id: <20200402185353.11047-6-richard.henderson@linaro.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20200402185353.11047-1-richard.henderson@linaro.org> References: <20200402185353.11047-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-22.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Apr 2020 18:54:05 -0000 The rtl description of signed/unsigned overflow from subtract was fine, as far as it goes -- we have CC_Cmode and CC_Vmode that indicate that only those particular bits are valid. However, it's not clear how to extend that description to handle signed comparison, where N == V (GE) N != V (LT) are the only valid bits. Using an UNSPEC means that we can unify all 3 usages without fear that combine will try to infer anything from the rtl. It also means we need far fewer variants when various inputs have constants propagated in, and the rtl folds. Accept -1 for the second input by using ADCS. * config/aarch64/aarch64.md (UNSPEC_SBCS): New. (cmp3_carryin): New expander. (sub3_carryin_cmp): New expander. (*cmp3_carryin): New pattern. (*cmp3_carryin_0): New pattern. (*sub3_carryin_cmp): New pattern. (*sub3_carryin_cmp_0): New pattern. (subvti4, usubvti4, negvti3): Use subdi3_carryin_cmp. (negvdi_carryinV): Remove. (usub3_carryinC): Remove. (*usub3_carryinC): Remove. (*usub3_carryinC_z1): Remove. (*usub3_carryinC_z2): Remove. (sub3_carryinV): Remove. (*sub3_carryinV): Remove. (*sub3_carryinV_z2): Remove. * config/aarch64/predicates.md (aarch64_reg_zero_minus1): New. --- gcc/config/aarch64/aarch64.md | 217 +++++++++++++------------------ gcc/config/aarch64/predicates.md | 7 + 2 files changed, 94 insertions(+), 130 deletions(-) diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 532c114a42e..564dea390be 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -281,6 +281,7 @@ UNSPEC_GEN_TAG_RND ; Generate a random 4-bit MTE tag. UNSPEC_TAG_SPACE ; Translate address to MTE tag address space. UNSPEC_LD1RO + UNSPEC_SBCS ]) (define_c_enum "unspecv" [ @@ -2942,7 +2943,7 @@ aarch64_expand_addsubti (operands[0], operands[1], operands[2], CODE_FOR_subvdi_insn, CODE_FOR_subdi3_compare1, - CODE_FOR_subdi3_carryinV); + CODE_FOR_subdi3_carryin_cmp); aarch64_gen_unlikely_cbranch (NE, CC_Vmode, operands[3]); DONE; }) @@ -2957,7 +2958,7 @@ aarch64_expand_addsubti (operands[0], operands[1], operands[2], CODE_FOR_subdi3_compare1, CODE_FOR_subdi3_compare1, - CODE_FOR_usubdi3_carryinC); + CODE_FOR_subdi3_carryin_cmp); aarch64_gen_unlikely_cbranch (LTU, CCmode, operands[3]); DONE; }) @@ -2968,12 +2969,14 @@ (label_ref (match_operand 2 "" ""))] "" { - emit_insn (gen_negdi_carryout (gen_lowpart (DImode, operands[0]), - gen_lowpart (DImode, operands[1]))); - emit_insn (gen_negvdi_carryinV (gen_highpart (DImode, operands[0]), - gen_highpart (DImode, operands[1]))); - aarch64_gen_unlikely_cbranch (NE, CC_Vmode, operands[2]); + rtx op0l = gen_lowpart (DImode, operands[0]); + rtx op1l = gen_lowpart (DImode, operands[1]); + rtx op0h = gen_highpart (DImode, operands[0]); + rtx op1h = gen_highpart (DImode, operands[1]); + emit_insn (gen_negdi_carryout (op0l, op1l)); + emit_insn (gen_subdi3_carryin_cmp (op0h, const0_rtx, op1h)); + aarch64_gen_unlikely_cbranch (NE, CC_Vmode, operands[2]); DONE; } ) @@ -2989,23 +2992,6 @@ [(set_attr "type" "alus_sreg")] ) -(define_insn "negvdi_carryinV" - [(set (reg:CC_V CC_REGNUM) - (compare:CC_V - (neg:TI (plus:TI - (ltu:TI (reg:CC CC_REGNUM) (const_int 0)) - (sign_extend:TI (match_operand:DI 1 "register_operand" "r")))) - (sign_extend:TI - (neg:DI (plus:DI (ltu:DI (reg:CC CC_REGNUM) (const_int 0)) - (match_dup 1)))))) - (set (match_operand:DI 0 "register_operand" "=r") - (neg:DI (plus:DI (ltu:DI (reg:CC CC_REGNUM) (const_int 0)) - (match_dup 1))))] - "" - "ngcs\\t%0, %1" - [(set_attr "type" "alus_sreg")] -) - (define_insn "*sub3_compare0" [(set (reg:CC_NZ CC_REGNUM) (compare:CC_NZ (minus:GPI (match_operand:GPI 1 "register_operand" "rk") @@ -3370,134 +3356,105 @@ [(set_attr "type" "adc_reg")] ) -(define_expand "usub3_carryinC" +(define_expand "sub3_carryin_cmp" [(parallel - [(set (reg:CC CC_REGNUM) - (compare:CC - (zero_extend: - (match_operand:GPI 1 "aarch64_reg_or_zero")) - (plus: - (zero_extend: - (match_operand:GPI 2 "register_operand")) - (ltu: (reg:CC CC_REGNUM) (const_int 0))))) - (set (match_operand:GPI 0 "register_operand") - (minus:GPI - (minus:GPI (match_dup 1) (match_dup 2)) - (ltu:GPI (reg:CC CC_REGNUM) (const_int 0))))])] + [(set (match_dup 3) + (unspec:CC + [(match_operand:GPI 1 "aarch64_reg_or_zero") + (match_operand:GPI 2 "aarch64_reg_zero_minus1") + (match_dup 4)] + UNSPEC_SBCS)) + (set (match_operand:GPI 0 "register_operand" "=r") + (unspec:GPI + [(match_dup 1) (match_dup 2) (match_dup 4)] + UNSPEC_SBCS))])] "" + { + operands[3] = gen_rtx_REG (CCmode, CC_REGNUM); + operands[4] = gen_rtx_LTU (mode, operands[3], const0_rtx); + } ) -(define_insn "*usub3_carryinC_z1" +(define_insn "*sub3_carryin_cmp" [(set (reg:CC CC_REGNUM) - (compare:CC - (const_int 0) - (plus: - (zero_extend: - (match_operand:GPI 1 "register_operand" "r")) - (match_operand: 2 "aarch64_borrow_operation" "")))) - (set (match_operand:GPI 0 "register_operand" "=r") - (minus:GPI - (neg:GPI (match_dup 1)) - (match_operand:GPI 3 "aarch64_borrow_operation" "")))] + (unspec:CC + [(match_operand:GPI 1 "aarch64_reg_or_zero" "rZ,rZ") + (match_operand:GPI 2 "aarch64_reg_zero_minus1" "rZ,UsM") + (match_operand:GPI 3 "aarch64_borrow_operation" "")] + UNSPEC_SBCS)) + (set (match_operand:GPI 0 "register_operand" "=r,r") + (unspec:GPI + [(match_dup 1) (match_dup 2) (match_dup 3)] + UNSPEC_SBCS))] "" - "sbcs\\t%0, zr, %1" + "@ + sbcs\\t%0, %1, %2 + adcs\\t%0, %1, zr" [(set_attr "type" "adc_reg")] ) -(define_insn "*usub3_carryinC_z2" +(define_expand "cmp3_carryin" [(set (reg:CC CC_REGNUM) - (compare:CC - (zero_extend: - (match_operand:GPI 1 "register_operand" "r")) - (match_operand: 2 "aarch64_borrow_operation" ""))) - (set (match_operand:GPI 0 "register_operand" "=r") - (minus:GPI - (match_dup 1) - (match_operand:GPI 3 "aarch64_borrow_operation" "")))] + (unspec:CC + [(match_operand:GPI 0 "aarch64_reg_or_zero") + (match_operand:GPI 1 "aarch64_reg_zero_minus1") + (ltu:GPI (reg:CC CC_REGNUM) (const_int 0))] + UNSPEC_SBCS))] "" - "sbcs\\t%0, %1, zr" - [(set_attr "type" "adc_reg")] ) -(define_insn "*usub3_carryinC" +(define_insn "*cmp3_carryin" [(set (reg:CC CC_REGNUM) - (compare:CC - (zero_extend: - (match_operand:GPI 1 "register_operand" "r")) - (plus: - (zero_extend: - (match_operand:GPI 2 "register_operand" "r")) - (match_operand: 3 "aarch64_borrow_operation" "")))) - (set (match_operand:GPI 0 "register_operand" "=r") - (minus:GPI - (minus:GPI (match_dup 1) (match_dup 2)) - (match_operand:GPI 4 "aarch64_borrow_operation" "")))] + (unspec:CC + [(match_operand:GPI 0 "aarch64_reg_or_zero" "rZ,rZ") + (match_operand:GPI 1 "aarch64_reg_zero_minus1" "rZ,UsM") + (match_operand:GPI 2 "aarch64_borrow_operation" "")] + UNSPEC_SBCS))] "" - "sbcs\\t%0, %1, %2" + "@ + sbcs\\tzr, %0, %1 + adcs\\tzr, %0, zr" [(set_attr "type" "adc_reg")] ) -(define_expand "sub3_carryinV" - [(parallel - [(set (reg:CC_V CC_REGNUM) - (compare:CC_V - (minus: - (sign_extend: - (match_operand:GPI 1 "aarch64_reg_or_zero")) - (plus: - (sign_extend: - (match_operand:GPI 2 "register_operand")) - (ltu: (reg:CC CC_REGNUM) (const_int 0)))) - (sign_extend: - (minus:GPI (match_dup 1) - (plus:GPI (ltu:GPI (reg:CC CC_REGNUM) (const_int 0)) - (match_dup 2)))))) - (set (match_operand:GPI 0 "register_operand") - (minus:GPI - (minus:GPI (match_dup 1) (match_dup 2)) - (ltu:GPI (reg:CC CC_REGNUM) (const_int 0))))])] - "" +;; If combine can show that the borrow is 0, fold SBCS to SUBS. +(define_insn_and_split "*sub3_carryin_cmp_0" + [(set (reg:CC CC_REGNUM) + (unspec:CC + [(match_operand:GPI 1 "aarch64_reg_or_zero" "rk,rkZ") + (match_operand:GPI 2 "aarch64_plus_immediate" "rIJ,r") + (const_int 0)] + UNSPEC_SBCS)) + (set (match_operand:GPI 0 "register_operand") + (unspec:GPI + [(match_dup 1) (match_dup 2) (const_int 0)] + UNSPEC_SBCS))] + "" + "#" + "" + [(scratch)] + { + emit_insn (gen_sub3_compare1 (operands[0], operands[1], + operands[2])); + DONE; + } ) -(define_insn "*sub3_carryinV_z2" - [(set (reg:CC_V CC_REGNUM) - (compare:CC_V - (minus: - (sign_extend: (match_operand:GPI 1 "register_operand" "r")) - (match_operand: 2 "aarch64_borrow_operation" "")) - (sign_extend: - (minus:GPI (match_dup 1) - (match_operand:GPI 3 "aarch64_borrow_operation" ""))))) - (set (match_operand:GPI 0 "register_operand" "=r") - (minus:GPI - (match_dup 1) (match_dup 3)))] +(define_insn_and_split "*cmp3_carryin_0" + [(set (reg:CC CC_REGNUM) + (unspec:CC + [(match_operand:GPI 0 "aarch64_reg_or_zero" "rk,rZ") + (match_operand:GPI 1 "aarch64_plus_operand" "rIJ,r") + (const_int 0)] + UNSPEC_SBCS))] "" - "sbcs\\t%0, %1, zr" - [(set_attr "type" "adc_reg")] -) - -(define_insn "*sub3_carryinV" - [(set (reg:CC_V CC_REGNUM) - (compare:CC_V - (minus: - (sign_extend: - (match_operand:GPI 1 "register_operand" "r")) - (plus: - (sign_extend: - (match_operand:GPI 2 "register_operand" "r")) - (match_operand: 3 "aarch64_borrow_operation" ""))) - (sign_extend: - (minus:GPI - (match_dup 1) - (plus:GPI (match_operand:GPI 4 "aarch64_borrow_operation" "") - (match_dup 2)))))) - (set (match_operand:GPI 0 "register_operand" "=r") - (minus:GPI - (minus:GPI (match_dup 1) (match_dup 2)) - (match_dup 4)))] + "#" "" - "sbcs\\t%0, %1, %2" - [(set_attr "type" "adc_reg")] + [(scratch)] + { + emit_insn (gen_cmp (operands[0], operands[1])); + DONE; + } ) (define_insn "*sub_uxt_shift2" diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 215fcec5955..5f44ef7d672 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -68,6 +68,13 @@ (ior (match_operand 0 "register_operand") (match_test "op == CONST0_RTX (GET_MODE (op))")))) +(define_predicate "aarch64_reg_zero_minus1" + (and (match_code "reg,subreg,const_int") + (ior (match_operand 0 "register_operand") + (ior (match_test "op == CONST0_RTX (GET_MODE (op))") + (match_test "op == CONSTM1_RTX (GET_MODE (op))"))))) + + (define_predicate "aarch64_reg_or_fp_zero" (ior (match_operand 0 "register_operand") (and (match_code "const_double") -- 2.20.1