From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 121257 invoked by alias); 20 May 2016 13:23:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 121239 invoked by uid 89); 20 May 2016 13:23:02 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=BAYES_00,KAM_LAZY_DOMAIN_SECURITY,KAM_LOTSOFHASH,RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=fan, letter, Best, Fast X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 20 May 2016 13:22:52 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 758F634; Fri, 20 May 2016 06:23:10 -0700 (PDT) Received: from [10.2.206.43] (e100706-lin.cambridge.arm.com [10.2.206.43]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2429B3F218; Fri, 20 May 2016 06:22:49 -0700 (PDT) Message-ID: <573F0FA8.2020609@foss.arm.com> Date: Fri, 20 May 2016 13:23:00 -0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: Thomas Preudhomme , Richard Earnshaw , Ramana Radhakrishnan CC: gcc-patches@gcc.gnu.org Subject: Re: [PATCH, ARM 6/7, ping1] Add support for CB(N)Z and (U|S)DIV to ARMv8-M Baseline References: <004601d138a3$6e832c10$4b898430$@foss.arm.com> <2666765.UAfDCUe3Qi@e108577-lin> In-Reply-To: <2666765.UAfDCUe3Qi@e108577-lin> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-SW-Source: 2016-05/txt/msg01647.txt.bz2 Hi Thomas, On 17/05/16 11:14, Thomas Preudhomme wrote: > Ping? > > *** gcc/ChangeLog *** > > 2015-11-13 Thomas Preud'homme > > * config/arm/arm.c (arm_print_operand_punct_valid_p): Make %? valid > for Thumb-1. > * config/arm/arm.h (TARGET_HAVE_CBZ): Define. > (TARGET_IDIV): Set for all Thumb targets provided they have hardware > divide feature. > * config/arm/thumb1.md (thumb1_cbz): New insn. > > > diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h > index > f42e996e5a7ce979fe406b8261d50fb2ba005f6b..347b5b0a5cc0bc1e3b5020c8124d968e76ce48a4 > 100644 > --- a/gcc/config/arm/arm.h > +++ b/gcc/config/arm/arm.h > @@ -271,9 +271,12 @@ extern void (*arm_lang_output_object_attributes_hook) > (void); > /* Nonzero if this chip provides the movw and movt instructions. */ > #define TARGET_HAVE_MOVT (arm_arch_thumb2 || arm_arch8) > > +/* Nonzero if this chip provides the cb{n}z instruction. */ > +#define TARGET_HAVE_CBZ (arm_arch_thumb2 || arm_arch8) > + > /* Nonzero if integer division instructions supported. */ > #define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \ > - || (TARGET_THUMB2 && arm_arch_thumb_hwdiv)) > + || (TARGET_THUMB && arm_arch_thumb_hwdiv)) > > /* Nonzero if disallow volatile memory access in IT block. */ > #define TARGET_NO_VOLATILE_CE (arm_arch_no_volatile_ce) > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > index > 13b4b71ac8f9c1da8ef1945f7ff6985ca59f6832..445972ce0e3fd27d4411840ff69e9edbb23994fc > 100644 > --- a/gcc/config/arm/arm.c > +++ b/gcc/config/arm/arm.c > @@ -22684,7 +22684,7 @@ arm_print_operand_punct_valid_p (unsigned char code) > { > return (code == '@' || code == '|' || code == '.' > || code == '(' || code == ')' || code == '#' > - || (TARGET_32BIT && (code == '?')) > + || code == '?' > || (TARGET_THUMB2 && (code == '!')) > || (TARGET_THUMB && (code == '_'))); > } Hmm, I'm not a fan of this change. arm_print_operand_punct_valid_p is an implementation of a target hook that is used to validate user-provided inline asm as well and is therefore the right place to reject such invalid constructs. This is just working around the fact that the output template for the [u]divsi3 patterns has a '%?' in it that is illegal in Thumb1 and will not be used for ARMv8-M Baseline anyway. I'd prefer it if you add a second alternative to those patterns and emit the sdiv/udiv mnemonic without the '%?' and enable that for the v8mb arch attribute (and mark the existing alternative as requiring the "32" arch attribute). > diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md > index > 4572456b8bc98503061846cad94bc642943db3a2..1b01ef6ce731fe3ff37c3d8c048fb9d5e7829b35 > 100644 > --- a/gcc/config/arm/thumb1.md > +++ b/gcc/config/arm/thumb1.md > @@ -973,6 +973,92 @@ > DONE; > }) > > +;; A pattern for the cb(n)z instruction added in ARMv8-M baseline profile, > +;; adapted from cbranchsi4_insn. Modifying cbranchsi4_insn instead leads to > +;; code generation difference for ARMv6-M because the minimum length of the > +;; instruction becomes 2 even for it due to a limitation in genattrtab's > +;; handling of pc in the length condition. > +(define_insn "thumb1_cbz" > + [(set (pc) (if_then_else > + (match_operator 0 "equality_operator" > + [(match_operand:SI 1 "s_register_operand" "l") > + (const_int 0)]) > + (label_ref (match_operand 2 "" "")) > + (pc)))] > + "TARGET_THUMB1 && TARGET_HAVE_MOVT" > +{ s/TARGET_HAVE_MOVT/TARGET_HAVE_CBZ/ > + if (get_attr_length (insn) == 2) > + { > + if (GET_CODE (operands[0]) == EQ) > + return "cbz\t%1, %l2"; > + else > + return "cbnz\t%1, %l2"; > + } > + else > + { > + rtx t = cfun->machine->thumb1_cc_insn; > + if (t != NULL_RTX) > + { > + if (!rtx_equal_p (cfun->machine->thumb1_cc_op0, operands[1]) > + || !rtx_equal_p (cfun->machine->thumb1_cc_op1, operands[2])) > + t = NULL_RTX; > + if (cfun->machine->thumb1_cc_mode == CC_NOOVmode) > + { > + if (!noov_comparison_operator (operands[0], VOIDmode)) > + t = NULL_RTX; > + } > + else if (cfun->machine->thumb1_cc_mode != CCmode) > + t = NULL_RTX; > + } > + if (t == NULL_RTX) > + { > + output_asm_insn ("cmp\t%1, #0", operands); > + cfun->machine->thumb1_cc_insn = insn; > + cfun->machine->thumb1_cc_op0 = operands[1]; > + cfun->machine->thumb1_cc_op1 = operands[2]; > + cfun->machine->thumb1_cc_mode = CCmode; > + } > + else > + /* Ensure we emit the right type of condition code on the jump. */ > + XEXP (operands[0], 0) = gen_rtx_REG (cfun->machine->thumb1_cc_mode, > + CC_REGNUM); > + > + switch (get_attr_length (insn)) > + { > + case 4: return "b%d0\t%l2"; > + case 6: return "b%D0\t.LCB%=;b\t%l2\t%@long jump\n.LCB%=:"; > + case 8: return "b%D0\t.LCB%=;bl\t%l2\t%@far jump\n.LCB%=:"; > + default: gcc_unreachable (); > + } > + } > +} > + [(set (attr "far_jump") > + (if_then_else > + (eq_attr "length" "8") > + (const_string "yes") > + (const_string "no"))) > + (set (attr "length") > + (if_then_else > + (and (ge (minus (match_dup 2) (pc)) (const_int 2)) > + (le (minus (match_dup 2) (pc)) (const_int 128)) > + (not (match_test "which_alternative"))) This pattern only has one alternative so "which_alternative" will always be 0, so the (not (match_test "which_alternative")) test inside the 'and' is redundant and can be removed. Thanks, Kyrill > + (const_int 2) > + (if_then_else > + (and (ge (minus (match_dup 2) (pc)) (const_int -250)) > + (le (minus (match_dup 2) (pc)) (const_int 256))) > + (const_int 4) > + (if_then_else > + (and (ge (minus (match_dup 2) (pc)) (const_int -2040)) > + (le (minus (match_dup 2) (pc)) (const_int 2048))) > + (const_int 6) > + (const_int 8))))) > + (set (attr "type") > + (if_then_else > + (eq_attr "length" "2") > + (const_string "branch") > + (const_string "multiple")))] > +) > + > (define_insn "cbranchsi4_insn" > [(set (pc) (if_then_else > (match_operator 0 "arm_comparison_operator" > > > Best regards, > > Thomas > > On Thursday 17 December 2015 16:17:52 Thomas Preud'homme wrote: >> Hi, >> >> This patch is part of a patch series to add support for ARMv8-M[1] to GCC. >> This specific patch makes the compiler start generating code with the new >> CB(N)Z and (U|S)DIV instructions for ARMv8-M Baseline. >> >> Sharing of instruction patterns for div insn template with ARM or Thumb-2 >> was done by allowing %? punctuation character for Thumb-1. This is safe to >> do since the compiler would fault in arm_print_condition if a condition >> code is not handled by a branch in Thumb1. >> >> Unfortunately, cbz cannot be shared with cbranchsi4 because it would lead to >> worse code for Thumb-1. Indeed, choosing cb(n)z over the other alternatives >> for cbranchsi4 depends on the distance between target and pc which lead >> insn-attrtab to evaluate the minimum length of this pattern to be 2 as it >> cannot computer the distance statically. It would be possible to determine >> that this alternative is not available for non ARMv8-M Thumb-1 target >> statically but genattrtab is not currently capable to do it, so this is for >> a later patch. >> >> >> [1] For a quick overview of ARMv8-M please refer to the initial cover >> letter. >> >> ChangeLog entry is as follows: >> >> >> *** gcc/ChangeLog *** >> >> 2015-11-13 Thomas Preud'homme >> >> * config/arm/arm.c (arm_print_operand_punct_valid_p): Make %? valid >> for Thumb-1. >> * config/arm/arm.h (TARGET_HAVE_CBZ): Define. >> (TARGET_IDIV): Set for all Thumb targets provided they have hardware >> divide feature. >> * config/arm/thumb1.md (thumb1_cbz): New insn. >> >> >> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h >> index 015df50..247f144 100644 >> --- a/gcc/config/arm/arm.h >> +++ b/gcc/config/arm/arm.h >> @@ -263,9 +263,12 @@ extern void >> (*arm_lang_output_object_attributes_hook)(void); /* Nonzero if this chip >> provides the movw and movt instructions. */ #define >> TARGET_HAVE_MOVT (arm_arch_thumb2 || arm_arch8) >> >> +/* Nonzero if this chip provides the cb{n}z instruction. */ >> +#define TARGET_HAVE_CBZ (arm_arch_thumb2 || arm_arch8) >> + >> /* Nonzero if integer division instructions supported. */ >> #define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \ >> - || (TARGET_THUMB2 && arm_arch_thumb_hwdiv)) >> + || (TARGET_THUMB && arm_arch_thumb_hwdiv)) >> >> /* Nonzero if disallow volatile memory access in IT block. */ >> #define TARGET_NO_VOLATILE_CE (arm_arch_no_volatile_ce) >> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c >> index d832309..5ef3a1d 100644 >> --- a/gcc/config/arm/arm.c >> +++ b/gcc/config/arm/arm.c >> @@ -22568,7 +22568,7 @@ arm_print_operand_punct_valid_p (unsigned char code) >> { >> return (code == '@' || code == '|' || code == '.' >> >> || code == '(' || code == ')' || code == '#' >> >> - || (TARGET_32BIT && (code == '?')) >> + || code == '?' >> >> || (TARGET_THUMB2 && (code == '!')) >> || (TARGET_THUMB && (code == '_'))); >> >> } >> diff --git a/gcc/config/arm/thumb1.md b/gcc/config/arm/thumb1.md >> index 7e3bcb4..074b267 100644 >> --- a/gcc/config/arm/thumb1.md >> +++ b/gcc/config/arm/thumb1.md >> @@ -973,6 +973,92 @@ >> DONE; >> }) >> >> +;; A pattern for the cb(n)z instruction added in ARMv8-M baseline profile, >> +;; adapted from cbranchsi4_insn. Modifying cbranchsi4_insn instead leads >> to +;; code generation difference for ARMv6-M because the minimum length of >> the +;; instruction becomes 2 even for it due to a limitation in >> genattrtab's +;; handling of pc in the length condition. >> +(define_insn "thumb1_cbz" >> + [(set (pc) (if_then_else >> + (match_operator 0 "equality_operator" >> + [(match_operand:SI 1 "s_register_operand" "l") >> + (const_int 0)]) >> + (label_ref (match_operand 2 "" "")) >> + (pc)))] >> + "TARGET_THUMB1 && TARGET_HAVE_MOVT" >> +{ >> + if (get_attr_length (insn) == 2) >> + { >> + if (GET_CODE (operands[0]) == EQ) >> + return "cbz\t%1, %l2"; >> + else >> + return "cbnz\t%1, %l2"; >> + } >> + else >> + { >> + rtx t = cfun->machine->thumb1_cc_insn; >> + if (t != NULL_RTX) >> + { >> + if (!rtx_equal_p (cfun->machine->thumb1_cc_op0, operands[1]) >> + || !rtx_equal_p (cfun->machine->thumb1_cc_op1, operands[2])) >> + t = NULL_RTX; >> + if (cfun->machine->thumb1_cc_mode == CC_NOOVmode) >> + { >> + if (!noov_comparison_operator (operands[0], VOIDmode)) >> + t = NULL_RTX; >> + } >> + else if (cfun->machine->thumb1_cc_mode != CCmode) >> + t = NULL_RTX; >> + } >> + if (t == NULL_RTX) >> + { >> + output_asm_insn ("cmp\t%1, #0", operands); >> + cfun->machine->thumb1_cc_insn = insn; >> + cfun->machine->thumb1_cc_op0 = operands[1]; >> + cfun->machine->thumb1_cc_op1 = operands[2]; >> + cfun->machine->thumb1_cc_mode = CCmode; >> + } >> + else >> + /* Ensure we emit the right type of condition code on the jump. */ >> + XEXP (operands[0], 0) = gen_rtx_REG (cfun->machine->thumb1_cc_mode, >> + CC_REGNUM); >> + >> + switch (get_attr_length (insn)) >> + { >> + case 4: return "b%d0\t%l2"; >> + case 6: return "b%D0\t.LCB%=;b\t%l2\t%@long jump\n.LCB%=:"; >> + case 8: return "b%D0\t.LCB%=;bl\t%l2\t%@far jump\n.LCB%=:"; >> + default: gcc_unreachable (); >> + } >> + } >> +} >> + [(set (attr "far_jump") >> + (if_then_else >> + (eq_attr "length" "8") >> + (const_string "yes") >> + (const_string "no"))) >> + (set (attr "length") >> + (if_then_else >> + (and (ge (minus (match_dup 2) (pc)) (const_int 2)) >> + (le (minus (match_dup 2) (pc)) (const_int 128)) >> + (not (match_test "which_alternative"))) >> + (const_int 2) >> + (if_then_else >> + (and (ge (minus (match_dup 2) (pc)) (const_int -250)) >> + (le (minus (match_dup 2) (pc)) (const_int 256))) >> + (const_int 4) >> + (if_then_else >> + (and (ge (minus (match_dup 2) (pc)) (const_int -2040)) >> + (le (minus (match_dup 2) (pc)) (const_int 2048))) >> + (const_int 6) >> + (const_int 8))))) >> + (set (attr "type") >> + (if_then_else >> + (eq_attr "length" "2") >> + (const_string "branch") >> + (const_string "multiple")))] >> +) >> + >> (define_insn "cbranchsi4_insn" >> [(set (pc) (if_then_else >> (match_operator 0 "arm_comparison_operator" >> >> >> Testing: >> >> * Toolchain was built successfully with and without the ARMv8-M support >> patches with the following multilib list: >> armv6-m,armv7-m,armv7e-m,cortex-m7. The code generation for crtbegin.o, >> crtend.o, crti.o, crtn.o, libgcc.a, libgcov.a, libc.a, libg.a, >> libgloss-linux.a, libm.a, libnosys.a, librdimon.a, librdpmon.a, libstdc++.a >> and libsupc++.a is unchanged for all these targets. >> >> * GCC also showed no testsuite regression when targeting ARMv8-M Baseline >> compared to ARMv6-M on ARM Fast Models and when targeting ARMv6-M and >> ARMv7-M (compared to without the patch) * GCC was bootstrapped successfully >> targeting Thumb-1 and targeting Thumb-2 >> >> Is this ok for stage3? >> >> Best regards, >> >> Thomas