From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id B664E38930F3 for ; Wed, 22 Jul 2020 13:16:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B664E38930F3 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Richard.Earnshaw@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 653B2101E; Wed, 22 Jul 2020 06:16:42 -0700 (PDT) Received: from [192.168.1.19] (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A62CD3F66E; Wed, 22 Jul 2020 06:16:41 -0700 (PDT) Subject: Re: [PATCH 1/2] Add new RTX instruction class FILLER_INSN To: Richard Biener , Andrea Corallo Cc: nd , GCC Patches References: From: "Richard Earnshaw (lists)" Message-ID: Date: Wed, 22 Jul 2020 14:16:40 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-3036.7 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Jul 2020 13:16:44 -0000 On 22/07/2020 13:24, Richard Biener via Gcc-patches wrote: > On Wed, Jul 22, 2020 at 12:03 PM Andrea Corallo wrote: >> >> Hi all, >> >> I'd like to submit the following two patches implementing a new AArch64 >> specific back-end pass that helps optimize branch-dense code, which can >> be a bottleneck for performance on some Arm cores. This is achieved by >> padding out the branch-dense sections of the instruction stream with >> nops. >> >> The original patch was already posted some time ago: >> >> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg200721.html >> >> This follows up splitting as suggested in two patches, rebasing on >> master and implementing the suggestions of the first code review. >> >> This first patch implements the addition of a new RTX instruction class >> FILLER_INSN, which has been white listed to allow placement of NOPs >> outside of a basic block. This is to allow padding after unconditional >> branches. This is favorable so that any performance gained from >> diluting branches is not paid straight back via excessive eating of >> nops. >> >> It was deemed that a new RTX class was less invasive than modifying >> behavior in regards to standard UNSPEC nops. >> >> 1/2 is requirement for 2/2. Please see this the cover letter of this last >> for more details on the pass itself. > > I wonder if such effect of instructions on the pipeline can be modeled > in the DFA and thus whether the scheduler could issue (always ready) > NOPs? > > I also wonder whether such optimization is better suited for the assembler > which should know instruction lengths and alignment in a more precise > way and also would know whether extra nops make immediates too large > for pc relative things like short branches or section anchor accesses > (or whatever else)? No, the assembler should never spontaneously insert instructions. That breaks the branch range calculations that the compiler relies upon. R. > > Richard. > >> Regards >> >> Andrea >> >> gcc/ChangeLog >> >> 2020-07-17 Andrea Corallo >> Carey Williams >> >> * cfgbuild.c (inside_basic_block_p): Handle FILLER_INSN. >> * cfgrtl.c (rtl_verify_bb_layout): Whitelist FILLER_INSN outside >> basic blocks. >> * coretypes.h: New rtx class. >> * emit-rtl.c (emit_filler_after): New function. >> * rtl.def (FILLER_INSN): New rtl define. >> * rtl.h (rtx_filler_insn): Define new structure. >> (FILLER_INSN_P): New macro. >> (is_a_helper ::test): New test helper for >> rtx_filler_insn. >> (emit_filler_after): New extern. >> * target-insns.def: Add target insn definition.