From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by sourceware.org (Postfix) with ESMTP id CFC543858D35 for ; Fri, 21 Aug 2020 02:09:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org CFC543858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=segher@kernel.crashing.org Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 07L29f0d009675; Thu, 20 Aug 2020 21:09:41 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 07L29eIL009674; Thu, 20 Aug 2020 21:09:40 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Thu, 20 Aug 2020 21:09:40 -0500 From: Segher Boessenkool To: Michael Meissner , gcc-patches@gcc.gnu.org, David Edelsohn , Bill Schmidt , Peter Bergner , Alan Modra Subject: Re: [PATCH 1/3] Power10: Add PCREL_OPT load support Message-ID: <20200821020940.GI28786@gate.crashing.org> References: <20200818063141.GA5783@ibm-toto.the-meissners.org> <20200818063401.GA11071@ibm-toto.the-meissners.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200818063401.GA11071@ibm-toto.the-meissners.org> User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, TXREP, T_SPF_HELO_PERMERROR, T_SPF_PERMERROR autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Aug 2020 02:09:43 -0000 Hi! On Tue, Aug 18, 2020 at 02:34:01AM -0400, Michael Meissner wrote: > +// Maximum number of insns to scan between the load address and the load that Please don't mix /* and // comments. Just stick to /* comments, like all the rest of our backend? > +const int MAX_PCREL_OPT_INSNS = 10; "static const", and not tab please (just a space). > + // LWA is a DS format instruction, but LWZ is a D format instruction. We use > + // DImode for the mode to force checking whether the bottom 2 bits are 0. How so? Can't you just check if the resulting insn is accepted? That would be so much more robust (and can have a correct description more easily, too!) > + // However FPR and vector registers uses the LFIWAX instruction which is > + // indexed only. (vectors use lxsiwax) > + if (GET_CODE (mem) == SIGN_EXTEND && GET_MODE (XEXP (mem, 0)) == SImode) You're checking here whether the address of the MEM is SImode. > + { > + if (!INT_REGNO_P (reg_regno)) That should use base_reg_operand instead? > + // The optimization will only work on non-prefixed offsettable loads. > + rtx addr = XEXP (mem_inner, 0); > + enum insn_form iform = address_to_insn_form (addr, mem_mode, non_prefixed); > + if (iform != INSN_FORM_BASE_REG > + && iform != INSN_FORM_D > + && iform != INSN_FORM_DS > + && iform != INSN_FORM_DQ) > + return false; Sounds like something for a utility function? Do we use this elsewhere? > + ++pcrel_opt_next_num; Normal style is postfix increment. Empty lines around this are nice, it indicates this is the point where we decided to do the thing. Or add a comment even, maybe? > + unsigned int addr_regno = reg_or_subregno (addr_reg); > + rtx label_num = GEN_INT (pcrel_opt_next_num); > + rtx reg_di = gen_rtx_REG (DImode, reg_regno); > + > + PATTERN (addr_insn) > + = ((addr_regno != reg_regno) > + ? gen_pcrel_opt_ld_addr (addr_reg, addr_symbol, label_num, reg_di) > + : gen_pcrel_opt_ld_addr_same_reg (addr_reg, addr_symbol, label_num)); Please use if() instead of multi-line expressions. The outer parens are unnecessary, too. > + // Revalidate the insn, backing out of the optimization if the insn is not > + // supported. So the count will be incorrect then? > + INSN_CODE (addr_insn) = recog (PATTERN (addr_insn), addr_insn, 0); > + if (INSN_CODE (addr_insn) < 0) > + { > + PATTERN (addr_insn) = addr_set; > + INSN_CODE (addr_insn) = recog (PATTERN (addr_insn), addr_insn, 0); > + return false; > + } Can you use the normal undo mechanisms, instead? apply_change_group etc. > +static rtx_insn * > +next_active_insn_in_basic_block (rtx_insn *insn) > +{ > + insn = NEXT_INSN (insn); > + > + while (insn != NULL_RTX) > + { > + /* If the basic block ends or there is a jump of some kind, exit the > + loop. */ > + if (CALL_P (insn) > + || JUMP_P (insn) > + || JUMP_TABLE_DATA_P (insn) > + || LABEL_P (insn) > + || BARRIER_P (insn)) > + return NULL; > + > + /* If this is a real insn, return it. */ > + if (!insn->deleted () > + && NONJUMP_INSN_P (insn) > + && GET_CODE (PATTERN (insn)) != USE > + && GET_CODE (PATTERN (insn)) != CLOBBER) > + return insn; > + > + /* Loop for USE, CLOBBER, DEBUG_INSN, NOTEs. */ > + insn = NEXT_INSN (insn); > + } > + > + return NULL; > +} There are things to do this. Please don't write code manually parsing RTL unless you have to. > +// Validate that a load is actually a single instruction that can be optimized > +// with the PCREL_OPT optimization. > + > +static bool > +is_single_instruction (rtx_insn *insn, rtx reg) Of course it is a single instruction! A single RTL instruction... Clarify as "single machine instruction"? > +{ > + if (!REG_P (reg) && !SUBREG_P (reg)) > + return false; You need to check if is a subreg of reg, then. There are subregs of other things. Not that you should see those here, but still. > + // _Decimal128 and IBM extended double are always multiple instructions. > + machine_mode mode = GET_MODE (reg); > + if (mode == TFmode && !TARGET_IEEEQUAD) > + return false; > + > + if (mode == TDmode || mode == IFmode) > + return false; Don't we have a helper for this? The ibm128 part at least. Maybe we should just have an attribute on the insns that *can* get this optimisation? > + return (base_reg_operand (XEXP (addr, 0), Pmode) > + && satisfies_constraint_I (XEXP (addr, 1))); short_cint_operand. But maybe just rs6000_legitimate_offset_address_p? > /* Flags that need to be turned off if -mno-power10. */ > #define OTHER_POWER10_MASKS (OPTION_MASK_MMA \ > | OPTION_MASK_PCREL \ > + | OPTION_MASK_PCREL_OPT \ > | OPTION_MASK_PREFIXED) This isn't a CPU flag. Instead, it is just an optimisation option. > @@ -8515,7 +8522,10 @@ rs6000_delegitimize_address (rtx orig_x) > { > rtx x, y, offset; > > - if (GET_CODE (orig_x) == UNSPEC && XINT (orig_x, 1) == UNSPEC_FUSION_GPR) > + if (GET_CODE (orig_x) == UNSPEC > + && (XINT (orig_x, 1) == UNSPEC_FUSION_GPR > + || XINT (orig_x, 1) == UNSPEC_PCREL_OPT_LD_ADDR > + || XINT (orig_x, 1) == UNSPEC_PCREL_OPT_LD_ADDR_SAME_REG)) > orig_x = XVECEXP (orig_x, 0, 0); > > orig_x = delegitimize_mem_from_attrs (orig_x); Why this? It needs an explanation (in the code). > @@ -13197,6 +13207,19 @@ print_operand (FILE *file, rtx x, int code) > fprintf (file, "%d", 128 >> (REGNO (x) - CR0_REGNO)); > return; > > + case 'r': > + /* X is a label number for the PCREL_OPT optimization. Emit the .reloc > + to enable this optimization, unless the value is 0. */ > + gcc_assert (CONST_INT_P (x)); > + if (UINTVAL (x) != 0) > + { > + unsigned int label_num = UINTVAL (x); > + fprintf (file, > + ".reloc .Lpcrel%u-8,R_PPC64_PCREL_OPT,.-(.Lpcrel%u-8)\n\t", > + label_num, label_num); > + } > + return; Don't eat output modifiers please. We have only so few left, and we cannot recycle any more without pain. I don't see why we cannot just do this in the normal output (C) code of the one or few insns that want this? > ;; The ISA we implement. > -(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10" > +(define_attr "isa" "any,p5,p6,p7,p7v,p8v,p9,p9v,p9kf,p9tf,p10,pcrel_opt" > (const_string "any")) No. Please read the heading. Segher