From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x236.google.com (mail-lj1-x236.google.com [IPv6:2a00:1450:4864:20::236]) by sourceware.org (Postfix) with ESMTPS id 8F4303858D38 for ; Thu, 16 Mar 2023 07:57:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8F4303858D38 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x236.google.com with SMTP id a32so735505ljq.1 for ; Thu, 16 Mar 2023 00:57:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1678953453; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=4Jt6DSCXcCzQWRrx4Gwt4UxYMKgwQygPjVviZ6BYOPg=; b=TYBz+6KLWmXbdldzIt+P1aGkUmPE7SizgW/T4bdMBnczaxjQrmwreIk9UUVC/9I9Mb t5sH/+yj9S7kWChdWM2TqmupiEaWlOEblyZaJ4W7irztj1PmPmJa2e25TuxCxRrNvq9O Pn3zPz2miSxPbZYko7bKegAGiwxqUP3l83x/Qo0s79XiAnhiFa0jpqSGFGYvwnNUMumi vHRxPe8WUz6nJmQS+EvCFZSpz26Ch3tJIIZiVuccdqo91QlNiW6siHa4AG+THQrWKaml LoAAhij4eqXdT50WZGVTNk+L0TOrGa9GgXrFN2fgfkI0XIzJZxFChaL6PghN8dQS1X3z /qCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678953453; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4Jt6DSCXcCzQWRrx4Gwt4UxYMKgwQygPjVviZ6BYOPg=; b=qIk1lO7zdhgpwygGXRuJPC1WvBw819JL1rNYHim03hfDM/j4/OJxNV7K5LDMhikhSD ahjrelGf2h+xoRHpT8XJ8C5UNTJE5NbdYkPEtp+9fuRw7xAJd71NwLeDlpZJSlCPPThY elYLQfQcZbMDWoU1C53fh0VjHCYHi5x3J8PUH0f2+yYC1sQnFPw53EmPBbYlh359JnwV ysVAjGye22Y+FVIdD1E/5mqeYIN3GftvT8Ww5PI0gMywMIdwt+8Q8io8JpxuOR9giU7K 32FnLbrJRf8M6He8yTdB+jVfku+n1XgCfiwLCqJyuaM4K/fPBSBBLtHbtgRO44J2FwtO Ohhg== X-Gm-Message-State: AO0yUKVLF1LdDJUiRsHO62n8LKWMTVvZV4r99CePg39nZX8DYqxRNeqR UXqYY1nDUB3vI9mIMUmSruiX3ePIUpzMFMz8zBE= X-Google-Smtp-Source: AK7set/JOcfPs34cjhQaGZW/7ob4cprjKef474QxD4GjzRNPVYxN2/I11kEOCtYcoOnBmgK8O5XmMyDkE3ogYieRmxg= X-Received: by 2002:a2e:b5dc:0:b0:295:9626:a20f with SMTP id g28-20020a2eb5dc000000b002959626a20fmr1808367ljn.1.1678953452705; Thu, 16 Mar 2023 00:57:32 -0700 (PDT) MIME-Version: 1.0 References: <22e83da3-a81f-dd61-c04b-a39b459a965f@linux.ibm.com> In-Reply-To: <22e83da3-a81f-dd61-c04b-a39b459a965f@linux.ibm.com> From: Richard Biener Date: Thu, 16 Mar 2023 08:57:08 +0100 Message-ID: Subject: Re: [PATCH-1, rs6000] Put constant into pseudo at expand when it needs two insns [PR86106] To: HAO CHEN GUI Cc: gcc-patches , Segher Boessenkool , David , "Kewen.Lin" , Peter Bergner , "Vladimir N. Makarov" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, Mar 16, 2023 at 6:35=E2=80=AFAM HAO CHEN GUI via Gcc-patches wrote: > > Hi, > Currently, rs6000 directly expands to 2 insns if an integer constant is= the > second operand and it needs two insns. For example, addi/addis and ori/or= is. > It may not benefit when the constant is used for more than 2 times in an > extended basic block, just like the case in PR shows. > > One possible solution is to force the constant in pseudo at expand and = let > propagation pass and combine pass decide if the pseudo should be replaced > with the constant or not by comparing the rtx/insn cost. > > It generates a constant move if the constant is forced to a pseudo. The= re > is one constant move if it's used only once. The combine pass can combine > the constant move and add/ior/xor insn and eliminate the move as the insn > cost reduces. There are multiple moves if the constant is used for severa= l > times. In an extended basic block, these constant moves are merged to one= by > propagation pass. The combine pass can't replace the pseudo with the cons= tant > as it is no cost saving. > > In an extreme case, the constant is used twice in an extended basic blo= ck. > The cost(latency) is unchanged between putting constant in pseudo and > generating 2 insns. The dependence of instructions reduces but one more > register is used. In other case, it should be always optimal to put const= ant > in a pseudo. > > This patch changes the expander of integer add and force constant to a > pseudo when it needs 2 insn. Also a combine and split pattern is defined. So this is one way around the lack of CSE/PRE of constant operands. I'd argue that a better spot for this _might_ be LRA (split the constant out if there's a free register available), postreload-[g]cse (CSE the constants) a= nd then maybe cprop_hardreg to combine back single-use constants? I'm not sure if careful constraints massaging like adding magic letters to alternatives with constants to pessimize them for LRA, making them more expensive than spilling the constant to a register but avoid secondary reloads with spilling a register to the stack to make room for the constant, is possible - but in theory a special constraint modifier for this purpose could be invented. Richard. > Bootstrapped and tested on powerpc64-linux BE and LE with no regression= s. > > Thanks > Gui Haochen > > ChangeLog > 2023-03-14 Haochen Gui > > gcc/ > * config/rs6000/predicates.md (add_2insn_cint_operand): New predi= cate > which returns true when op is a 32-bit but not a 16-bit signed > integer constant. > * config/rs6000/rs6000.md (add3): Put the second operand in= to > register when it's a constant and need 2 add insns. > (*add_2insn): New insn_and_split for 2-insn add. > > > patch.diff > diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicat= es.md > index a1764018545..09e59a48cd3 100644 > --- a/gcc/config/rs6000/predicates.md > +++ b/gcc/config/rs6000/predicates.md > @@ -282,6 +282,13 @@ (define_predicate "s32bit_cint_operand" > (and (match_code "const_int") > (match_test "(0x80000000 + UINTVAL (op)) >> 32 =3D=3D 0"))) > > +;; Return 1 if op is a 32-bit but not 16-bit constant signed integer > +(define_predicate "add_2insn_cint_operand" > + (and (match_code "const_int") > + (and (match_operand 0 "s32bit_cint_operand") > + (and (not (match_operand 0 "short_cint_operand")) > + (not (match_operand 0 "upper16_cint_operand")))))) > + > ;; Return 1 if op is a constant 32-bit unsigned > (define_predicate "c32bit_cint_operand" > (and (match_code "const_int") > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 6011f5bf76a..dba41e3df90 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -1796,12 +1796,44 @@ (define_expand "add3" > /* The ordering here is important for the prolog expander. > When space is allocated from the stack, adding 'low' first may > produce a temporary deallocation (which would be bad). */ > - emit_insn (gen_add3 (tmp, operands[1], GEN_INT (rest))); > - emit_insn (gen_add3 (operands[0], tmp, GEN_INT (low))); > - DONE; > + if (!can_create_pseudo_p ()) > + { > + emit_insn (gen_add3 (tmp, operands[1], GEN_INT (rest))); > + emit_insn (gen_add3 (operands[0], tmp, GEN_INT (low))); > + DONE; > + } > + > + operands[2] =3D force_reg (mode, operands[2]); > } > }) > > +/* The ordering here is important for the prolog expander. > + When space is allocated from the stack, adding 'low' first may > + produce a temporary deallocation (which would be bad). */ > + > +(define_insn_and_split "*add_2insn" > + [(set (match_operand:GPR 0 "gpc_reg_operand" "=3Db") > + (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%b") > + (match_operand:GPR 2 "add_2insn_cint_operand" "n")))] > + "!TARGET_PREFIXED" > + "#" > + "&& 1" > + [(set (match_dup 0) > + (plus:GPR (match_dup 1) > + (match_dup 3))) > + (set (match_dup 0) > + (plus:GPR (match_dup 0) > + (match_dup 4)))] > +{ > + HOST_WIDE_INT val =3D INTVAL (operands[2]); > + HOST_WIDE_INT low =3D sext_hwi (val, 16); > + HOST_WIDE_INT rest =3D trunc_int_for_mode (val - low, mode); > + > + operands[3] =3D GEN_INT (rest); > + operands[4] =3D GEN_INT (low); > +} > + [(set_attr "length" "8")]) > + > (define_insn "*add3" > [(set (match_operand:GPR 0 "gpc_reg_operand" "=3Dr,r,r,r") > (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b")