From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x433.google.com (mail-pf1-x433.google.com [IPv6:2607:f8b0:4864:20::433]) by sourceware.org (Postfix) with ESMTPS id 24E003854551 for ; Fri, 18 Nov 2022 21:13:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 24E003854551 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pf1-x433.google.com with SMTP id v28so6009498pfi.12 for ; Fri, 18 Nov 2022 13:13:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=mvy7BRVKqHOuUx+Vnrk7npc9FrTXb8+2fijGRMzqBhc=; b=bkXa6DMr2oiGu7Q4GZvtxPMIcNrtWoGh4gtaa9PyQIgQQLY3caYGd45DbL93YF2B5n RvPqo9E+zfGTi3p4HOqXzbc9bjuhnLDdVB2Hivh6yEgpyUtETY+f6Lkn9IELMXbioVDx LyhZWSlLsi3xlnUtzFJMfPhZNVfoXOMk7PRBsTu1Y11NbxAex5RxDKhPaeWYrIubOKUF 32nB1BtjjlrOYnpTnUyr+HUSFRmnOQ2HjJ8G86Zhlq2h0A8CPWGNkJgbtujQL0cr7oyP MQW67BEaj2JhWdCJkASru/jbtszIIc9Gb8VKMkET9EhBycXMKaRgZdb3domk7dHlYfJE IcaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mvy7BRVKqHOuUx+Vnrk7npc9FrTXb8+2fijGRMzqBhc=; b=FDETnSTyis0olr7H8528P6vmP5Ehfw4DpnEVEb4LUUivHnDV4XAcek8w9U/QoZd9Le BLbFPt1EA0XswwF3FEUu4kkbfXC89z9Gm2kinTz+CCLG2XsnIzwv2D2Ioqp4nu2JnGde NfAWxTW/yKhFE3MjX38YHAUDigGpkx5pFZiicw5r5qJZWkmbFAuzOoULO58fMfP82UpM SkGXsA/MpMqNtdm7iAtsLWRbL5xzSydKo0P2sIPigKvQ4Lqk5Fuqlk/wSeEp1D5+5U9y sbIUhP+3WTBocqCGqO7q5XXyudIcFHEPZ8NM3kF8QXTHIkLERF/c5KqhWnru5cbLB0Bf E6yw== X-Gm-Message-State: ANoB5pnd7aMyH5aiTHTIzDkBZM+bDkSBSZ7sEDiDqVu92wBRZSJySX1G laynIB+LCDr4A2JfG3upOIg= X-Google-Smtp-Source: AA0mqf4BVY67+29iK+Ux08MVRlSQYZPBBgrBjv09LzYMyQA6E9Lxd8CdNrD68dDso4EAnDGBbURhPw== X-Received: by 2002:a63:fa0b:0:b0:46b:2493:14ad with SMTP id y11-20020a63fa0b000000b0046b249314admr8303853pgh.274.1668806016857; Fri, 18 Nov 2022 13:13:36 -0800 (PST) Received: from ?IPV6:2601:681:8600:13d0::f0a? ([2601:681:8600:13d0::f0a]) by smtp.gmail.com with ESMTPSA id c30-20020a631c1e000000b004308422060csm2131764pgc.69.2022.11.18.13.13.35 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 18 Nov 2022 13:13:36 -0800 (PST) Message-ID: <84ccfc10-df8c-5c77-913e-2819c7a14ee4@gmail.com> Date: Fri, 18 Nov 2022 14:13:34 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.1 Subject: Re: [PATCH] RISC-V: Optimise adding a (larger than simm12) constant Content-Language: en-US To: Philipp Tomsich , gcc-patches@gcc.gnu.org Cc: Christoph Muellner , Palmer Dabbelt , Vineet Gupta , Kito Cheng , Jeff Law References: <20221109230718.3240479-1-philipp.tomsich@vrull.eu> From: Jeff Law In-Reply-To: <20221109230718.3240479-1-philipp.tomsich@vrull.eu> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 11/9/22 16:07, Philipp Tomsich wrote: > Handling the register-const_int addition has very quickly escalated to > creating a full sign-extended 32bit constant and performing a > register-register for RISC-V in GCC so far, resulting in sequences like > (for the case of "a + 2048"): > li a5,4096 > addi a5,a5,-2048 > add a0,a0,a5 > > By adding an expansion for add3, we can emit optimised RTL that > matches the capabilities of RISC-V better by adding support for the > following, previously unoptimised cases: > - addi + addi > addi a0,a0,2047 > addi a0,a0,1 > - li + sh[123]add (if Zba is enabled) > li a5,960 > sh3add a0,a5,a0 > > With this commit, we also fix up riscv_adjust_libcall_cfi_prologue() > and riscv_adjust_libcall_cfi_epilogue() to not use gen_add3_insn, as > the expander will otherwise wrap the resulting set-expression in an > insn (causing an ICE at dwarf2-time) when invoked with -msave-restore. > > This closes the gap to LLVM, which has already been emitting these > optimised sequences. > > Note that this benefits is perlbench (in SPEC CPU 2017), which needs > to add the constant 3840. > > gcc/ChangeLog: > > * config/riscv/bitmanip.md (*shNadd): Rename. > (riscv_shNadd): Expose as gen_riscv_shNadd{di/si}. > * config/riscv/predicates.md (const_arith_shifted123_operand): > New predicate (for constants that are a simm12, shifted by > 1, 2 or 3). > (const_arith_2simm12_operand): New predicate (that can be > expressed by adding 2 simm12 together). > (addi_operand): New predicate (an immedaite operand suitable > for the new add3 expansion). > * config/riscv/riscv.cc (riscv_adjust_libcall_cfi_prologue): > Don't use gen_add3_insn, where a RTX instead of an INSN is > required (otherwise this will break as soon as we have a > define_expand for add3). > (riscv_adjust_libcall_cfi_epilogue): Same. > * config/riscv/riscv.md (addsi3): Rename. > (riscv_addsi3): New name for addsi3. > (adddi3): Rename. > (riscv_adddi3): New name for adddi3. > (add3): New expander that handles the basic and fancy > (such as li+sh[123]add, addi+addi, ...) cases for adding > register-register and register-const_int. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/addi.c: New test. > * gcc.target/riscv/zba-shNadd-06.c: New test. > > Signed-off-by: Philipp Tomsich > --- > > gcc/config/riscv/bitmanip.md | 2 +- > gcc/config/riscv/predicates.md | 28 +++++++++ > gcc/config/riscv/riscv.cc | 10 ++-- > gcc/config/riscv/riscv.md | 58 ++++++++++++++++++- > gcc/testsuite/gcc.target/riscv/addi.c | 39 +++++++++++++ > .../gcc.target/riscv/zba-shNadd-06.c | 11 ++++ > 6 files changed, 141 insertions(+), 7 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/addi.c > create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shNadd-06.c > > > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > index 171a0cdced6..289ff7470c6 100644 > --- a/gcc/config/riscv/riscv.md > +++ b/gcc/config/riscv/riscv.md > @@ -464,6 +464,60 @@ > [(set_attr "type" "arith") > (set_attr "mode" "DI")]) > > +(define_expand "add3" > + [(set (match_operand:GPR 0 "register_operand" "=r,r") > + (plus:GPR (match_operand:GPR 1 "register_operand" " r,r") > + (match_operand:GPR 2 "addi_operand" " r,I")))] > + "" > +{ > + if (arith_operand (operands[2], mode)) > + emit_insn (gen_riscv_add3 (operands[0], operands[1], operands[2])); > + else if (const_arith_2simm12_operand (operands[2], mode)) > + { > + /* Split into two immediates that add up to the desired value: > + * e.g., break up "a + 2445" into: > + * addi a0,a0,2047 > + * addi a0,a0,398 > + */ Nit.  GNU comment style please. > + > + HOST_WIDE_INT val = INTVAL (operands[2]); > + HOST_WIDE_INT saturated = HOST_WIDE_INT_M1U << (IMM_BITS - 1); > + > + if (val >= 0) > + saturated = ~saturated; > + > + val -= saturated; > + > + rtx tmp = gen_reg_rtx (mode); Can't add3 be generated by LRA?  If so, don't you have to guard against going into this path as we shouldn't be creating new pseudos at that point (I know LRA can create some internally, but I don't think it handles new ones showing up due to target expanders). Similarly for the shifted_123 case immediately following. If we do indeed have an issue here, I'm not sure how best to resolve.  If the output operand does not overlap with the inputs, then we're golden and can just re-use it to form the constant.  If not,  then it's a bit tougher.  I'm not keen to add a test of no_new_pseudos to the operand predicate, but I don't see a better option yet. jeff