From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x22b.google.com (mail-lj1-x22b.google.com [IPv6:2a00:1450:4864:20::22b]) by sourceware.org (Postfix) with ESMTPS id 34FC2385802F for ; Fri, 18 Nov 2022 21:26:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 34FC2385802F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-lj1-x22b.google.com with SMTP id d20so8290081ljc.12 for ; Fri, 18 Nov 2022 13:26:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=gerRbRK53KxYzON4g2+Yr0bvUriSvJx2xfKDMwQ5ySc=; b=RjFPTiVJhAk5c6Mi9RDBv3L/PKv2kh/8R5+re0Y1VCM2TI40QoFR2VDFpYy43q5Akk QI/QxxQ5ilhhSXiTWztARc9PoAxBbN6SjF//pk2RjXPid1QmEERygsanBxBvfwEUpgpD yXhCidxaY7tsphftg3Et2RGeJNngM9tuyOrip5CnXzMLTJIGU6CAdh1dZKCW/3qw7UzH QoWynD+UCoJrtFoCgY/OcpoYLfG2m5h8N7PjBxabCDdRpgLRBjr7vEXDW+vyR2bUkc7G Fw7/GJqyNuid8G93Kw9AOJ79pEtEloKZn+UKhor3FgtyRWhmMg38mX+xnxAgpBqVW5q2 uq4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=gerRbRK53KxYzON4g2+Yr0bvUriSvJx2xfKDMwQ5ySc=; b=yAjSwF+F47eXymHu1NURNLkvswixyad3vGMpBSfGlej/vylu0/8k6odBfdQmRoG2a6 iCKLYhgpOcOonnKZ2XkertobJn3DTeXS6m/iNVJxLUQpPSP6EhtVWcM++FjN6xWkkMn+ 5gSA4qFH/VGKDwutx39aHJBwJfMkKDMPx89cpkSudMTvkvv65Uy+t1epJ1gqJHypOfOs lwzvz5aVUgjlKhCFRrqYRPX2t2Al1sY8WJbdGS9JNrhWjZVsA2vIkjT2nfUc14iJI69q QqY8YhJzyF8NrlrhSXQ4/Ni5el3O01M/jhMv87LQXsTyQmFOlrvHmSRkX5R3bDlsyI9x wPkQ== X-Gm-Message-State: ANoB5plY2Rf9useCZhmgyv/3S5TgQ97iIz+yB7wxm7OSWT2YxaOB66zu CfBkjasujB4UICXm1M5nUB3XyG7XBILxcmGMBlgV2g== X-Google-Smtp-Source: AA0mqf7iaEDadgd3Sofv9vfPlB+yEC0LNOeCbUxUqDJG8vxuBiJd+FX9lPiDBZ9gzvIDZNY3wEg6WmLaTQ4u+9MKcyM= X-Received: by 2002:a2e:a544:0:b0:278:f5b8:82c8 with SMTP id e4-20020a2ea544000000b00278f5b882c8mr2931159ljn.228.1668806786586; Fri, 18 Nov 2022 13:26:26 -0800 (PST) MIME-Version: 1.0 References: <20221109230718.3240479-1-philipp.tomsich@vrull.eu> <84ccfc10-df8c-5c77-913e-2819c7a14ee4@gmail.com> In-Reply-To: <84ccfc10-df8c-5c77-913e-2819c7a14ee4@gmail.com> From: Philipp Tomsich Date: Fri, 18 Nov 2022 22:26:15 +0100 Message-ID: Subject: Re: [PATCH] RISC-V: Optimise adding a (larger than simm12) constant To: Jeff Law Cc: gcc-patches@gcc.gnu.org, Christoph Muellner , Palmer Dabbelt , Vineet Gupta , Kito Cheng , Jeff Law Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,JMQ_SPF_NEUTRAL,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, 18 Nov 2022 at 22:13, Jeff Law wrote: > > > On 11/9/22 16:07, Philipp Tomsich wrote: > > Handling the register-const_int addition has very quickly escalated to > > creating a full sign-extended 32bit constant and performing a > > register-register for RISC-V in GCC so far, resulting in sequences like > > (for the case of "a + 2048"): > > li a5,4096 > > addi a5,a5,-2048 > > add a0,a0,a5 > > > > By adding an expansion for add3, we can emit optimised RTL that > > matches the capabilities of RISC-V better by adding support for the > > following, previously unoptimised cases: > > - addi + addi > > addi a0,a0,2047 > > addi a0,a0,1 > > - li + sh[123]add (if Zba is enabled) > > li a5,960 > > sh3add a0,a5,a0 > > > > With this commit, we also fix up riscv_adjust_libcall_cfi_prologue() > > and riscv_adjust_libcall_cfi_epilogue() to not use gen_add3_insn, as > > the expander will otherwise wrap the resulting set-expression in an > > insn (causing an ICE at dwarf2-time) when invoked with -msave-restore. > > > > This closes the gap to LLVM, which has already been emitting these > > optimised sequences. > > > > Note that this benefits is perlbench (in SPEC CPU 2017), which needs > > to add the constant 3840. > > > > gcc/ChangeLog: > > > > * config/riscv/bitmanip.md (*shNadd): Rename. > > (riscv_shNadd): Expose as gen_riscv_shNadd{di/si}. > > * config/riscv/predicates.md (const_arith_shifted123_operand): > > New predicate (for constants that are a simm12, shifted by > > 1, 2 or 3). > > (const_arith_2simm12_operand): New predicate (that can be > > expressed by adding 2 simm12 together). > > (addi_operand): New predicate (an immedaite operand suitable > > for the new add3 expansion). > > * config/riscv/riscv.cc (riscv_adjust_libcall_cfi_prologue): > > Don't use gen_add3_insn, where a RTX instead of an INSN is > > required (otherwise this will break as soon as we have a > > define_expand for add3). > > (riscv_adjust_libcall_cfi_epilogue): Same. > > * config/riscv/riscv.md (addsi3): Rename. > > (riscv_addsi3): New name for addsi3. > > (adddi3): Rename. > > (riscv_adddi3): New name for adddi3. > > (add3): New expander that handles the basic and fancy > > (such as li+sh[123]add, addi+addi, ...) cases for adding > > register-register and register-const_int. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/riscv/addi.c: New test. > > * gcc.target/riscv/zba-shNadd-06.c: New test. > > > > Signed-off-by: Philipp Tomsich > > --- > > > > gcc/config/riscv/bitmanip.md | 2 +- > > gcc/config/riscv/predicates.md | 28 +++++++++ > > gcc/config/riscv/riscv.cc | 10 ++-- > > gcc/config/riscv/riscv.md | 58 ++++++++++++++++++- > > gcc/testsuite/gcc.target/riscv/addi.c | 39 +++++++++++++ > > .../gcc.target/riscv/zba-shNadd-06.c | 11 ++++ > > 6 files changed, 141 insertions(+), 7 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/riscv/addi.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shNadd-06.c > > > > > > > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > > index 171a0cdced6..289ff7470c6 100644 > > --- a/gcc/config/riscv/riscv.md > > +++ b/gcc/config/riscv/riscv.md > > @@ -464,6 +464,60 @@ > > [(set_attr "type" "arith") > > (set_attr "mode" "DI")]) > > > > +(define_expand "add3" > > + [(set (match_operand:GPR 0 "register_operand" "=r,r") > > + (plus:GPR (match_operand:GPR 1 "register_operand" " r,r") > > + (match_operand:GPR 2 "addi_operand" " r,I")))] > > + "" > > +{ > > + if (arith_operand (operands[2], mode)) > > + emit_insn (gen_riscv_add3 (operands[0], operands[1], operands[2])); > > + else if (const_arith_2simm12_operand (operands[2], mode)) > > + { > > + /* Split into two immediates that add up to the desired value: > > + * e.g., break up "a + 2445" into: > > + * addi a0,a0,2047 > > + * addi a0,a0,398 > > + */ > > Nit. GNU comment style please. > > > > + > > + HOST_WIDE_INT val = INTVAL (operands[2]); > > + HOST_WIDE_INT saturated = HOST_WIDE_INT_M1U << (IMM_BITS - 1); > > + > > + if (val >= 0) > > + saturated = ~saturated; > > + > > + val -= saturated; > > + > > + rtx tmp = gen_reg_rtx (mode); > > Can't add3 be generated by LRA? If so, don't you have to guard > against going into this path as we shouldn't be creating new pseudos at > that point (I know LRA can create some internally, but I don't think it > handles new ones showing up due to target expanders). > > > Similarly for the shifted_123 case immediately following. > > > If we do indeed have an issue here, I'm not sure how best to resolve. > If the output operand does not overlap with the inputs, then we're > golden and can just re-use it to form the constant. If not, then it's > a bit tougher. I'm not keen to add a test of no_new_pseudos to the > operand predicate, but I don't see a better option yet. >From a cursory glance, LRA does not try to go through gen_add3_insn, but rather forms PLUS rtx. This matches my expectation, as (if we can't create a pseudo) the original add3 definition would run into the same problem (as an immediate that doesn't satisfy the SMALL_OPERAND test will need to be placed in a register). Philipp.