From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pj1-x1035.google.com (mail-pj1-x1035.google.com [IPv6:2607:f8b0:4864:20::1035]) by sourceware.org (Postfix) with ESMTPS id A529F3858291 for ; Wed, 23 Nov 2022 05:07:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A529F3858291 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pj1-x1035.google.com with SMTP id y14-20020a17090a2b4e00b002189a1b84d4so862288pjc.2 for ; Tue, 22 Nov 2022 21:07:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=5B3gTYjTMiZ27LwLlo6agSuoe8j0EuIOPDjMzO+MeoQ=; b=CDMazrJa4P2hAEAgU3aQNP3CooAsvuOHYL2PLkocn8RK6GDInXD9caK+IjPDfzR5+R ezEqWNuqg6lHPnc7l24tQa/rg7QUsdQW+cTNP+05l9RWkJbcK1sehv6nEOsO8v4kjZaC ZOYc6DeeirO5EGO3mpFoFD6PwpBawAtIbjOMiFWUQdPhQDxXFixQ5Xq2a2iyxiYDAFOC mpnmzIINB27iMaN+Dl7vJX+4qzosQbTcX38qQ/q8i0sTzXAk+hYkW67ecyGWKl4CDImq RjW+VC1/BnfI5TZ08wOgW0OlATIrVoUNplVdOblu9SBTwBV1iX1eDsB6Ohgwl9pljtFb EgjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=5B3gTYjTMiZ27LwLlo6agSuoe8j0EuIOPDjMzO+MeoQ=; b=0kPvBRo44+qUS7svBnda2TR2qQ+O20VS5VV+W37Q6xtkawgkutGLaCc6H80/XUhuis 5fZm4XwcsUyWev8/N1VLOF6gxilrxjuDOPmYQGC9RrgW4gSyT04gLgVb/YV6GXNTMqrO Acfv/av9nhmcs9tJVjN+ogvAWieENlwXltFbL7NgdEEn1z4PLfu9WsSXgv04Urvm0Sao pwIxSHyQMIPQVoTregILyRTXc15BTl6Tv/HtQo/+4j1vU9zOUYGboaD8G15sK/SlgNUo wXiqX3MvfsN8Pjggz4mYRDeLW5gGKqWqWxs2R1EAeENoYBxEwVavKW6rDNDCN/kI2t4f Ofig== X-Gm-Message-State: ANoB5pnx9FkvcwJFJPEEdXCyZaTXf97VJ85gdLGM8GdQ0taasXSTU7D1 dy2OT9tlu3fXnoHUfteh28o= X-Google-Smtp-Source: AA0mqf4mxse+wkLa14h1SXNvFB9Bw2C4bWbKOdQC5xiLqq98OMSESU9FeZdC/b/YEaQaAGC7v/iJZw== X-Received: by 2002:a17:90a:2b8c:b0:212:f4f1:96ee with SMTP id u12-20020a17090a2b8c00b00212f4f196eemr36013307pjd.72.1669180053345; Tue, 22 Nov 2022 21:07:33 -0800 (PST) Received: from ?IPV6:2601:681:8600:13d0::99f? ([2601:681:8600:13d0::99f]) by smtp.gmail.com with ESMTPSA id 85-20020a621958000000b0056b4c5dde61sm11928525pfz.98.2022.11.22.21.07.32 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 22 Nov 2022 21:07:32 -0800 (PST) Message-ID: Date: Tue, 22 Nov 2022 22:07:31 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.1 Subject: Re: [PATCH v3] RISC-V modified add3 for large stack frame optimization [PR105733] Content-Language: en-US To: Kevin Lee , gcc-patches@gcc.gnu.org Cc: gnu-toolchain@rivosinc.com References: <20221104005331.775049-1-kevinl@rivosinc.com> From: Jeff Law In-Reply-To: <20221104005331.775049-1-kevinl@rivosinc.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 11/3/22 18:53, Kevin Lee wrote: > This is the identical patch with https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604814.html, but with the correct plaintext format. > >> The loop still seems a bit odd which may point to further improvements >> that could be made to this patch. Consider this fragment of the loop: > Thank you for the review Jeff! I am currently looking into this issue > in a different patch. I'll come back with some improvement. > > gcc/ChangeLog: > Jim Wilson > Michael Collison > Kevin Lee > > * config/riscv/predicates.md (const_lui_operand): New Predicate. > (add_operand): Ditto. > (reg_or_const_int_operand): Ditto. > * config/riscv/riscv-protos.h (riscv_eliminable_reg): New > function. > * config/riscv/riscv-selftests.cc (calculate_x_in_sequence): > Consider Parallel insns. > * config/riscv/riscv.cc (riscv_eliminable_reg): New function. > (riscv_adjust_libcall_cfi_prologue): Use gen_rtx_SET and > gen_rtx_fmt_ee instead of gen_add3_insn. > (riscv_adjust_libcall_cfi_epilogue): Ditto. > * config/riscv/riscv.md (addsi3): Remove. > (add3): New instruction for large stack frame > optimization. > (add3_internal): Ditto. > (adddi3): Remove. > (add3_internal2): New instruction for insns generated in > the prologue and epilogue pass. > --- > > diff --git a/gcc/config/riscv/riscv-selftests.cc b/gcc/config/riscv/riscv-selftests.cc > index 636874ebc0f..50457db708e 100644 > --- a/gcc/config/riscv/riscv-selftests.cc > +++ b/gcc/config/riscv/riscv-selftests.cc > @@ -116,6 +116,9 @@ calculate_x_in_sequence (rtx reg) > rtx pat = PATTERN (insn); > rtx dest = SET_DEST (pat); > > + if (GET_CODE (pat) == PARALLEL) > + dest = SET_DEST (XVECEXP (pat, 0, 0)); So this assumes you've got a parallel where the first vector is a SET.  That may well be true, but it's probably safer to verify with something like     gcc_assert (GET_CODE (XVECEXP (pat, 0, 0)) == SET) That way we're not surprised in the future if we have more patterns that use PARALLEL, perhaps with the first not being a simple set. > +{ > + return REG_P (x) && (REGNO (x) == FRAME_POINTER_REGNUM > + || REGNO (x) == ARG_POINTER_REGNUM > + || (REGNO (x) >= FIRST_VIRTUAL_REGISTER > + && REGNO (x) <= LAST_VIRTUAL_REGISTER)); Instead I'd write it as   return (REG_P (x)           && (REGNO (x) == FRAME_POINTER_REGNUM               || REGNO (x) == ARG_POINTER_REGNUM               || (REGNO (x) >= FIRST_VIRUTAL_REGISTER                   && REGNO (x) <= LAST_VIRTUAL_REGISTER))); That's just the style most GCC folks are used to reading. > @@ -4887,8 +4897,9 @@ riscv_adjust_libcall_cfi_prologue () > } > > /* Debug info for adjust sp. */ > - adjust_sp_rtx = gen_add3_insn (stack_pointer_rtx, > - stack_pointer_rtx, GEN_INT (-saved_size)); > + adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx, > + gen_rtx_fmt_ee (PLUS, GET_MODE (stack_pointer_rtx), > + stack_pointer_rtx, GEN_INT (saved_size))); > dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx, > dwarf); > return dwarf; > @@ -4990,8 +5001,9 @@ riscv_adjust_libcall_cfi_epilogue () > int saved_size = cfun->machine->frame.save_libcall_adjustment; > > /* Debug info for adjust sp. */ > - adjust_sp_rtx = gen_add3_insn (stack_pointer_rtx, > - stack_pointer_rtx, GEN_INT (saved_size)); > + adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx, > + gen_rtx_fmt_ee (PLUS, GET_MODE (stack_pointer_rtx), > + stack_pointer_rtx, GEN_INT (saved_size))); > dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx, > dwarf); I think this duplicates a change from Phillip & his team. This as to fix ICEs in the dwarf2 CFI generator, right?  Please double check as remove if it does duplicate a change from Philipp & his team. > + > +(define_insn_and_split "add3_internal" > + [(set (match_operand:GPR 0 "register_operand" "=r,r,&r,!&r") > + (plus:GPR (match_operand:GPR 1 "register_operand" " %r,r,r,0") > + (match_operand:GPR 2 "add_operand" " r,I,L,L"))) > + (clobber (match_scratch:GPR 3 "=X,X,X,&r"))] > + "" > +{ > + if ((which_alternative == 2) || (which_alternative == 3)) > + return "#"; > + > + if (TARGET_64BIT && mode == SImode) > + return "add%i2w\t%0,%1,%2"; > + else > + return "add%i2\t%0,%1,%2"; > +} > + "&& reload_completed && const_lui_operand (operands[2], mode)" > + [(const_int 0)] > +{ > + if (REGNO (operands[0]) != REGNO (operands[1])) > + { > + emit_insn (gen_mov (operands[0], operands[2])); > + emit_insn (gen_add3_internal (operands[0], operands[0], operands[1])); > + } > + else > + { > + emit_insn (gen_mov (operands[3], operands[2])); > + emit_insn (gen_add3_internal (operands[0], operands[0], operands[3])); > + } > + DONE; > +} > [(set_attr "type" "arith") > - (set_attr "mode" "SI")]) > + (set_attr "mode" "")]) So I think it was Kito that was concerned about potential performance impacts, particularly with the clobber expression.      Probably the best thing I can think of to minimize the potential for codegen changes would be to leave your existing add3 pattern alone. Instead define an addptr pattern.  That one is special in that it's only used by LRA.  So the potential for impacting other code is minimized. Rather than checking REGNO (operands[0]) != REGNO (operands[1]) use reg_overlap_mentioned_p instead.  The only notable difference for this code is that reg_overlap_mentioned_p will work with modes that cover multiple hard regs as well as subregs. Jeff