From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by sourceware.org (Postfix) with UTF8SMTPS id 208A7385B180 for ; Mon, 28 Nov 2022 19:18:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 208A7385B180 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=dabbelt.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=dabbelt.com Received: by mail-pf1-x42f.google.com with SMTP id l7so8986667pfl.7 for ; Mon, 28 Nov 2022 11:18:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=dabbelt-com.20210112.gappssmtp.com; s=20210112; h=message-id:to:from:cc:in-reply-to:subject:date:from:to:cc:subject :date:message-id:reply-to; bh=bC7sz2rAsxxR/1wk2LZlmFoWIr/hbLR1Zh81EQOXwvw=; b=CjFmPBFs4E9LB6i2LyEnTa4H7M6wCgyIMUGOdK838X1T9RX3FAQclWoXhTrlSASjlS F8kjkXhrMjCW8gaCpOOUYy/Ykz78p/C3ywUw9bVUnNABx/FtAZWV8PpNaHu0LC3qn6Zp DYkspzKppgvblZvjyK25XLUjjnDa6XufJfpVdMiDT2SbuQvTdJY3gpJUF9Rg74yXfUh7 hKEDUmNtGFxgDYEb08FOaDZW2bBSI0vJmbeBOmLa6C6BkMFCiKwEvLWY9XX7xJWGPxVT eIM9FJhzvGs7ZjG84DPB2apU1clxWV2fHny9nV3Fin27GztvEgCdzHu7KO5qEjf87icN 1g7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=message-id:to:from:cc:in-reply-to:subject:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=bC7sz2rAsxxR/1wk2LZlmFoWIr/hbLR1Zh81EQOXwvw=; b=tOHSu4blCV9nP1tgt58DsMtzky2w7lRkAJipm1RumYQ+KiUXb/bsaQPZrNemTUpwaF 83+xmd2N9uMk8MrzONXQvl+R9OhuUYjd0FJOMDcZk+AZc0ijhHoqISgZ43+0w4Ol78KP CNtyAGyWKEN3g7ivAKq6uXPrTSxcffZoq7bBh6FCQPouaM7gGZYJovFsy0WJk7f+qGvJ yZFy1xqPXcANNye+JE+sEgcn+rW7RPxqb5oJ8mBW7NBAi+RXXAqaF4VlmaA5oYys4fFQ 7nny+isIKnvPXI2XTbUGh+5/8qoV0jUdUz8SUEIjsrXSXNklNE1L0j8f9JQrGPndadc/ w8Bw== X-Gm-Message-State: ANoB5plcplN2FQTjL2XP835gTJARbeGg/SXdQXv0n2pRqkCHIzbU8BMB DvEnxnsc/FbKQd/3tsqChileGldY7HngSQ== X-Google-Smtp-Source: AA0mqf7c7iIGe0pzN5vNBxU4bRZRqquFdWyevln/EJY4cyey6ZplC9h7juGqedLzygsfLt/ZZhs5Pg== X-Received: by 2002:a63:e617:0:b0:41a:4bd4:f466 with SMTP id g23-20020a63e617000000b0041a4bd4f466mr36852805pgh.499.1669663104747; Mon, 28 Nov 2022 11:18:24 -0800 (PST) Received: from localhost ([50.221.140.188]) by smtp.gmail.com with ESMTPSA id f14-20020a170902684e00b001892af9472esm9117133pln.261.2022.11.28.11.18.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 11:18:24 -0800 (PST) Date: Mon, 28 Nov 2022 11:18:24 -0800 (PST) X-Google-Original-Date: Mon, 28 Nov 2022 11:18:15 PST (-0800) Subject: Re: 回复:[PING] [PATCH RESEND] riscv: improve the cost model for loading a 64bit constant in rv32. In-Reply-To: CC: sinan.lin@linux.alibaba.com, gcc-patches@gcc.gnu.org From: Palmer Dabbelt To: gcc-patches@gcc.gnu.org Message-ID: X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,KAM_SHORT,PP_MIME_FAKE_ASCII_TEXT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Mon, 28 Nov 2022 11:15:01 PST (-0800), gcc-patches@gcc.gnu.org wrote: > > > On 11/24/22 00:43, Sinan wrote: >>> The motivation of this patch is to correct the wrong estimation of >>>> the number of instructions needed for loading a 64bit constant in >>>> rv32 in the current cost model(riscv_interger_cost). According to >>>> the current implementation, if a constant requires more than 3 >>>> instructions(riscv_const_insn and riscv_legitimate_constant_p), >>>> then the constant will be put into constant pool when expanding >>>> gimple to rtl(legitimate_constant_p hook and emit_move_insn). >>>> So the inaccurate cost model leads to the suboptimal codegen >>>> in rv32 and the wrong estimation part could be corrected through >>>> this fix. >>>> >>>> e.g. the current codegen for loading 0x839290001 in rv32 >>>> >>>>    lui     a5,%hi(.LC0) >>>>    lw      a0,%lo(.LC0)(a5) >>>>    lw      a1,%lo(.LC0+4)(a5) >>>> .LC0: >>>>    .word   958988289 >>>>    .word   8 >>>> >>>> output after this patch >>>> >>>>    li a0,958988288 >>>>    addi a0,a0,1 >>>>    li a1,8 >>>> >>>> gcc/ChangeLog: >>>> >>>>          * config/riscv/riscv.cc (riscv_build_integer): Handle the case of loading 64bit constant in rv32. >>>> >>>> gcc/testsuite/ChangeLog: >>>> >>>>          * gcc.target/riscv/rv32-load-64bit-constant.c: New test. >>>> >>>> Signed-off-by: Lin Sinan  >>>> --- >>>>   gcc/config/riscv/riscv.cc                     | 23 +++++++++++ >>>>   .../riscv/rv32-load-64bit-constant.c          | 38 +++++++++++++++++++ >>>>   2 files changed, 61 insertions(+) >>>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c >>>> >>>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc >>>> index 32f9ef9ade9..9dffabdc5e3 100644 >>>> --- a/gcc/config/riscv/riscv.cc >>>> +++ b/gcc/config/riscv/riscv.cc >>>> @@ -618,6 +618,29 @@ riscv_build_integer (struct riscv_integer_op *codes, HOST_WIDE_INT value, >>>>    } >>>>       } >>>> >>>> +  if ((value > INT32_MAX || value < INT32_MIN) && !TARGET_64BIT) >>> >>> Nit.   It's common practice to have the TARGET test first in a series of >>> tests.  It may also be advisable to break this into two lines. >>> Something like this: >>> >>> >>>   if ((!TARGET_64BIT) >>>       || value > INT32_MAX || value < INT32_MIN) >>> >>> >>> That's the style most GCC folks are more accustomed to reading. >> >> Thanks for the tips and I will change it then. >> >>>> +    { >>>> +      unsigned HOST_WIDE_INT loval = sext_hwi (value, 32); >>>> +      unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32); >>>> +      struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS], >>>> +       hicode[RISCV_MAX_INTEGER_OPS]; >>>> +      int hi_cost, lo_cost; >>>> + >>>> +      hi_cost = riscv_build_integer_1 (hicode, hival, mode); >>>> +      if (hi_cost < cost) >>>> + { >>>> +   lo_cost = riscv_build_integer_1 (alt_codes, loval, mode); >>>> +   if (lo_cost + hi_cost < cost) >>> >>> Just so I'm sure.  "cost" here refers strictly to other synthesized >>> forms? If so, then ISTM that we'd want to generate the new style when >>> lo_cost + hi_cost < cost OR when lo_cost + hi_cost is less than loading >>> the constant from memory -- which is almost certainly more than "3" >>> since the sequence from memory will be at least 3 instructions, two of >>> which will hit memory. >>> >>> >>> Jeff >>> >> >> Yes, almost right. The basic idea of this patch is to improve the cost >> calculation for loading 64bit constant in rv32, instead of adding a new >> way to load constant. >> >> gcc now loads 0x739290001LL in rv32gc with three instructions, >>         li      a0,958988288 >>         addi    a0,a0,1 >>         li      a1,7 >> However, when it loads 0x839290001LL, the output assembly becomes >>         lui     a5,%hi(.LC0) >>         lw      a0,%lo(.LC0)(a5) >>         lw      a1,%lo(.LC0+4)(a5) >>     .LC0: >>         .word   958988289 >>         .word   8 >> The cost calculation is inaccurate in such cases, since loading these >> two constants should have no difference in rv32 (just change `li a1,7` >> to `li a1,8` to load the hi part). This patch will take these cases >> into consideration. >> > I think I see better what's going on. This really isn't about the > constant pool costing. It's about another way to break down the > constant into components. > > riscv_build_integer_1, for the cases we're looking at breaks down the > constant so that high + low will give the final result. It costs the > high and low parts separately, then sums their cost + 1 for the addition > step. > > Your patch adds another method that is specific to rv32 and takes > advantage of register pairs. You break the constant down into 32bit > high and low chunks, where each chunk will go into a different 32 bit > register. You just then need to sum the cost of loading each chunk. > > For the constants in question, your new method will result in a smaller > cost than the current method. That's really the point of > riscv_build_integer -- find the sequence and cost of creation. We later > use that information to determine if we should use that sequence or a > constant pool. > > Palmer raised an issue on the tests with a request to not include the > arch/abi specification. But I think you addressed that in a later > comment. Specifically for rv64 we end up with another instruction, > which would cause some constants to be considered cheaper as constant > pool entries rather than inline sequences. > > Palmer is right in this seems like it ought to be generic, particularly > breaking things down on word boundaries. But I don't think adding that > infrastructure should hold this patch up. Reality is not much is > happening with 32bit (or smaller) architectures and little is happening > with 128bit integer types. So there's not much motivation to fix this > stuff more generically right now. Seems reasonable to me, we can always promote it to something generic later if some other port wants something similar.