From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=RqVT=34=dabbelt.com=palmer@sourceware.org>
Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f])
	by sourceware.org (Postfix) with UTF8SMTPS id 208A7385B180
	for <gcc-patches@gcc.gnu.org>; Mon, 28 Nov 2022 19:18:26 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 208A7385B180
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=dabbelt.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=dabbelt.com
Received: by mail-pf1-x42f.google.com with SMTP id l7so8986667pfl.7
        for <gcc-patches@gcc.gnu.org>; Mon, 28 Nov 2022 11:18:26 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=dabbelt-com.20210112.gappssmtp.com; s=20210112;
        h=message-id:to:from:cc:in-reply-to:subject:date:from:to:cc:subject
         :date:message-id:reply-to;
        bh=bC7sz2rAsxxR/1wk2LZlmFoWIr/hbLR1Zh81EQOXwvw=;
        b=CjFmPBFs4E9LB6i2LyEnTa4H7M6wCgyIMUGOdK838X1T9RX3FAQclWoXhTrlSASjlS
         F8kjkXhrMjCW8gaCpOOUYy/Ykz78p/C3ywUw9bVUnNABx/FtAZWV8PpNaHu0LC3qn6Zp
         DYkspzKppgvblZvjyK25XLUjjnDa6XufJfpVdMiDT2SbuQvTdJY3gpJUF9Rg74yXfUh7
         hKEDUmNtGFxgDYEb08FOaDZW2bBSI0vJmbeBOmLa6C6BkMFCiKwEvLWY9XX7xJWGPxVT
         eIM9FJhzvGs7ZjG84DPB2apU1clxWV2fHny9nV3Fin27GztvEgCdzHu7KO5qEjf87icN
         1g7w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=message-id:to:from:cc:in-reply-to:subject:date:x-gm-message-state
         :from:to:cc:subject:date:message-id:reply-to;
        bh=bC7sz2rAsxxR/1wk2LZlmFoWIr/hbLR1Zh81EQOXwvw=;
        b=tOHSu4blCV9nP1tgt58DsMtzky2w7lRkAJipm1RumYQ+KiUXb/bsaQPZrNemTUpwaF
         83+xmd2N9uMk8MrzONXQvl+R9OhuUYjd0FJOMDcZk+AZc0ijhHoqISgZ43+0w4Ol78KP
         CNtyAGyWKEN3g7ivAKq6uXPrTSxcffZoq7bBh6FCQPouaM7gGZYJovFsy0WJk7f+qGvJ
         yZFy1xqPXcANNye+JE+sEgcn+rW7RPxqb5oJ8mBW7NBAi+RXXAqaF4VlmaA5oYys4fFQ
         7nny+isIKnvPXI2XTbUGh+5/8qoV0jUdUz8SUEIjsrXSXNklNE1L0j8f9JQrGPndadc/
         w8Bw==
X-Gm-Message-State: ANoB5plcplN2FQTjL2XP835gTJARbeGg/SXdQXv0n2pRqkCHIzbU8BMB
	DvEnxnsc/FbKQd/3tsqChileGldY7HngSQ==
X-Google-Smtp-Source: AA0mqf7c7iIGe0pzN5vNBxU4bRZRqquFdWyevln/EJY4cyey6ZplC9h7juGqedLzygsfLt/ZZhs5Pg==
X-Received: by 2002:a63:e617:0:b0:41a:4bd4:f466 with SMTP id g23-20020a63e617000000b0041a4bd4f466mr36852805pgh.499.1669663104747;
        Mon, 28 Nov 2022 11:18:24 -0800 (PST)
Received: from localhost ([50.221.140.188])
        by smtp.gmail.com with ESMTPSA id f14-20020a170902684e00b001892af9472esm9117133pln.261.2022.11.28.11.18.24
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 28 Nov 2022 11:18:24 -0800 (PST)
Date: Mon, 28 Nov 2022 11:18:24 -0800 (PST)
X-Google-Original-Date: Mon, 28 Nov 2022 11:18:15 PST (-0800)
Subject:     Re: 回复：[PING] [PATCH RESEND] riscv: improve the cost model for loading a 64bit constant in rv32.
In-Reply-To: <f5423f3a-7e34-3c2d-3b90-7552d9e25f54@gmail.com>
CC: sinan.lin@linux.alibaba.com, gcc-patches@gcc.gnu.org
From: Palmer Dabbelt <palmer@dabbelt.com>
To: gcc-patches@gcc.gnu.org
Message-ID: <mhng-b1640937-e124-4916-939f-5a9fc2214f5d@palmer-ri-x1c9a>
X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,KAM_SHORT,PP_MIME_FAKE_ASCII_TEXT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Mon, 28 Nov 2022 11:15:01 PST (-0800), gcc-patches@gcc.gnu.org wrote:
>
>
> On 11/24/22 00:43, Sinan wrote:
>>> The motivation of this patch is to correct the wrong estimation of
>>>> the number of instructions needed for loading a 64bit constant in
>>>> rv32 in the current cost model(riscv_interger_cost). According to
>>>> the current implementation, if a constant requires more than 3
>>>> instructions(riscv_const_insn and riscv_legitimate_constant_p),
>>>> then the constant will be put into constant pool when expanding
>>>> gimple to rtl(legitimate_constant_p hook and emit_move_insn).
>>>> So the inaccurate cost model leads to the suboptimal codegen
>>>> in rv32 and the wrong estimation part could be corrected through
>>>> this fix.
>>>>
>>>> e.g. the current codegen for loading 0x839290001 in rv32
>>>>
>>>>    lui     a5,%hi(.LC0)
>>>>    lw      a0,%lo(.LC0)(a5)
>>>>    lw      a1,%lo(.LC0+4)(a5)
>>>> .LC0:
>>>>    .word   958988289
>>>>    .word   8
>>>>
>>>> output after this patch
>>>>
>>>>    li a0,958988288
>>>>    addi a0,a0,1
>>>>    li a1,8
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>          * config/riscv/riscv.cc (riscv_build_integer): Handle the case of loading 64bit constant in rv32.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>>          * gcc.target/riscv/rv32-load-64bit-constant.c: New test.
>>>>
>>>> Signed-off-by: Lin Sinan <sinan.lin@linux.alibaba.com>
>>>> ---
>>>>   gcc/config/riscv/riscv.cc                     | 23 +++++++++++
>>>>   .../riscv/rv32-load-64bit-constant.c          | 38 +++++++++++++++++++
>>>>   2 files changed, 61 insertions(+)
>>>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c
>>>>
>>>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>>>> index 32f9ef9ade9..9dffabdc5e3 100644
>>>> --- a/gcc/config/riscv/riscv.cc
>>>> +++ b/gcc/config/riscv/riscv.cc
>>>> @@ -618,6 +618,29 @@ riscv_build_integer (struct riscv_integer_op *codes, HOST_WIDE_INT value,
>>>>    }
>>>>       }
>>>>
>>>> +  if ((value > INT32_MAX || value < INT32_MIN) && !TARGET_64BIT)
>>>
>>> Nit.   It's common practice to have the TARGET test first in a series of
>>> tests.  It may also be advisable to break this into two lines.
>>> Something like this:
>>>
>>>
>>>   if ((!TARGET_64BIT)
>>>       || value > INT32_MAX || value < INT32_MIN)
>>>
>>>
>>> That's the style most GCC folks are more accustomed to reading.
>>
>> Thanks for the tips and I will change it then.
>>
>>>> +    {
>>>> +      unsigned HOST_WIDE_INT loval = sext_hwi (value, 32);
>>>> +      unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32);
>>>> +      struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS],
>>>> +       hicode[RISCV_MAX_INTEGER_OPS];
>>>> +      int hi_cost, lo_cost;
>>>> +
>>>> +      hi_cost = riscv_build_integer_1 (hicode, hival, mode);
>>>> +      if (hi_cost < cost)
>>>> + {
>>>> +   lo_cost = riscv_build_integer_1 (alt_codes, loval, mode);
>>>> +   if (lo_cost + hi_cost < cost)
>>>
>>> Just so I'm sure.  "cost" here refers strictly to other synthesized
>>> forms? If so, then ISTM that we'd want to generate the new style when
>>> lo_cost + hi_cost < cost OR when lo_cost + hi_cost is less than loading
>>> the constant from memory -- which is almost certainly more than "3"
>>> since the sequence from memory will be at least 3 instructions, two of
>>> which will hit memory.
>>>
>>>
>>> Jeff
>>>
>>
>> Yes, almost right. The basic idea of this patch is to improve the cost
>> calculation for loading 64bit constant in rv32, instead of adding a new
>> way to load constant.
>>
>> gcc now loads 0x739290001LL in rv32gc with three instructions,
>>          li      a0,958988288
>>          addi    a0,a0,1
>>          li      a1,7
>> However, when it loads 0x839290001LL, the output assembly becomes
>>          lui     a5,%hi(.LC0)
>>          lw      a0,%lo(.LC0)(a5)
>>          lw      a1,%lo(.LC0+4)(a5)
>>      .LC0:
>>          .word   958988289
>>          .word   8
>> The cost calculation is inaccurate in such cases, since loading these
>> two constants should have no difference in rv32 (just change `li a1,7`
>> to `li a1,8` to load the hi part). This patch will take these cases
>> into consideration.
>>
> I think I see better what's going on.  This really isn't about the
> constant pool costing.  It's about another way to break down the
> constant into components.
>
> riscv_build_integer_1, for the cases we're looking at breaks down the
> constant so that high + low will give the final result.  It costs the
> high and low parts separately, then sums their cost + 1 for the addition
> step.
>
> Your patch adds another method that is specific to rv32 and takes
> advantage of register pairs.   You break the constant down into 32bit
> high and low chunks, where each chunk will go into a different 32 bit
> register.  You just then need to sum the cost of loading each chunk.
>
> For the constants in question, your new method will result in a smaller
> cost than the current method.   That's really the point of
> riscv_build_integer -- find the sequence and cost of creation.  We later
> use that information to determine if we should use that sequence or a
> constant pool.
>
> Palmer raised an issue on the tests with a request to not include the
> arch/abi specification.  But I think you addressed that in a later
> comment.  Specifically for rv64 we end up with another instruction,
> which would cause some constants to be considered cheaper as constant
> pool entries rather than inline sequences.
>
> Palmer is right in this seems like it ought to be generic, particularly
> breaking things down on word boundaries.  But I don't think adding that
> infrastructure should hold this patch up.  Reality is not much is
> happening with 32bit (or smaller) architectures and little is happening
> with 128bit integer types.  So there's not much motivation to fix this
> stuff more generically right now.

Seems reasonable to me, we can always promote it to something generic 
later if some other port wants something similar.