On Mon, 28 Nov 2022 11:15:01 PST (-0800), gcc-patches@gcc.gnu.org wrote:
>
>
> On 11/24/22 00:43, Sinan wrote:
>>> The motivation of this patch is to correct the wrong estimation of
>>>> the number of instructions needed for loading a 64bit constant in
>>>> rv32 in the current cost model(riscv_interger_cost). According to
>>>> the current implementation, if a constant requires more than 3
>>>> instructions(riscv_const_insn and riscv_legitimate_constant_p),
>>>> then the constant will be put into constant pool when expanding
>>>> gimple to rtl(legitimate_constant_p hook and emit_move_insn).
>>>> So the inaccurate cost model leads to the suboptimal codegen
>>>> in rv32 and the wrong estimation part could be corrected through
>>>> this fix.
>>>>
>>>> e.g. the current codegen for loading 0x839290001 in rv32
>>>>
>>>>    lui     a5,%hi(.LC0)
>>>>    lw      a0,%lo(.LC0)(a5)
>>>>    lw      a1,%lo(.LC0+4)(a5)
>>>> .LC0:
>>>>    .word   958988289
>>>>    .word   8
>>>>
>>>> output after this patch
>>>>
>>>>    li a0,958988288
>>>>    addi a0,a0,1
>>>>    li a1,8
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>          * config/riscv/riscv.cc (riscv_build_integer): Handle the case of loading 64bit constant in rv32.
>>>>
>>>> gcc/testsuite/ChangeLog:
>>>>
>>>>          * gcc.target/riscv/rv32-load-64bit-constant.c: New test.
>>>>
>>>> Signed-off-by: Lin Sinan <sinan.lin@linux.alibaba.com>
>>>> ---
>>>>   gcc/config/riscv/riscv.cc                     | 23 +++++++++++
>>>>   .../riscv/rv32-load-64bit-constant.c          | 38 +++++++++++++++++++
>>>>   2 files changed, 61 insertions(+)
>>>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c
>>>>
>>>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>>>> index 32f9ef9ade9..9dffabdc5e3 100644
>>>> --- a/gcc/config/riscv/riscv.cc
>>>> +++ b/gcc/config/riscv/riscv.cc
>>>> @@ -618,6 +618,29 @@ riscv_build_integer (struct riscv_integer_op *codes, HOST_WIDE_INT value,
>>>>    }
>>>>       }
>>>>
>>>> +  if ((value > INT32_MAX || value < INT32_MIN) && !TARGET_64BIT)
>>>
>>> Nit.   It's common practice to have the TARGET test first in a series of
>>> tests.  It may also be advisable to break this into two lines.
>>> Something like this:
>>>
>>>
>>>   if ((!TARGET_64BIT)
>>>       || value > INT32_MAX || value < INT32_MIN)
>>>
>>>
>>> That's the style most GCC folks are more accustomed to reading.
>>
>> Thanks for the tips and I will change it then.
>>
>>>> +    {
>>>> +      unsigned HOST_WIDE_INT loval = sext_hwi (value, 32);
>>>> +      unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32);
>>>> +      struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS],
>>>> +       hicode[RISCV_MAX_INTEGER_OPS];
>>>> +      int hi_cost, lo_cost;
>>>> +
>>>> +      hi_cost = riscv_build_integer_1 (hicode, hival, mode);
>>>> +      if (hi_cost < cost)
>>>> + {
>>>> +   lo_cost = riscv_build_integer_1 (alt_codes, loval, mode);
>>>> +   if (lo_cost + hi_cost < cost)
>>>
>>> Just so I'm sure.  "cost" here refers strictly to other synthesized
>>> forms? If so, then ISTM that we'd want to generate the new style when
>>> lo_cost + hi_cost < cost OR when lo_cost + hi_cost is less than loading
>>> the constant from memory -- which is almost certainly more than "3"
>>> since the sequence from memory will be at least 3 instructions, two of
>>> which will hit memory.
>>>
>>>
>>> Jeff
>>>
>>
>> Yes, almost right. The basic idea of this patch is to improve the cost
>> calculation for loading 64bit constant in rv32, instead of adding a new
>> way to load constant.
>>
>> gcc now loads 0x739290001LL in rv32gc with three instructions,
>>          li      a0,958988288
>>          addi    a0,a0,1
>>          li      a1,7
>> However, when it loads 0x839290001LL, the output assembly becomes
>>          lui     a5,%hi(.LC0)
>>          lw      a0,%lo(.LC0)(a5)
>>          lw      a1,%lo(.LC0+4)(a5)
>>      .LC0:
>>          .word   958988289
>>          .word   8
>> The cost calculation is inaccurate in such cases, since loading these
>> two constants should have no difference in rv32 (just change `li a1,7`
>> to `li a1,8` to load the hi part). This patch will take these cases
>> into consideration.
>>
> I think I see better what's going on.  This really isn't about the
> constant pool costing.  It's about another way to break down the
> constant into components.
>
> riscv_build_integer_1, for the cases we're looking at breaks down the
> constant so that high + low will give the final result.  It costs the
> high and low parts separately, then sums their cost + 1 for the addition
> step.
>
> Your patch adds another method that is specific to rv32 and takes
> advantage of register pairs.   You break the constant down into 32bit
> high and low chunks, where each chunk will go into a different 32 bit
> register.  You just then need to sum the cost of loading each chunk.
>
> For the constants in question, your new method will result in a smaller
> cost than the current method.   That's really the point of
> riscv_build_integer -- find the sequence and cost of creation.  We later
> use that information to determine if we should use that sequence or a
> constant pool.
>
> Palmer raised an issue on the tests with a request to not include the
> arch/abi specification.  But I think you addressed that in a later
> comment.  Specifically for rv64 we end up with another instruction,
> which would cause some constants to be considered cheaper as constant
> pool entries rather than inline sequences.
>
> Palmer is right in this seems like it ought to be generic, particularly
> breaking things down on word boundaries.  But I don't think adding that
> infrastructure should hold this patch up.  Reality is not much is
> happening with 32bit (or smaller) architectures and little is happening
> with 128bit integer types.  So there's not much motivation to fix this
> stuff more generically right now.

Seems reasonable to me, we can always promote it to something generic 
later if some other port wants something similar.