On Mon, 28 Nov 2022 11:15:01 PST (-0800), gcc-patches@gcc.gnu.org wrote: > > > On 11/24/22 00:43, Sinan wrote: >>> The motivation of this patch is to correct the wrong estimation of >>>> the number of instructions needed for loading a 64bit constant in >>>> rv32 in the current cost model(riscv_interger_cost). According to >>>> the current implementation, if a constant requires more than 3 >>>> instructions(riscv_const_insn and riscv_legitimate_constant_p), >>>> then the constant will be put into constant pool when expanding >>>> gimple to rtl(legitimate_constant_p hook and emit_move_insn). >>>> So the inaccurate cost model leads to the suboptimal codegen >>>> in rv32 and the wrong estimation part could be corrected through >>>> this fix. >>>> >>>> e.g. the current codegen for loading 0x839290001 in rv32 >>>> >>>>    lui     a5,%hi(.LC0) >>>>    lw      a0,%lo(.LC0)(a5) >>>>    lw      a1,%lo(.LC0+4)(a5) >>>> .LC0: >>>>    .word   958988289 >>>>    .word   8 >>>> >>>> output after this patch >>>> >>>>    li a0,958988288 >>>>    addi a0,a0,1 >>>>    li a1,8 >>>> >>>> gcc/ChangeLog: >>>> >>>>          * config/riscv/riscv.cc (riscv_build_integer): Handle the case of loading 64bit constant in rv32. >>>> >>>> gcc/testsuite/ChangeLog: >>>> >>>>          * gcc.target/riscv/rv32-load-64bit-constant.c: New test. >>>> >>>> Signed-off-by: Lin Sinan  >>>> --- >>>>   gcc/config/riscv/riscv.cc                     | 23 +++++++++++ >>>>   .../riscv/rv32-load-64bit-constant.c          | 38 +++++++++++++++++++ >>>>   2 files changed, 61 insertions(+) >>>>   create mode 100644 gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c >>>> >>>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc >>>> index 32f9ef9ade9..9dffabdc5e3 100644 >>>> --- a/gcc/config/riscv/riscv.cc >>>> +++ b/gcc/config/riscv/riscv.cc >>>> @@ -618,6 +618,29 @@ riscv_build_integer (struct riscv_integer_op *codes, HOST_WIDE_INT value, >>>>    } >>>>       } >>>> >>>> +  if ((value > INT32_MAX || value < INT32_MIN) && !TARGET_64BIT) >>> >>> Nit.   It's common practice to have the TARGET test first in a series of >>> tests.  It may also be advisable to break this into two lines. >>> Something like this: >>> >>> >>>   if ((!TARGET_64BIT) >>>       || value > INT32_MAX || value < INT32_MIN) >>> >>> >>> That's the style most GCC folks are more accustomed to reading. >> >> Thanks for the tips and I will change it then. >> >>>> +    { >>>> +      unsigned HOST_WIDE_INT loval = sext_hwi (value, 32); >>>> +      unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32); >>>> +      struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS], >>>> +       hicode[RISCV_MAX_INTEGER_OPS]; >>>> +      int hi_cost, lo_cost; >>>> + >>>> +      hi_cost = riscv_build_integer_1 (hicode, hival, mode); >>>> +      if (hi_cost < cost) >>>> + { >>>> +   lo_cost = riscv_build_integer_1 (alt_codes, loval, mode); >>>> +   if (lo_cost + hi_cost < cost) >>> >>> Just so I'm sure.  "cost" here refers strictly to other synthesized >>> forms? If so, then ISTM that we'd want to generate the new style when >>> lo_cost + hi_cost < cost OR when lo_cost + hi_cost is less than loading >>> the constant from memory -- which is almost certainly more than "3" >>> since the sequence from memory will be at least 3 instructions, two of >>> which will hit memory. >>> >>> >>> Jeff >>> >> >> Yes, almost right. The basic idea of this patch is to improve the cost >> calculation for loading 64bit constant in rv32, instead of adding a new >> way to load constant. >> >> gcc now loads 0x739290001LL in rv32gc with three instructions, >>         li      a0,958988288 >>         addi    a0,a0,1 >>         li      a1,7 >> However, when it loads 0x839290001LL, the output assembly becomes >>         lui     a5,%hi(.LC0) >>         lw      a0,%lo(.LC0)(a5) >>         lw      a1,%lo(.LC0+4)(a5) >>     .LC0: >>         .word   958988289 >>         .word   8 >> The cost calculation is inaccurate in such cases, since loading these >> two constants should have no difference in rv32 (just change `li a1,7` >> to `li a1,8` to load the hi part). This patch will take these cases >> into consideration. >> > I think I see better what's going on. This really isn't about the > constant pool costing. It's about another way to break down the > constant into components. > > riscv_build_integer_1, for the cases we're looking at breaks down the > constant so that high + low will give the final result. It costs the > high and low parts separately, then sums their cost + 1 for the addition > step. > > Your patch adds another method that is specific to rv32 and takes > advantage of register pairs. You break the constant down into 32bit > high and low chunks, where each chunk will go into a different 32 bit > register. You just then need to sum the cost of loading each chunk. > > For the constants in question, your new method will result in a smaller > cost than the current method. That's really the point of > riscv_build_integer -- find the sequence and cost of creation. We later > use that information to determine if we should use that sequence or a > constant pool. > > Palmer raised an issue on the tests with a request to not include the > arch/abi specification. But I think you addressed that in a later > comment. Specifically for rv64 we end up with another instruction, > which would cause some constants to be considered cheaper as constant > pool entries rather than inline sequences. > > Palmer is right in this seems like it ought to be generic, particularly > breaking things down on word boundaries. But I don't think adding that > infrastructure should hold this patch up. Reality is not much is > happening with 32bit (or smaller) architectures and little is happening > with 128bit integer types. So there's not much motivation to fix this > stuff more generically right now. Seems reasonable to me, we can always promote it to something generic later if some other port wants something similar.