public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* ideas on PR/109279
@ 2023-03-31  9:11 Vineet Gupta
  2023-03-31 13:34 ` Jeff Law
  0 siblings, 1 reply; 3+ messages in thread
From: Vineet Gupta @ 2023-03-31  9:11 UTC (permalink / raw)
  To: Jeff Law, Kito Cheng, Palmer Dabbelt; +Cc: gcc

Hi Jeff, Kito,

I need some ideas to proceed with PR/109279: pertaining to longer term 
direction and short term fix.

First the executive summary:

long long f(void)
{
   return 0x0101010101010101ull;
}

Up until gcc 12 this used to generate const pool type access.

     lui    a5,%hi(.LANCHOR0)
     ld    a0,%lo(.LANCHOR0)(a5)
     ret
.LC0:
     .dword    0x101010101010101

After commit 2e886eef7f2b ("RISC-V: Produce better code with complex 
constants [PR95632] [PR106602] ") it gets synthesized to following

li    a0,0x01010000
     addi    a0,0x0101
     slli    a0,a0,16
     addi    a0,a0,0x0101
     slli    a0,a0,16
     addi    a0,a0,0x0101
     ret

Granted const pool could or not be preferred by  specific uarch, will 
the long term approach be to have a cost model for the const pool vs. 
synthesizing.

The second aspect is to improve the horror above. Per chat on IRC, 
pinskia suggested we relax the in_splitter constraint in 
riscv_move_integer, as the combine issue holding it back is now fixed - 
after commit 61bee6aed26eb30.

That beings it down to some what reasonable

     li        a5,0x01010000
     addi   a5,a5,0x0101
     mv     a0,a5
     slli      a5,a5,32
     add    a0,a5,a0
     ret

I can spin a minimal patch, will that be acceptable for gcc 13.1 if it 
is testsuite clean !

Thx,
-Vineet

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ideas on PR/109279
  2023-03-31  9:11 ideas on PR/109279 Vineet Gupta
@ 2023-03-31 13:34 ` Jeff Law
  2023-03-31 14:13   ` Kito Cheng
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Law @ 2023-03-31 13:34 UTC (permalink / raw)
  To: Vineet Gupta, Kito Cheng, Palmer Dabbelt; +Cc: gcc



On 3/31/23 03:11, Vineet Gupta wrote:
> Hi Jeff, Kito,
> 
> I need some ideas to proceed with PR/109279: pertaining to longer term 
> direction and short term fix.
> 
> First the executive summary:
> 
> long long f(void)
> {
>    return 0x0101010101010101ull;
> }
> 
> Up until gcc 12 this used to generate const pool type access.
> 
>      lui    a5,%hi(.LANCHOR0)
>      ld    a0,%lo(.LANCHOR0)(a5)
>      ret
> .LC0:
>      .dword    0x101010101010101
> 
> After commit 2e886eef7f2b ("RISC-V: Produce better code with complex 
> constants [PR95632] [PR106602] ") it gets synthesized to following
> 
> li    a0,0x01010000
>      addi    a0,0x0101
>      slli    a0,a0,16
>      addi    a0,a0,0x0101
>      slli    a0,a0,16
>      addi    a0,a0,0x0101
>      ret
> 
> Granted const pool could or not be preferred by  specific uarch, will 
> the long term approach be to have a cost model for the const pool vs. 
> synthesizing.
> 
> The second aspect is to improve the horror above. Per chat on IRC, 
> pinskia suggested we relax the in_splitter constraint in 
> riscv_move_integer, as the combine issue holding it back is now fixed - 
> after commit 61bee6aed26eb30.
> 
> That beings it down to some what reasonable
> 
>      li        a5,0x01010000
>      addi   a5,a5,0x0101
>      mv     a0,a5
>      slli      a5,a5,32
>      add    a0,a5,a0
>      ret
> 
> I can spin a minimal patch, will that be acceptable for gcc 13.1 if it 
> is testsuite clean 
It would seem to be a gcc-14 thing to me.

It seems like we probably should adjust the basic constant synthesis 
code to handle this class of cases so that the initial RTL is good 
rather than waiting on combine to fix it up.  It looks like we need the 
destination register as well as a temporary and a 5 instruction sequence.

I'm aware of uarch plans that would handle this kind of sequence 
entirely in the front-end and pass off a single uop to the execution 
units.  We'd planned to dig into constant synthesis in support of that 
effort.  So I'm happy to help shepherd this improvement once gcc-14 
development opens.

jeff

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: ideas on PR/109279
  2023-03-31 13:34 ` Jeff Law
@ 2023-03-31 14:13   ` Kito Cheng
  0 siblings, 0 replies; 3+ messages in thread
From: Kito Cheng @ 2023-03-31 14:13 UTC (permalink / raw)
  To: Jeff Law; +Cc: Palmer Dabbelt, Vineet Gupta, gcc

[-- Attachment #1: Type: text/plain, Size: 2383 bytes --]

I would prefer defer to GCC 14 too

Jeff Law <jeffreyalaw@gmail.com>於 2023年3月31日 週五,21:34寫道:

>
>
> On 3/31/23 03:11, Vineet Gupta wrote:
> > Hi Jeff, Kito,
> >
> > I need some ideas to proceed with PR/109279: pertaining to longer term
> > direction and short term fix.
> >
> > First the executive summary:
> >
> > long long f(void)
> > {
> >    return 0x0101010101010101ull;
> > }
> >
> > Up until gcc 12 this used to generate const pool type access.
> >
> >      lui    a5,%hi(.LANCHOR0)
> >      ld    a0,%lo(.LANCHOR0)(a5)
> >      ret
> > .LC0:
> >      .dword    0x101010101010101
> >
> > After commit 2e886eef7f2b ("RISC-V: Produce better code with complex
> > constants [PR95632] [PR106602] ") it gets synthesized to following
> >
> > li    a0,0x01010000
> >      addi    a0,0x0101
> >      slli    a0,a0,16
> >      addi    a0,a0,0x0101
> >      slli    a0,a0,16
> >      addi    a0,a0,0x0101
> >      ret
> >
> > Granted const pool could or not be preferred by  specific uarch, will
> > the long term approach be to have a cost model for the const pool vs.
> > synthesizing.
> >
> > The second aspect is to improve the horror above. Per chat on IRC,
> > pinskia suggested we relax the in_splitter constraint in
> > riscv_move_integer, as the combine issue holding it back is now fixed -
> > after commit 61bee6aed26eb30.
> >
> > That beings it down to some what reasonable
> >
> >      li        a5,0x01010000
> >      addi   a5,a5,0x0101
> >      mv     a0,a5
> >      slli      a5,a5,32
> >      add    a0,a5,a0
> >      ret
> >
> > I can spin a minimal patch, will that be acceptable for gcc 13.1 if it
> > is testsuite clean
> It would seem to be a gcc-14 thing to me.
>
> It seems like we probably should adjust the basic constant synthesis
> code to handle this class of cases so that the initial RTL is good
> rather than waiting on combine to fix it up.  It looks like we need the
> destination register as well as a temporary and a 5 instruction sequence.
>
> I'm aware of uarch plans that would handle this kind of sequence
> entirely in the front-end and pass off a single uop to the execution
> units.  We'd planned to dig into constant synthesis in support of that
> effort.  So I'm happy to help shepherd this improvement once gcc-14
> development opens.
>
> jeff
>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-03-31 14:14 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-31  9:11 ideas on PR/109279 Vineet Gupta
2023-03-31 13:34 ` Jeff Law
2023-03-31 14:13   ` Kito Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).