public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* POWER __builtin_add_overflow/__builtin_mul_overflow with u64
@ 2023-02-14  3:48 Simon Richter
  2023-02-14  8:23 ` Jakub Jelinek
  0 siblings, 1 reply; 4+ messages in thread
From: Simon Richter @ 2023-02-14  3:48 UTC (permalink / raw)
  To: gcc


[-- Attachment #1.1: Type: text/plain, Size: 661 bytes --]

Hi,

I'm looking at the generated code for these builtins on POWER:

         add 4,3,4
         subfc 3,3,4
         subfe 3,3,3
         std 4,0(5)
         rldicl 3,3,0,63
         blr

and

         mulld 10,3,4
         mulhdu 3,3,4
         addic 9,3,-1
         std 10,0(5)
         subfe 3,9,3
         blr

The POWER architecture has variants of these instructions with builtin 
overflow checks (addo/mulldo), but these aren't listed in the .md files, 
and the builtins don't generate them either.

Is this intentional (I've found a few comments that mulldo is microcoded 
on CellBE and should be avoided there)?

    Simon

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: POWER __builtin_add_overflow/__builtin_mul_overflow with u64
  2023-02-14  3:48 POWER __builtin_add_overflow/__builtin_mul_overflow with u64 Simon Richter
@ 2023-02-14  8:23 ` Jakub Jelinek
  2023-02-14  9:26   ` Eric Botcazou
  2023-02-15  9:43   ` Segher Boessenkool
  0 siblings, 2 replies; 4+ messages in thread
From: Jakub Jelinek @ 2023-02-14  8:23 UTC (permalink / raw)
  To: Simon Richter, Segher Boessenkool, David Edelsohn; +Cc: gcc

Hi!

CCing Segher and David on this.
rs6000 indeed doesn't implement {,u}{add,sub,mul}v4_optab for
any mode and thus leaves it to the generic code.

On Tue, Feb 14, 2023 at 04:48:42AM +0100, Simon Richter wrote:
> I'm looking at the generated code for these builtins on POWER:
> 
>         add 4,3,4
>         subfc 3,3,4
>         subfe 3,3,3
>         std 4,0(5)
>         rldicl 3,3,0,63
>         blr
> 
> and
> 
>         mulld 10,3,4
>         mulhdu 3,3,4
>         addic 9,3,-1
>         std 10,0(5)
>         subfe 3,9,3
>         blr
> 
> The POWER architecture has variants of these instructions with builtin
> overflow checks (addo/mulldo), but these aren't listed in the .md files, and
> the builtins don't generate them either.
> 
> Is this intentional (I've found a few comments that mulldo is microcoded on
> CellBE and should be avoided there)?

	Jakub


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: POWER __builtin_add_overflow/__builtin_mul_overflow with u64
  2023-02-14  8:23 ` Jakub Jelinek
@ 2023-02-14  9:26   ` Eric Botcazou
  2023-02-15  9:43   ` Segher Boessenkool
  1 sibling, 0 replies; 4+ messages in thread
From: Eric Botcazou @ 2023-02-14  9:26 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Simon Richter, Segher Boessenkool, David Edelsohn, gcc

> rs6000 indeed doesn't implement {,u}{add,sub,mul}v4_optab for
> any mode and thus leaves it to the generic code.

https://gcc.gnu.org/pipermail/gcc-patches/2016-October/460209.html

-- 
Eric Botcazou



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: POWER __builtin_add_overflow/__builtin_mul_overflow with u64
  2023-02-14  8:23 ` Jakub Jelinek
  2023-02-14  9:26   ` Eric Botcazou
@ 2023-02-15  9:43   ` Segher Boessenkool
  1 sibling, 0 replies; 4+ messages in thread
From: Segher Boessenkool @ 2023-02-15  9:43 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Simon Richter, David Edelsohn, gcc

Hi!

On Tue, Feb 14, 2023 at 09:23:55AM +0100, Jakub Jelinek wrote:
> CCing Segher and David on this.
> rs6000 indeed doesn't implement {,u}{add,sub,mul}v4_optab for
> any mode and thus leaves it to the generic code.

Yes.  Can we do better than the generic code, for those?

> On Tue, Feb 14, 2023 at 04:48:42AM +0100, Simon Richter wrote:
> > I'm looking at the generated code for these builtins on POWER:
> > 
> >         add 4,3,4
> >         subfc 3,3,4
> >         subfe 3,3,3
> >         std 4,0(5)
> >         rldicl 3,3,0,63
> >         blr
> > 
> > and
> > 
> >         mulld 10,3,4
> >         mulhdu 3,3,4
> >         addic 9,3,-1
> >         std 10,0(5)
> >         subfe 3,9,3
> >         blr


(The _overflow builtins, which obviously generate different code).

This is pretty much as good as we can do, at least with older ISAs.
With ISA 3.0 (p9) we have isel and with ISA 3.1 (p10) we have setbc*,
allowing us to improve on this slightly.

> > The POWER architecture has variants of these instructions with builtin
> > overflow checks (addo/mulldo), but these aren't listed in the .md files, and
> > the builtins don't generate them either.
> > 
> > Is this intentional (I've found a few comments that mulldo is microcoded on
> > CellBE and should be avoided there)?

As Eric points at we cannot easily use OV since ISA 2.00 (p4, 2001).
ISA 3.0 allows us to use addex to get at the OV bit (but inconveniently,
the insn is really only meant to allow multiple regs in lon carry chains
for multi-precision arithmetic and the like), but nothing else does.

Before ISA 2.00 we could use OV easily using the mcrxr instruction, or
save up testing it for as many insns as we want using the SO bit, which
mcrxr also reads, and conveniently also clears.  But instructions like
that (reading from three separate resoources and writing to one as well,
one that is not renamed even) are not suitable for heavily out-of-order
implementations.

Since ISA 3.0 we can read OV using mcrxrx.  This can allow slightly
faster sequences for some specialised cases.  That insn also allows us
to move CA into a GPR in just one insn as well, hrm :-)

I'll add a work item to investigate what we can do here.  Improvements
will be only marginal, maybe an insn or a cycle or two can be saved
here or there, but it is not likely worth it to have o variants of most
instructions (which are very inconvenient to deal with in the compiler,
they are much nicer for human writers).


Segher

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-02-15  9:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-14  3:48 POWER __builtin_add_overflow/__builtin_mul_overflow with u64 Simon Richter
2023-02-14  8:23 ` Jakub Jelinek
2023-02-14  9:26   ` Eric Botcazou
2023-02-15  9:43   ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).