Libgcc divide vectorization question

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Libgcc divide vectorization question
@ 2023-03-21 16:59 Andrew Stubbs
  2023-03-22 10:09 ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Stubbs @ 2023-03-21 16:59 UTC (permalink / raw)
  To: GCC Development

Hi all,

I want to be able to vectorize divide operators (softfp and integer), 
but amdgcn only has hardware instructions suitable for -ffast-math.

We have recently implemented vector versions of all the libm functions, 
but the libgcc functions aren't builtins and therefore don't use those 
hooks.

What's the best way to achieve this? Add a new __builtin_div (and 
__builtin_mod) that tree-vectorize can find, perhaps? Or something else?

Thanks

Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Libgcc divide vectorization question
  2023-03-21 16:59 Libgcc divide vectorization question Andrew Stubbs
@ 2023-03-22 10:09 ` Richard Biener
  2023-03-22 11:02   ` Andrew Stubbs
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2023-03-22 10:09 UTC (permalink / raw)
  To: Andrew Stubbs; +Cc: GCC Development

On Tue, Mar 21, 2023 at 6:00 PM Andrew Stubbs <ams@codesourcery.com> wrote:
>
> Hi all,
>
> I want to be able to vectorize divide operators (softfp and integer),
> but amdgcn only has hardware instructions suitable for -ffast-math.
>
> We have recently implemented vector versions of all the libm functions,
> but the libgcc functions aren't builtins and therefore don't use those
> hooks.
>
> What's the best way to achieve this? Add a new __builtin_div (and
> __builtin_mod) that tree-vectorize can find, perhaps? Or something else?

What do you want to do?  Vectorize the out-of-line libgcc copy?  Or
emit inline vectorized code for int/softfp operations?  In the latter
case just emit the code from the pattern expanders?

> Thanks
>
> Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Libgcc divide vectorization question
  2023-03-22 10:09 ` Richard Biener
@ 2023-03-22 11:02   ` Andrew Stubbs
  2023-03-22 13:56     ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Stubbs @ 2023-03-22 11:02 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Development

On 22/03/2023 10:09, Richard Biener wrote:
> On Tue, Mar 21, 2023 at 6:00 PM Andrew Stubbs <ams@codesourcery.com> wrote:
>>
>> Hi all,
>>
>> I want to be able to vectorize divide operators (softfp and integer),
>> but amdgcn only has hardware instructions suitable for -ffast-math.
>>
>> We have recently implemented vector versions of all the libm functions,
>> but the libgcc functions aren't builtins and therefore don't use those
>> hooks.
>>
>> What's the best way to achieve this? Add a new __builtin_div (and
>> __builtin_mod) that tree-vectorize can find, perhaps? Or something else?
> 
> What do you want to do?  Vectorize the out-of-line libgcc copy?  Or
> emit inline vectorized code for int/softfp operations?  In the latter
> case just emit the code from the pattern expanders?

I'd like to investigate having vectorized versions of the libgcc 
instruction functions, like we do for libm.

The inline code expansion is certainly an option, but I think there's 
quite a lot of code in those routines. I know how to do that option at 
least (except, maybe not the errno handling without making assumptions 
about the C runtime).

Basically, the -ffast-math instructions will always be the fastest way, 
but the goal is that the default optimization shouldn't just disable 
vectorization entirely for any loop that has a divide in it.

Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Libgcc divide vectorization question
  2023-03-22 11:02   ` Andrew Stubbs
@ 2023-03-22 13:56     ` Richard Biener
  2023-03-22 15:57       ` Andrew Stubbs
  0 siblings, 1 reply; 6+ messages in thread
From: Richard Biener @ 2023-03-22 13:56 UTC (permalink / raw)
  To: Andrew Stubbs; +Cc: GCC Development

On Wed, Mar 22, 2023 at 12:02 PM Andrew Stubbs <ams@codesourcery.com> wrote:
>
> On 22/03/2023 10:09, Richard Biener wrote:
> > On Tue, Mar 21, 2023 at 6:00 PM Andrew Stubbs <ams@codesourcery.com> wrote:
> >>
> >> Hi all,
> >>
> >> I want to be able to vectorize divide operators (softfp and integer),
> >> but amdgcn only has hardware instructions suitable for -ffast-math.
> >>
> >> We have recently implemented vector versions of all the libm functions,
> >> but the libgcc functions aren't builtins and therefore don't use those
> >> hooks.
> >>
> >> What's the best way to achieve this? Add a new __builtin_div (and
> >> __builtin_mod) that tree-vectorize can find, perhaps? Or something else?
> >
> > What do you want to do?  Vectorize the out-of-line libgcc copy?  Or
> > emit inline vectorized code for int/softfp operations?  In the latter
> > case just emit the code from the pattern expanders?
>
> I'd like to investigate having vectorized versions of the libgcc
> instruction functions, like we do for libm.
>
> The inline code expansion is certainly an option, but I think there's
> quite a lot of code in those routines. I know how to do that option at
> least (except, maybe not the errno handling without making assumptions
> about the C runtime).
>
> Basically, the -ffast-math instructions will always be the fastest way,
> but the goal is that the default optimization shouldn't just disable
> vectorization entirely for any loop that has a divide in it.

We try to express division as multiplication, but yes, I think there's
currently no way to tell the vectorizer that vectorized division is
available as libcall (nor for any other arithmetic operator that is not
a call in the first place).

> Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Libgcc divide vectorization question
  2023-03-22 13:56     ` Richard Biener
@ 2023-03-22 15:57       ` Andrew Stubbs
  2023-03-23  7:24         ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Andrew Stubbs @ 2023-03-22 15:57 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Development

On 22/03/2023 13:56, Richard Biener wrote:
>> Basically, the -ffast-math instructions will always be the fastest way,
>> but the goal is that the default optimization shouldn't just disable
>> vectorization entirely for any loop that has a divide in it.
> 
> We try to express division as multiplication, but yes, I think there's
> currently no way to tell the vectorizer that vectorized division is
> available as libcall (nor for any other arithmetic operator that is not
> a call in the first place).

I have considered creating a new builtin code, similar to the libm 
functions, that would be enabled by a backend hook, or maybe just if 
TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION doesn't return NULL. The 
vectorizer would then use that, somehow. To treat it just like any other 
builtin it would have to be set before the vectorizer pass encounters 
it, which is probably not ideal for all the other passes that want to 
handle divide operators. Alternatively, the vectorizable_operation 
function could detect and introduce the builtin where appropriate.

Would this be acceptable, or am I wasting my time planning something 
that would get rejected?

Thanks

Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Libgcc divide vectorization question
  2023-03-22 15:57       ` Andrew Stubbs
@ 2023-03-23  7:24         ` Richard Biener
  0 siblings, 0 replies; 6+ messages in thread
From: Richard Biener @ 2023-03-23  7:24 UTC (permalink / raw)
  To: Andrew Stubbs, Richard Sandiford, Jeff Law; +Cc: GCC Development

On Wed, Mar 22, 2023 at 4:57 PM Andrew Stubbs <ams@codesourcery.com> wrote:
>
> On 22/03/2023 13:56, Richard Biener wrote:
> >> Basically, the -ffast-math instructions will always be the fastest way,
> >> but the goal is that the default optimization shouldn't just disable
> >> vectorization entirely for any loop that has a divide in it.
> >
> > We try to express division as multiplication, but yes, I think there's
> > currently no way to tell the vectorizer that vectorized division is
> > available as libcall (nor for any other arithmetic operator that is not
> > a call in the first place).
>
> I have considered creating a new builtin code, similar to the libm
> functions, that would be enabled by a backend hook, or maybe just if
> TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION doesn't return NULL. The
> vectorizer would then use that, somehow. To treat it just like any other
> builtin it would have to be set before the vectorizer pass encounters
> it, which is probably not ideal for all the other passes that want to
> handle divide operators. Alternatively, the vectorizable_operation
> function could detect and introduce the builtin where appropriate.
>
> Would this be acceptable, or am I wasting my time planning something
> that would get rejected?

So why not make it possible for the target to specify there's a libcall
for a specific optab so the vectorizer would simply use vectorized
{TRUNC_DIV,RDIV}_EXPR but the RTL expander would emit a
libcall (in libgcc ways, thus divv2df3 or so)?  It feels wrong to add
some secondary machinery here (like for example having
.RDIV internal function calls instead of a / operator)

I think that for standard unops and binops that would be the default
behavior already, the only piece missing is the vectorizer looking
for a CODE_FOR_* optab handler and there's currently no way
to say "yes I have a libcall fallback" or "no, no libcall fallback available"
or for a target to specify those (maybe add a (define_libcall ...) alongside
(define_expand ...)?)

A short-circuit would be to use a new target hook to specify that
libcall availability iff the libcall emission works.  There's the remaining
question of whether the libcall emission code works good enough for
vector types, in cases the ABI for libcalls doesn't match the ABI
for regular calls.

Richard.

>
> Thanks
>
> Andrew

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-03-23  7:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-21 16:59 Libgcc divide vectorization question Andrew Stubbs
2023-03-22 10:09 ` Richard Biener
2023-03-22 11:02   ` Andrew Stubbs
2023-03-22 13:56     ` Richard Biener
2023-03-22 15:57       ` Andrew Stubbs
2023-03-23  7:24         ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).