[Bug target/81904] FMA and addsub instructions

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/81904] FMA and addsub instructions
       [not found] <bug-81904-4@http.gcc.gnu.org/bugzilla/>
@ 2023-07-21 12:31 ` rguenth at gcc dot gnu.org
  2023-07-31  5:26 ` crazylht at gmail dot com
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-21 12:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81904

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
*** Bug 84361 has been marked as a duplicate of this bug. ***

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/81904] FMA and addsub instructions
       [not found] <bug-81904-4@http.gcc.gnu.org/bugzilla/>
  2023-07-21 12:31 ` [Bug target/81904] FMA and addsub instructions rguenth at gcc dot gnu.org
@ 2023-07-31  5:26 ` crazylht at gmail dot com
  2023-07-31  5:27 ` crazylht at gmail dot com
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: crazylht at gmail dot com @ 2023-07-31  5:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81904

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #2)
> __m128d h(__m128d x, __m128d y, __m128d z){
>     __m128d tem = _mm_mul_pd (x,y);
>     __m128d tem2 = tem + z;
>     __m128d tem3 = tem - z;
>     return __builtin_shuffle (tem2, tem3, (__m128i) {0, 3});
> }
> 
> doesn't quite work (the combiner pattern for fmaddsub is missing).  Tried
> {0, 2} as well.
> 
> :
> .LFB5021:
>         .cfi_startproc
>         vmovapd %xmm0, %xmm3
>         vfmsub132pd     %xmm1, %xmm2, %xmm0
>         vfmadd132pd     %xmm1, %xmm2, %xmm3
>         vshufpd $2, %xmm0, %xmm3, %xmm0

  tem2_6 = .FMA (x_2(D), y_3(D), z_5(D));
  # DEBUG tem2 => tem2_6
  # DEBUG BEGIN_STMT
  tem3_7 = .FMS (x_2(D), y_3(D), z_5(D));
  # DEBUG tem3 => NULL
  # DEBUG BEGIN_STMT
  _8 = VEC_PERM_EXPR <tem2_6, tem3_7, { 0, 3 }>;

Can it be handled in match.pd? rewrite fmaddsub pattern into vec_merge fma fms
<addsub_cst> looks too complex.

Similar for VEC_ADDSUB + MUL -> VEC_FMADDSUB.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/81904] FMA and addsub instructions
       [not found] <bug-81904-4@http.gcc.gnu.org/bugzilla/>
  2023-07-21 12:31 ` [Bug target/81904] FMA and addsub instructions rguenth at gcc dot gnu.org
  2023-07-31  5:26 ` crazylht at gmail dot com
@ 2023-07-31  5:27 ` crazylht at gmail dot com
  2023-07-31  7:32 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 6+ messages in thread
From: crazylht at gmail dot com @ 2023-07-31  5:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81904

--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #1)
> Hmm, I think the issue is we see
> 
> f (__m128d x, __m128d y, __m128d z)
> {
>   vector(2) double _4;
>   vector(2) double _6;
> 
>   <bb 2> [100.00%]:
>   _4 = x_2(D) * y_3(D);
>   _6 = __builtin_ia32_addsubpd (_4, z_5(D)); [tail call]
We can fold the builtin into .VEC_ADDSUB, and optimize MUL + VEC_ADDSUB ->
VEC_FMADDSUB in match.pd?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/81904] FMA and addsub instructions
       [not found] <bug-81904-4@http.gcc.gnu.org/bugzilla/>
                   ` (2 preceding siblings ...)
  2023-07-31  5:27 ` crazylht at gmail dot com
@ 2023-07-31  7:32 ` rguenth at gcc dot gnu.org
  2023-07-31  7:58 ` crazylht at gmail dot com
  2023-08-02  6:50 ` cvs-commit at gcc dot gnu.org
  5 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-31  7:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81904

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Hongtao.liu from comment #5)
> (In reply to Richard Biener from comment #1)
> > Hmm, I think the issue is we see
> > 
> > f (__m128d x, __m128d y, __m128d z)
> > {
> >   vector(2) double _4;
> >   vector(2) double _6;
> > 
> >   <bb 2> [100.00%]:
> >   _4 = x_2(D) * y_3(D);
> >   _6 = __builtin_ia32_addsubpd (_4, z_5(D)); [tail call]
> We can fold the builtin into .VEC_ADDSUB, and optimize MUL + VEC_ADDSUB ->
> VEC_FMADDSUB in match.pd?

I think MUL + .VEC_ADDSUB can be handled in the FMA pass.  For my example
above we early (before FMA recog) get

  _4 = x_2(D) * y_3(D);
  tem2_7 = _4 + z_6(D);
  tem3_8 = _4 - z_6(D);
  _9 = VEC_PERM_EXPR <tem2_7, tem3_8, { 0, 3 }>;

we could recognize that as .VEC_ADDSUB.  I think we want to avoid doing
this too early, not sure if doing this within the FMA pass itself will
work since we key FMAs on the mult but would need to key the addsub
on the VEC_PERM (we are walking stmts from BB start to end).  Looking
at the code it seems changing the walking order should work.

Note matching

  tem2_7 = _4 + z_6(D);
  tem3_8 = _4 - z_6(D);
  _9 = VEC_PERM_EXPR <tem2_7, tem3_8, { 0, 3 }>;

to .VEC_ADDSUB possibly loses exceptions (the vectorizer now directly
creates .VEC_ADDSUB when possible).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/81904] FMA and addsub instructions
       [not found] <bug-81904-4@http.gcc.gnu.org/bugzilla/>
                   ` (3 preceding siblings ...)
  2023-07-31  7:32 ` rguenth at gcc dot gnu.org
@ 2023-07-31  7:58 ` crazylht at gmail dot com
  2023-08-02  6:50 ` cvs-commit at gcc dot gnu.org
  5 siblings, 0 replies; 6+ messages in thread
From: crazylht at gmail dot com @ 2023-07-31  7:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81904

--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---

> 
> to .VEC_ADDSUB possibly loses exceptions (the vectorizer now directly
> creates .VEC_ADDSUB when possible).
Let's put it under -fno-trapping-math.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/81904] FMA and addsub instructions
       [not found] <bug-81904-4@http.gcc.gnu.org/bugzilla/>
                   ` (4 preceding siblings ...)
  2023-07-31  7:58 ` crazylht at gmail dot com
@ 2023-08-02  6:50 ` cvs-commit at gcc dot gnu.org
  5 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-08-02  6:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81904

--- Comment #8 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:f0b7a61d83534fc8f7aa593b1f0f0357a371a800

commit r14-2919-gf0b7a61d83534fc8f7aa593b1f0f0357a371a800
Author: liuhongt <hongtao.liu@intel.com>
Date:   Mon Jul 31 16:03:45 2023 +0800

    Support vec_fmaddsub/vec_fmsubadd for vector HFmode.

    AVX512FP16 supports vfmaddsubXXXph and vfmsubaddXXXph.
    Also remove scalar mode from fmaddsub/fmsubadd pattern since there's
    no scalar instruction for that.

    gcc/ChangeLog:

            PR target/81904
            * config/i386/sse.md (vec_fmaddsub<mode>4): Extend to vector
            HFmode, use mode iterator VFH instead.
            (vec_fmsubadd<mode>4): Ditto.
            (<sd_mask_codefor>fma_fmaddsub_<mode><sd_maskz_name><round_name>):
            Remove scalar mode from iterator, use VFH_AVX512VL instead.
            (<sd_mask_codefor>fma_fmsubadd_<mode><sd_maskz_name><round_name>):
            Ditto.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr81904.c: New test.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-08-02  6:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-81904-4@http.gcc.gnu.org/bugzilla/>
2023-07-21 12:31 ` [Bug target/81904] FMA and addsub instructions rguenth at gcc dot gnu.org
2023-07-31  5:26 ` crazylht at gmail dot com
2023-07-31  5:27 ` crazylht at gmail dot com
2023-07-31  7:32 ` rguenth at gcc dot gnu.org
2023-07-31  7:58 ` crazylht at gmail dot com
2023-08-02  6:50 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).