public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [AARCH64] fnma<mode>4: scalar vs vector and placement of neg.
@ 2017-06-28  2:52 Andrew Pinski
  2017-06-29  7:53 ` Richard Sandiford
  0 siblings, 1 reply; 2+ messages in thread
From: Andrew Pinski @ 2017-06-28  2:52 UTC (permalink / raw)
  To: GCC Mailing List

Hi,
  I was looking into why we don't produce fmls with a scalar register
as the last argument but I found a difference in how fnma<mode>4 is
described in RTL which I think is causing the missed optimization.
Look at the scalar version:


(define_insn "fnma<mode>4"
  [(set (match_operand:GPF_F16 0 "register_operand" "=w")
        (fma:GPF_F16
          (neg:GPF_F16 (match_operand:GPF_F16 1 "register_operand" "w"))
          (match_operand:GPF_F16 2 "register_operand" "w")
          (match_operand:GPF_F16 3 "register_operand" "w")))]
  "TARGET_FLOAT"
  "fmsub\\t%<s>0, %<s>1, %<s>2, %<s>3"
  [(set_attr "type" "fmac<stype>")]
)

vs the vector version:
(define_insn "fnma<mode>4"
  [(set (match_operand:VHSDF 0 "register_operand" "=w")
        (fma:VHSDF
          (match_operand:VHSDF 1 "register_operand" "w")
          (neg:VHSDF
            (match_operand:VHSDF 2 "register_operand" "w"))
          (match_operand:VHSDF 3 "register_operand" "0")))]
  "TARGET_SIMD"
  "fmls\\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
  [(set_attr "type" "neon_fp_mla_<stype><q>")]
)

Notice how the neg is a different location for both of them.  What is
the reason for that?

Thanks,
Andrew

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [AARCH64] fnma<mode>4: scalar vs vector and placement of neg.
  2017-06-28  2:52 [AARCH64] fnma<mode>4: scalar vs vector and placement of neg Andrew Pinski
@ 2017-06-29  7:53 ` Richard Sandiford
  0 siblings, 0 replies; 2+ messages in thread
From: Richard Sandiford @ 2017-06-29  7:53 UTC (permalink / raw)
  To: Andrew Pinski; +Cc: GCC Mailing List

Andrew Pinski <pinskia@gmail.com> writes:
> Hi,
>   I was looking into why we don't produce fmls with a scalar register
> as the last argument but I found a difference in how fnma<mode>4 is
> described in RTL which I think is causing the missed optimization.
> Look at the scalar version:
>
> (define_insn "fnma<mode>4"
>   [(set (match_operand:GPF_F16 0 "register_operand" "=w")
>         (fma:GPF_F16
>           (neg:GPF_F16 (match_operand:GPF_F16 1 "register_operand" "w"))
>           (match_operand:GPF_F16 2 "register_operand" "w")
>           (match_operand:GPF_F16 3 "register_operand" "w")))]
>   "TARGET_FLOAT"
>   "fmsub\\t%<s>0, %<s>1, %<s>2, %<s>3"
>   [(set_attr "type" "fmac<stype>")]
> )
>
> vs the vector version:
> (define_insn "fnma<mode>4"
>   [(set (match_operand:VHSDF 0 "register_operand" "=w")
>         (fma:VHSDF
>           (match_operand:VHSDF 1 "register_operand" "w")
>           (neg:VHSDF
>             (match_operand:VHSDF 2 "register_operand" "w"))
>           (match_operand:VHSDF 3 "register_operand" "0")))]
>   "TARGET_SIMD"
>   "fmls\\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
>   [(set_attr "type" "neon_fp_mla_<stype><q>")]
> )
>
> Notice how the neg is a different location for both of them.  What is
> the reason for that?

Yeah, that looks weird.  We should be treating the first two operands of
FMA as commutative, which with the normal canonicalization rules would
make the scalar version right and the vector version the one that should
change.

Does that give the output you wanted?  Or does it need to be the other
way around?

Thanks,
Richard

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-06-29  7:53 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-28  2:52 [AARCH64] fnma<mode>4: scalar vs vector and placement of neg Andrew Pinski
2017-06-29  7:53 ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).