* [AARCH64] fnma<mode>4: scalar vs vector and placement of neg.
@ 2017-06-28 2:52 Andrew Pinski
2017-06-29 7:53 ` Richard Sandiford
0 siblings, 1 reply; 2+ messages in thread
From: Andrew Pinski @ 2017-06-28 2:52 UTC (permalink / raw)
To: GCC Mailing List
Hi,
I was looking into why we don't produce fmls with a scalar register
as the last argument but I found a difference in how fnma<mode>4 is
described in RTL which I think is causing the missed optimization.
Look at the scalar version:
(define_insn "fnma<mode>4"
[(set (match_operand:GPF_F16 0 "register_operand" "=w")
(fma:GPF_F16
(neg:GPF_F16 (match_operand:GPF_F16 1 "register_operand" "w"))
(match_operand:GPF_F16 2 "register_operand" "w")
(match_operand:GPF_F16 3 "register_operand" "w")))]
"TARGET_FLOAT"
"fmsub\\t%<s>0, %<s>1, %<s>2, %<s>3"
[(set_attr "type" "fmac<stype>")]
)
vs the vector version:
(define_insn "fnma<mode>4"
[(set (match_operand:VHSDF 0 "register_operand" "=w")
(fma:VHSDF
(match_operand:VHSDF 1 "register_operand" "w")
(neg:VHSDF
(match_operand:VHSDF 2 "register_operand" "w"))
(match_operand:VHSDF 3 "register_operand" "0")))]
"TARGET_SIMD"
"fmls\\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
[(set_attr "type" "neon_fp_mla_<stype><q>")]
)
Notice how the neg is a different location for both of them. What is
the reason for that?
Thanks,
Andrew
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [AARCH64] fnma<mode>4: scalar vs vector and placement of neg.
2017-06-28 2:52 [AARCH64] fnma<mode>4: scalar vs vector and placement of neg Andrew Pinski
@ 2017-06-29 7:53 ` Richard Sandiford
0 siblings, 0 replies; 2+ messages in thread
From: Richard Sandiford @ 2017-06-29 7:53 UTC (permalink / raw)
To: Andrew Pinski; +Cc: GCC Mailing List
Andrew Pinski <pinskia@gmail.com> writes:
> Hi,
> I was looking into why we don't produce fmls with a scalar register
> as the last argument but I found a difference in how fnma<mode>4 is
> described in RTL which I think is causing the missed optimization.
> Look at the scalar version:
>
> (define_insn "fnma<mode>4"
> [(set (match_operand:GPF_F16 0 "register_operand" "=w")
> (fma:GPF_F16
> (neg:GPF_F16 (match_operand:GPF_F16 1 "register_operand" "w"))
> (match_operand:GPF_F16 2 "register_operand" "w")
> (match_operand:GPF_F16 3 "register_operand" "w")))]
> "TARGET_FLOAT"
> "fmsub\\t%<s>0, %<s>1, %<s>2, %<s>3"
> [(set_attr "type" "fmac<stype>")]
> )
>
> vs the vector version:
> (define_insn "fnma<mode>4"
> [(set (match_operand:VHSDF 0 "register_operand" "=w")
> (fma:VHSDF
> (match_operand:VHSDF 1 "register_operand" "w")
> (neg:VHSDF
> (match_operand:VHSDF 2 "register_operand" "w"))
> (match_operand:VHSDF 3 "register_operand" "0")))]
> "TARGET_SIMD"
> "fmls\\t%0.<Vtype>, %1.<Vtype>, %2.<Vtype>"
> [(set_attr "type" "neon_fp_mla_<stype><q>")]
> )
>
> Notice how the neg is a different location for both of them. What is
> the reason for that?
Yeah, that looks weird. We should be treating the first two operands of
FMA as commutative, which with the normal canonicalization rules would
make the scalar version right and the vector version the one that should
change.
Does that give the output you wanted? Or does it need to be the other
way around?
Thanks,
Richard
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2017-06-29 7:53 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-28 2:52 [AARCH64] fnma<mode>4: scalar vs vector and placement of neg Andrew Pinski
2017-06-29 7:53 ` Richard Sandiford
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).