* [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882]
@ 2024-06-04 1:09 Pengxuan Zheng
2024-06-06 12:29 ` Richard Sandiford
0 siblings, 1 reply; 5+ messages in thread
From: Pengxuan Zheng @ 2024-06-04 1:09 UTC (permalink / raw)
To: gcc-patches; +Cc: Pengxuan Zheng
This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is implemented
using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2 (V4SI->V4HI).
PR target/113882
gcc/ChangeLog:
* config/aarch64/aarch64-simd.md (fix_truncv4sfv4hi2): New pattern.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/fix_trunc2.c: New test.
Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>
---
gcc/config/aarch64/aarch64-simd.md | 13 +++++++++++++
gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++++++++++++++
2 files changed, 27 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 868f4486218..096f7b56a27 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3032,6 +3032,19 @@ (define_expand "<fix_trunc_optab><VHSDF:mode><fcvt_target>2"
"TARGET_SIMD"
{})
+
+(define_expand "fix_truncv4sfv4hi2"
+ [(match_operand:V4HI 0 "register_operand")
+ (match_operand:V4SF 1 "register_operand")]
+ "TARGET_SIMD"
+ {
+ rtx tmp = gen_reg_rtx (V4SImode);
+ emit_insn (gen_fix_truncv4sfv4si2 (tmp, operands[1]));
+ emit_insn (gen_truncv4siv4hi2 (operands[0], tmp));
+ DONE;
+ }
+)
+
(define_expand "ftrunc<VHSDF:mode>2"
[(set (match_operand:VHSDF 0 "register_operand")
(unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")]
diff --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
new file mode 100644
index 00000000000..57cc00913a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void
+f (short *__restrict a, float *__restrict b)
+{
+ a[0] = b[0];
+ a[1] = b[1];
+ a[2] = b[2];
+ a[3] = b[3];
+}
+
+/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s} 1 } } */
+/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1 } } */
--
2.17.1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882]
2024-06-04 1:09 [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882] Pengxuan Zheng
@ 2024-06-06 12:29 ` Richard Sandiford
2024-06-06 13:22 ` Richard Biener
2024-06-18 0:05 ` Pengxuan Zheng (QUIC)
0 siblings, 2 replies; 5+ messages in thread
From: Richard Sandiford @ 2024-06-06 12:29 UTC (permalink / raw)
To: Pengxuan Zheng; +Cc: gcc-patches, rguenther
Pengxuan Zheng <quic_pzheng@quicinc.com> writes:
> This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is implemented
> using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2 (V4SI->V4HI).
>
> PR target/113882
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64-simd.md (fix_truncv4sfv4hi2): New pattern.
Could we handle this by extending the target-independent code instead?
Richard mentioned in comment 1 that the current set of intermediate
conversions is hard-coded, but it didn't sound like he was implying that
the set shouldn't change.
Thanks,
Richard
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/fix_trunc2.c: New test.
>
> Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>
> ---
> gcc/config/aarch64/aarch64-simd.md | 13 +++++++++++++
> gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++++++++++++++
> 2 files changed, 27 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
> index 868f4486218..096f7b56a27 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -3032,6 +3032,19 @@ (define_expand "<fix_trunc_optab><VHSDF:mode><fcvt_target>2"
> "TARGET_SIMD"
> {})
>
> +
> +(define_expand "fix_truncv4sfv4hi2"
> + [(match_operand:V4HI 0 "register_operand")
> + (match_operand:V4SF 1 "register_operand")]
> + "TARGET_SIMD"
> + {
> + rtx tmp = gen_reg_rtx (V4SImode);
> + emit_insn (gen_fix_truncv4sfv4si2 (tmp, operands[1]));
> + emit_insn (gen_truncv4siv4hi2 (operands[0], tmp));
> + DONE;
> + }
> +)
> +
> (define_expand "ftrunc<VHSDF:mode>2"
> [(set (match_operand:VHSDF 0 "register_operand")
> (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")]
> diff --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> new file mode 100644
> index 00000000000..57cc00913a3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +void
> +f (short *__restrict a, float *__restrict b)
> +{
> + a[0] = b[0];
> + a[1] = b[1];
> + a[2] = b[2];
> + a[3] = b[3];
> +}
> +
> +/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s} 1 } } */
> +/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1 } } */
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882]
2024-06-06 12:29 ` Richard Sandiford
@ 2024-06-06 13:22 ` Richard Biener
2024-06-18 0:05 ` Pengxuan Zheng (QUIC)
1 sibling, 0 replies; 5+ messages in thread
From: Richard Biener @ 2024-06-06 13:22 UTC (permalink / raw)
To: Richard Sandiford; +Cc: Pengxuan Zheng, gcc-patches
On Thu, 6 Jun 2024, Richard Sandiford wrote:
> Pengxuan Zheng <quic_pzheng@quicinc.com> writes:
> > This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is implemented
> > using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2 (V4SI->V4HI).
> >
> > PR target/113882
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64-simd.md (fix_truncv4sfv4hi2): New pattern.
>
> Could we handle this by extending the target-independent code instead?
> Richard mentioned in comment 1 that the current set of intermediate
> conversions is hard-coded, but it didn't sound like he was implying that
> the set shouldn't change.
Yes, much like non-SLP uses supportable_narrowing_operation with any
number of intermediate conversions the SLP case should do something
similar.
Richard.
> Thanks,
> Richard
>
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/fix_trunc2.c: New test.
> >
> > Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>
> > ---
> > gcc/config/aarch64/aarch64-simd.md | 13 +++++++++++++
> > gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++++++++++++++
> > 2 files changed, 27 insertions(+)
> > create mode 100644 gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> >
> > diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
> > index 868f4486218..096f7b56a27 100644
> > --- a/gcc/config/aarch64/aarch64-simd.md
> > +++ b/gcc/config/aarch64/aarch64-simd.md
> > @@ -3032,6 +3032,19 @@ (define_expand "<fix_trunc_optab><VHSDF:mode><fcvt_target>2"
> > "TARGET_SIMD"
> > {})
> >
> > +
> > +(define_expand "fix_truncv4sfv4hi2"
> > + [(match_operand:V4HI 0 "register_operand")
> > + (match_operand:V4SF 1 "register_operand")]
> > + "TARGET_SIMD"
> > + {
> > + rtx tmp = gen_reg_rtx (V4SImode);
> > + emit_insn (gen_fix_truncv4sfv4si2 (tmp, operands[1]));
> > + emit_insn (gen_truncv4siv4hi2 (operands[0], tmp));
> > + DONE;
> > + }
> > +)
> > +
> > (define_expand "ftrunc<VHSDF:mode>2"
> > [(set (match_operand:VHSDF 0 "register_operand")
> > (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")]
> > diff --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> > new file mode 100644
> > index 00000000000..57cc00913a3
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> > @@ -0,0 +1,14 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2" } */
> > +
> > +void
> > +f (short *__restrict a, float *__restrict b)
> > +{
> > + a[0] = b[0];
> > + a[1] = b[1];
> > + a[2] = b[2];
> > + a[3] = b[3];
> > +}
> > +
> > +/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s} 1 } } */
> > +/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1 } } */
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882]
2024-06-06 12:29 ` Richard Sandiford
2024-06-06 13:22 ` Richard Biener
@ 2024-06-18 0:05 ` Pengxuan Zheng (QUIC)
2024-06-18 5:55 ` Richard Biener
1 sibling, 1 reply; 5+ messages in thread
From: Pengxuan Zheng (QUIC) @ 2024-06-18 0:05 UTC (permalink / raw)
To: Richard Sandiford, Pengxuan Zheng (QUIC); +Cc: gcc-patches, rguenther
> Pengxuan Zheng <quic_pzheng@quicinc.com> writes:
> > This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is
> > implemented using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2
> (V4SI->V4HI).
> >
> > PR target/113882
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64-simd.md (fix_truncv4sfv4hi2): New pattern.
>
> Could we handle this by extending the target-independent code instead?
> Richard mentioned in comment 1 that the current set of intermediate
> conversions is hard-coded, but it didn't sound like he was implying that the
> set shouldn't change.
Yes, Richard. I checked the target-independent code. In fact, SLP already
handles this type of intermediate conversions. However, the logic is guarded by
"!flag_trapping_math". Therefore, if we pass -fno-trapping-math , SLP actually
generates the right vectorized code. Also, looks like the check for
"!flag_trapping_math" was added intentionally in r14-2085-g77a50c772771f6 to fix
some PRs. So, I'm not sure what we should do here. Thoughts?
if (GET_MODE_SIZE (lhs_mode) != GET_MODE_SIZE (rhs_mode)
&& (code == FLOAT_EXPR ||
(code == FIX_TRUNC_EXPR && !flag_trapping_math)))
Thanks,
Pengxuan
>
> Thanks,
> Richard
>
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/fix_trunc2.c: New test.
> >
> > Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>
> > ---
> > gcc/config/aarch64/aarch64-simd.md | 13 +++++++++++++
> > gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++++++++++++++
> > 2 files changed, 27 insertions(+)
> > create mode 100644 gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> >
> > diff --git a/gcc/config/aarch64/aarch64-simd.md
> > b/gcc/config/aarch64/aarch64-simd.md
> > index 868f4486218..096f7b56a27 100644
> > --- a/gcc/config/aarch64/aarch64-simd.md
> > +++ b/gcc/config/aarch64/aarch64-simd.md
> > @@ -3032,6 +3032,19 @@ (define_expand
> "<fix_trunc_optab><VHSDF:mode><fcvt_target>2"
> > "TARGET_SIMD"
> > {})
> >
> > +
> > +(define_expand "fix_truncv4sfv4hi2"
> > + [(match_operand:V4HI 0 "register_operand")
> > + (match_operand:V4SF 1 "register_operand")]
> > + "TARGET_SIMD"
> > + {
> > + rtx tmp = gen_reg_rtx (V4SImode);
> > + emit_insn (gen_fix_truncv4sfv4si2 (tmp, operands[1]));
> > + emit_insn (gen_truncv4siv4hi2 (operands[0], tmp));
> > + DONE;
> > + }
> > +)
> > +
> > (define_expand "ftrunc<VHSDF:mode>2"
> > [(set (match_operand:VHSDF 0 "register_operand")
> > (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")] diff
> > --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> > b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> > new file mode 100644
> > index 00000000000..57cc00913a3
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> > @@ -0,0 +1,14 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2" } */
> > +
> > +void
> > +f (short *__restrict a, float *__restrict b) {
> > + a[0] = b[0];
> > + a[1] = b[1];
> > + a[2] = b[2];
> > + a[3] = b[3];
> > +}
> > +
> > +/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s}
> > +1 } } */
> > +/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1
> > +} } */
^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882]
2024-06-18 0:05 ` Pengxuan Zheng (QUIC)
@ 2024-06-18 5:55 ` Richard Biener
0 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2024-06-18 5:55 UTC (permalink / raw)
To: Pengxuan Zheng (QUIC); +Cc: Richard Sandiford, gcc-patches
On Tue, 18 Jun 2024, Pengxuan Zheng (QUIC) wrote:
> > Pengxuan Zheng <quic_pzheng@quicinc.com> writes:
> > > This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is
> > > implemented using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2
> > (V4SI->V4HI).
> > >
> > > PR target/113882
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/aarch64/aarch64-simd.md (fix_truncv4sfv4hi2): New pattern.
> >
> > Could we handle this by extending the target-independent code instead?
> > Richard mentioned in comment 1 that the current set of intermediate
> > conversions is hard-coded, but it didn't sound like he was implying that the
> > set shouldn't change.
>
> Yes, Richard. I checked the target-independent code. In fact, SLP already
> handles this type of intermediate conversions. However, the logic is guarded by
> "!flag_trapping_math". Therefore, if we pass -fno-trapping-math , SLP actually
> generates the right vectorized code. Also, looks like the check for
> "!flag_trapping_math" was added intentionally in r14-2085-g77a50c772771f6 to fix
> some PRs. So, I'm not sure what we should do here. Thoughts?
>
> if (GET_MODE_SIZE (lhs_mode) != GET_MODE_SIZE (rhs_mode)
> && (code == FLOAT_EXPR ||
> (code == FIX_TRUNC_EXPR && !flag_trapping_math)))
That is because of missing FE_INVALID(?) when say float -> signed char
doesn't fit but float -> int does and the remaining converts are done
as int -> {short,char}.
There has been multiple rounds of discussion whether flag_trapping_math
should be off by default.
Richard.
> Thanks,
> Pengxuan
> >
> > Thanks,
> > Richard
> >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.target/aarch64/fix_trunc2.c: New test.
> > >
> > > Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com>
> > > ---
> > > gcc/config/aarch64/aarch64-simd.md | 13 +++++++++++++
> > > gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++++++++++++++
> > > 2 files changed, 27 insertions(+)
> > > create mode 100644 gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> > >
> > > diff --git a/gcc/config/aarch64/aarch64-simd.md
> > > b/gcc/config/aarch64/aarch64-simd.md
> > > index 868f4486218..096f7b56a27 100644
> > > --- a/gcc/config/aarch64/aarch64-simd.md
> > > +++ b/gcc/config/aarch64/aarch64-simd.md
> > > @@ -3032,6 +3032,19 @@ (define_expand
> > "<fix_trunc_optab><VHSDF:mode><fcvt_target>2"
> > > "TARGET_SIMD"
> > > {})
> > >
> > > +
> > > +(define_expand "fix_truncv4sfv4hi2"
> > > + [(match_operand:V4HI 0 "register_operand")
> > > + (match_operand:V4SF 1 "register_operand")]
> > > + "TARGET_SIMD"
> > > + {
> > > + rtx tmp = gen_reg_rtx (V4SImode);
> > > + emit_insn (gen_fix_truncv4sfv4si2 (tmp, operands[1]));
> > > + emit_insn (gen_truncv4siv4hi2 (operands[0], tmp));
> > > + DONE;
> > > + }
> > > +)
> > > +
> > > (define_expand "ftrunc<VHSDF:mode>2"
> > > [(set (match_operand:VHSDF 0 "register_operand")
> > > (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")] diff
> > > --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> > > b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> > > new file mode 100644
> > > index 00000000000..57cc00913a3
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c
> > > @@ -0,0 +1,14 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O2" } */
> > > +
> > > +void
> > > +f (short *__restrict a, float *__restrict b) {
> > > + a[0] = b[0];
> > > + a[1] = b[1];
> > > + a[2] = b[2];
> > > + a[3] = b[3];
> > > +}
> > > +
> > > +/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s}
> > > +1 } } */
> > > +/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1
> > > +} } */
>
--
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-06-18 5:55 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-04 1:09 [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882] Pengxuan Zheng
2024-06-06 12:29 ` Richard Sandiford
2024-06-06 13:22 ` Richard Biener
2024-06-18 0:05 ` Pengxuan Zheng (QUIC)
2024-06-18 5:55 ` Richard Biener
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).