* [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882] @ 2024-06-04 1:09 Pengxuan Zheng 2024-06-06 12:29 ` Richard Sandiford 0 siblings, 1 reply; 5+ messages in thread From: Pengxuan Zheng @ 2024-06-04 1:09 UTC (permalink / raw) To: gcc-patches; +Cc: Pengxuan Zheng This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is implemented using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2 (V4SI->V4HI). PR target/113882 gcc/ChangeLog: * config/aarch64/aarch64-simd.md (fix_truncv4sfv4hi2): New pattern. gcc/testsuite/ChangeLog: * gcc.target/aarch64/fix_trunc2.c: New test. Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com> --- gcc/config/aarch64/aarch64-simd.md | 13 +++++++++++++ gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++++++++++++++ 2 files changed, 27 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/fix_trunc2.c diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 868f4486218..096f7b56a27 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3032,6 +3032,19 @@ (define_expand "<fix_trunc_optab><VHSDF:mode><fcvt_target>2" "TARGET_SIMD" {}) + +(define_expand "fix_truncv4sfv4hi2" + [(match_operand:V4HI 0 "register_operand") + (match_operand:V4SF 1 "register_operand")] + "TARGET_SIMD" + { + rtx tmp = gen_reg_rtx (V4SImode); + emit_insn (gen_fix_truncv4sfv4si2 (tmp, operands[1])); + emit_insn (gen_truncv4siv4hi2 (operands[0], tmp)); + DONE; + } +) + (define_expand "ftrunc<VHSDF:mode>2" [(set (match_operand:VHSDF 0 "register_operand") (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")] diff --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c new file mode 100644 index 00000000000..57cc00913a3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +void +f (short *__restrict a, float *__restrict b) +{ + a[0] = b[0]; + a[1] = b[1]; + a[2] = b[2]; + a[3] = b[3]; +} + +/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s} 1 } } */ +/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1 } } */ -- 2.17.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882] 2024-06-04 1:09 [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882] Pengxuan Zheng @ 2024-06-06 12:29 ` Richard Sandiford 2024-06-06 13:22 ` Richard Biener 2024-06-18 0:05 ` Pengxuan Zheng (QUIC) 0 siblings, 2 replies; 5+ messages in thread From: Richard Sandiford @ 2024-06-06 12:29 UTC (permalink / raw) To: Pengxuan Zheng; +Cc: gcc-patches, rguenther Pengxuan Zheng <quic_pzheng@quicinc.com> writes: > This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is implemented > using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2 (V4SI->V4HI). > > PR target/113882 > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md (fix_truncv4sfv4hi2): New pattern. Could we handle this by extending the target-independent code instead? Richard mentioned in comment 1 that the current set of intermediate conversions is hard-coded, but it didn't sound like he was implying that the set shouldn't change. Thanks, Richard > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/fix_trunc2.c: New test. > > Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com> > --- > gcc/config/aarch64/aarch64-simd.md | 13 +++++++++++++ > gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++++++++++++++ > 2 files changed, 27 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md > index 868f4486218..096f7b56a27 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -3032,6 +3032,19 @@ (define_expand "<fix_trunc_optab><VHSDF:mode><fcvt_target>2" > "TARGET_SIMD" > {}) > > + > +(define_expand "fix_truncv4sfv4hi2" > + [(match_operand:V4HI 0 "register_operand") > + (match_operand:V4SF 1 "register_operand")] > + "TARGET_SIMD" > + { > + rtx tmp = gen_reg_rtx (V4SImode); > + emit_insn (gen_fix_truncv4sfv4si2 (tmp, operands[1])); > + emit_insn (gen_truncv4siv4hi2 (operands[0], tmp)); > + DONE; > + } > +) > + > (define_expand "ftrunc<VHSDF:mode>2" > [(set (match_operand:VHSDF 0 "register_operand") > (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")] > diff --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > new file mode 100644 > index 00000000000..57cc00913a3 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +void > +f (short *__restrict a, float *__restrict b) > +{ > + a[0] = b[0]; > + a[1] = b[1]; > + a[2] = b[2]; > + a[3] = b[3]; > +} > + > +/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s} 1 } } */ > +/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1 } } */ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882] 2024-06-06 12:29 ` Richard Sandiford @ 2024-06-06 13:22 ` Richard Biener 2024-06-18 0:05 ` Pengxuan Zheng (QUIC) 1 sibling, 0 replies; 5+ messages in thread From: Richard Biener @ 2024-06-06 13:22 UTC (permalink / raw) To: Richard Sandiford; +Cc: Pengxuan Zheng, gcc-patches On Thu, 6 Jun 2024, Richard Sandiford wrote: > Pengxuan Zheng <quic_pzheng@quicinc.com> writes: > > This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is implemented > > using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2 (V4SI->V4HI). > > > > PR target/113882 > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64-simd.md (fix_truncv4sfv4hi2): New pattern. > > Could we handle this by extending the target-independent code instead? > Richard mentioned in comment 1 that the current set of intermediate > conversions is hard-coded, but it didn't sound like he was implying that > the set shouldn't change. Yes, much like non-SLP uses supportable_narrowing_operation with any number of intermediate conversions the SLP case should do something similar. Richard. > Thanks, > Richard > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/aarch64/fix_trunc2.c: New test. > > > > Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com> > > --- > > gcc/config/aarch64/aarch64-simd.md | 13 +++++++++++++ > > gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++++++++++++++ > > 2 files changed, 27 insertions(+) > > create mode 100644 gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > > > diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md > > index 868f4486218..096f7b56a27 100644 > > --- a/gcc/config/aarch64/aarch64-simd.md > > +++ b/gcc/config/aarch64/aarch64-simd.md > > @@ -3032,6 +3032,19 @@ (define_expand "<fix_trunc_optab><VHSDF:mode><fcvt_target>2" > > "TARGET_SIMD" > > {}) > > > > + > > +(define_expand "fix_truncv4sfv4hi2" > > + [(match_operand:V4HI 0 "register_operand") > > + (match_operand:V4SF 1 "register_operand")] > > + "TARGET_SIMD" > > + { > > + rtx tmp = gen_reg_rtx (V4SImode); > > + emit_insn (gen_fix_truncv4sfv4si2 (tmp, operands[1])); > > + emit_insn (gen_truncv4siv4hi2 (operands[0], tmp)); > > + DONE; > > + } > > +) > > + > > (define_expand "ftrunc<VHSDF:mode>2" > > [(set (match_operand:VHSDF 0 "register_operand") > > (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")] > > diff --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > new file mode 100644 > > index 00000000000..57cc00913a3 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > @@ -0,0 +1,14 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2" } */ > > + > > +void > > +f (short *__restrict a, float *__restrict b) > > +{ > > + a[0] = b[0]; > > + a[1] = b[1]; > > + a[2] = b[2]; > > + a[3] = b[3]; > > +} > > + > > +/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s} 1 } } */ > > +/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1 } } */ > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882] 2024-06-06 12:29 ` Richard Sandiford 2024-06-06 13:22 ` Richard Biener @ 2024-06-18 0:05 ` Pengxuan Zheng (QUIC) 2024-06-18 5:55 ` Richard Biener 1 sibling, 1 reply; 5+ messages in thread From: Pengxuan Zheng (QUIC) @ 2024-06-18 0:05 UTC (permalink / raw) To: Richard Sandiford, Pengxuan Zheng (QUIC); +Cc: gcc-patches, rguenther > Pengxuan Zheng <quic_pzheng@quicinc.com> writes: > > This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is > > implemented using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2 > (V4SI->V4HI). > > > > PR target/113882 > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64-simd.md (fix_truncv4sfv4hi2): New pattern. > > Could we handle this by extending the target-independent code instead? > Richard mentioned in comment 1 that the current set of intermediate > conversions is hard-coded, but it didn't sound like he was implying that the > set shouldn't change. Yes, Richard. I checked the target-independent code. In fact, SLP already handles this type of intermediate conversions. However, the logic is guarded by "!flag_trapping_math". Therefore, if we pass -fno-trapping-math , SLP actually generates the right vectorized code. Also, looks like the check for "!flag_trapping_math" was added intentionally in r14-2085-g77a50c772771f6 to fix some PRs. So, I'm not sure what we should do here. Thoughts? if (GET_MODE_SIZE (lhs_mode) != GET_MODE_SIZE (rhs_mode) && (code == FLOAT_EXPR || (code == FIX_TRUNC_EXPR && !flag_trapping_math))) Thanks, Pengxuan > > Thanks, > Richard > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/aarch64/fix_trunc2.c: New test. > > > > Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com> > > --- > > gcc/config/aarch64/aarch64-simd.md | 13 +++++++++++++ > > gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++++++++++++++ > > 2 files changed, 27 insertions(+) > > create mode 100644 gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > > > diff --git a/gcc/config/aarch64/aarch64-simd.md > > b/gcc/config/aarch64/aarch64-simd.md > > index 868f4486218..096f7b56a27 100644 > > --- a/gcc/config/aarch64/aarch64-simd.md > > +++ b/gcc/config/aarch64/aarch64-simd.md > > @@ -3032,6 +3032,19 @@ (define_expand > "<fix_trunc_optab><VHSDF:mode><fcvt_target>2" > > "TARGET_SIMD" > > {}) > > > > + > > +(define_expand "fix_truncv4sfv4hi2" > > + [(match_operand:V4HI 0 "register_operand") > > + (match_operand:V4SF 1 "register_operand")] > > + "TARGET_SIMD" > > + { > > + rtx tmp = gen_reg_rtx (V4SImode); > > + emit_insn (gen_fix_truncv4sfv4si2 (tmp, operands[1])); > > + emit_insn (gen_truncv4siv4hi2 (operands[0], tmp)); > > + DONE; > > + } > > +) > > + > > (define_expand "ftrunc<VHSDF:mode>2" > > [(set (match_operand:VHSDF 0 "register_operand") > > (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")] diff > > --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > new file mode 100644 > > index 00000000000..57cc00913a3 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > @@ -0,0 +1,14 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2" } */ > > + > > +void > > +f (short *__restrict a, float *__restrict b) { > > + a[0] = b[0]; > > + a[1] = b[1]; > > + a[2] = b[2]; > > + a[3] = b[3]; > > +} > > + > > +/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s} > > +1 } } */ > > +/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1 > > +} } */ ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882] 2024-06-18 0:05 ` Pengxuan Zheng (QUIC) @ 2024-06-18 5:55 ` Richard Biener 0 siblings, 0 replies; 5+ messages in thread From: Richard Biener @ 2024-06-18 5:55 UTC (permalink / raw) To: Pengxuan Zheng (QUIC); +Cc: Richard Sandiford, gcc-patches On Tue, 18 Jun 2024, Pengxuan Zheng (QUIC) wrote: > > Pengxuan Zheng <quic_pzheng@quicinc.com> writes: > > > This patch adds the fix_truncv4sfv4hi2 (V4SF->V4HI) pattern which is > > > implemented using fix_truncv4sfv4si2 (V4SF->V4SI) and then truncv4siv4hi2 > > (V4SI->V4HI). > > > > > > PR target/113882 > > > > > > gcc/ChangeLog: > > > > > > * config/aarch64/aarch64-simd.md (fix_truncv4sfv4hi2): New pattern. > > > > Could we handle this by extending the target-independent code instead? > > Richard mentioned in comment 1 that the current set of intermediate > > conversions is hard-coded, but it didn't sound like he was implying that the > > set shouldn't change. > > Yes, Richard. I checked the target-independent code. In fact, SLP already > handles this type of intermediate conversions. However, the logic is guarded by > "!flag_trapping_math". Therefore, if we pass -fno-trapping-math , SLP actually > generates the right vectorized code. Also, looks like the check for > "!flag_trapping_math" was added intentionally in r14-2085-g77a50c772771f6 to fix > some PRs. So, I'm not sure what we should do here. Thoughts? > > if (GET_MODE_SIZE (lhs_mode) != GET_MODE_SIZE (rhs_mode) > && (code == FLOAT_EXPR || > (code == FIX_TRUNC_EXPR && !flag_trapping_math))) That is because of missing FE_INVALID(?) when say float -> signed char doesn't fit but float -> int does and the remaining converts are done as int -> {short,char}. There has been multiple rounds of discussion whether flag_trapping_math should be off by default. Richard. > Thanks, > Pengxuan > > > > Thanks, > > Richard > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.target/aarch64/fix_trunc2.c: New test. > > > > > > Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com> > > > --- > > > gcc/config/aarch64/aarch64-simd.md | 13 +++++++++++++ > > > gcc/testsuite/gcc.target/aarch64/fix_trunc2.c | 14 ++++++++++++++ > > > 2 files changed, 27 insertions(+) > > > create mode 100644 gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > > > > > diff --git a/gcc/config/aarch64/aarch64-simd.md > > > b/gcc/config/aarch64/aarch64-simd.md > > > index 868f4486218..096f7b56a27 100644 > > > --- a/gcc/config/aarch64/aarch64-simd.md > > > +++ b/gcc/config/aarch64/aarch64-simd.md > > > @@ -3032,6 +3032,19 @@ (define_expand > > "<fix_trunc_optab><VHSDF:mode><fcvt_target>2" > > > "TARGET_SIMD" > > > {}) > > > > > > + > > > +(define_expand "fix_truncv4sfv4hi2" > > > + [(match_operand:V4HI 0 "register_operand") > > > + (match_operand:V4SF 1 "register_operand")] > > > + "TARGET_SIMD" > > > + { > > > + rtx tmp = gen_reg_rtx (V4SImode); > > > + emit_insn (gen_fix_truncv4sfv4si2 (tmp, operands[1])); > > > + emit_insn (gen_truncv4siv4hi2 (operands[0], tmp)); > > > + DONE; > > > + } > > > +) > > > + > > > (define_expand "ftrunc<VHSDF:mode>2" > > > [(set (match_operand:VHSDF 0 "register_operand") > > > (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand")] diff > > > --git a/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > > b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > > new file mode 100644 > > > index 00000000000..57cc00913a3 > > > --- /dev/null > > > +++ b/gcc/testsuite/gcc.target/aarch64/fix_trunc2.c > > > @@ -0,0 +1,14 @@ > > > +/* { dg-do compile } */ > > > +/* { dg-options "-O2" } */ > > > + > > > +void > > > +f (short *__restrict a, float *__restrict b) { > > > + a[0] = b[0]; > > > + a[1] = b[1]; > > > + a[2] = b[2]; > > > + a[3] = b[3]; > > > +} > > > + > > > +/* { dg-final { scan-assembler-times {fcvtzs\tv[0-9]+.4s, v[0-9]+.4s} > > > +1 } } */ > > > +/* { dg-final { scan-assembler-times {xtn\tv[0-9]+.4h, v[0-9]+.4s} 1 > > > +} } */ > -- Richard Biener <rguenther@suse.de> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg) ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-06-18 5:55 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-06-04 1:09 [PATCH] aarch64: Add fix_truncv4sfv4hi2 pattern [PR113882] Pengxuan Zheng 2024-06-06 12:29 ` Richard Sandiford 2024-06-06 13:22 ` Richard Biener 2024-06-18 0:05 ` Pengxuan Zheng (QUIC) 2024-06-18 5:55 ` Richard Biener
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).