* [PATCH] [x86] Add some preference for floating point rtl ifcvt when sse4.1 is not available
@ 2024-06-03 3:09 liuhongt
2024-06-03 7:14 ` Uros Bizjak
0 siblings, 1 reply; 2+ messages in thread
From: liuhongt @ 2024-06-03 3:09 UTC (permalink / raw)
To: gcc-patches; +Cc: ubizjak
W/o TARGET_SSE4_1, it takes 3 instructions (pand, pandn and por) for
movdfcc/movsfcc, and could possibly fail cost comparison. Increase
branch cost could hurt performance for other modes, so specially add
some preference for floating point ifcvt.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?
gcc/ChangeLog:
* config/i386/i386.cc (ix86_noce_conversion_profitable_p): Add
some preference for floating point ifcvt when SSE4.1 is not
available.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr115299.c: New test.
* gcc.target/i386/pr86722.c: Adjust testcase.
---
gcc/config/i386/i386.cc | 17 +++++++++++++++++
gcc/testsuite/gcc.target/i386/pr115299.c | 10 ++++++++++
gcc/testsuite/gcc.target/i386/pr86722.c | 2 +-
3 files changed, 28 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr115299.c
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 1a0206ab573..271da127a89 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -24879,6 +24879,23 @@ ix86_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info)
return false;
}
}
+
+ /* W/o TARGET_SSE4_1, it takes 3 instructions (pand, pandn and por)
+ for movdfcc/movsfcc, and could possibly fail cost comparison.
+ Increase branch cost will hurt performance for other modes, so
+ specially add some preference for floating point ifcvt. */
+ if (!TARGET_SSE4_1 && if_info->x
+ && GET_MODE_CLASS (GET_MODE (if_info->x)) == MODE_FLOAT
+ && if_info->speed_p)
+ {
+ unsigned cost = seq_cost (seq, true);
+
+ if (cost <= if_info->original_cost)
+ return true;
+
+ return cost <= (if_info->max_seq_cost + COSTS_N_INSNS (2));
+ }
+
return default_noce_conversion_profitable_p (seq, if_info);
}
diff --git a/gcc/testsuite/gcc.target/i386/pr115299.c b/gcc/testsuite/gcc.target/i386/pr115299.c
new file mode 100644
index 00000000000..53c5899136a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr115299.c
@@ -0,0 +1,10 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mno-sse4.1 -msse2" } */
+
+void f(double*d,double*e){
+ for(;d<e;++d)
+ *d=(*d<.5)?.7:0;
+}
+
+/* { dg-final { scan-assembler {(?n)(?:cmpnltsd|cmpltsd)} } } */
+/* { dg-final { scan-assembler {(?n)(?:andnpd|andpd)} } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr86722.c b/gcc/testsuite/gcc.target/i386/pr86722.c
index 4de2ca1a6c0..e266a1e56c2 100644
--- a/gcc/testsuite/gcc.target/i386/pr86722.c
+++ b/gcc/testsuite/gcc.target/i386/pr86722.c
@@ -6,5 +6,5 @@ void f(double*d,double*e){
*d=(*d<.5)?.7:0;
}
-/* { dg-final { scan-assembler-not "andnpd" } } */
+/* { dg-final { scan-assembler-times {(?n)(?:andnpd|andpd)} 1 } } */
/* { dg-final { scan-assembler-not "orpd" } } */
--
2.31.1
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: [PATCH] [x86] Add some preference for floating point rtl ifcvt when sse4.1 is not available
2024-06-03 3:09 [PATCH] [x86] Add some preference for floating point rtl ifcvt when sse4.1 is not available liuhongt
@ 2024-06-03 7:14 ` Uros Bizjak
0 siblings, 0 replies; 2+ messages in thread
From: Uros Bizjak @ 2024-06-03 7:14 UTC (permalink / raw)
To: liuhongt; +Cc: gcc-patches
On Mon, Jun 3, 2024 at 5:11 AM liuhongt <hongtao.liu@intel.com> wrote:
>
> W/o TARGET_SSE4_1, it takes 3 instructions (pand, pandn and por) for
> movdfcc/movsfcc, and could possibly fail cost comparison. Increase
> branch cost could hurt performance for other modes, so specially add
> some preference for floating point ifcvt.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * config/i386/i386.cc (ix86_noce_conversion_profitable_p): Add
> some preference for floating point ifcvt when SSE4.1 is not
> available.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr115299.c: New test.
> * gcc.target/i386/pr86722.c: Adjust testcase.
LGTM.
Thanks,
Uros.
> ---
> gcc/config/i386/i386.cc | 17 +++++++++++++++++
> gcc/testsuite/gcc.target/i386/pr115299.c | 10 ++++++++++
> gcc/testsuite/gcc.target/i386/pr86722.c | 2 +-
> 3 files changed, 28 insertions(+), 1 deletion(-)
> create mode 100644 gcc/testsuite/gcc.target/i386/pr115299.c
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index 1a0206ab573..271da127a89 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -24879,6 +24879,23 @@ ix86_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info)
> return false;
> }
> }
> +
> + /* W/o TARGET_SSE4_1, it takes 3 instructions (pand, pandn and por)
> + for movdfcc/movsfcc, and could possibly fail cost comparison.
> + Increase branch cost will hurt performance for other modes, so
> + specially add some preference for floating point ifcvt. */
> + if (!TARGET_SSE4_1 && if_info->x
> + && GET_MODE_CLASS (GET_MODE (if_info->x)) == MODE_FLOAT
> + && if_info->speed_p)
> + {
> + unsigned cost = seq_cost (seq, true);
> +
> + if (cost <= if_info->original_cost)
> + return true;
> +
> + return cost <= (if_info->max_seq_cost + COSTS_N_INSNS (2));
> + }
> +
> return default_noce_conversion_profitable_p (seq, if_info);
> }
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr115299.c b/gcc/testsuite/gcc.target/i386/pr115299.c
> new file mode 100644
> index 00000000000..53c5899136a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr115299.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2 -mno-sse4.1 -msse2" } */
> +
> +void f(double*d,double*e){
> + for(;d<e;++d)
> + *d=(*d<.5)?.7:0;
> +}
> +
> +/* { dg-final { scan-assembler {(?n)(?:cmpnltsd|cmpltsd)} } } */
> +/* { dg-final { scan-assembler {(?n)(?:andnpd|andpd)} } } */
> diff --git a/gcc/testsuite/gcc.target/i386/pr86722.c b/gcc/testsuite/gcc.target/i386/pr86722.c
> index 4de2ca1a6c0..e266a1e56c2 100644
> --- a/gcc/testsuite/gcc.target/i386/pr86722.c
> +++ b/gcc/testsuite/gcc.target/i386/pr86722.c
> @@ -6,5 +6,5 @@ void f(double*d,double*e){
> *d=(*d<.5)?.7:0;
> }
>
> -/* { dg-final { scan-assembler-not "andnpd" } } */
> +/* { dg-final { scan-assembler-times {(?n)(?:andnpd|andpd)} 1 } } */
> /* { dg-final { scan-assembler-not "orpd" } } */
> --
> 2.31.1
>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-06-03 7:14 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-03 3:09 [PATCH] [x86] Add some preference for floating point rtl ifcvt when sse4.1 is not available liuhongt
2024-06-03 7:14 ` Uros Bizjak
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).