public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [x86 PATCH] Fix FAIL of gcc.target/i386/pr78794.c on ia32.
@ 2023-06-27 18:40 Roger Sayle
  2023-06-27 20:02 ` Uros Bizjak
  0 siblings, 1 reply; 2+ messages in thread
From: Roger Sayle @ 2023-06-27 18:40 UTC (permalink / raw)
  To: gcc-patches; +Cc: 'Uros Bizjak'

[-- Attachment #1: Type: text/plain, Size: 1344 bytes --]


This patch fixes the FAIL of gcc.target/i386/pr78794.c on ia32, which
is caused by minor STV rtx_cost differences with -march=silvermont.
It turns out that generic tuning results in pandn, but the lack of
accurate parameterization for COMPARE in compute_convert_gain combined
with small differences in scalar<->SSE costs on silvermont results in
this DImode chain not being converted.

The solution is to provide more accurate costs/gains for converting
(DImode and SImode) comparisons.

I'd been holding off of doing this as I'd thought it would be possible
to turn pandn;ptestz into ptestc (for an even bigger scalar-to-vector
win) but I've recently realized that these optimizations (as I've
implemented them) occur in the wrong order (stv2 occurs after
combine), so it isn't easy for STV to convert CCZmode into CCCmode.
Doh!  Perhaps something can be done in peephole2...


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2023-06-27  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
        PR target/78794
        * config/i386/i386-features.cc (compute_convert_gain): Provide
        more accurate gains for conversion of scalar comparisons to
        PTEST.


Thanks for your patience.
Roger
--


[-- Attachment #2: patchvc.txt --]
[-- Type: text/plain, Size: 1078 bytes --]

diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index 4a3b07a..53bec08 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -631,7 +631,31 @@ general_scalar_chain::compute_convert_gain ()
 	    break;
 
 	  case COMPARE:
-	    /* Assume comparison cost is the same.  */
+	    if (XEXP (src, 1) != const0_rtx)
+	      {
+		/* cmp vs. pxor;pshufd;ptest.  */
+		igain += COSTS_N_INSNS (m - 3);
+	      }
+	    else if (GET_CODE (XEXP (src, 0)) != AND)
+	      {
+		/* test vs. pshufd;ptest.  */
+		igain += COSTS_N_INSNS (m - 2);
+	      }
+	    else if (GET_CODE (XEXP (XEXP (src, 0), 0)) != NOT)
+	      {
+		/* and;test vs. pshufd;ptest.  */
+		igain += COSTS_N_INSNS (2 * m - 2);
+	      }
+	    else if (TARGET_BMI)
+	      {
+		/* andn;test vs. pandn;pshufd;ptest.  */
+		igain += COSTS_N_INSNS (2 * m - 3);
+	      }
+	    else
+	      {
+		/* not;and;test vs. pandn;pshufd;ptest.  */
+		igain += COSTS_N_INSNS (3 * m - 3);
+	      }
 	    break;
 
 	  case CONST_INT:

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [x86 PATCH] Fix FAIL of gcc.target/i386/pr78794.c on ia32.
  2023-06-27 18:40 [x86 PATCH] Fix FAIL of gcc.target/i386/pr78794.c on ia32 Roger Sayle
@ 2023-06-27 20:02 ` Uros Bizjak
  0 siblings, 0 replies; 2+ messages in thread
From: Uros Bizjak @ 2023-06-27 20:02 UTC (permalink / raw)
  To: Roger Sayle; +Cc: gcc-patches

On Tue, Jun 27, 2023 at 8:40 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch fixes the FAIL of gcc.target/i386/pr78794.c on ia32, which
> is caused by minor STV rtx_cost differences with -march=silvermont.
> It turns out that generic tuning results in pandn, but the lack of
> accurate parameterization for COMPARE in compute_convert_gain combined
> with small differences in scalar<->SSE costs on silvermont results in
> this DImode chain not being converted.
>
> The solution is to provide more accurate costs/gains for converting
> (DImode and SImode) comparisons.
>
> I'd been holding off of doing this as I'd thought it would be possible
> to turn pandn;ptestz into ptestc (for an even bigger scalar-to-vector
> win) but I've recently realized that these optimizations (as I've
> implemented them) occur in the wrong order (stv2 occurs after
> combine), so it isn't easy for STV to convert CCZmode into CCCmode.
> Doh!  Perhaps something can be done in peephole2...
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
>
>
> 2023-06-27  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         PR target/78794
>         * config/i386/i386-features.cc (compute_convert_gain): Provide
>         more accurate gains for conversion of scalar comparisons to
>         PTEST.

LGTM.

Thanks,
Uros.

>
> Thanks for your patience.
> Roger
> --
>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-06-27 20:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-27 18:40 [x86 PATCH] Fix FAIL of gcc.target/i386/pr78794.c on ia32 Roger Sayle
2023-06-27 20:02 ` Uros Bizjak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).