public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Uros Bizjak <ubizjak@gmail.com>
To: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: [COMMITTED] i386: Adjust emulated integer vector mode multiplication costs
Date: Wed, 17 May 2023 20:29:19 +0200	[thread overview]
Message-ID: <CAFULd4Y37a95TGsZtHtCZtNXiecqGJ-22AvB+=AxK_Tfh+AaAg@mail.gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 625 bytes --]

Returned integer vector mode costs of emulated modes in
ix86_multiplication_cost are wrong and do not reflect generated
instruction sequences.  Rewrite handling of different integer vector
modes and different target ABIs to return real instruction
counts in order to calculate better costs of various emulated modes.

gcc/ChangeLog:

    * config/i386/i386.cc (ix86_multiplication_cost): Correct
    calculation of integer vector mode costs to reflect generated
    instruction sequences of different integer vector modes and
    different target ABIs.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.

[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 3014 bytes --]

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 498fac468b5..9ab24242b59 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -20463,36 +20463,52 @@ ix86_multiplication_cost (const struct processor_costs *cost,
     return  ix86_vec_cost (mode,
 			   inner_mode == DFmode ? cost->mulsd : cost->mulss);
   else if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
-    {
-      /* vpmullq is used in this case. No emulation is needed.  */
-      if (TARGET_AVX512DQ)
-	return ix86_vec_cost (mode, cost->mulss);
+    switch (mode)
+      {
+      case V16QImode:
+	/* V*QImode is emulated with 4-11 insns.  */
+	if (TARGET_AVX512BW && TARGET_AVX512VL)
+	  return ix86_vec_cost (mode, cost->mulss + cost->sse_op * 3);
+	else if (TARGET_XOP)
+	  return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 5);
+	/* FALLTHRU */
+      case V32QImode:
+	if (TARGET_AVX512BW && mode == V32QImode)
+	  return ix86_vec_cost (mode, cost->mulss + cost->sse_op * 3);
+	else
+	  return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 7);
 
-      /* V*QImode is emulated with 7-13 insns.  */
-      if (mode == V16QImode || mode == V32QImode)
-	{
-	  int extra = 11;
-	  if (TARGET_XOP && mode == V16QImode)
-	    extra = 5;
-	  else if (TARGET_SSSE3)
-	    extra = 6;
-	  return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * extra);
-	}
-      /* V*DImode is emulated with 5-8 insns.  */
-      else if (mode == V2DImode || mode == V4DImode)
-	{
-	  if (TARGET_XOP && mode == V2DImode)
-	    return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 3);
-	  else
-	    return ix86_vec_cost (mode, cost->mulss * 3 + cost->sse_op * 5);
-	}
-      /* Without sse4.1, we don't have PMULLD; it's emulated with 7
-	 insns, including two PMULUDQ.  */
-      else if (mode == V4SImode && !(TARGET_SSE4_1 || TARGET_AVX))
-	return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 5);
-      else
+      case V64QImode:
+	return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 9);
+
+      case V4SImode:
+	/* pmulld is used in this case. No emulation is needed.  */
+	if (TARGET_SSE4_1)
+	  goto do_native;
+	/* V4SImode is emulated with 7 insns.  */
+	else
+	  return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 5);
+
+      case V2DImode:
+      case V4DImode:
+	/* vpmullq is used in this case. No emulation is needed.  */
+	if (TARGET_AVX512DQ && TARGET_AVX512VL)
+	  goto do_native;
+	/* V*DImode is emulated with 6-8 insns.  */
+	else if (TARGET_XOP && mode == V2DImode)
+	  return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 4);
+	/* FALLTHRU */
+      case V8DImode:
+	/* vpmullq is used in this case. No emulation is needed.  */
+	if (TARGET_AVX512DQ && mode == V8DImode)
+	  goto do_native;
+	else
+	  return ix86_vec_cost (mode, cost->mulss * 3 + cost->sse_op * 5);
+
+      default:
+      do_native:
 	return ix86_vec_cost (mode, cost->mulss);
-    }
+      }
   else
     return (cost->mult_init[MODE_INDEX (mode)] + cost->mult_bit * 7);
 }

                 reply	other threads:[~2023-05-17 18:29 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFULd4Y37a95TGsZtHtCZtNXiecqGJ-22AvB+=AxK_Tfh+AaAg@mail.gmail.com' \
    --to=ubizjak@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).