[Bug middle-end/103641] [11/12 regression] Severe compile time regression in SLP vectorize step

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

From: "roger at nextmovesoftware dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug middle-end/103641] [11/12 regression] Severe compile time regression in SLP vectorize step
Date: Mon, 24 Jan 2022 16:49:36 +0000	[thread overview]
Message-ID: <bug-103641-4-sPmsUSdblj@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-103641-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103641

--- Comment #22 from Roger Sayle <roger at nextmovesoftware dot com> ---
I completely agree with Richard that the decision to vectorize or not to
vectorize should be made elsewhere taking the whole function/loop into account.
 It's quite reasonable to synthesize a slow vector multiply if there's an
overall benefit from SLP.  What I think is required is that the "baseline" cost
should be the cost of moving from the vector to a scalar mode, performing the
multiplication(s) as a scalar and moving the result back again.  i.e. we're
assuming that we're always going to multiply the value in a vector register,
we're just choosing the cheapest implementation for it.  For the xxhash.i
testcase, I'm seeing DI mode multiplications with COSTS_N_INSNS(30) [i.e. a
mult_cost of 120]. Even with slow inter-unit moves it must be possible to do
this faster on AArch64?  In fact, we'll probably vectorize more in SLP, if we
have the option to shuffle data back to the scalar multiplier if required.
Perhaps even a define_insn_and_split of mulv2di3 to fool the middle-end into
thinking we can do this "natively" via an optab.

Note that multipliers used in cryptographic hash functions are sometimes
(chosen to be) pathological to synth_mult.  Like the design of DES' sboxes,
these are coefficients designed to be slow to implement in software [and faster
in custom hardware].  64bit values with around 32 (random) bits set.

I/we can try to speed up the recursion in synth_mult, and/or increase the size
of the hash-table cache [which will help hppa64 and other targets with slow
multipliers] but that's perhaps just working around the deeper issue with this
PR.

next prev parent reply	other threads:[~2022-01-24 16:49 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-10  9:03 [Bug rtl-optimization/103641] New: [aarch64][11 " husseydevin at gmail dot com
2021-12-10  9:25 ` [Bug rtl-optimization/103641] " marxin at gcc dot gnu.org
2021-12-10  9:37 ` [Bug tree-optimization/103641] " pinskia at gcc dot gnu.org
2021-12-10  9:43 ` [Bug tree-optimization/103641] [11/12 " pinskia at gcc dot gnu.org
2021-12-10  9:46 ` pinskia at gcc dot gnu.org
2021-12-10  9:56 ` pinskia at gcc dot gnu.org
2021-12-10 10:01 ` marxin at gcc dot gnu.org
2021-12-10 10:02 ` pinskia at gcc dot gnu.org
2021-12-10 10:03 ` pinskia at gcc dot gnu.org
2021-12-10 10:06 ` marxin at gcc dot gnu.org
2021-12-10 10:08 ` pinskia at gcc dot gnu.org
2021-12-10 10:09 ` pinskia at gcc dot gnu.org
2021-12-10 10:12 ` marxin at gcc dot gnu.org
2021-12-10 10:12 ` pinskia at gcc dot gnu.org
2021-12-10 10:14 ` pinskia at gcc dot gnu.org
2021-12-10 10:15 ` [Bug middle-end/103641] " pinskia at gcc dot gnu.org
2021-12-10 10:24 ` pinskia at gcc dot gnu.org
2021-12-10 10:28 ` pinskia at gcc dot gnu.org
2021-12-10 13:17 ` roger at nextmovesoftware dot com
2021-12-10 13:19 ` husseydevin at gmail dot com
2022-01-18 14:10 ` rguenth at gcc dot gnu.org
2022-01-22 14:30 ` roger at nextmovesoftware dot com
2022-01-24  8:13 ` rguenther at suse dot de
2022-01-24 16:49 ` roger at nextmovesoftware dot com [this message]
2022-01-24 17:02 ` roger at nextmovesoftware dot com
2022-01-25  7:23 ` rguenth at gcc dot gnu.org
2022-01-25  7:52 ` rguenth at gcc dot gnu.org
2022-02-04  7:26 ` rguenth at gcc dot gnu.org
2022-02-04 10:30 ` cvs-commit at gcc dot gnu.org
2022-02-04 10:43 ` rguenth at gcc dot gnu.org
2022-02-04 11:08 ` tnfchris at gcc dot gnu.org
2022-02-07 12:19 ` tnfchris at gcc dot gnu.org
2022-02-07 15:05 ` [Bug middle-end/103641] [11 " rguenth at gcc dot gnu.org
2022-02-08  8:08 ` tnfchris at gcc dot gnu.org
2022-02-08  8:13 ` pinskia at gcc dot gnu.org
2022-02-08  8:15 ` tnfchris at gcc dot gnu.org
2022-03-16  8:22 ` cvs-commit at gcc dot gnu.org
2022-03-16  8:23 ` rguenth at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-103641-4-sPmsUSdblj@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).