public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] Partial vectors for s390
@ 2021-10-20  8:34 Robin Dapp
  2021-10-20  9:07 ` Richard Sandiford
  0 siblings, 1 reply; 4+ messages in thread
From: Robin Dapp @ 2021-10-20  8:34 UTC (permalink / raw)
  To: GCC Patches

Hi,

I have been playing around with making Kewen's partial vector changes 
workable with s390:

We have a vll instruction that can be passed the highest byte to load. 
The rather unfortunate consequence of this is that a length of zero 
cannot be specified.  The partial vector framework, however, relies a 
lot on the fact that a len_load can be made a NOP using a length of zero.

After confirming an additional zero-check before each vll is definitely 
too slow across SPEC and some discussion with Kewen we figured the 
easiest way forward is to exclude loops with multiple VFs (despite 
giving up vectorization possibilities).  These are prone to len_loads 
with zero while the regular induction variable check prevents them in 
single-VF loops.

So, as a quick hack, I went with

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 75f24e7c4f6..f79222daeb6 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -1170,6 +1170,9 @@ vect_verify_loop_lens (loop_vec_info loop_vinfo)
    if (LOOP_VINFO_LENS (loop_vinfo).is_empty ())
      return false;

+  if (LOOP_VINFO_LENS (loop_vinfo).length () > 1)
+    return false;
+

which could be made a hook, eventually.  FWIW this is sufficient to make 
bootstrap, regtest and compiling the SPEC suites succeed.  I'm unsure 
whether we are guaranteed not to emit len_load with zero now.   On top, 
I subtract 1 from the passed length in the expander, which, supposedly, 
is also not ideal.

There are some regressions that I haven't fully analyzed yet but whether 
and when to actually enable this feature could be a backend decision 
with the necessary middle-end checks already in place.

Any ideas on how to properly check for the zero condition and exclude 
the cases that cause it? Kewen suggested enriching the len_load optabs 
with a separate parameter.

Regards
  Robin

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-10-26 14:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-20  8:34 [RFC] Partial vectors for s390 Robin Dapp
2021-10-20  9:07 ` Richard Sandiford
2021-10-26 13:04   ` Robin Dapp
2021-10-26 14:18     ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).