public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Uros Bizjak <ubizjak@gmail.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [RFC PATCH] i386: Enable auto-vectorization for 32bit modes (+ testcases)
Date: Wed, 26 May 2021 10:43:52 +0200	[thread overview]
Message-ID: <CAFULd4a++zGQ9vBkrgie5zuqBKMoPqxMZh1yVh+UzgKbPrLLow@mail.gmail.com> (raw)
In-Reply-To: <CAFiYyc0SQwPovG_Jh0FnKNFSDFhcJ9gR2YrqBi+YoaENg0h0gA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4008 bytes --]

On Tue, May 25, 2021 at 4:29 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Fri, May 21, 2021 at 5:00 PM Uros Bizjak via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Here it is, the patch that enables auto-vectorization for 32bit modes.
> >
> > Sent as RFC, because the patch fails some vectorizer scans, as it
> > obviously enables more vectorization to happen:
> >
> > Running target unix
> > FAIL: gcc.dg/vect/pr71264.c -flto -ffat-lto-objects  scan-tree-dump
> > vect "vectorized 1 loops in function"
> > FAIL: gcc.dg/vect/pr71264.c scan-tree-dump vect "vectorized 1 loops in function"
> > FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vectorized 1 loops" 1
> > FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vectorizing stmts using SLP" 1
> > FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorized 1 loops" 1
> > FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorizing
> > stmts using SLP" 1
> > FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vectorized 3 loops" 1
> > FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vectorizing stmts using SLP" 3
> > FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorized 3 loops" 1
> > FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorizing stmts
> > using SLP" 3
> >
> >
> > Running target unix/-m32
> > FAIL: gcc.dg/vect/no-vfa-vect-101.c scan-tree-dump-times vect "can't
> > determine dependence" 1
> > FAIL: gcc.dg/vect/no-vfa-vect-102.c scan-tree-dump-times vect
> > "possible dependence between data-refs" 1
> > FAIL: gcc.dg/vect/no-vfa-vect-102a.c scan-tree-dump-times vect
> > "possible dependence between data-refs" 1
> > FAIL: gcc.dg/vect/no-vfa-vect-37.c scan-tree-dump-times vect "can't
> > determine dependence" 2
> > FAIL: gcc.dg/vect/pr71264.c -flto -ffat-lto-objects  scan-tree-dump
> > vect "vectorized 1 loops in function"
> > FAIL: gcc.dg/vect/pr71264.c scan-tree-dump vect "vectorized 1 loops in function"
> > FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vectorized 1 loops" 1
> > FAIL: gcc.dg/vect/slp-28.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vectorizing stmts using SLP" 1
> > FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorized 1 loops" 1
> > FAIL: gcc.dg/vect/slp-28.c scan-tree-dump-times vect "vectorizing
> > stmts using SLP" 1
> > FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vectorized 3 loops" 1
> > FAIL: gcc.dg/vect/slp-3.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "vectorizing stmts using SLP" 3
> > FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorized 3 loops" 1
> > FAIL: gcc.dg/vect/slp-3.c scan-tree-dump-times vect "vectorizing stmts
> > using SLP" 3
> > FAIL: gcc.dg/vect/vect-104.c -flto -ffat-lto-objects
> > scan-tree-dump-times vect "possible dependence between data-refs" 1
> > FAIL: gcc.dg/vect/vect-104.c scan-tree-dump-times vect "possible
> > dependence between data-refs" 1
>
> Yeah, it's a bit iffy to adjust expectations.  If there's a way to
> disable vectorization
> for 32bit modes on x86 that might be a way to "fix" them, otherwise we're
> lacking a way to query for available vector modes/sizes in the dejagnu vect
> targets.  There's available_vector_sizes but it's implementation is hardly
> complete nor is size the only important thing (FP vs. INT).  At least
> one could add a vect32 predicate similar to the existing vect64 one.

I went the way you proposed above. By adding 32bit vector size to
available_vector_sizes only two testcases fails. The attached patch
fixes all vect scan failures (the remaining failure in
vect_epilogues.c is just the case of missing uavg<mode>3_ceil pattern
for V4QI epilogue vectorization - I plan to add the insn in the
follow-up patch).

The patch also xfails pr71264.c, the case of missing re-vectorization
of 32bit vectors.

WDYT?

Uros.

[-- Attachment #2: p.diff.txt --]
[-- Type: text/plain, Size: 4474 bytes --]

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 28e6113a609..04649b42122 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -22190,12 +22190,15 @@ ix86_autovectorize_vector_modes (vector_modes *modes, bool all)
       modes->safe_push (V16QImode);
       modes->safe_push (V32QImode);
     }
-  else if (TARGET_MMX_WITH_SSE)
+  else if (TARGET_SSE2)
     modes->safe_push (V16QImode);
 
   if (TARGET_MMX_WITH_SSE)
     modes->safe_push (V8QImode);
 
+  if (TARGET_SSE2)
+    modes->safe_push (V4QImode);
+
   return 0;
 }
 
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index cf3098749c0..16c6a3b8e99 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1740,6 +1740,12 @@ circumstances.
 @item vect_variable_length
 Target has variable-length vectors.
 
+@item vect64
+Target supports vectors of 64 bits.
+
+@item vect32
+Target supports vectors of 32 bits.
+
 @item vect_widen_sum_hi_to_si
 Target supports a vector widening summation of @code{short} operands
 into @code{int} results, or can promote (unpack) from @code{short}
diff --git a/gcc/testsuite/gcc.dg/vect/pr71264.c b/gcc/testsuite/gcc.dg/vect/pr71264.c
index dc849bf2797..1381e0ed132 100644
--- a/gcc/testsuite/gcc.dg/vect/pr71264.c
+++ b/gcc/testsuite/gcc.dg/vect/pr71264.c
@@ -19,5 +19,4 @@ void test(uint8_t *ptr, uint8_t *mask)
     }
 }
 
-/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" { xfail s390*-*-* sparc*-*-* } } } */
-
+/* { dg-final { scan-tree-dump "vectorized 1 loops in function" "vect" { xfail { { s390*-*-* sparc*-*-* } || vect32 } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/slp-28.c b/gcc/testsuite/gcc.dg/vect/slp-28.c
index 7778bad4465..0bb5f0eb0e4 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-28.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-28.c
@@ -88,6 +88,7 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { ! vect32 } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 2 loops" 1 "vect" { target vect32 } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect32 } } } } */
   
diff --git a/gcc/testsuite/gcc.dg/vect/slp-3.c b/gcc/testsuite/gcc.dg/vect/slp-3.c
index 46ab584419a..80ded1840ad 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-3.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-3.c
@@ -141,8 +141,8 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { ! vect_partial_vectors } } } } */
-/* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" { target vect_partial_vectors } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" { target { ! vect_partial_vectors } } } }*/
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { target vect_partial_vectors } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { ! { vect_partial_vectors || vect32 } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect" { target { vect_partial_vectors || vect32 } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" { target { ! { vect_partial_vectors || vect32 } } } } }*/
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { target { vect_partial_vectors || vect32 } } } } */
   
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 849f1bbeda5..7f78c5593ac 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -7626,6 +7626,7 @@ proc available_vector_sizes { } {
 	if { ![is-effective-target ia32] } {
 	    lappend result 64
 	}
+	lappend result 32
     } elseif { [istarget sparc*-*-*] } {
 	lappend result 64
     } elseif { [istarget amdgcn*-*-*] } {
@@ -7655,6 +7656,12 @@ proc check_effective_target_vect64 { } {
     return [expr { [lsearch -exact [available_vector_sizes] 64] >= 0 }]
 }
 
+# Return 1 if the target supports vectors of 32 bits.
+
+proc check_effective_target_vect32 { } {
+    return [expr { [lsearch -exact [available_vector_sizes] 32] >= 0 }]
+}
+
 # Return 1 if the target supports vector copysignf calls.
 
 proc check_effective_target_vect_call_copysignf { } {

  reply	other threads:[~2021-05-26  8:44 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-21 14:04 Uros Bizjak
2021-05-25 14:29 ` Richard Biener
2021-05-26  8:43   ` Uros Bizjak [this message]
2021-05-26 12:27     ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFULd4a++zGQ9vBkrgie5zuqBKMoPqxMZh1yVh+UzgKbPrLLow@mail.gmail.com \
    --to=ubizjak@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).