public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFC] Merge strathegy for all-SLP vectorizer
@ 2024-05-17 10:36 Richard Biener
  2024-05-17 12:08 ` Richard Sandiford
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2024-05-17 10:36 UTC (permalink / raw)
  To: gcc; +Cc: richard.sandiford, tamar.christina


Hi,

I'd like to discuss how to go forward with getting the vectorizer to
all-SLP for this stage1.  While there is a personal branch with my
ongoing work (users/rguenth/vect-force-slp) branches haven't proved
themselves working well for collaboration.  The branch isn't ready
to be merged in full but I have been picking improvements to trunk
last stage1 and some remaining bits in the past weeks.  I have
refrained from merging code paths that cannot be exercised on trunk.

There are two important set of changes on the branch, both critical
to get more testing on non-x86 targets.

 1. enable single-lane SLP discovery
 2. avoid splitting store groups (9315bfc661432c3 and 4336060fe2db8ec
    if you fetch the branch)

The first point is also most annoying on the testsuite since doing
SLP instead of interleaving changes what we dump and thus tests
start to fail in random ways when you switch between both modes.
On the branch single-lane SLP discovery is gated with
--param vect-single-lane-slp.

The branch has numerous changes to enable single-lane SLP for some
code paths that have SLP not implemented and where I did not bother
to try supporting multi-lane SLP at this point.  It also adds more
SLP discovery entry points.

I'm not sure how to try merging these pieces to allow others to
more easily help out.  One possibility is to merge
--param vect-single-lane-slp defaulted off and pick dependent
changes even when they cause testsuite regressions with
vect-single-lane-slp=1.  Alternatively adjust the testsuite by
adding --param vect-single-lane-slp=0 and default to 1
(or keep the default).  Or require a clean testsuite with
--param vect-single-lane-slp defaulted to 1 but keep the --param
for debugging (and allow FAILs with 0).

For fun I merged just single-lane discovery of non-grouped stores
and have that enabled by default.  On x86_64 this results in the
set of FAILs below.

Any suggestions?

Thanks,
Richard.

FAIL: gcc.dg/vect/O3-pr39675-2.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 1
XPASS: gcc.dg/vect/no-scevccp-outer-12.c scan-tree-dump-times vect "OUTER 
LOOP VECTORIZED." 1
FAIL: gcc.dg/vect/no-section-anchors-vect-31.c scan-tree-dump-times vect 
"Alignment of access forced using peeling" 2
FAIL: gcc.dg/vect/no-section-anchors-vect-31.c scan-tree-dump-times vect 
"Vectorizing an unaligned access" 0
FAIL: gcc.dg/vect/no-section-anchors-vect-64.c scan-tree-dump-times vect 
"Alignment of access forced using peeling" 2
FAIL: gcc.dg/vect/no-section-anchors-vect-64.c scan-tree-dump-times vect 
"Vectorizing an unaligned access" 0
FAIL: gcc.dg/vect/no-section-anchors-vect-66.c scan-tree-dump-times vect 
"Alignment of access forced using peeling" 1
FAIL: gcc.dg/vect/no-section-anchors-vect-66.c scan-tree-dump-times vect 
"Vectorizing an unaligned access" 0
FAIL: gcc.dg/vect/no-section-anchors-vect-68.c scan-tree-dump-times vect 
"Alignment of access forced using peeling" 2
FAIL: gcc.dg/vect/no-section-anchors-vect-68.c scan-tree-dump-times vect 
"Vectorizing an unaligned access" 0
FAIL: gcc.dg/vect/slp-12a.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-12a.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 1
FAIL: gcc.dg/vect/slp-19a.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-19a.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 1
FAIL: gcc.dg/vect/slp-19b.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-19b.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 1
FAIL: gcc.dg/vect/slp-19c.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorized 1 loops" 1
FAIL: gcc.dg/vect/slp-19c.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-19c.c scan-tree-dump-times vect "vectorized 1 loops" 
1
FAIL: gcc.dg/vect/slp-19c.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 1
XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c -flto -ffat-lto-objects  
scan-tree-dump vect "vectorized 1 loops"
XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c scan-tree-dump vect "vectorized 
1 loops"
XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c -flto -ffat-lto-objects  
scan-tree-dump vect "vectorized 1 loops"
XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c scan-tree-dump vect "vectorized 1 
loops"
XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c -flto -ffat-lto-objects  
scan-tree-dump vect "vectorized 1 loops"
XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c scan-tree-dump vect "vectorized 
1 loops"
XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c -flto -ffat-lto-objects  
scan-tree-dump vect "vectorized 1 loops"
XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c scan-tree-dump vect "vectorized 1 
loops"
FAIL: gcc.dg/vect/vect-26.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Alignment of access forced using peeling" 1
FAIL: gcc.dg/vect/vect-26.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Vectorizing an unaligned access" 0
FAIL: gcc.dg/vect/vect-26.c scan-tree-dump-times vect "Alignment of access 
forced using peeling" 1
FAIL: gcc.dg/vect/vect-26.c scan-tree-dump-times vect "Vectorizing an 
unaligned access" 0
FAIL: gcc.dg/vect/vect-54.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Alignment of access forced using peeling" 1
FAIL: gcc.dg/vect/vect-54.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Vectorizing an unaligned access" 0
FAIL: gcc.dg/vect/vect-54.c scan-tree-dump-times vect "Alignment of access 
forced using peeling" 1
FAIL: gcc.dg/vect/vect-54.c scan-tree-dump-times vect "Vectorizing an 
unaligned access" 0
FAIL: gcc.dg/vect/vect-56.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Alignment of access forced using peeling" 1
FAIL: gcc.dg/vect/vect-56.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Vectorizing an unaligned access" 1
FAIL: gcc.dg/vect/vect-56.c scan-tree-dump-times vect "Alignment of access 
forced using peeling" 1
FAIL: gcc.dg/vect/vect-56.c scan-tree-dump-times vect "Vectorizing an 
unaligned access" 1
FAIL: gcc.dg/vect/vect-58.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Alignment of access forced using peeling" 1
FAIL: gcc.dg/vect/vect-58.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Vectorizing an unaligned access" 0
FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vect "Alignment of access 
forced using peeling" 1
FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vect "Vectorizing an 
unaligned access" 0
FAIL: gcc.dg/vect/vect-60.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Alignment of access forced using peeling" 1
FAIL: gcc.dg/vect/vect-60.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Vectorizing an unaligned access" 1
FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vect "Alignment of access 
forced using peeling" 1
FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vect "Vectorizing an 
unaligned access" 1
FAIL: gcc.dg/vect/vect-89-big-array.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "Alignment of access forced using peeling" 1
FAIL: gcc.dg/vect/vect-89-big-array.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "Vectorizing an unaligned access" 0
FAIL: gcc.dg/vect/vect-89-big-array.c scan-tree-dump-times vect "Alignment 
of access forced using peeling" 1
FAIL: gcc.dg/vect/vect-89-big-array.c scan-tree-dump-times vect 
"Vectorizing an unaligned access" 0
FAIL: gcc.dg/vect/vect-89.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Alignment of access forced using peeling" 1
FAIL: gcc.dg/vect/vect-89.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Vectorizing an unaligned access" 0
FAIL: gcc.dg/vect/vect-89.c scan-tree-dump-times vect "Alignment of access 
forced using peeling" 1
FAIL: gcc.dg/vect/vect-89.c scan-tree-dump-times vect "Vectorizing an 
unaligned access" 0
FAIL: gcc.dg/vect/vect-92.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Alignment of access forced using peeling" 3
FAIL: gcc.dg/vect/vect-92.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "Vectorizing an unaligned access" 0 
FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vect "Alignment of access 
forced using peeling" 3
FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vect "Vectorizing an 
unaligned access" 0 
FAIL: gcc.dg/vect/vect-early-break_25.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "Alignment of access forced using peeling" 1
FAIL: gcc.dg/vect/vect-early-break_25.c scan-tree-dump-times vect 
"Alignment of access forced using peeling" 1
FAIL: gcc.dg/vect/vect-multitypes-1.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "Alignment of access forced using peeling" 2
FAIL: gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect "Alignment 
of access forced using peeling" 2
XPASS: gcc.dg/vect/vect-outer-4e.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
XPASS: gcc.dg/vect/vect-outer-4e.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED" 1 
FAIL: gcc.dg/vect/vect-peel-1.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "Alignment of access forced using peeling" 1
FAIL: gcc.dg/vect/vect-peel-1.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "Vectorizing an unaligned access" 1
FAIL: gcc.dg/vect/vect-peel-1.c scan-tree-dump-times vect "Alignment of 
access forced using peeling" 1
FAIL: gcc.dg/vect/vect-peel-1.c scan-tree-dump-times vect "Vectorizing an 
unaligned access" 1
FAIL: gcc.dg/vect/vect-peel-2.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "Alignment of access forced using peeling" 1
FAIL: gcc.dg/vect/vect-peel-2.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "Vectorizing an unaligned access" 1
FAIL: gcc.dg/vect/vect-peel-2.c scan-tree-dump-times vect "Alignment of 
access forced using peeling" 1
FAIL: gcc.dg/vect/vect-peel-2.c scan-tree-dump-times vect "Vectorizing an 
unaligned access" 1
FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times 
vfmadd132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times 
vfmsub132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times 
vfnmadd132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times 
vfnmsub132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times 
vfmadd132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times 
vfmsub132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times 
vfnmadd132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times 
vfnmsub132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times 
vfmadd132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times 
vfmsub132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times 
vfnmadd132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times 
vfnmsub132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
FAIL: gcc.target/i386/pr101950-2.c scan-assembler-times \\txor[ql]\\t 2
FAIL: gcc.target/i386/pr88531-2b.c scan-assembler-times vmulps 1
FAIL: gcc.target/i386/pr88531-2c.c scan-assembler-times vmulps 1
FAIL: gcc.target/i386/vectorize1.c scan-tree-dump vect "vect_cst"
FAIL: gfortran.dg/temporary_3.f90   -O2  execution test
FAIL: gfortran.dg/vect/fast-math-mgrid-resid.f   -O   scan-tree-dump pcom 
"Executing predictive commoning without unrolling"
FAIL: gfortran.dg/vect/vect-8.f90   -O   scan-tree-dump-times vect 
"vectorized 2[234] loops" 1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Merge strathegy for all-SLP vectorizer
  2024-05-17 10:36 [RFC] Merge strathegy for all-SLP vectorizer Richard Biener
@ 2024-05-17 12:08 ` Richard Sandiford
  2024-05-17 12:53   ` Richard Biener
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Sandiford @ 2024-05-17 12:08 UTC (permalink / raw)
  To: Richard Biener via Gcc; +Cc: Richard Biener, tamar.christina

Richard Biener via Gcc <gcc@gcc.gnu.org> writes:
> Hi,
>
> I'd like to discuss how to go forward with getting the vectorizer to
> all-SLP for this stage1.  While there is a personal branch with my
> ongoing work (users/rguenth/vect-force-slp) branches haven't proved
> themselves working well for collaboration.

Speaking for myself, the problem hasn't been so much the branch as
lack of time.  I've been pretty swamped the last eight months of so
(except for the time that I took off, which admittedly was quite a
bit!), and so I never even got around to properly reading and replying
to your message after the Cauldron.  It's been on the "this is important,
I should make time to read and understand it properly" list all this time.
Sorry about that. :(

I'm hoping to have time to work/help out on SLP stuff soon.

> The branch isn't ready to be merged in full but I have been picking
> improvements to trunk last stage1 and some remaining bits in the past
> weeks.  I have refrained from merging code paths that cannot be
> exercised on trunk.
>
> There are two important set of changes on the branch, both critical
> to get more testing on non-x86 targets.
>
>  1. enable single-lane SLP discovery
>  2. avoid splitting store groups (9315bfc661432c3 and 4336060fe2db8ec
>     if you fetch the branch)
>
> The first point is also most annoying on the testsuite since doing
> SLP instead of interleaving changes what we dump and thus tests
> start to fail in random ways when you switch between both modes.
> On the branch single-lane SLP discovery is gated with
> --param vect-single-lane-slp.
>
> The branch has numerous changes to enable single-lane SLP for some
> code paths that have SLP not implemented and where I did not bother
> to try supporting multi-lane SLP at this point.  It also adds more
> SLP discovery entry points.
>
> I'm not sure how to try merging these pieces to allow others to
> more easily help out.  One possibility is to merge
> --param vect-single-lane-slp defaulted off and pick dependent
> changes even when they cause testsuite regressions with
> vect-single-lane-slp=1.  Alternatively adjust the testsuite by
> adding --param vect-single-lane-slp=0 and default to 1
> (or keep the default).

FWIW, this one sounds good to me (the default to 1 version).
I.e. mechanically add --param vect-single-lane-slp=0 to any tests
that fail with the new default.  That means that the test that need
fixing are easily greppable for anyone who wants to help.  Sometimes
it'll just be a test update.  Sometimes it will be new vectoriser code.

Thanks,
Richard

> Or require a clean testsuite with
> --param vect-single-lane-slp defaulted to 1 but keep the --param
> for debugging (and allow FAILs with 0).
>
> For fun I merged just single-lane discovery of non-grouped stores
> and have that enabled by default.  On x86_64 this results in the
> set of FAILs below.
>
> Any suggestions?
>
> Thanks,
> Richard.
>
> FAIL: gcc.dg/vect/O3-pr39675-2.c scan-tree-dump-times vect "vectorizing 
> stmts using SLP" 1
> XPASS: gcc.dg/vect/no-scevccp-outer-12.c scan-tree-dump-times vect "OUTER 
> LOOP VECTORIZED." 1
> FAIL: gcc.dg/vect/no-section-anchors-vect-31.c scan-tree-dump-times vect 
> "Alignment of access forced using peeling" 2
> FAIL: gcc.dg/vect/no-section-anchors-vect-31.c scan-tree-dump-times vect 
> "Vectorizing an unaligned access" 0
> FAIL: gcc.dg/vect/no-section-anchors-vect-64.c scan-tree-dump-times vect 
> "Alignment of access forced using peeling" 2
> FAIL: gcc.dg/vect/no-section-anchors-vect-64.c scan-tree-dump-times vect 
> "Vectorizing an unaligned access" 0
> FAIL: gcc.dg/vect/no-section-anchors-vect-66.c scan-tree-dump-times vect 
> "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/vect/no-section-anchors-vect-66.c scan-tree-dump-times vect 
> "Vectorizing an unaligned access" 0
> FAIL: gcc.dg/vect/no-section-anchors-vect-68.c scan-tree-dump-times vect 
> "Alignment of access forced using peeling" 2
> FAIL: gcc.dg/vect/no-section-anchors-vect-68.c scan-tree-dump-times vect 
> "Vectorizing an unaligned access" 0
> FAIL: gcc.dg/vect/slp-12a.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorizing stmts using SLP" 1
> FAIL: gcc.dg/vect/slp-12a.c scan-tree-dump-times vect "vectorizing stmts 
> using SLP" 1
> FAIL: gcc.dg/vect/slp-19a.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorizing stmts using SLP" 1
> FAIL: gcc.dg/vect/slp-19a.c scan-tree-dump-times vect "vectorizing stmts 
> using SLP" 1
> FAIL: gcc.dg/vect/slp-19b.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorizing stmts using SLP" 1
> FAIL: gcc.dg/vect/slp-19b.c scan-tree-dump-times vect "vectorizing stmts 
> using SLP" 1
> FAIL: gcc.dg/vect/slp-19c.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/slp-19c.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorizing stmts using SLP" 1
> FAIL: gcc.dg/vect/slp-19c.c scan-tree-dump-times vect "vectorized 1 loops" 
> 1
> FAIL: gcc.dg/vect/slp-19c.c scan-tree-dump-times vect "vectorizing stmts 
> using SLP" 1
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c scan-tree-dump vect "vectorized 
> 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c scan-tree-dump vect "vectorized 1 
> loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c scan-tree-dump vect "vectorized 
> 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c -flto -ffat-lto-objects  
> scan-tree-dump vect "vectorized 1 loops"
> XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c scan-tree-dump vect "vectorized 1 
> loops"
> FAIL: gcc.dg/vect/vect-26.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-26.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Vectorizing an unaligned access" 0
> FAIL: gcc.dg/vect/vect-26.c scan-tree-dump-times vect "Alignment of access 
> forced using peeling" 1
> FAIL: gcc.dg/vect/vect-26.c scan-tree-dump-times vect "Vectorizing an 
> unaligned access" 0
> FAIL: gcc.dg/vect/vect-54.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-54.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Vectorizing an unaligned access" 0
> FAIL: gcc.dg/vect/vect-54.c scan-tree-dump-times vect "Alignment of access 
> forced using peeling" 1
> FAIL: gcc.dg/vect/vect-54.c scan-tree-dump-times vect "Vectorizing an 
> unaligned access" 0
> FAIL: gcc.dg/vect/vect-56.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-56.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Vectorizing an unaligned access" 1
> FAIL: gcc.dg/vect/vect-56.c scan-tree-dump-times vect "Alignment of access 
> forced using peeling" 1
> FAIL: gcc.dg/vect/vect-56.c scan-tree-dump-times vect "Vectorizing an 
> unaligned access" 1
> FAIL: gcc.dg/vect/vect-58.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-58.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Vectorizing an unaligned access" 0
> FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vect "Alignment of access 
> forced using peeling" 1
> FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vect "Vectorizing an 
> unaligned access" 0
> FAIL: gcc.dg/vect/vect-60.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-60.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Vectorizing an unaligned access" 1
> FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vect "Alignment of access 
> forced using peeling" 1
> FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vect "Vectorizing an 
> unaligned access" 1
> FAIL: gcc.dg/vect/vect-89-big-array.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-89-big-array.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "Vectorizing an unaligned access" 0
> FAIL: gcc.dg/vect/vect-89-big-array.c scan-tree-dump-times vect "Alignment 
> of access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-89-big-array.c scan-tree-dump-times vect 
> "Vectorizing an unaligned access" 0
> FAIL: gcc.dg/vect/vect-89.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-89.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Vectorizing an unaligned access" 0
> FAIL: gcc.dg/vect/vect-89.c scan-tree-dump-times vect "Alignment of access 
> forced using peeling" 1
> FAIL: gcc.dg/vect/vect-89.c scan-tree-dump-times vect "Vectorizing an 
> unaligned access" 0
> FAIL: gcc.dg/vect/vect-92.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Alignment of access forced using peeling" 3
> FAIL: gcc.dg/vect/vect-92.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "Vectorizing an unaligned access" 0 
> FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vect "Alignment of access 
> forced using peeling" 3
> FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vect "Vectorizing an 
> unaligned access" 0 
> FAIL: gcc.dg/vect/vect-early-break_25.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-early-break_25.c scan-tree-dump-times vect 
> "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-multitypes-1.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "Alignment of access forced using peeling" 2
> FAIL: gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect "Alignment 
> of access forced using peeling" 2
> XPASS: gcc.dg/vect/vect-outer-4e.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> XPASS: gcc.dg/vect/vect-outer-4e.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED" 1 
> FAIL: gcc.dg/vect/vect-peel-1.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-peel-1.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "Vectorizing an unaligned access" 1
> FAIL: gcc.dg/vect/vect-peel-1.c scan-tree-dump-times vect "Alignment of 
> access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-peel-1.c scan-tree-dump-times vect "Vectorizing an 
> unaligned access" 1
> FAIL: gcc.dg/vect/vect-peel-2.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "Alignment of access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-peel-2.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "Vectorizing an unaligned access" 1
> FAIL: gcc.dg/vect/vect-peel-2.c scan-tree-dump-times vect "Alignment of 
> access forced using peeling" 1
> FAIL: gcc.dg/vect/vect-peel-2.c scan-tree-dump-times vect "Vectorizing an 
> unaligned access" 1
> FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times 
> vfmadd132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times 
> vfmsub132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times 
> vfnmadd132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times 
> vfnmsub132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times 
> vfmadd132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times 
> vfmsub132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times 
> vfnmadd132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times 
> vfnmsub132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times 
> vfmadd132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times 
> vfmsub132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times 
> vfnmadd132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times 
> vfnmsub132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> FAIL: gcc.target/i386/pr101950-2.c scan-assembler-times \\txor[ql]\\t 2
> FAIL: gcc.target/i386/pr88531-2b.c scan-assembler-times vmulps 1
> FAIL: gcc.target/i386/pr88531-2c.c scan-assembler-times vmulps 1
> FAIL: gcc.target/i386/vectorize1.c scan-tree-dump vect "vect_cst"
> FAIL: gfortran.dg/temporary_3.f90   -O2  execution test
> FAIL: gfortran.dg/vect/fast-math-mgrid-resid.f   -O   scan-tree-dump pcom 
> "Executing predictive commoning without unrolling"
> FAIL: gfortran.dg/vect/vect-8.f90   -O   scan-tree-dump-times vect 
> "vectorized 2[234] loops" 1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC] Merge strathegy for all-SLP vectorizer
  2024-05-17 12:08 ` Richard Sandiford
@ 2024-05-17 12:53   ` Richard Biener
  2024-05-21  6:55     ` Tamar Christina
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Biener @ 2024-05-17 12:53 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Biener via Gcc, tamar.christina

On Fri, 17 May 2024, Richard Sandiford wrote:

> Richard Biener via Gcc <gcc@gcc.gnu.org> writes:
> > Hi,
> >
> > I'd like to discuss how to go forward with getting the vectorizer to
> > all-SLP for this stage1.  While there is a personal branch with my
> > ongoing work (users/rguenth/vect-force-slp) branches haven't proved
> > themselves working well for collaboration.
> 
> Speaking for myself, the problem hasn't been so much the branch as
> lack of time.  I've been pretty swamped the last eight months of so
> (except for the time that I took off, which admittedly was quite a
> bit!), and so I never even got around to properly reading and replying
> to your message after the Cauldron.  It's been on the "this is important,
> I should make time to read and understand it properly" list all this time.
> Sorry about that. :(
> 
> I'm hoping to have time to work/help out on SLP stuff soon.
> 
> > The branch isn't ready to be merged in full but I have been picking
> > improvements to trunk last stage1 and some remaining bits in the past
> > weeks.  I have refrained from merging code paths that cannot be
> > exercised on trunk.
> >
> > There are two important set of changes on the branch, both critical
> > to get more testing on non-x86 targets.
> >
> >  1. enable single-lane SLP discovery
> >  2. avoid splitting store groups (9315bfc661432c3 and 4336060fe2db8ec
> >     if you fetch the branch)
> >
> > The first point is also most annoying on the testsuite since doing
> > SLP instead of interleaving changes what we dump and thus tests
> > start to fail in random ways when you switch between both modes.
> > On the branch single-lane SLP discovery is gated with
> > --param vect-single-lane-slp.
> >
> > The branch has numerous changes to enable single-lane SLP for some
> > code paths that have SLP not implemented and where I did not bother
> > to try supporting multi-lane SLP at this point.  It also adds more
> > SLP discovery entry points.
> >
> > I'm not sure how to try merging these pieces to allow others to
> > more easily help out.  One possibility is to merge
> > --param vect-single-lane-slp defaulted off and pick dependent
> > changes even when they cause testsuite regressions with
> > vect-single-lane-slp=1.  Alternatively adjust the testsuite by
> > adding --param vect-single-lane-slp=0 and default to 1
> > (or keep the default).
> 
> FWIW, this one sounds good to me (the default to 1 version).
> I.e. mechanically add --param vect-single-lane-slp=0 to any tests
> that fail with the new default.  That means that the test that need
> fixing are easily greppable for anyone who wants to help.  Sometimes
> it'll just be a test update.  Sometimes it will be new vectoriser code.

OK.  Meanwhile I figured the most important part is 2. from above
since that enables the single-lane in a grouped access (also covering
single element interleaving).  This will cover all problematical cases
with respect to vectorizing loads and stores.  It also has less
testsuite fallout, mainly because we have a lot less coverage for
grouped stores without SLP.

So I'll see to produce a mergeable patch for part 2 and post that
for review next week.

Thanks,
Richard.

> Thanks,
> Richard
> 
> > Or require a clean testsuite with
> > --param vect-single-lane-slp defaulted to 1 but keep the --param
> > for debugging (and allow FAILs with 0).
> >
> > For fun I merged just single-lane discovery of non-grouped stores
> > and have that enabled by default.  On x86_64 this results in the
> > set of FAILs below.
> >
> > Any suggestions?
> >
> > Thanks,
> > Richard.
> >
> > FAIL: gcc.dg/vect/O3-pr39675-2.c scan-tree-dump-times vect "vectorizing 
> > stmts using SLP" 1
> > XPASS: gcc.dg/vect/no-scevccp-outer-12.c scan-tree-dump-times vect "OUTER 
> > LOOP VECTORIZED." 1
> > FAIL: gcc.dg/vect/no-section-anchors-vect-31.c scan-tree-dump-times vect 
> > "Alignment of access forced using peeling" 2
> > FAIL: gcc.dg/vect/no-section-anchors-vect-31.c scan-tree-dump-times vect 
> > "Vectorizing an unaligned access" 0
> > FAIL: gcc.dg/vect/no-section-anchors-vect-64.c scan-tree-dump-times vect 
> > "Alignment of access forced using peeling" 2
> > FAIL: gcc.dg/vect/no-section-anchors-vect-64.c scan-tree-dump-times vect 
> > "Vectorizing an unaligned access" 0
> > FAIL: gcc.dg/vect/no-section-anchors-vect-66.c scan-tree-dump-times vect 
> > "Alignment of access forced using peeling" 1
> > FAIL: gcc.dg/vect/no-section-anchors-vect-66.c scan-tree-dump-times vect 
> > "Vectorizing an unaligned access" 0
> > FAIL: gcc.dg/vect/no-section-anchors-vect-68.c scan-tree-dump-times vect 
> > "Alignment of access forced using peeling" 2
> > FAIL: gcc.dg/vect/no-section-anchors-vect-68.c scan-tree-dump-times vect 
> > "Vectorizing an unaligned access" 0
> > FAIL: gcc.dg/vect/slp-12a.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "vectorizing stmts using SLP" 1
> > FAIL: gcc.dg/vect/slp-12a.c scan-tree-dump-times vect "vectorizing stmts 
> > using SLP" 1
> > FAIL: gcc.dg/vect/slp-19a.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "vectorizing stmts using SLP" 1
> > FAIL: gcc.dg/vect/slp-19a.c scan-tree-dump-times vect "vectorizing stmts 
> > using SLP" 1
> > FAIL: gcc.dg/vect/slp-19b.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "vectorizing stmts using SLP" 1
> > FAIL: gcc.dg/vect/slp-19b.c scan-tree-dump-times vect "vectorizing stmts 
> > using SLP" 1
> > FAIL: gcc.dg/vect/slp-19c.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "vectorized 1 loops" 1
> > FAIL: gcc.dg/vect/slp-19c.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "vectorizing stmts using SLP" 1
> > FAIL: gcc.dg/vect/slp-19c.c scan-tree-dump-times vect "vectorized 1 loops" 
> > 1
> > FAIL: gcc.dg/vect/slp-19c.c scan-tree-dump-times vect "vectorizing stmts 
> > using SLP" 1
> > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c -flto -ffat-lto-objects  
> > scan-tree-dump vect "vectorized 1 loops"
> > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c scan-tree-dump vect "vectorized 
> > 1 loops"
> > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c -flto -ffat-lto-objects  
> > scan-tree-dump vect "vectorized 1 loops"
> > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c scan-tree-dump vect "vectorized 1 
> > loops"
> > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c -flto -ffat-lto-objects  
> > scan-tree-dump vect "vectorized 1 loops"
> > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c scan-tree-dump vect "vectorized 
> > 1 loops"
> > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c -flto -ffat-lto-objects  
> > scan-tree-dump vect "vectorized 1 loops"
> > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c scan-tree-dump vect "vectorized 1 
> > loops"
> > FAIL: gcc.dg/vect/vect-26.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Alignment of access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-26.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Vectorizing an unaligned access" 0
> > FAIL: gcc.dg/vect/vect-26.c scan-tree-dump-times vect "Alignment of access 
> > forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-26.c scan-tree-dump-times vect "Vectorizing an 
> > unaligned access" 0
> > FAIL: gcc.dg/vect/vect-54.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Alignment of access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-54.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Vectorizing an unaligned access" 0
> > FAIL: gcc.dg/vect/vect-54.c scan-tree-dump-times vect "Alignment of access 
> > forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-54.c scan-tree-dump-times vect "Vectorizing an 
> > unaligned access" 0
> > FAIL: gcc.dg/vect/vect-56.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Alignment of access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-56.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Vectorizing an unaligned access" 1
> > FAIL: gcc.dg/vect/vect-56.c scan-tree-dump-times vect "Alignment of access 
> > forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-56.c scan-tree-dump-times vect "Vectorizing an 
> > unaligned access" 1
> > FAIL: gcc.dg/vect/vect-58.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Alignment of access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-58.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Vectorizing an unaligned access" 0
> > FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vect "Alignment of access 
> > forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vect "Vectorizing an 
> > unaligned access" 0
> > FAIL: gcc.dg/vect/vect-60.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Alignment of access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-60.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Vectorizing an unaligned access" 1
> > FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vect "Alignment of access 
> > forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vect "Vectorizing an 
> > unaligned access" 1
> > FAIL: gcc.dg/vect/vect-89-big-array.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "Alignment of access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-89-big-array.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "Vectorizing an unaligned access" 0
> > FAIL: gcc.dg/vect/vect-89-big-array.c scan-tree-dump-times vect "Alignment 
> > of access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-89-big-array.c scan-tree-dump-times vect 
> > "Vectorizing an unaligned access" 0
> > FAIL: gcc.dg/vect/vect-89.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Alignment of access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-89.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Vectorizing an unaligned access" 0
> > FAIL: gcc.dg/vect/vect-89.c scan-tree-dump-times vect "Alignment of access 
> > forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-89.c scan-tree-dump-times vect "Vectorizing an 
> > unaligned access" 0
> > FAIL: gcc.dg/vect/vect-92.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Alignment of access forced using peeling" 3
> > FAIL: gcc.dg/vect/vect-92.c -flto -ffat-lto-objects  scan-tree-dump-times 
> > vect "Vectorizing an unaligned access" 0 
> > FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vect "Alignment of access 
> > forced using peeling" 3
> > FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vect "Vectorizing an 
> > unaligned access" 0 
> > FAIL: gcc.dg/vect/vect-early-break_25.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "Alignment of access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-early-break_25.c scan-tree-dump-times vect 
> > "Alignment of access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-multitypes-1.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "Alignment of access forced using peeling" 2
> > FAIL: gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect "Alignment 
> > of access forced using peeling" 2
> > XPASS: gcc.dg/vect/vect-outer-4e.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> > XPASS: gcc.dg/vect/vect-outer-4e.c scan-tree-dump-times vect "OUTER LOOP 
> > VECTORIZED" 1 
> > FAIL: gcc.dg/vect/vect-peel-1.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "Alignment of access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-peel-1.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "Vectorizing an unaligned access" 1
> > FAIL: gcc.dg/vect/vect-peel-1.c scan-tree-dump-times vect "Alignment of 
> > access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-peel-1.c scan-tree-dump-times vect "Vectorizing an 
> > unaligned access" 1
> > FAIL: gcc.dg/vect/vect-peel-2.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "Alignment of access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-peel-2.c -flto -ffat-lto-objects  
> > scan-tree-dump-times vect "Vectorizing an unaligned access" 1
> > FAIL: gcc.dg/vect/vect-peel-2.c scan-tree-dump-times vect "Alignment of 
> > access forced using peeling" 1
> > FAIL: gcc.dg/vect/vect-peel-2.c scan-tree-dump-times vect "Vectorizing an 
> > unaligned access" 1
> > FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times 
> > vfmadd132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times 
> > vfmsub132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times 
> > vfnmadd132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times 
> > vfnmsub132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times 
> > vfmadd132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times 
> > vfmsub132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times 
> > vfnmadd132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times 
> > vfnmsub132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times 
> > vfmadd132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times 
> > vfmsub132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times 
> > vfnmadd132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times 
> > vfnmsub132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > FAIL: gcc.target/i386/pr101950-2.c scan-assembler-times \\txor[ql]\\t 2
> > FAIL: gcc.target/i386/pr88531-2b.c scan-assembler-times vmulps 1
> > FAIL: gcc.target/i386/pr88531-2c.c scan-assembler-times vmulps 1
> > FAIL: gcc.target/i386/vectorize1.c scan-tree-dump vect "vect_cst"
> > FAIL: gfortran.dg/temporary_3.f90   -O2  execution test
> > FAIL: gfortran.dg/vect/fast-math-mgrid-resid.f   -O   scan-tree-dump pcom 
> > "Executing predictive commoning without unrolling"
> > FAIL: gfortran.dg/vect/vect-8.f90   -O   scan-tree-dump-times vect 
> > "vectorized 2[234] loops" 1
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [RFC] Merge strathegy for all-SLP vectorizer
  2024-05-17 12:53   ` Richard Biener
@ 2024-05-21  6:55     ` Tamar Christina
  0 siblings, 0 replies; 4+ messages in thread
From: Tamar Christina @ 2024-05-21  6:55 UTC (permalink / raw)
  To: Richard Biener, Richard Sandiford; +Cc: Richard Biener via Gcc



> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Friday, May 17, 2024 1:54 PM
> To: Richard Sandiford <Richard.Sandiford@arm.com>
> Cc: Richard Biener via Gcc <gcc@gcc.gnu.org>; Tamar Christina
> <Tamar.Christina@arm.com>
> Subject: Re: [RFC] Merge strathegy for all-SLP vectorizer
> 
> On Fri, 17 May 2024, Richard Sandiford wrote:
> 
> > Richard Biener via Gcc <gcc@gcc.gnu.org> writes:
> > > Hi,
> > >
> > > I'd like to discuss how to go forward with getting the vectorizer to
> > > all-SLP for this stage1.  While there is a personal branch with my
> > > ongoing work (users/rguenth/vect-force-slp) branches haven't proved
> > > themselves working well for collaboration.
> >

Yeah, It's hard to keep rebasing and build on top of.

> > Speaking for myself, the problem hasn't been so much the branch as
> > lack of time.  I've been pretty swamped the last eight months of so
> > (except for the time that I took off, which admittedly was quite a
> > bit!), and so I never even got around to properly reading and replying
> > to your message after the Cauldron.  It's been on the "this is important,
> > I should make time to read and understand it properly" list all this time.
> > Sorry about that. :(
> >
> > I'm hoping to have time to work/help out on SLP stuff soon.
> >
> > > The branch isn't ready to be merged in full but I have been picking
> > > improvements to trunk last stage1 and some remaining bits in the past
> > > weeks.  I have refrained from merging code paths that cannot be
> > > exercised on trunk.
> > >
> > > There are two important set of changes on the branch, both critical
> > > to get more testing on non-x86 targets.
> > >
> > >  1. enable single-lane SLP discovery
> > >  2. avoid splitting store groups (9315bfc661432c3 and 4336060fe2db8ec
> > >     if you fetch the branch)
> > >

For no# is there a param or is it just the default?  I can run these through
regression today.

> > > The first point is also most annoying on the testsuite since doing
> > > SLP instead of interleaving changes what we dump and thus tests
> > > start to fail in random ways when you switch between both modes.
> > > On the branch single-lane SLP discovery is gated with
> > > --param vect-single-lane-slp.
> > >
> > > The branch has numerous changes to enable single-lane SLP for some
> > > code paths that have SLP not implemented and where I did not bother
> > > to try supporting multi-lane SLP at this point.  It also adds more
> > > SLP discovery entry points.
> > >
> > > I'm not sure how to try merging these pieces to allow others to
> > > more easily help out.  One possibility is to merge
> > > --param vect-single-lane-slp defaulted off and pick dependent
> > > changes even when they cause testsuite regressions with
> > > vect-single-lane-slp=1.  Alternatively adjust the testsuite by
> > > adding --param vect-single-lane-slp=0 and default to 1
> > > (or keep the default).

I guess which one is better depends on whether the parameter goes
away this release? If so I think we should just leave them broken for
now and fix them up when it's the default?

> >
> > FWIW, this one sounds good to me (the default to 1 version).
> > I.e. mechanically add --param vect-single-lane-slp=0 to any tests
> > that fail with the new default.  That means that the test that need
> > fixing are easily greppable for anyone who wants to help.  Sometimes
> > it'll just be a test update.  Sometimes it will be new vectoriser code.
> 
> OK.  Meanwhile I figured the most important part is 2. from above
> since that enables the single-lane in a grouped access (also covering
> single element interleaving).  This will cover all problematical cases
> with respect to vectorizing loads and stores.  It also has less
> testsuite fallout, mainly because we have a lot less coverage for
> grouped stores without SLP.
> 
> So I'll see to produce a mergeable patch for part 2 and post that
> for review next week.

Sounds good!

Thanks for getting the ball rolling on this.
It would be useful to have it in trunk indeed, off by default for now
sounds good because then I can work on trunk for the SLP support
for early break as well.

Cheers,
Tamar

> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Richard
> >
> > > Or require a clean testsuite with
> > > --param vect-single-lane-slp defaulted to 1 but keep the --param
> > > for debugging (and allow FAILs with 0).
> > >
> > > For fun I merged just single-lane discovery of non-grouped stores
> > > and have that enabled by default.  On x86_64 this results in the
> > > set of FAILs below.
> > >
> > > Any suggestions?
> > >
> > > Thanks,
> > > Richard.
> > >
> > > FAIL: gcc.dg/vect/O3-pr39675-2.c scan-tree-dump-times vect "vectorizing
> > > stmts using SLP" 1
> > > XPASS: gcc.dg/vect/no-scevccp-outer-12.c scan-tree-dump-times vect "OUTER
> > > LOOP VECTORIZED." 1
> > > FAIL: gcc.dg/vect/no-section-anchors-vect-31.c scan-tree-dump-times vect
> > > "Alignment of access forced using peeling" 2
> > > FAIL: gcc.dg/vect/no-section-anchors-vect-31.c scan-tree-dump-times vect
> > > "Vectorizing an unaligned access" 0
> > > FAIL: gcc.dg/vect/no-section-anchors-vect-64.c scan-tree-dump-times vect
> > > "Alignment of access forced using peeling" 2
> > > FAIL: gcc.dg/vect/no-section-anchors-vect-64.c scan-tree-dump-times vect
> > > "Vectorizing an unaligned access" 0
> > > FAIL: gcc.dg/vect/no-section-anchors-vect-66.c scan-tree-dump-times vect
> > > "Alignment of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/no-section-anchors-vect-66.c scan-tree-dump-times vect
> > > "Vectorizing an unaligned access" 0
> > > FAIL: gcc.dg/vect/no-section-anchors-vect-68.c scan-tree-dump-times vect
> > > "Alignment of access forced using peeling" 2
> > > FAIL: gcc.dg/vect/no-section-anchors-vect-68.c scan-tree-dump-times vect
> > > "Vectorizing an unaligned access" 0
> > > FAIL: gcc.dg/vect/slp-12a.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "vectorizing stmts using SLP" 1
> > > FAIL: gcc.dg/vect/slp-12a.c scan-tree-dump-times vect "vectorizing stmts
> > > using SLP" 1
> > > FAIL: gcc.dg/vect/slp-19a.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "vectorizing stmts using SLP" 1
> > > FAIL: gcc.dg/vect/slp-19a.c scan-tree-dump-times vect "vectorizing stmts
> > > using SLP" 1
> > > FAIL: gcc.dg/vect/slp-19b.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "vectorizing stmts using SLP" 1
> > > FAIL: gcc.dg/vect/slp-19b.c scan-tree-dump-times vect "vectorizing stmts
> > > using SLP" 1
> > > FAIL: gcc.dg/vect/slp-19c.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "vectorized 1 loops" 1
> > > FAIL: gcc.dg/vect/slp-19c.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "vectorizing stmts using SLP" 1
> > > FAIL: gcc.dg/vect/slp-19c.c scan-tree-dump-times vect "vectorized 1 loops"
> > > 1
> > > FAIL: gcc.dg/vect/slp-19c.c scan-tree-dump-times vect "vectorizing stmts
> > > using SLP" 1
> > > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c -flto -ffat-lto-objects
> > > scan-tree-dump vect "vectorized 1 loops"
> > > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1115.c scan-tree-dump vect "vectorized
> > > 1 loops"
> > > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c -flto -ffat-lto-objects
> > > scan-tree-dump vect "vectorized 1 loops"
> > > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s114.c scan-tree-dump vect "vectorized 1
> > > loops"
> > > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c -flto -ffat-lto-objects
> > > scan-tree-dump vect "vectorized 1 loops"
> > > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s1232.c scan-tree-dump vect "vectorized
> > > 1 loops"
> > > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c -flto -ffat-lto-objects
> > > scan-tree-dump vect "vectorized 1 loops"
> > > XPASS: gcc.dg/vect/tsvc/vect-tsvc-s257.c scan-tree-dump vect "vectorized 1
> > > loops"
> > > FAIL: gcc.dg/vect/vect-26.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Alignment of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-26.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Vectorizing an unaligned access" 0
> > > FAIL: gcc.dg/vect/vect-26.c scan-tree-dump-times vect "Alignment of access
> > > forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-26.c scan-tree-dump-times vect "Vectorizing an
> > > unaligned access" 0
> > > FAIL: gcc.dg/vect/vect-54.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Alignment of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-54.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Vectorizing an unaligned access" 0
> > > FAIL: gcc.dg/vect/vect-54.c scan-tree-dump-times vect "Alignment of access
> > > forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-54.c scan-tree-dump-times vect "Vectorizing an
> > > unaligned access" 0
> > > FAIL: gcc.dg/vect/vect-56.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Alignment of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-56.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Vectorizing an unaligned access" 1
> > > FAIL: gcc.dg/vect/vect-56.c scan-tree-dump-times vect "Alignment of access
> > > forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-56.c scan-tree-dump-times vect "Vectorizing an
> > > unaligned access" 1
> > > FAIL: gcc.dg/vect/vect-58.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Alignment of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-58.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Vectorizing an unaligned access" 0
> > > FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vect "Alignment of access
> > > forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-58.c scan-tree-dump-times vect "Vectorizing an
> > > unaligned access" 0
> > > FAIL: gcc.dg/vect/vect-60.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Alignment of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-60.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Vectorizing an unaligned access" 1
> > > FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vect "Alignment of access
> > > forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-60.c scan-tree-dump-times vect "Vectorizing an
> > > unaligned access" 1
> > > FAIL: gcc.dg/vect/vect-89-big-array.c -flto -ffat-lto-objects
> > > scan-tree-dump-times vect "Alignment of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-89-big-array.c -flto -ffat-lto-objects
> > > scan-tree-dump-times vect "Vectorizing an unaligned access" 0
> > > FAIL: gcc.dg/vect/vect-89-big-array.c scan-tree-dump-times vect "Alignment
> > > of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-89-big-array.c scan-tree-dump-times vect
> > > "Vectorizing an unaligned access" 0
> > > FAIL: gcc.dg/vect/vect-89.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Alignment of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-89.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Vectorizing an unaligned access" 0
> > > FAIL: gcc.dg/vect/vect-89.c scan-tree-dump-times vect "Alignment of access
> > > forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-89.c scan-tree-dump-times vect "Vectorizing an
> > > unaligned access" 0
> > > FAIL: gcc.dg/vect/vect-92.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Alignment of access forced using peeling" 3
> > > FAIL: gcc.dg/vect/vect-92.c -flto -ffat-lto-objects  scan-tree-dump-times
> > > vect "Vectorizing an unaligned access" 0
> > > FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vect "Alignment of access
> > > forced using peeling" 3
> > > FAIL: gcc.dg/vect/vect-92.c scan-tree-dump-times vect "Vectorizing an
> > > unaligned access" 0
> > > FAIL: gcc.dg/vect/vect-early-break_25.c -flto -ffat-lto-objects
> > > scan-tree-dump-times vect "Alignment of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-early-break_25.c scan-tree-dump-times vect
> > > "Alignment of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-multitypes-1.c -flto -ffat-lto-objects
> > > scan-tree-dump-times vect "Alignment of access forced using peeling" 2
> > > FAIL: gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect "Alignment
> > > of access forced using peeling" 2
> > > XPASS: gcc.dg/vect/vect-outer-4e.c -flto -ffat-lto-objects
> > > scan-tree-dump-times vect "OUTER LOOP VECTORIZED" 1
> > > XPASS: gcc.dg/vect/vect-outer-4e.c scan-tree-dump-times vect "OUTER LOOP
> > > VECTORIZED" 1
> > > FAIL: gcc.dg/vect/vect-peel-1.c -flto -ffat-lto-objects
> > > scan-tree-dump-times vect "Alignment of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-peel-1.c -flto -ffat-lto-objects
> > > scan-tree-dump-times vect "Vectorizing an unaligned access" 1
> > > FAIL: gcc.dg/vect/vect-peel-1.c scan-tree-dump-times vect "Alignment of
> > > access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-peel-1.c scan-tree-dump-times vect "Vectorizing an
> > > unaligned access" 1
> > > FAIL: gcc.dg/vect/vect-peel-2.c -flto -ffat-lto-objects
> > > scan-tree-dump-times vect "Alignment of access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-peel-2.c -flto -ffat-lto-objects
> > > scan-tree-dump-times vect "Vectorizing an unaligned access" 1
> > > FAIL: gcc.dg/vect/vect-peel-2.c scan-tree-dump-times vect "Alignment of
> > > access forced using peeling" 1
> > > FAIL: gcc.dg/vect/vect-peel-2.c scan-tree-dump-times vect "Vectorizing an
> > > unaligned access" 1
> > > FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times
> > > vfmadd132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > > FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times
> > > vfmsub132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > > FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times
> > > vfnmadd132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > > FAIL: gcc.target/i386/cond_op_fma__Float16-1.c scan-assembler-times
> > > vfnmsub132ph[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > > FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times
> > > vfmadd132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > > FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times
> > > vfmsub132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > > FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times
> > > vfnmadd132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > > FAIL: gcc.target/i386/cond_op_fma_double-1.c scan-assembler-times
> > > vfnmsub132pd[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > > FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times
> > > vfmadd132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > > FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times
> > > vfmsub132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > > FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times
> > > vfnmadd132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > > FAIL: gcc.target/i386/cond_op_fma_float-1.c scan-assembler-times
> > > vfnmsub132ps[ \\\\t]+[^{\\n]*%ymm[0-9]+{%k[1-7]}(?:\\n|[ \\\\t]+#) 1
> > > FAIL: gcc.target/i386/pr101950-2.c scan-assembler-times \\txor[ql]\\t 2
> > > FAIL: gcc.target/i386/pr88531-2b.c scan-assembler-times vmulps 1
> > > FAIL: gcc.target/i386/pr88531-2c.c scan-assembler-times vmulps 1
> > > FAIL: gcc.target/i386/vectorize1.c scan-tree-dump vect "vect_cst"
> > > FAIL: gfortran.dg/temporary_3.f90   -O2  execution test
> > > FAIL: gfortran.dg/vect/fast-math-mgrid-resid.f   -O   scan-tree-dump pcom
> > > "Executing predictive commoning without unrolling"
> > > FAIL: gfortran.dg/vect/vect-8.f90   -O   scan-tree-dump-times vect
> > > "vectorized 2[234] loops" 1
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-05-21  6:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-17 10:36 [RFC] Merge strathegy for all-SLP vectorizer Richard Biener
2024-05-17 12:08 ` Richard Sandiford
2024-05-17 12:53   ` Richard Biener
2024-05-21  6:55     ` Tamar Christina

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).