public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders
@ 2023-05-24 12:18 rguenth at gcc dot gnu.org
2023-05-24 12:20 ` [Bug target/109955] " rguenth at gcc dot gnu.org
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-24 12:18 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955
Bug ID: 109955
Summary: Should be possible to remove vcond{,u,eq} expanders
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target Milestone: ---
It should be possible to remove all vcond, vcondu and vcondeq expanders and
have the functionality be implemented via the vec_cmp and vcond_mask expanders.
But when removing them a bootstrap & regtest reveals
=== g++ tests ===
Running target unix
FAIL: g++.target/i386/avx-pr54700-1.C scan-assembler-not vpcmpgt[bdq]
FAIL: g++.target/i386/avx-pr54700-1.C scan-assembler-times vblendvpd 4
FAIL: g++.target/i386/avx-pr54700-1.C scan-assembler-times vblendvps 4
FAIL: g++.target/i386/avx-pr54700-1.C scan-assembler-times vpblendvb 2
FAIL: g++.target/i386/avx2-pr54700-1.C scan-assembler-not vpcmpgt[bdq]
FAIL: g++.target/i386/avx2-pr54700-1.C scan-assembler-times vblendvpd 4
FAIL: g++.target/i386/avx2-pr54700-1.C scan-assembler-times vblendvps 4
FAIL: g++.target/i386/avx2-pr54700-1.C scan-assembler-times vpblendvb 2
FAIL: g++.target/i386/avx512fp16-vcondmn-minmax.C -std=gnu++14
scan-assembler-times vmaxph 3
FAIL: g++.target/i386/avx512fp16-vcondmn-minmax.C -std=gnu++14
scan-assembler-times vminph 3
FAIL: g++.target/i386/pr100738-1.C -std=gnu++14 scan-assembler-not vpcmpeqd[
\\\\t]
FAIL: g++.target/i386/pr100738-1.C -std=gnu++14 scan-assembler-not vpxor[
\\\\t]
FAIL: g++.target/i386/pr100738-1.C -std=gnu++14 scan-assembler-times
vblendvps[ \\\\t] 2
FAIL: g++.target/i386/sse4_1-pr54700-1.C scan-assembler-not pcmpgt[bdq]
FAIL: g++.target/i386/sse4_1-pr54700-1.C scan-assembler-times blendvpd 4
FAIL: g++.target/i386/sse4_1-pr54700-1.C scan-assembler-times blendvps 4
FAIL: g++.target/i386/sse4_1-pr54700-1.C scan-assembler-times pblendvb 2
=== gcc tests ===
Running target unix
FAIL: gcc.dg/vect/pr109011-3.c -flto -ffat-lto-objects scan-tree-dump-times
optimized " = .POPCOUNT \\\\(vect" 3
FAIL: gcc.dg/vect/pr109011-3.c scan-tree-dump-times optimized " = .POPCOUNT
\\\\(vect" 3
FAIL: gcc.dg/vect/pr109011-5.c -flto -ffat-lto-objects scan-tree-dump-times
optimized " = .POPCOUNT \\\\(vect" 3
FAIL: gcc.dg/vect/pr109011-5.c scan-tree-dump-times optimized " = .POPCOUNT
\\\\(vect" 3
FAIL: gcc.target/i386/avx2-pr99908.c scan-assembler-not \\tvpcmpeq
FAIL: gcc.target/i386/avx512bw-pr96891-1.c scan-assembler-not %k[0-7]
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-not %k[0-9]
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsb[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsd[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsq[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsw[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminub[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminud[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminuq[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminuw[\\t ] 2
FAIL: gcc.target/i386/pr109011-b1.c scan-assembler-times vpopcntb[ \\t]+ 4
FAIL: gcc.target/i386/pr109011-w1.c scan-assembler-times vpopcntw[ \\t]+ 4
FAIL: gcc.target/i386/sse4_1-pr99908.c scan-assembler-not \\tpcmpeq
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/109955] Should be possible to remove vcond{,u,eq} expanders
2023-05-24 12:18 [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders rguenth at gcc dot gnu.org
@ 2023-05-24 12:20 ` rguenth at gcc dot gnu.org
2023-05-24 12:42 ` rguenth at gcc dot gnu.org
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-24 12:20 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955
--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 55149
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55149&action=edit
patch I tested
This is the patch I tested. I have not yet investigated any of the FAILs.
Causes might be missing/differing vec_cmp or vcond_mask patterns or different
behavior of the vectorizer or RTL expander.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/109955] Should be possible to remove vcond{,u,eq} expanders
2023-05-24 12:18 [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders rguenth at gcc dot gnu.org
2023-05-24 12:20 ` [Bug target/109955] " rguenth at gcc dot gnu.org
@ 2023-05-24 12:42 ` rguenth at gcc dot gnu.org
2023-05-25 9:44 ` rguenth at gcc dot gnu.org
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-24 12:42 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955
--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
One thing I see is
-(insn 11 10 15 2 (set (subreg:V16QI (reg:V2DI 83 [ <retval> ]) 0)
- (unspec:V16QI [
- (reg:V16QI 92)
- (reg:V16QI 91)
- (lt:V16QI (reg:V16QI 90)
- (const_vector:V16QI [
- (const_int 0 [0]) repeated x16
- ]))
- ] UNSPEC_BLENDV))
"/space/rguenther/src/gcc/gcc/testsuite/gcc.target/i386/sse4_1-pr99908.c":22:10
discrim 1 7431 {*sse4_1_pblendvb_lt}
(nil)))))
vs
+(insn 8 5 9 2 (set (reg:V16QI 89)
+ (const_vector:V16QI [
+ (const_int -1 [0xffffffffffffffff]) repeated x16
+ ]))
"/spc/abuild/rguenther/obj-gcc-g/gcc/include/smmintrin.h":181:20 1838
{movv16qi_internal}
+ (nil))
+(insn 9 8 11 2 (set (reg:V16QI 90)
+ (gt:V16QI (reg:V16QI 92)
+ (reg:V16QI 89)))
"/spc/abuild/rguenther/obj-gcc-g/gcc/include/smmintrin.h":181:20 6749
{*sse2_gtv16qi3}
(expr_list:REG_DEAD (reg:V16QI 92)
+ (expr_list:REG_DEAD (reg:V16QI 89)
+ (nil))))
+(note 11 9 12 2 NOTE_INSN_DELETED)
+(insn 12 11 16 2 (set (subreg:V16QI (reg:V2DI 84 [ <retval> ]) 0)
+ (unspec:V16QI [
+ (reg:V16QI 93)
+ (reg:V16QI 94)
+ (reg:V16QI 90)
+ ] UNSPEC_BLENDV))
"/space/rguenther/src/gcc/gcc/testsuite/gcc.target/i386/sse4_1-pr99908.c":22:10
discrim 1 7429 {sse4_1_pblendvb}
+ (expr_list:REG_DEAD (reg:V16QI 93)
+ (expr_list:REG_DEAD (reg:V16QI 90)
+ (expr_list:REG_DEAD (reg:V16QI 94)
(nil)))))
after the combiner which seems to be a missing simplification of
(insn 8 5 9 2 (set (reg:V16QI 89)
(const_vector:V16QI [
(const_int -1 [0xffffffffffffffff]) repeated x16
]))
(insn 9 8 11 2 (set (reg:V16QI 90)
(gt:V16QI (reg:V16QI 92)
(reg:V16QI 89)))
to
(lt:V16QI (reg:V16QI 90)
(const_vector:V16QI [
(const_int 0 [0]) repeated x16
])
Trying 8 -> 9:
8: r89:V16QI=const_vector
9: r90:V16QI=r92:V16QI>r89:V16QI
REG_DEAD r92:V16QI
REG_DEAD r89:V16QI
Failed to match this instruction:
(set (reg:V16QI 90)
(gt:V16QI (reg:V16QI 92)
(const_vector:V16QI [
(const_int -1 [0xffffffffffffffff]) repeated x16
])))
Trying 8, 9 -> 12:
8: r89:V16QI=const_vector
9: r90:V16QI=r92:V16QI>r89:V16QI
REG_DEAD r92:V16QI
REG_DEAD r89:V16QI
12: r84:V2DI#0=unspec[r93:V16QI,r94:V16QI,r90:V16QI] 47
REG_DEAD r93:V16QI
REG_DEAD r90:V16QI
REG_DEAD r94:V16QI
Failed to match this instruction:
(set (subreg:V16QI (reg:V2DI 84 [ <retval> ]) 0)
(unspec:V16QI [
(reg:V16QI 93)
(reg:V16QI 94)
(gt:V16QI (reg:V16QI 92)
(const_vector:V16QI [
(const_int -1 [0xffffffffffffffff]) repeated x16
]))
] UNSPEC_BLENDV))
not sure if the lt is a standalone thing. Maybe we just need a
define-insn-and-split for _gt as well. All those seem to be somewhat
tuned to the exact way RTL expansion works when the vcond patterns are there.
Getting rid of vcond* (but not vcond_mask) would allow quite some
simplification
in middle-end code and the vectorizer.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/109955] Should be possible to remove vcond{,u,eq} expanders
2023-05-24 12:18 [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders rguenth at gcc dot gnu.org
2023-05-24 12:20 ` [Bug target/109955] " rguenth at gcc dot gnu.org
2023-05-24 12:42 ` rguenth at gcc dot gnu.org
@ 2023-05-25 9:44 ` rguenth at gcc dot gnu.org
2023-05-25 11:01 ` cvs-commit at gcc dot gnu.org
2023-05-25 11:08 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-25 9:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
FAIL: gcc.dg/vect/pr109011-3.c -flto -ffat-lto-objects scan-tree-dump-times
optimized " = .POPCOUNT \\\\(vect" 3
show that when pattern recognition detects
t.c:5:21: note: vec_recog_ctz_ffs_pattern: detected: _6 = __builtin_ffs (_4);
t.c:5:21: note: created pattern stmt: patt_7 = _4 != 0 ? patt_8 : 0;
t.c:5:21: note: ctz_ffs pattern recognized: patt_7 = _4 != 0 ? patt_8 : 0;
t.c:5:21: note: extra pattern stmt: patt_18 = -_4;
t.c:5:21: note: extra pattern stmt: patt_15 = _4 | patt_18;
t.c:5:21: note: extra pattern stmt: patt_14 = .POPCOUNT (patt_15);
t.c:5:21: note: extra pattern stmt: patt_8 = 33 - patt_14;
we fail:
t.c:5:21: note: ==> examining pattern statement: patt_7 = _4 != 0 ? patt_8 :
0;
t.c:5:21: note: vect_is_simple_use: operand *_3, type of def: internal
t.c:5:21: note: vect_is_simple_use: vectype vector(16) int
t.c:5:21: note: vect_is_simple_use: operand 33 - patt_14, type of def:
internal
t.c:5:21: note: vect_is_simple_use: vectype vector(16) int
t.c:5:21: note: vect_is_simple_use: operand 0, type of def: constant
t.c:2:1: missed: not vectorized: relevant stmt not supported: patt_7 = _4 !=
0 ? patt_8 : 0;
t.c:5:21: missed: bad operation or unsupported loop bound.
note there's "unconverted" code in pattern recog that creates those "bad"
GIMPLE COND_EXPRs with GENERIC compares. I've chickened out "fixing" those
because exactly of a similar issue that we're not perfect in supporting
vcond vs vcond_mask 1:1. Having only one of them would simplify things.
I have a patch for that.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/109955] Should be possible to remove vcond{,u,eq} expanders
2023-05-24 12:18 [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders rguenth at gcc dot gnu.org
` (2 preceding siblings ...)
2023-05-25 9:44 ` rguenth at gcc dot gnu.org
@ 2023-05-25 11:01 ` cvs-commit at gcc dot gnu.org
2023-05-25 11:08 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-25 11:01 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955
--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:
https://gcc.gnu.org/g:f97572c2aeddc71b01686993b978895e55890ab6
commit r14-1238-gf97572c2aeddc71b01686993b978895e55890ab6
Author: Richard Biener <rguenther@suse.de>
Date: Thu May 25 12:55:11 2023 +0200
target/109955 - handle pattern generated COND_EXPR without vcond
The following properly handles pattern matching generated COND_EXPRs
which can still have embedded compares in vectorizable_condition
which will always code generate the masked vector variant. We
were requiring vcond with embedded comparisons instead of also
allowing (as code generated) split compare and VEC_COND_EXPR.
This fixes some of the fallout when removing vcond{,u,eq} expanders
from the x86 backend.
PR target/109955
* tree-vect-stmts.cc (vectorizable_condition): For
embedded comparisons also handle the case when the target
only provides vec_cmp and vcond_mask.
^ permalink raw reply [flat|nested] 6+ messages in thread
* [Bug target/109955] Should be possible to remove vcond{,u,eq} expanders
2023-05-24 12:18 [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders rguenth at gcc dot gnu.org
` (3 preceding siblings ...)
2023-05-25 11:01 ` cvs-commit at gcc dot gnu.org
@ 2023-05-25 11:08 ` rguenth at gcc dot gnu.org
4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-25 11:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955
--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
The remaining FAILs are all because of define_insn_and_split no longer working
I think. I wonder if these combinations can be handled in generic code
somehow.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-05-25 11:08 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-24 12:18 [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders rguenth at gcc dot gnu.org
2023-05-24 12:20 ` [Bug target/109955] " rguenth at gcc dot gnu.org
2023-05-24 12:42 ` rguenth at gcc dot gnu.org
2023-05-25 9:44 ` rguenth at gcc dot gnu.org
2023-05-25 11:01 ` cvs-commit at gcc dot gnu.org
2023-05-25 11:08 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).