public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders
@ 2023-05-24 12:18 rguenth at gcc dot gnu.org
  2023-05-24 12:20 ` [Bug target/109955] " rguenth at gcc dot gnu.org
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-24 12:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955

            Bug ID: 109955
           Summary: Should be possible to remove vcond{,u,eq} expanders
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

It should be possible to remove all vcond, vcondu and vcondeq expanders and
have the functionality be implemented via the vec_cmp and vcond_mask expanders.
But when removing them a bootstrap & regtest reveals

                === g++ tests ===


Running target unix
FAIL: g++.target/i386/avx-pr54700-1.C   scan-assembler-not vpcmpgt[bdq]
FAIL: g++.target/i386/avx-pr54700-1.C   scan-assembler-times vblendvpd 4
FAIL: g++.target/i386/avx-pr54700-1.C   scan-assembler-times vblendvps 4
FAIL: g++.target/i386/avx-pr54700-1.C   scan-assembler-times vpblendvb 2
FAIL: g++.target/i386/avx2-pr54700-1.C   scan-assembler-not vpcmpgt[bdq]
FAIL: g++.target/i386/avx2-pr54700-1.C   scan-assembler-times vblendvpd 4
FAIL: g++.target/i386/avx2-pr54700-1.C   scan-assembler-times vblendvps 4
FAIL: g++.target/i386/avx2-pr54700-1.C   scan-assembler-times vpblendvb 2
FAIL: g++.target/i386/avx512fp16-vcondmn-minmax.C  -std=gnu++14 
scan-assembler-times vmaxph 3
FAIL: g++.target/i386/avx512fp16-vcondmn-minmax.C  -std=gnu++14 
scan-assembler-times vminph 3
FAIL: g++.target/i386/pr100738-1.C  -std=gnu++14  scan-assembler-not vpcmpeqd[
\\\\t]
FAIL: g++.target/i386/pr100738-1.C  -std=gnu++14  scan-assembler-not vpxor[
\\\\t]
FAIL: g++.target/i386/pr100738-1.C  -std=gnu++14  scan-assembler-times
vblendvps[ \\\\t] 2
FAIL: g++.target/i386/sse4_1-pr54700-1.C   scan-assembler-not pcmpgt[bdq]
FAIL: g++.target/i386/sse4_1-pr54700-1.C   scan-assembler-times blendvpd 4
FAIL: g++.target/i386/sse4_1-pr54700-1.C   scan-assembler-times blendvps 4
FAIL: g++.target/i386/sse4_1-pr54700-1.C   scan-assembler-times pblendvb 2

                === gcc tests ===


Running target unix
FAIL: gcc.dg/vect/pr109011-3.c -flto -ffat-lto-objects  scan-tree-dump-times
optimized " = .POPCOUNT \\\\(vect" 3
FAIL: gcc.dg/vect/pr109011-3.c scan-tree-dump-times optimized " = .POPCOUNT
\\\\(vect" 3
FAIL: gcc.dg/vect/pr109011-5.c -flto -ffat-lto-objects  scan-tree-dump-times
optimized " = .POPCOUNT \\\\(vect" 3
FAIL: gcc.dg/vect/pr109011-5.c scan-tree-dump-times optimized " = .POPCOUNT
\\\\(vect" 3
FAIL: gcc.target/i386/avx2-pr99908.c scan-assembler-not \\tvpcmpeq
FAIL: gcc.target/i386/avx512bw-pr96891-1.c scan-assembler-not %k[0-7]
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-not %k[0-9]
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsb[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsd[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsq[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsw[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminub[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminud[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminuq[\\t ] 2
FAIL: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminuw[\\t ] 2
FAIL: gcc.target/i386/pr109011-b1.c scan-assembler-times vpopcntb[ \\t]+ 4
FAIL: gcc.target/i386/pr109011-w1.c scan-assembler-times vpopcntw[ \\t]+ 4
FAIL: gcc.target/i386/sse4_1-pr99908.c scan-assembler-not \\tpcmpeq

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/109955] Should be possible to remove vcond{,u,eq} expanders
  2023-05-24 12:18 [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders rguenth at gcc dot gnu.org
@ 2023-05-24 12:20 ` rguenth at gcc dot gnu.org
  2023-05-24 12:42 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-24 12:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 55149
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55149&action=edit
patch I tested

This is the patch I tested.  I have not yet investigated any of the FAILs.

Causes might be missing/differing vec_cmp or vcond_mask patterns or different
behavior of the vectorizer or RTL expander.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/109955] Should be possible to remove vcond{,u,eq} expanders
  2023-05-24 12:18 [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders rguenth at gcc dot gnu.org
  2023-05-24 12:20 ` [Bug target/109955] " rguenth at gcc dot gnu.org
@ 2023-05-24 12:42 ` rguenth at gcc dot gnu.org
  2023-05-25  9:44 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-24 12:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
One thing I see is

-(insn 11 10 15 2 (set (subreg:V16QI (reg:V2DI 83 [ <retval> ]) 0)
-        (unspec:V16QI [
-                (reg:V16QI 92)
-                (reg:V16QI 91)
-                (lt:V16QI (reg:V16QI 90)
-                    (const_vector:V16QI [
-                            (const_int 0 [0]) repeated x16
-                        ]))
-            ] UNSPEC_BLENDV))
"/space/rguenther/src/gcc/gcc/testsuite/gcc.target/i386/sse4_1-pr99908.c":22:10
discrim 1 7431 {*sse4_1_pblendvb_lt}
                 (nil)))))

vs

+(insn 8 5 9 2 (set (reg:V16QI 89)
+        (const_vector:V16QI [
+                (const_int -1 [0xffffffffffffffff]) repeated x16
+            ]))
"/spc/abuild/rguenther/obj-gcc-g/gcc/include/smmintrin.h":181:20 1838
{movv16qi_internal}
+     (nil))
+(insn 9 8 11 2 (set (reg:V16QI 90)
+        (gt:V16QI (reg:V16QI 92)
+            (reg:V16QI 89)))
"/spc/abuild/rguenther/obj-gcc-g/gcc/include/smmintrin.h":181:20 6749
{*sse2_gtv16qi3}
      (expr_list:REG_DEAD (reg:V16QI 92)
+        (expr_list:REG_DEAD (reg:V16QI 89)
+            (nil))))
+(note 11 9 12 2 NOTE_INSN_DELETED)
+(insn 12 11 16 2 (set (subreg:V16QI (reg:V2DI 84 [ <retval> ]) 0)
+        (unspec:V16QI [
+                (reg:V16QI 93)
+                (reg:V16QI 94)
+                (reg:V16QI 90)
+            ] UNSPEC_BLENDV))
"/space/rguenther/src/gcc/gcc/testsuite/gcc.target/i386/sse4_1-pr99908.c":22:10
discrim 1 7429 {sse4_1_pblendvb}
+     (expr_list:REG_DEAD (reg:V16QI 93)
+        (expr_list:REG_DEAD (reg:V16QI 90)
+            (expr_list:REG_DEAD (reg:V16QI 94)
                 (nil)))))

after the combiner which seems to be a missing simplification of


(insn 8 5 9 2 (set (reg:V16QI 89)
        (const_vector:V16QI [
                (const_int -1 [0xffffffffffffffff]) repeated x16
            ]))
(insn 9 8 11 2 (set (reg:V16QI 90)
               (gt:V16QI (reg:V16QI 92)
                (reg:V16QI 89)))

to

(lt:V16QI (reg:V16QI 90)
                    (const_vector:V16QI [
                            (const_int 0 [0]) repeated x16
                        ])

Trying 8 -> 9:
    8: r89:V16QI=const_vector
    9: r90:V16QI=r92:V16QI>r89:V16QI
      REG_DEAD r92:V16QI
      REG_DEAD r89:V16QI
Failed to match this instruction:
(set (reg:V16QI 90)
    (gt:V16QI (reg:V16QI 92)
        (const_vector:V16QI [
                (const_int -1 [0xffffffffffffffff]) repeated x16
            ])))

Trying 8, 9 -> 12:
    8: r89:V16QI=const_vector
    9: r90:V16QI=r92:V16QI>r89:V16QI
      REG_DEAD r92:V16QI
      REG_DEAD r89:V16QI
   12: r84:V2DI#0=unspec[r93:V16QI,r94:V16QI,r90:V16QI] 47
      REG_DEAD r93:V16QI
      REG_DEAD r90:V16QI
      REG_DEAD r94:V16QI
Failed to match this instruction:
(set (subreg:V16QI (reg:V2DI 84 [ <retval> ]) 0)
    (unspec:V16QI [
            (reg:V16QI 93)
            (reg:V16QI 94) 
            (gt:V16QI (reg:V16QI 92)
                (const_vector:V16QI [
                        (const_int -1 [0xffffffffffffffff]) repeated x16
                    ]))
        ] UNSPEC_BLENDV))

not sure if the lt is a standalone thing.  Maybe we just need a
define-insn-and-split for _gt as well.  All those seem to be somewhat
tuned to the exact way RTL expansion works when the vcond patterns are there.

Getting rid of vcond* (but not vcond_mask) would allow quite some
simplification
in middle-end code and the vectorizer.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/109955] Should be possible to remove vcond{,u,eq} expanders
  2023-05-24 12:18 [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders rguenth at gcc dot gnu.org
  2023-05-24 12:20 ` [Bug target/109955] " rguenth at gcc dot gnu.org
  2023-05-24 12:42 ` rguenth at gcc dot gnu.org
@ 2023-05-25  9:44 ` rguenth at gcc dot gnu.org
  2023-05-25 11:01 ` cvs-commit at gcc dot gnu.org
  2023-05-25 11:08 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-25  9:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
FAIL: gcc.dg/vect/pr109011-3.c -flto -ffat-lto-objects  scan-tree-dump-times
optimized " = .POPCOUNT \\\\(vect" 3

show that when pattern recognition detects

t.c:5:21: note:   vec_recog_ctz_ffs_pattern: detected: _6 = __builtin_ffs (_4);
t.c:5:21: note:   created pattern stmt: patt_7 = _4 != 0 ? patt_8 : 0;
t.c:5:21: note:   ctz_ffs pattern recognized: patt_7 = _4 != 0 ? patt_8 : 0;
t.c:5:21: note:   extra pattern stmt: patt_18 = -_4;
t.c:5:21: note:   extra pattern stmt: patt_15 = _4 | patt_18;
t.c:5:21: note:   extra pattern stmt: patt_14 = .POPCOUNT (patt_15);
t.c:5:21: note:   extra pattern stmt: patt_8 = 33 - patt_14;

we fail:

t.c:5:21: note:   ==> examining pattern statement: patt_7 = _4 != 0 ? patt_8 :
0;
t.c:5:21: note:   vect_is_simple_use: operand *_3, type of def: internal
t.c:5:21: note:   vect_is_simple_use: vectype vector(16) int
t.c:5:21: note:   vect_is_simple_use: operand 33 - patt_14, type of def:
internal
t.c:5:21: note:   vect_is_simple_use: vectype vector(16) int
t.c:5:21: note:   vect_is_simple_use: operand 0, type of def: constant
t.c:2:1: missed:   not vectorized: relevant stmt not supported: patt_7 = _4 !=
0 ? patt_8 : 0;
t.c:5:21: missed:  bad operation or unsupported loop bound.

note there's "unconverted" code in pattern recog that creates those "bad"
GIMPLE COND_EXPRs with GENERIC compares.  I've chickened out "fixing" those
because exactly of a similar issue that we're not perfect in supporting
vcond vs vcond_mask 1:1.  Having only one of them would simplify things.

I have a patch for that.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/109955] Should be possible to remove vcond{,u,eq} expanders
  2023-05-24 12:18 [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-05-25  9:44 ` rguenth at gcc dot gnu.org
@ 2023-05-25 11:01 ` cvs-commit at gcc dot gnu.org
  2023-05-25 11:08 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-25 11:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955

--- Comment #4 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Richard Biener <rguenth@gcc.gnu.org>:

https://gcc.gnu.org/g:f97572c2aeddc71b01686993b978895e55890ab6

commit r14-1238-gf97572c2aeddc71b01686993b978895e55890ab6
Author: Richard Biener <rguenther@suse.de>
Date:   Thu May 25 12:55:11 2023 +0200

    target/109955 - handle pattern generated COND_EXPR without vcond

    The following properly handles pattern matching generated COND_EXPRs
    which can still have embedded compares in vectorizable_condition
    which will always code generate the masked vector variant.  We
    were requiring vcond with embedded comparisons instead of also
    allowing (as code generated) split compare and VEC_COND_EXPR.

    This fixes some of the fallout when removing vcond{,u,eq} expanders
    from the x86 backend.

            PR target/109955
            * tree-vect-stmts.cc (vectorizable_condition): For
            embedded comparisons also handle the case when the target
            only provides vec_cmp and vcond_mask.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug target/109955] Should be possible to remove vcond{,u,eq} expanders
  2023-05-24 12:18 [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders rguenth at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-05-25 11:01 ` cvs-commit at gcc dot gnu.org
@ 2023-05-25 11:08 ` rguenth at gcc dot gnu.org
  4 siblings, 0 replies; 6+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-05-25 11:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109955

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
The remaining FAILs are all because of define_insn_and_split no longer working
I think.  I wonder if these combinations can be handled in generic code
somehow.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-05-25 11:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-24 12:18 [Bug target/109955] New: Should be possible to remove vcond{,u,eq} expanders rguenth at gcc dot gnu.org
2023-05-24 12:20 ` [Bug target/109955] " rguenth at gcc dot gnu.org
2023-05-24 12:42 ` rguenth at gcc dot gnu.org
2023-05-25  9:44 ` rguenth at gcc dot gnu.org
2023-05-25 11:01 ` cvs-commit at gcc dot gnu.org
2023-05-25 11:08 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).