[Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab
@ 2024-06-17  8:05 liuhongt at gcc dot gnu.org
  2024-06-18  6:20 ` [Bug target/115517] Fix x86 regressions " rguenth at gcc dot gnu.org
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-06-17  8:05 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

            Bug ID: 115517
           Summary: Fix regression after dropping uses of
                    vcond{,u,eq}_optab
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: liuhongt at gcc dot gnu.org
        Depends on: 114189
  Target Milestone: ---
            Target: x86_64-*-* i?86-*-*

> I'd appreciate testing, I do not expect fallout for x86 or arm/aarch64.
> > I know riscv doesn't implement any of the legacy optabs.  But less
> > maintained vector targets might need adjustments.
> >
> At GCC14, I tried to remove these expanders in the x86 backend, and it
> regressed some testcases, mainly because of the optimizations we did
> in ix86_expand_{int,fp}_vcond.
> I've started testing your patch, it's possible that we still need to
> move the ix86_expand_{int,fp}_vcond optimizations to the
> middle-end(isel or match.pd)or add extra patterns to handle it at the
> rtl pas_combine.
These are new failures I got

g++: g++.target/i386/avx-pr54700-1.C   scan-assembler-not vpcmpgt[bdq]

g++: g++.target/i386/avx-pr54700-1.C   scan-assembler-times vblendvpd 4

g++: g++.target/i386/avx-pr54700-1.C   scan-assembler-times vblendvps 4

g++: g++.target/i386/avx-pr54700-1.C   scan-assembler-times vpblendvb 2

g++: g++.target/i386/avx2-pr54700-1.C   scan-assembler-not vpcmpgt[bdq]

g++: g++.target/i386/avx2-pr54700-1.C   scan-assembler-times vblendvpd 4

g++: g++.target/i386/avx2-pr54700-1.C   scan-assembler-times vblendvps 4

g++: g++.target/i386/avx2-pr54700-1.C   scan-assembler-times vpblendvb 2

g++: g++.target/i386/avx512fp16-vcondmn-minmax.C  -std=gnu++14

g++scan-assembler-times vmaxph 3

g++: g++.target/i386/avx512fp16-vcondmn-minmax.C  -std=gnu++14

g++scan-assembler-times vminph 3

g++: g++.target/i386/avx512fp16-vcondmn-minmax.C  -std=gnu++17

g++scan-assembler-times vmaxph 3

g++: g++.target/i386/avx512fp16-vcondmn-minmax.C  -std=gnu++17

g++scan-assembler-times vminph 3

g++: g++.target/i386/avx512fp16-vcondmn-minmax.C  -std=gnu++20

g++scan-assembler-times vmaxph 3

g++: g++.target/i386/avx512fp16-vcondmn-minmax.C  -std=gnu++20

g++scan-assembler-times vminph 3

g++: g++.target/i386/avx512fp16-vcondmn-minmax.C  -std=gnu++98

g++scan-assembler-times vmaxph 3

g++: g++.target/i386/avx512fp16-vcondmn-minmax.C  -std=gnu++98

g++scan-assembler-times vminph 3

g++: g++.target/i386/pr100637-1b.C  -std=gnu++14  scan-assembler-times

g++pcmpeqb 2

g++: g++.target/i386/pr100637-1b.C  -std=gnu++17  scan-assembler-times

g++pcmpeqb 2

g++: g++.target/i386/pr100637-1b.C  -std=gnu++20  scan-assembler-times

g++pcmpeqb 2

g++: g++.target/i386/pr100637-1b.C  -std=gnu++98  scan-assembler-times

g++pcmpeqb 2

g++: g++.target/i386/pr100637-1w.C  -std=gnu++14  scan-assembler-times

g++pcmpeqw 2

g++: g++.target/i386/pr100637-1w.C  -std=gnu++17  scan-assembler-times

g++pcmpeqw 2

g++: g++.target/i386/pr100637-1w.C  -std=gnu++20  scan-assembler-times

g++pcmpeqw 2

g++: g++.target/i386/pr100637-1w.C  -std=gnu++98  scan-assembler-times

g++pcmpeqw 2

g++: g++.target/i386/pr100738-1.C  -std=gnu++14  scan-assembler-not

g++vpcmpeqd[ \\t]

g++: g++.target/i386/pr100738-1.C  -std=gnu++14  scan-assembler-not

g++vpxor[ \\t]

g++: g++.target/i386/pr100738-1.C  -std=gnu++14  scan-assembler-times

g++vblendvps[ \\t] 2

g++: g++.target/i386/pr100738-1.C  -std=gnu++17  scan-assembler-not

g++vpcmpeqd[ \\t]

g++: g++.target/i386/pr100738-1.C  -std=gnu++17  scan-assembler-not

g++vpxor[ \\t]

g++: g++.target/i386/pr100738-1.C  -std=gnu++17  scan-assembler-times

g++vblendvps[ \\t] 2

g++: g++.target/i386/pr100738-1.C  -std=gnu++20  scan-assembler-not

g++vpcmpeqd[ \\t]

g++: g++.target/i386/pr100738-1.C  -std=gnu++20  scan-assembler-not

g++vpxor[ \\t]

g++: g++.target/i386/pr100738-1.C  -std=gnu++20  scan-assembler-times

g++vblendvps[ \\t] 2

g++: g++.target/i386/pr100738-1.C  -std=gnu++98  scan-assembler-not

g++vpcmpeqd[ \\t]

g++: g++.target/i386/pr100738-1.C  -std=gnu++98  scan-assembler-not

g++vpxor[ \\t]

g++: g++.target/i386/pr100738-1.C  -std=gnu++98  scan-assembler-times

g++vblendvps[ \\t] 2

g++: g++.target/i386/pr103861-1.C  -std=gnu++14  scan-assembler-times

g++pcmpeqb 2

g++: g++.target/i386/pr103861-1.C  -std=gnu++17  scan-assembler-times

g++pcmpeqb 2

g++: g++.target/i386/pr103861-1.C  -std=gnu++20  scan-assembler-times

g++pcmpeqb 2

g++: g++.target/i386/pr103861-1.C  -std=gnu++98  scan-assembler-times

g++pcmpeqb 2

g++: g++.target/i386/pr61747.C  -std=gnu++14  scan-assembler-times max 4

g++: g++.target/i386/pr61747.C  -std=gnu++14  scan-assembler-times min 4

g++: g++.target/i386/pr61747.C  -std=gnu++17  scan-assembler-times max 4

g++: g++.target/i386/pr61747.C  -std=gnu++17  scan-assembler-times min 4

g++: g++.target/i386/pr61747.C  -std=gnu++20  scan-assembler-times max 4

g++: g++.target/i386/pr61747.C  -std=gnu++20  scan-assembler-times min 4

g++: g++.target/i386/sse4_1-pr54700-1.C   scan-assembler-not pcmpgt[bdq]

g++: g++.target/i386/sse4_1-pr54700-1.C   scan-assembler-times blendvpd 4

g++: g++.target/i386/sse4_1-pr54700-1.C   scan-assembler-times blendvps 4

g++: g++.target/i386/sse4_1-pr54700-1.C   scan-assembler-times pblendvb 2

gcc: gcc.target/i386/avx2-pr99908.c scan-assembler-not \tvpcmpeq

gcc: gcc.target/i386/avx512bw-pr96891-1.c scan-assembler-not %k[0-7]

gcc: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-not %k[0-9]

gcc: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsb[\t ] 2

gcc: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsd[\t ] 2

gcc: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsq[\t ] 2

gcc: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminsw[\t ] 2

gcc: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminub[\t ] 2

gcc: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminud[\t ] 2

gcc: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminuq[\t ] 2

gcc: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-times vpminuw[\t ] 2

gcc: gcc.target/i386/blendv-3.c scan-assembler-not vpcmp

gcc: gcc.target/i386/pr88540.c scan-assembler minpd

gcc: gcc.target/i386/sse4_1-pr99908.c scan-assembler-not \tpcmpeq

unix/-m32: g++: g++.target/i386/avx-pr54700-1.C   scan-assembler-not
vpcmpgt[bdq]

unix/-m32: g++: g++.target/i386/avx-pr54700-1.C   scan-assembler-times
vblendvpd 4

unix/-m32: g++: g++.target/i386/avx-pr54700-1.C   scan-assembler-times
vblendvps 4

unix/-m32: g++: g++.target/i386/avx-pr54700-1.C   scan-assembler-times
vpblendvb 2

unix/-m32: g++: g++.target/i386/avx2-pr54700-1.C   scan-assembler-not
vpcmpgt[bdq]

unix/-m32: g++: g++.target/i386/avx2-pr54700-1.C
scan-assembler-times vblendvpd 4

unix/-m32: g++: g++.target/i386/avx2-pr54700-1.C
scan-assembler-times vblendvps 4

unix/-m32: g++: g++.target/i386/avx2-pr54700-1.C
scan-assembler-times vpblendvb 2

unix/-m32: g++: g++.target/i386/avx512fp16-vcondmn-minmax.C
-std=gnu++14  scan-assembler-times vmaxph 3

unix/-m32: g++: g++.target/i386/avx512fp16-vcondmn-minmax.C
-std=gnu++14  scan-assembler-times vminph 3

unix/-m32: g++: g++.target/i386/avx512fp16-vcondmn-minmax.C
-std=gnu++17  scan-assembler-times vmaxph 3

unix/-m32: g++: g++.target/i386/avx512fp16-vcondmn-minmax.C
-std=gnu++17  scan-assembler-times vminph 3

unix/-m32: g++: g++.target/i386/avx512fp16-vcondmn-minmax.C
-std=gnu++20  scan-assembler-times vmaxph 3

unix/-m32: g++: g++.target/i386/avx512fp16-vcondmn-minmax.C
-std=gnu++20  scan-assembler-times vminph 3

unix/-m32: g++: g++.target/i386/avx512fp16-vcondmn-minmax.C
-std=gnu++98  scan-assembler-times vmaxph 3

unix/-m32: g++: g++.target/i386/avx512fp16-vcondmn-minmax.C
-std=gnu++98  scan-assembler-times vminph 3

unix/-m32: g++: g++.target/i386/pr100637-1b.C  -std=gnu++14
scan-assembler-times pcmpeqb 2

unix/-m32: g++: g++.target/i386/pr100637-1b.C  -std=gnu++17
scan-assembler-times pcmpeqb 2

unix/-m32: g++: g++.target/i386/pr100637-1b.C  -std=gnu++20
scan-assembler-times pcmpeqb 2

unix/-m32: g++: g++.target/i386/pr100637-1b.C  -std=gnu++98
scan-assembler-times pcmpeqb 2

unix/-m32: g++: g++.target/i386/pr100637-1w.C  -std=gnu++14
scan-assembler-times pcmpeqw 2

unix/-m32: g++: g++.target/i386/pr100637-1w.C  -std=gnu++17
scan-assembler-times pcmpeqw 2

unix/-m32: g++: g++.target/i386/pr100637-1w.C  -std=gnu++20
scan-assembler-times pcmpeqw 2

unix/-m32: g++: g++.target/i386/pr100637-1w.C  -std=gnu++98
scan-assembler-times pcmpeqw 2

unix/-m32: g++: g++.target/i386/pr100738-1.C  -std=gnu++14
scan-assembler-not vpcmpeqd[ \\t]

unix/-m32: g++: g++.target/i386/pr100738-1.C  -std=gnu++14
scan-assembler-not vpxor[ \\t]

unix/-m32: g++: g++.target/i386/pr100738-1.C  -std=gnu++14
scan-assembler-times vblendvps[ \\t] 2

unix/-m32: g++: g++.target/i386/pr100738-1.C  -std=gnu++17
scan-assembler-not vpcmpeqd[ \\t]

unix/-m32: g++: g++.target/i386/pr100738-1.C  -std=gnu++17
scan-assembler-not vpxor[ \\t]

unix/-m32: g++: g++.target/i386/pr100738-1.C  -std=gnu++17
scan-assembler-times vblendvps[ \\t] 2

unix/-m32: g++: g++.target/i386/pr100738-1.C  -std=gnu++20
scan-assembler-not vpcmpeqd[ \\t]

unix/-m32: g++: g++.target/i386/pr100738-1.C  -std=gnu++20
scan-assembler-not vpxor[ \\t]

unix/-m32: g++: g++.target/i386/pr100738-1.C  -std=gnu++20
scan-assembler-times vblendvps[ \\t] 2

unix/-m32: g++: g++.target/i386/pr100738-1.C  -std=gnu++98
scan-assembler-not vpcmpeqd[ \\t]

unix/-m32: g++: g++.target/i386/pr100738-1.C  -std=gnu++98
scan-assembler-not vpxor[ \\t]

unix/-m32: g++: g++.target/i386/pr100738-1.C  -std=gnu++98
scan-assembler-times vblendvps[ \\t] 2

unix/-m32: g++: g++.target/i386/pr103861-1.C  -std=gnu++14
scan-assembler-times pcmpeqb 2

unix/-m32: g++: g++.target/i386/pr103861-1.C  -std=gnu++17
scan-assembler-times pcmpeqb 2

unix/-m32: g++: g++.target/i386/pr103861-1.C  -std=gnu++20
scan-assembler-times pcmpeqb 2

unix/-m32: g++: g++.target/i386/pr103861-1.C  -std=gnu++98
scan-assembler-times pcmpeqb 2

unix/-m32: g++: g++.target/i386/pr61747.C  -std=gnu++14
scan-assembler-times max 4

unix/-m32: g++: g++.target/i386/pr61747.C  -std=gnu++14
scan-assembler-times min 4

unix/-m32: g++: g++.target/i386/pr61747.C  -std=gnu++17
scan-assembler-times max 4

unix/-m32: g++: g++.target/i386/pr61747.C  -std=gnu++17
scan-assembler-times min 4

unix/-m32: g++: g++.target/i386/pr61747.C  -std=gnu++20
scan-assembler-times max 4

unix/-m32: g++: g++.target/i386/pr61747.C  -std=gnu++20
scan-assembler-times min 4

unix/-m32: g++: g++.target/i386/sse4_1-pr54700-1.C
scan-assembler-not pcmpgt[bdq]

unix/-m32: g++: g++.target/i386/sse4_1-pr54700-1.C
scan-assembler-times blendvpd 4

unix/-m32: g++: g++.target/i386/sse4_1-pr54700-1.C
scan-assembler-times blendvps 4

unix/-m32: g++: g++.target/i386/sse4_1-pr54700-1.C
scan-assembler-times pblendvb 2

unix/-m32: gcc: gcc.target/i386/avx2-pr99908.c scan-assembler-not \tvpcmpeq

unix/-m32: gcc: gcc.target/i386/avx512bw-pr96891-1.c scan-assembler-not %k[0-7]

unix/-m32: gcc: gcc.target/i386/avx512vl-pr88547-1.c scan-assembler-not %k[0-9]

unix/-m32: gcc: gcc.target/i386/avx512vl-pr88547-1.c
scan-assembler-times vpminsb[\t ] 2

unix/-m32: gcc: gcc.target/i386/avx512vl-pr88547-1.c
scan-assembler-times vpminsd[\t ] 2

unix/-m32: gcc: gcc.target/i386/avx512vl-pr88547-1.c
scan-assembler-times vpminsq[\t ] 2

unix/-m32: gcc: gcc.target/i386/avx512vl-pr88547-1.c
scan-assembler-times vpminsw[\t ] 2

unix/-m32: gcc: gcc.target/i386/avx512vl-pr88547-1.c
scan-assembler-times vpminub[\t ] 2

unix/-m32: gcc: gcc.target/i386/avx512vl-pr88547-1.c
scan-assembler-times vpminud[\t ] 2

unix/-m32: gcc: gcc.target/i386/avx512vl-pr88547-1.c
scan-assembler-times vpminuq[\t ] 2

unix/-m32: gcc: gcc.target/i386/avx512vl-pr88547-1.c
scan-assembler-times vpminuw[\t ] 2

unix/-m32: gcc: gcc.target/i386/blendv-3.c scan-assembler-not vpcmp

unix/-m32: gcc: gcc.target/i386/pr88540.c scan-assembler minpd

unix/-m32: gcc: gcc.target/i386/sse4_1-pr99908.c scan-assembler-not \tpcmpeq


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114189
[Bug 114189] Target implements obsolete vcond{,u,eq} expanders

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
@ 2024-06-18  6:20 ` rguenth at gcc dot gnu.org
  2024-06-18  8:39 ` liuhongt at gcc dot gnu.org
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-06-18  6:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Btw, I had opened PR115490 with my results for this already.  Some mitigation
should be from optimizing ISEL expansion to vcond_mask and I'd start with
looking at some of the fallout from that side (note that might require
the backend reject not natively implemented vec_cmp via its operand 1
predicate)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
  2024-06-18  6:20 ` [Bug target/115517] Fix x86 regressions " rguenth at gcc dot gnu.org
@ 2024-06-18  8:39 ` liuhongt at gcc dot gnu.org
  2024-06-18 10:49 ` rguenther at suse dot de
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-06-18  8:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #2 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #1)
> Btw, I had opened PR115490 with my results for this already.  Some mitigation
> should be from optimizing ISEL expansion to vcond_mask and I'd start with
> looking at some of the fallout from that side (note that might require
> the backend reject not natively implemented vec_cmp via its operand 1
> predicate)

w/o AVX512, vector integer comparison only supports EQ/GT, others comparison
rtx_cost is transformed to that. (.i.e GTU is emulated with us_minus + eq +
negative the vector mask)
If we restrict the predicate of operand 1, would middle-end reject
vectorization (or lower it to scalar version)?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
  2024-06-18  6:20 ` [Bug target/115517] Fix x86 regressions " rguenth at gcc dot gnu.org
  2024-06-18  8:39 ` liuhongt at gcc dot gnu.org
@ 2024-06-18 10:49 ` rguenther at suse dot de
  2024-06-18 11:08 ` liuhongt at gcc dot gnu.org
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenther at suse dot de @ 2024-06-18 10:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #3 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
> 
> --- Comment #2 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> (In reply to Richard Biener from comment #1)
> > Btw, I had opened PR115490 with my results for this already.  Some mitigation
> > should be from optimizing ISEL expansion to vcond_mask and I'd start with
> > looking at some of the fallout from that side (note that might require
> > the backend reject not natively implemented vec_cmp via its operand 1
> > predicate)
> 
> w/o AVX512, vector integer comparison only supports EQ/GT, others comparison
> rtx_cost is transformed to that. (.i.e GTU is emulated with us_minus + eq +
> negative the vector mask)
> If we restrict the predicate of operand 1, would middle-end reject
> vectorization (or lower it to scalar version)?

Richard suggests that we implement the "obvious" transforms like
inversion in the middle-end but if for example unsigned compares
are not supported the us_minus + eq + negative trick isn't on
that list.

The main reason to restrict vec_cmp would be to avoid
a <= b ? c : d going with an unsupported vec_cmp but instead
do a > b ? d : c - the alternative is trying to fix this
on the RTL side via combine.  I understand the non-native
compares are already expanded to supported form and we
don't use a split after combine to make combinations to
a supported form easier?

I don't have a good feeling which approach is going to be better
maintainable here.  But for example even for the unsigned compare
"lowering" the middle-end would have range info while RTL does
not (to some extent it's available at RTL expansion time).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2024-06-18 10:49 ` rguenther at suse dot de
@ 2024-06-18 11:08 ` liuhongt at gcc dot gnu.org
  2024-06-18 11:17 ` rguenther at suse dot de
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-06-18 11:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #4 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
(In reply to rguenther@suse.de from comment #3)
> On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
> > 
> > --- Comment #2 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> > (In reply to Richard Biener from comment #1)
> > > Btw, I had opened PR115490 with my results for this already.  Some mitigation
> > > should be from optimizing ISEL expansion to vcond_mask and I'd start with
> > > looking at some of the fallout from that side (note that might require
> > > the backend reject not natively implemented vec_cmp via its operand 1
> > > predicate)
> > 
> > w/o AVX512, vector integer comparison only supports EQ/GT, others comparison
> > rtx_cost is transformed to that. (.i.e GTU is emulated with us_minus + eq +
> > negative the vector mask)
> > If we restrict the predicate of operand 1, would middle-end reject
> > vectorization (or lower it to scalar version)?
> 
> Richard suggests that we implement the "obvious" transforms like
> inversion in the middle-end but if for example unsigned compares
> are not supported the us_minus + eq + negative trick isn't on
> that list.
> 
> The main reason to restrict vec_cmp would be to avoid
> a <= b ? c : d going with an unsupported vec_cmp but instead
> do a > b ? d : c - the alternative is trying to fix this
> on the RTL side via combine.  I understand the non-native

Yes, I have a patch which can fix most regressions via pattern match in
combine.
Still there is a situation that is difficult to deal with, mainly the
optimization w/o sse4.1 . Because pblendvb/blendvps/blendvpd only exists under
sse4.1, w/o sse4.1, it takes 3 instructions (pand,pandn,por) to simulate the
vcond_mask, and the combine matches up to 4 instructions, which makes it
currently impossible to use the combine to recover those optimizations in the
vcond{,u,eq}.i.e min/max.
In the case of sse 4.1 and above, there is basically no regression anymore.


the regression testcases w/o sse4.1

FAIL: g++.target/i386/pr100637-1b.C  -std=gnu++14  scan-assembler-times pcmpeqb
2
FAIL: g++.target/i386/pr100637-1b.C  -std=gnu++17  scan-assembler-times pcmpeqb
2
FAIL: g++.target/i386/pr100637-1b.C  -std=gnu++20  scan-assembler-times pcmpeqb
2
FAIL: g++.target/i386/pr100637-1b.C  -std=gnu++98  scan-assembler-times pcmpeqb
2
FAIL: g++.target/i386/pr100637-1w.C  -std=gnu++14  scan-assembler-times pcmpeqw
2
FAIL: g++.target/i386/pr100637-1w.C  -std=gnu++17  scan-assembler-times pcmpeqw
2
FAIL: g++.target/i386/pr100637-1w.C  -std=gnu++20  scan-assembler-times pcmpeqw
2
FAIL: g++.target/i386/pr100637-1w.C  -std=gnu++98  scan-assembler-times pcmpeqw
2
FAIL: g++.target/i386/pr103861-1.C  -std=gnu++14  scan-assembler-times pcmpeqb
2
FAIL: g++.target/i386/pr103861-1.C  -std=gnu++17  scan-assembler-times pcmpeqb
2
FAIL: g++.target/i386/pr103861-1.C  -std=gnu++20  scan-assembler-times pcmpeqb
2
FAIL: g++.target/i386/pr103861-1.C  -std=gnu++98  scan-assembler-times pcmpeqb
2
FAIL: gcc.target/i386/pr88540.c scan-assembler minpd

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2024-06-18 11:08 ` liuhongt at gcc dot gnu.org
@ 2024-06-18 11:17 ` rguenther at suse dot de
  2024-06-18 11:29 ` liuhongt at gcc dot gnu.org
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: rguenther at suse dot de @ 2024-06-18 11:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #5 from rguenther at suse dot de <rguenther at suse dot de> ---
On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
> 
> --- Comment #4 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> (In reply to rguenther@suse.de from comment #3)
> > On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote:
> > 
> > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
> > > 
> > > --- Comment #2 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> > > (In reply to Richard Biener from comment #1)
> > > > Btw, I had opened PR115490 with my results for this already.  Some mitigation
> > > > should be from optimizing ISEL expansion to vcond_mask and I'd start with
> > > > looking at some of the fallout from that side (note that might require
> > > > the backend reject not natively implemented vec_cmp via its operand 1
> > > > predicate)
> > > 
> > > w/o AVX512, vector integer comparison only supports EQ/GT, others comparison
> > > rtx_cost is transformed to that. (.i.e GTU is emulated with us_minus + eq +
> > > negative the vector mask)
> > > If we restrict the predicate of operand 1, would middle-end reject
> > > vectorization (or lower it to scalar version)?
> > 
> > Richard suggests that we implement the "obvious" transforms like
> > inversion in the middle-end but if for example unsigned compares
> > are not supported the us_minus + eq + negative trick isn't on
> > that list.
> > 
> > The main reason to restrict vec_cmp would be to avoid
> > a <= b ? c : d going with an unsupported vec_cmp but instead
> > do a > b ? d : c - the alternative is trying to fix this
> > on the RTL side via combine.  I understand the non-native
> 
> Yes, I have a patch which can fix most regressions via pattern match in
> combine.
> Still there is a situation that is difficult to deal with, mainly the
> optimization w/o sse4.1 . Because pblendvb/blendvps/blendvpd only exists under
> sse4.1, w/o sse4.1, it takes 3 instructions (pand,pandn,por) to simulate the
> vcond_mask, and the combine matches up to 4 instructions, which makes it
> currently impossible to use the combine to recover those optimizations in the
> vcond{,u,eq}.i.e min/max.
> In the case of sse 4.1 and above, there is basically no regression anymore.

Maybe it's possible to use a define_insn_and_split for blends w/o SSE 4.1?
That would allow combine matching the high-level blend operation and
we'd only lower it afterwards?  The question is what we lose in
combinations of/into the loweredn pand/pandn/por of course.

Maybe it's possible to catch the higher-level optimization (min/max)
on the GIMPLE level instead?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2024-06-18 11:17 ` rguenther at suse dot de
@ 2024-06-18 11:29 ` liuhongt at gcc dot gnu.org
  2024-07-01  5:20 ` cvs-commit at gcc dot gnu.org
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-06-18 11:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #6 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
(In reply to rguenther@suse.de from comment #5)
> On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
> > 
> > --- Comment #4 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> > (In reply to rguenther@suse.de from comment #3)
> > > On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote:
> > > 
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517
> > > > 
> > > > --- Comment #2 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
> > > > (In reply to Richard Biener from comment #1)
> > > > > Btw, I had opened PR115490 with my results for this already.  Some mitigation
> > > > > should be from optimizing ISEL expansion to vcond_mask and I'd start with
> > > > > looking at some of the fallout from that side (note that might require
> > > > > the backend reject not natively implemented vec_cmp via its operand 1
> > > > > predicate)
> > > > 
> > > > w/o AVX512, vector integer comparison only supports EQ/GT, others comparison
> > > > rtx_cost is transformed to that. (.i.e GTU is emulated with us_minus + eq +
> > > > negative the vector mask)
> > > > If we restrict the predicate of operand 1, would middle-end reject
> > > > vectorization (or lower it to scalar version)?
> > > 
> > > Richard suggests that we implement the "obvious" transforms like
> > > inversion in the middle-end but if for example unsigned compares
> > > are not supported the us_minus + eq + negative trick isn't on
> > > that list.
> > > 
> > > The main reason to restrict vec_cmp would be to avoid
> > > a <= b ? c : d going with an unsupported vec_cmp but instead
> > > do a > b ? d : c - the alternative is trying to fix this
> > > on the RTL side via combine.  I understand the non-native
> > 
> > Yes, I have a patch which can fix most regressions via pattern match in
> > combine.
> > Still there is a situation that is difficult to deal with, mainly the
> > optimization w/o sse4.1 . Because pblendvb/blendvps/blendvpd only exists under
> > sse4.1, w/o sse4.1, it takes 3 instructions (pand,pandn,por) to simulate the
> > vcond_mask, and the combine matches up to 4 instructions, which makes it
> > currently impossible to use the combine to recover those optimizations in the
> > vcond{,u,eq}.i.e min/max.
> > In the case of sse 4.1 and above, there is basically no regression anymore.
> 
> Maybe it's possible to use a define_insn_and_split for blends w/o SSE 4.1?
> That would allow combine matching the high-level blend operation and
> we'd only lower it afterwards?  The question is what we lose in
> combinations of/into the loweredn pand/pandn/por of course.
I'd rather live with those regressions since they're only existed below sse4.1.
> 
> Maybe it's possible to catch the higher-level optimization (min/max)
> on the GIMPLE level instead?
For integral part, I believe the optimization is already there at gimple level.
For floating point part, x86 {max,min}{ps,pd} is not ieee-conformant, it's a
exact match of cond_expr a < b ? a : b (w/ consideration of -0.0 and NAN.)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2024-06-18 11:29 ` liuhongt at gcc dot gnu.org
@ 2024-07-01  5:20 ` cvs-commit at gcc dot gnu.org
  2024-07-01  5:20 ` cvs-commit at gcc dot gnu.org
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-07-01  5:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #7 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:2e2dfa0095c3326a0a5fc2ff175918b42eeb044f

commit r15-1736-g2e2dfa0095c3326a0a5fc2ff175918b42eeb044f
Author: liuhongt <hongtao.liu@intel.com>
Date:   Mon Jun 17 17:16:46 2024 +0800

    Add more splitters to match (unspec [op1 op2 (gt op3 constm1_operand)]
UNSPEC_BLENDV)

    These define_insn_and_split are needed after vcond{,u,eq} is obsolete.

    gcc/ChangeLog:

            PR target/115517
            * config/i386/sse.md
            (*<sse4_1>_blendv<ssemodesuffix><avxsizesuffix>_gt): New
            define_insn_and_split.
            (*<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_gtint):
            Ditto.
            (*<sse4_1>_blendv<ssefltmodesuffix><avxsizesuffix>_not_gtint):
            Ditto.
            (*<sse4_1_avx2>_pblendvb_gt): Ditto.
            (*<sse4_1_avx2>_pblendvb_gt_subreg_not): Ditto.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2024-07-01  5:20 ` cvs-commit at gcc dot gnu.org
@ 2024-07-01  5:20 ` cvs-commit at gcc dot gnu.org
  2024-07-01  5:20 ` cvs-commit at gcc dot gnu.org
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-07-01  5:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #8 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:b06a108f0fbffe12493b527224f6e4131a72beac

commit r15-1737-gb06a108f0fbffe12493b527224f6e4131a72beac
Author: liuhongt <hongtao.liu@intel.com>
Date:   Tue Jun 18 14:03:42 2024 +0800

    Lower AVX512 kmask comparison back to AVX2 comparison when op_{true,false}
is vector -1/0.

    gcc/ChangeLog
            PR target/115517
            * config/i386/sse.md
            (*<avx512>_cvtmask2<ssemodesuffix><mode>_not): New pre_reload
            splitter.
            (*<avx512>_cvtmask2<ssemodesuffix><mode>_not): Ditto.
            (*avx2_pcmp<mode>3_6): Ditto.
            (*avx2_pcmp<mode>3_7): Ditto.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2024-07-01  5:20 ` cvs-commit at gcc dot gnu.org
@ 2024-07-01  5:20 ` cvs-commit at gcc dot gnu.org
  2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-07-01  5:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #9 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:09737d9605521df9232d9990006c44955064f44e

commit r15-1738-g09737d9605521df9232d9990006c44955064f44e
Author: liuhongt <hongtao.liu@intel.com>
Date:   Tue Jun 18 15:52:02 2024 +0800

    Match IEEE min/max with UNSPEC_IEEE_{MIN,MAX}.

    These versions of the min/max patterns implement exactly the operations
       min = (op1 < op2 ? op1 : op2)
       max = (!(op1 < op2) ? op1 : op2)

    gcc/ChangeLog:
            PR target/115517
            * config/i386/sse.md (*minmax<mode>3_1): New pre_reload
            define_insn_and_split.
            (*minmax<mode>3_2): Ditto.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (8 preceding siblings ...)
  2024-07-01  5:20 ` cvs-commit at gcc dot gnu.org
@ 2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
  2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-07-01  5:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #10 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:3cb204046c0db899750aee9480af4f1953a40ac3

commit r15-1739-g3cb204046c0db899750aee9480af4f1953a40ac3
Author: liuhongt <hongtao.liu@intel.com>
Date:   Wed Jun 19 13:12:00 2024 +0800

    Add more splitter for mskmov with avx512 comparison.

    gcc/ChangeLog:

            PR target/115517
            * config/i386/sse.md
            (*<sse>_movmsk<ssemodesuffix><avxsizesuffix>_lt_avx512): New
            define_insn_and_split.
            (*<sse>_movmsk<ssemodesuffix><avxsizesuffix>_<u>ext_lt_avx512):
            Ditto.
            (*<sse2_avx2>_pmovmskb_lt_avx512): Ditto.
            (*<sse2_avx2>_pmovmskb_zext_lt_avx512): Ditto.
            (*sse2_pmovmskb_ext_lt_avx512): Ditto.
            (*pmovsk_kmask_v16qi_avx512): Ditto.
            (*pmovsk_mask_v32qi_avx512): Ditto.
            (*pmovsk_mask_cmp_<mode>_avx512): Ditto.
            (*pmovsk_ptest_<mode>_avx512): Ditto.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (9 preceding siblings ...)
  2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
@ 2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
  2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-07-01  5:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #11 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:e94e6ee495d95f29355bbc017214228a5e367638

commit r15-1740-ge94e6ee495d95f29355bbc017214228a5e367638
Author: liuhongt <hongtao.liu@intel.com>
Date:   Wed Jun 19 16:05:58 2024 +0800

    Adjust testcase for the regressed testcases after obsolete of vcond{,u,eq}.

    > Richard suggests that we implement the "obvious" transforms like
    > inversion in the middle-end but if for example unsigned compares
    > are not supported the us_minus + eq + negative trick isn't on
    > that list.
    >
    > The main reason to restrict vec_cmp would be to avoid
    > a <= b ? c : d going with an unsupported vec_cmp but instead
    > do a > b ? d : c - the alternative is trying to fix this
    > on the RTL side via combine.  I understand the non-native

    Yes, I have a patch which can fix most regressions via pattern match
    in combine.
    Still there is a situation that is difficult to deal with, mainly the
    optimization w/o sse4.1 . Because pblendvb/blendvps/blendvpd only
    exists under sse4.1, w/o sse4.1, it takes 3
    instructions (pand,pandn,por) to simulate the vcond_mask, and the
    combine matches up to 4 instructions, which makes it currently
    impossible to use the combine to recover those optimizations in the
    vcond{,u,eq}.i.e min/max.

    In the case of sse 4.1 and above, there is basically no regression anymore.

    the regression testcases w/o sse4.1

    FAIL: g++.target/i386/pr100637-1b.C  -std=gnu++14  scan-assembler-times
pcmpeqb 2
    FAIL: g++.target/i386/pr100637-1b.C  -std=gnu++17  scan-assembler-times
pcmpeqb 2
    FAIL: g++.target/i386/pr100637-1b.C  -std=gnu++20  scan-assembler-times
pcmpeqb 2
    FAIL: g++.target/i386/pr100637-1b.C  -std=gnu++98  scan-assembler-times
pcmpeqb 2
    FAIL: g++.target/i386/pr100637-1w.C  -std=gnu++14  scan-assembler-times
pcmpeqw 2
    FAIL: g++.target/i386/pr100637-1w.C  -std=gnu++17  scan-assembler-times
pcmpeqw 2
    FAIL: g++.target/i386/pr100637-1w.C  -std=gnu++20  scan-assembler-times
pcmpeqw 2
    FAIL: g++.target/i386/pr100637-1w.C  -std=gnu++98  scan-assembler-times
pcmpeqw 2
    FAIL: g++.target/i386/pr103861-1.C  -std=gnu++14  scan-assembler-times
pcmpeqb 2
    FAIL: g++.target/i386/pr103861-1.C  -std=gnu++17  scan-assembler-times
pcmpeqb 2
    FAIL: g++.target/i386/pr103861-1.C  -std=gnu++20  scan-assembler-times
pcmpeqb 2
    FAIL: g++.target/i386/pr103861-1.C  -std=gnu++98  scan-assembler-times
pcmpeqb 2
    FAIL: gcc.target/i386/pr88540.c scan-assembler minpd

    gcc/testsuite/ChangeLog:

            PR target/115517
            * g++.target/i386/pr100637-1b.C: Add xfail and -mno-sse4.1.
            * g++.target/i386/pr100637-1w.C: Ditto.
            * g++.target/i386/pr103861-1.C: Ditto.
            * gcc.target/i386/pr88540.c: Ditto.
            * gcc.target/i386/pr103941-2.c: Add -mno-avx512f.
            * g++.target/i386/sse4_1-pr100637-1b.C: New test.
            * g++.target/i386/sse4_1-pr100637-1w.C: New test.
            * g++.target/i386/sse4_1-pr103861-1.C: New test.
            * gcc.target/i386/sse4_1-pr88540.c: New test.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (10 preceding siblings ...)
  2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
@ 2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
  2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-07-01  5:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #12 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:2ccdd0f22312a14ac64bf944fdc4f8e7532eb0eb

commit r15-1741-g2ccdd0f22312a14ac64bf944fdc4f8e7532eb0eb
Author: liuhongt <hongtao.liu@intel.com>
Date:   Thu Jun 20 12:41:13 2024 +0800

    Optimize a < 0 ? -1 : 0 to (signed)a >> 31.

    Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31
    and x < 0 ? 1 : 0 into (unsigned) x >> 31.

    Add define_insn_and_split for the optimization did in
    ix86_expand_int_vcond.

    gcc/ChangeLog:

            PR target/115517
            * config/i386/sse.md ("*ashr<mode>3_1"): New
            define_insn_and_split.
            (*avx512_ashr<mode>3_1): Ditto.
            (*avx2_lshr<mode>3_1): Ditto.
            (*avx2_lshr<mode>3_2): Ditto and add 2 combine splitter after
            it.
            * config/i386/mmx.md (mmxscalarsize): New mode attribute.
            (*mmw_ashr<mode>3_1): New define_insn_and_split.
            ("mmx_<insn><mode>3): Add a combine spiltter after it.
            (*mmx_ashrv2hi3_1): New define_insn_and_plit, also add a
            combine splitter after it.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr111023-2.c: Adjust testcase.
            * gcc.target/i386/vect-div-1.c: Ditto.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (11 preceding siblings ...)
  2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
@ 2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
  2024-07-01  5:26 ` liuhongt at gcc dot gnu.org
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-07-01  5:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #13 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:55f80c690c5fa59836646565a9dee2a3f68374a0

commit r15-1742-g55f80c690c5fa59836646565a9dee2a3f68374a0
Author: liuhongt <hongtao.liu@intel.com>
Date:   Mon Jun 24 09:19:01 2024 +0800

    Remove vcond{,u,eq}<mode> expanders since they will be obsolete.

    gcc/ChangeLog:

            PR target/115517
            * config/i386/mmx.md (vcond<mode>v2sf): Removed.
            (vcond<MMXMODE124:mode><MMXMODEI:mode>): Ditto.
            (vcond<mode><mode>): Ditto.
            (vcondu<MMXMODE124:mode><MMXMODEI:mode>): Ditto.
            (vcondu<mode><mode>): Ditto.
            * config/i386/sse.md (vcond<V_512:mode><VF_512:mode>): Ditto.
            (vcond<V_256:mode><VF_256:mode>): Ditto.
            (vcond<V_128:mode><VF_128:mode>): Ditto.
            (vcond<VI2HFBF_AVX512VL:mode><VHF_AVX512VL:mode>): Ditto.
            (vcond<V_512:mode><VI_AVX512BW:mode>): Ditto.
            (vcond<V_256:mode><VI_256:mode>): Ditto.
            (vcond<V_128:mode><VI124_128:mode>): Ditto.
            (vcond<VI8F_128:mode>v2di): Ditto.
            (vcondu<V_512:mode><VI_AVX512BW:mode>): Ditto.
            (vcondu<V_256:mode><VI_256:mode>): Ditto.
            (vcondu<V_128:mode><VI124_128:mode>): Ditto.
            (vcondu<VI8F_128:mode>v2di): Ditto.
            (vcondeq<VI8F_128:mode>v2di): Ditto.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (12 preceding siblings ...)
  2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
@ 2024-07-01  5:26 ` liuhongt at gcc dot gnu.org
  2024-07-01  5:27 ` liuhongt at gcc dot gnu.org
  2024-07-01  6:45 ` rguenth at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-07-01  5:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #14 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
regressions above SSE4.1 are fxed in GCC15, SSE2 regressions are tracked in
PR115683

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (13 preceding siblings ...)
  2024-07-01  5:26 ` liuhongt at gcc dot gnu.org
@ 2024-07-01  5:27 ` liuhongt at gcc dot gnu.org
  2024-07-01  6:45 ` rguenth at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-07-01  5:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

Hongtao Liu <liuhongt at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #15 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab
  2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
                   ` (14 preceding siblings ...)
  2024-07-01  5:27 ` liuhongt at gcc dot gnu.org
@ 2024-07-01  6:45 ` rguenth at gcc dot gnu.org
  15 siblings, 0 replies; 17+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-07-01  6:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517

--- Comment #16 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Hongtao Liu from comment #14)
> regressions above SSE4.1 are fxed in GCC15, SSE2 regressions are tracked in
> PR115683

Thanks a lot for the effort!

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-07-01  6:45 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-17  8:05 [Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab liuhongt at gcc dot gnu.org
2024-06-18  6:20 ` [Bug target/115517] Fix x86 regressions " rguenth at gcc dot gnu.org
2024-06-18  8:39 ` liuhongt at gcc dot gnu.org
2024-06-18 10:49 ` rguenther at suse dot de
2024-06-18 11:08 ` liuhongt at gcc dot gnu.org
2024-06-18 11:17 ` rguenther at suse dot de
2024-06-18 11:29 ` liuhongt at gcc dot gnu.org
2024-07-01  5:20 ` cvs-commit at gcc dot gnu.org
2024-07-01  5:20 ` cvs-commit at gcc dot gnu.org
2024-07-01  5:20 ` cvs-commit at gcc dot gnu.org
2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
2024-07-01  5:21 ` cvs-commit at gcc dot gnu.org
2024-07-01  5:26 ` liuhongt at gcc dot gnu.org
2024-07-01  5:27 ` liuhongt at gcc dot gnu.org
2024-07-01  6:45 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).