[Bug target/100711] New: Miss optimization for pandn

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/100711] New: Miss optimization for pandn
@ 2021-05-21  2:22 crazylht at gmail dot com
  2021-05-21  6:42 ` [Bug target/100711] " rguenth at gcc dot gnu.org
                   ` (14 more replies)
  0 siblings, 15 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-05-21  2:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

            Bug ID: 100711
           Summary: Miss optimization for pandn
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
                CC: hjl.tools at gmail dot com
  Target Milestone: ---
            Target: x86_64-*-* i?86-*-*

cat test.c

typedef int v4si __attribute__((vector_size (16)));
v4si
foo (int a, v4si b)
{
    return (__extension__ (v4si) {~a, ~a, ~a, ~a}) & b;
}

generate

        notl    %edi
        vmovdqa %xmm0, %xmm1
        vpbroadcastd    %edi, %xmm0
        vpand   %xmm1, %xmm0, %xmm0
        ret

it should be better as

        vmovdqa %xmm0, %xmm1
        vpbroadcastd    %edi, %xmm0
        vpandn   %xmm1, %xmm0, %xmm0

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
@ 2021-05-21  6:42 ` rguenth at gcc dot gnu.org
  2021-05-21 10:36 ` segher at gcc dot gnu.org
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-05-21  6:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |segher at gcc dot gnu.org

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
    7: r82:SI=~r89:SI
      REG_DEAD r89:SI
    8: r88:V4SI=vec_duplicate(r82:SI)
      REG_DEAD r82:SI
    9: r87:V4SI=r88:V4SI&r90:V4SI
      REG_DEAD r90:V4SI
      REG_DEAD r88:V4SI

I suppose we're confused about the vec_duplicate.  Would generally swapping
the duplicate and the bit_not be profitable?  Eventually it's a simplification
combine could try - I belive it has some cases where it tries variants of the
original instructions when combining.  Adding a combine helper pattern
looks like putting too much burden on the backend IMHO.

We don't have a generic nand optab so handling this in ISEL on gimple
isn't straight-forward.

But combine and/or forwprop could do this.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
  2021-05-21  6:42 ` [Bug target/100711] " rguenth at gcc dot gnu.org
@ 2021-05-21 10:36 ` segher at gcc dot gnu.org
  2021-05-25  5:40 ` crazylht at gmail dot com
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: segher at gcc dot gnu.org @ 2021-05-21 10:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

Segher Boessenkool <segher at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2021-05-21

--- Comment #2 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #1)
> I suppose we're confused about the vec_duplicate.  Would generally swapping
> the duplicate and the bit_not be profitable?  Eventually it's a
> simplification
> combine could try - I belive it has some cases where it tries variants of the
> original instructions when combining.  Adding a combine helper pattern
> looks like putting too much burden on the backend IMHO.
> 
> We don't have a generic nand optab so handling this in ISEL on gimple
> isn't straight-forward.
> 
> But combine and/or forwprop could do this.

Combine never tries anything.  Combine makes *one* result; if that does not
work,
it does not do the combination.  (This is not completely true, but in essence
that is how it works, and it has to to not have exponential complexity).

It would be good to define a canonical form for anything vec_duplicate.  It
probably is a good idea to pull the vec_duplicate as far outside as possible?

Canonical forms hugely reduce the amount of work needed.  Compare to how "andc"
is represented (canonically with the inverted input first), or how "nand" is
(we
write that as an "orcc", an "or" with both inputs inverted, in canonical RTL).
Because only one form is allowed, we only have to check for that one form
everywhere.

Confirmed.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
  2021-05-21  6:42 ` [Bug target/100711] " rguenth at gcc dot gnu.org
  2021-05-21 10:36 ` segher at gcc dot gnu.org
@ 2021-05-25  5:40 ` crazylht at gmail dot com
  2021-05-25  6:52 ` crazylht at gmail dot com
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-05-25  5:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---


(In reply to Segher Boessenkool from comment #2)
> (In reply to Richard Biener from comment #1)
> > I suppose we're confused about the vec_duplicate.  Would generally swapping
> > the duplicate and the bit_not be profitable?  Eventually it's a
> > simplification
> > combine could try - I belive it has some cases where it tries variants of the
> > original instructions when combining.  Adding a combine helper pattern
> > looks like putting too much burden on the backend IMHO.
> > 
> > We don't have a generic nand optab so handling this in ISEL on gimple
> > isn't straight-forward.
> > 
> > But combine and/or forwprop could do this.
> 
> Combine never tries anything.  Combine makes *one* result; if that does not
> work,
> it does not do the combination.  (This is not completely true, but in essence
> that is how it works, and it has to to not have exponential complexity).
> 
> It would be good to define a canonical form for anything vec_duplicate.  It
> probably is a good idea to pull the vec_duplicate as far outside as possible?
> 
> Canonical forms hugely reduce the amount of work needed.  Compare to how
> "andc"
> is represented (canonically with the inverted input first), or how "nand" is
> (we
> write that as an "orcc", an "or" with both inputs inverted, in canonical
> RTL).
> Because only one form is allowed, we only have to check for that one form
> everywhere.
> 
> Confirmed.

Even w/ canonical RTL, i think a combine splitter is also needed here, the
canonical RTL only helps combine/forwprop to match more possibility but won't
split patterns by itselies.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
                   ` (2 preceding siblings ...)
  2021-05-25  5:40 ` crazylht at gmail dot com
@ 2021-05-25  6:52 ` crazylht at gmail dot com
  2021-05-25 10:09 ` segher at gcc dot gnu.org
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-05-25  6:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #3)
> 
> (In reply to Segher Boessenkool from comment #2)
> > (In reply to Richard Biener from comment #1)
> > > I suppose we're confused about the vec_duplicate.  Would generally 
> Even w/ canonical RTL, i think a combine splitter is also needed here, the
> canonical RTL only helps combine/forwprop to match more possibility but
> won't split patterns by itselies.

I was wrong, i thought combine only support n->1 combining, but actually
pass_combine also support 3->2 combining which means a define_split is not
needed here.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
                   ` (3 preceding siblings ...)
  2021-05-25  6:52 ` crazylht at gmail dot com
@ 2021-05-25 10:09 ` segher at gcc dot gnu.org
  2021-11-30  8:37 ` cvs-commit at gcc dot gnu.org
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: segher at gcc dot gnu.org @ 2021-05-25 10:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

--- Comment #5 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to Hongtao.liu from comment #4)
> > Even w/ canonical RTL, i think a combine splitter is also needed here, the
> > canonical RTL only helps combine/forwprop to match more possibility but
> > won't split patterns by itselies.
> 
> I was wrong, i thought combine only support n->1 combining, but actually
> pass_combine also support 3->2 combining which means a define_split is not
> needed here.

<anything>->1 and <anything>->2, yes.  But note that combine can often split
RTL
without having an explicit define_split; and also note the opposite, combine
does
not always pick the best spot to split, "manual" help (a define_split) can be
needed for good results.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
                   ` (4 preceding siblings ...)
  2021-05-25 10:09 ` segher at gcc dot gnu.org
@ 2021-11-30  8:37 ` cvs-commit at gcc dot gnu.org
  2021-11-30  8:46 ` crazylht at gmail dot com
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-11-30  8:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:c39d77f252e895306ef88c1efb3eff04e4232554

commit r12-5595-gc39d77f252e895306ef88c1efb3eff04e4232554
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Tue Nov 30 08:35:39 2021 +0000

    x86_64: PR target/100711: Splitters for pandn

    This patch addresses PR target/100711 by introducing define_split
    patterns so that not/broadcast/pand may be simplified (by combine)
    to broadcast/pandn.  This introduces two splitters one for optimizing
    pandn on TARGET_SSE for V4SI and V2DI, and another for vpandn on
    TARGET_AVX2 for V16QI, V8HI, V32QI, V16HI and V8SI.  Each splitter
    has its own new testcase.

    I've also confirmed that not/broadcast/pandn is already getting
    simplified to broadcast/pand by the middle-end optimizers.

    2021-11-30  Roger Sayle  <roger@nextmovesoftware.com>
                UroÅ¡ Bizjak  <ubizjak@gmail.com>

    gcc/ChangeLog
            PR target/100711
            * config/i386/sse.md (define_split): New splitters to simplify
            not;vec_duplicate;and as vec_duplicate;andn.

    gcc/testsuite/ChangeLog
            PR target/100711
            * gcc.target/i386/pr100711-1.c: New test case.
            * gcc.target/i386/pr100711-2.c: New test case.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
                   ` (5 preceding siblings ...)
  2021-11-30  8:37 ` cvs-commit at gcc dot gnu.org
@ 2021-11-30  8:46 ` crazylht at gmail dot com
  2023-05-24  8:42 ` jbeulich at suse dot com
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2021-11-30  8:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC12.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
                   ` (6 preceding siblings ...)
  2021-11-30  8:46 ` crazylht at gmail dot com
@ 2023-05-24  8:42 ` jbeulich at suse dot com
  2023-05-25  6:52 ` crazylht at gmail dot com
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jbeulich at suse dot com @ 2023-05-24  8:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

jbeulich at suse dot com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jbeulich at suse dot com

--- Comment #8 from jbeulich at suse dot com ---
Since the commit doesn't really explain it (maybe it's obvious to others, but
it isn't to me), may I ask why two splitters were introduced, yet then still
not covering all possible modes? VI48_128 only covers two of the four possible
SSE2 modes, while VI124_AVX2 leaves out all DI-element-size ones as well as all
512-bit ones. Shouldn't both be folded, using VI_AVX2 as the mode iterator?

As an aside, it is also interesting that the 1st splitter uses TARGET_SSE
without the corresponding testcase limiting itself to just SSE. When building
that testcase with SSE2 turned off, foo() uses shufps and andnps as expected,
but the splitter doesn't appear to come into play at all for bar(), when really
it is only the broadcast that needs synthesizing, while andnps can be used
regardless of mode.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
                   ` (7 preceding siblings ...)
  2023-05-24  8:42 ` jbeulich at suse dot com
@ 2023-05-25  6:52 ` crazylht at gmail dot com
  2023-05-25  8:36 ` jbeulich at suse dot com
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2023-05-25  6:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

--- Comment #9 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to jbeulich from comment #8)
> Since the commit doesn't really explain it (maybe it's obvious to others,
> but it isn't to me), may I ask why two splitters were introduced, yet then
> still not covering all possible modes? VI48_128 only covers two of the four
> possible SSE2 modes, while VI124_AVX2 leaves out all DI-element-size ones as
> well as all 512-bit ones. Shouldn't both be folded, using VI_AVX2 as the
> mode iterator?
We don't have single instruction for V8HI/V16QImode broadcast without AVX2,
that's why the first splitter only have VI48_128.
And yes, for the second splitter, I think we should use VI_AVX2 to cover all
modes.
> 
> As an aside, it is also interesting that the 1st splitter uses TARGET_SSE
> without the corresponding testcase limiting itself to just SSE. When
> building that testcase with SSE2 turned off, foo() uses shufps and andnps as
> expected, but the splitter doesn't appear to come into play at all for
> bar(), when really it is only the broadcast that needs synthesizing, while
> andnps can be used regardless of mode.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
                   ` (8 preceding siblings ...)
  2023-05-25  6:52 ` crazylht at gmail dot com
@ 2023-05-25  8:36 ` jbeulich at suse dot com
  2023-05-25  8:50 ` crazylht at gmail dot com
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: jbeulich at suse dot com @ 2023-05-25  8:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

--- Comment #10 from jbeulich at suse dot com ---
(In reply to Hongtao.liu from comment #9)
> We don't have single instruction for V8HI/V16QImode broadcast without AVX2,
> that's why the first splitter only have VI48_128.

Does this matter? The splitters are about subsuming the "not". How the
"vec_duplicate" is carried out isn't really relevant here, is it?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
                   ` (9 preceding siblings ...)
  2023-05-25  8:36 ` jbeulich at suse dot com
@ 2023-05-25  8:50 ` crazylht at gmail dot com
  2023-05-27  9:28 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: crazylht at gmail dot com @ 2023-05-25  8:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

--- Comment #11 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to jbeulich from comment #10)
> (In reply to Hongtao.liu from comment #9)
> > We don't have single instruction for V8HI/V16QImode broadcast without AVX2,
> > that's why the first splitter only have VI48_128.
> 
> Does this matter? The splitters are about subsuming the "not". How the
> "vec_duplicate" is carried out isn't really relevant here, is it?

It's splitted to 2 patterns, but there's no V8HI/V16QImode define_insn for the
first pattern  w/o AVX2, there will be ICE of unrecognisable insn.

17110  [(set (match_dup 3)
17111        (vec_duplicate:VI48_128 (match_dup 1)))
17112   (set (match_dup 0)
17113        (and:VI48_128 (not:VI48_128 (match_dup 3))
17114                      (match_dup 2)))]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
                   ` (10 preceding siblings ...)
  2023-05-25  8:50 ` crazylht at gmail dot com
@ 2023-05-27  9:28 ` cvs-commit at gcc dot gnu.org
  2023-07-05  7:49 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-05-27  9:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:ed6a9a35799c9298321d1589533767c2bb6f8d42

commit r14-1307-ged6a9a35799c9298321d1589533767c2bb6f8d42
Author: liuhongt <hongtao.liu@intel.com>
Date:   Thu May 25 16:14:14 2023 +0800

    Split notl + pbraodcast + pand to pbroadcast + pandn more modes.

    r12-5595-gc39d77f252e895306ef88c1efb3eff04e4232554 adds 2 splitter to
    transform notl + pbroadcast + pand to pbroadcast + pandn for
    VI124_AVX2 which leaves out all DI-element-size ones as
    well as all 512-bit ones.
    This patch extend the splitter to VI_AVX2 which will handle DImode for
    AVX2, and V64QImode,V32HImode,V16SImode,V8DImode for AVX512.

    gcc/ChangeLog:

            PR target/100711
            * config/i386/sse.md (*andnot<mode>3): Extend below splitter
            to VI_AVX2 to cover more modes.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr100711-2.c: Add v4di/v2di testcases.
            * gcc.target/i386/pr100711-3.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
                   ` (11 preceding siblings ...)
  2023-05-27  9:28 ` cvs-commit at gcc dot gnu.org
@ 2023-07-05  7:49 ` cvs-commit at gcc dot gnu.org
  2023-07-05  7:49 ` cvs-commit at gcc dot gnu.org
  2023-07-05  7:49 ` cvs-commit at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-07-05  7:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

--- Comment #13 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jan Beulich <jbeulich@gcc.gnu.org>:

https://gcc.gnu.org/g:3186ef0cb9e2d25e8455f9990e50187e3d1eee19

commit r14-2312-g3186ef0cb9e2d25e8455f9990e50187e3d1eee19
Author: Jan Beulich <jbeulich@suse.com>
Date:   Wed Jul 5 09:48:19 2023 +0200

    x86: allow memory operand for AVX2 splitter for PR target/100711

    The intended broadcast (with AVX512) can very well be done right from
    memory.

    gcc/

            PR target/100711
            * config/i386/sse.md: Permit non-immediate operand 1 in AVX2
            form of splitter for PR target/100711.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
                   ` (12 preceding siblings ...)
  2023-07-05  7:49 ` cvs-commit at gcc dot gnu.org
@ 2023-07-05  7:49 ` cvs-commit at gcc dot gnu.org
  2023-07-05  7:49 ` cvs-commit at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-07-05  7:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

--- Comment #14 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jan Beulich <jbeulich@gcc.gnu.org>:

https://gcc.gnu.org/g:fa58c2871a1235cb5ba475303a2bd11ae90416d5

commit r14-2313-gfa58c2871a1235cb5ba475303a2bd11ae90416d5
Author: Jan Beulich <jbeulich@suse.com>
Date:   Wed Jul 5 09:48:47 2023 +0200

    x86: further PR target/100711-like splitting

    With respective two-operand bitwise operations now expressable by a
    single VPTERNLOG, add splitters to also deal with ior and xor
    counterparts of the original and-only case. Note that the splitters need
    to be separate, as the placement of "not" differs in the final insns
    (*iornot<mode>3, *xnor<mode>3) which are intended to pick up one half of
    the result.

    gcc/

            PR target/100711
            * config/i386/sse.md: New splitters to simplify
            not;vec_duplicate;{ior,xor} as vec_duplicate;{iornot,xnor}.

    gcc/testsuite/

            PR target/100711
            * gcc.target/i386/pr100711-4.c: New test.
            * gcc.target/i386/pr100711-5.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Bug target/100711] Miss optimization for pandn
  2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
                   ` (13 preceding siblings ...)
  2023-07-05  7:49 ` cvs-commit at gcc dot gnu.org
@ 2023-07-05  7:49 ` cvs-commit at gcc dot gnu.org
  14 siblings, 0 replies; 16+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-07-05  7:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711

--- Comment #15 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jan Beulich <jbeulich@gcc.gnu.org>:

https://gcc.gnu.org/g:e007369c8b67bcabd57c4fed8cff2a6db82e78e6

commit r14-2314-ge007369c8b67bcabd57c4fed8cff2a6db82e78e6
Author: Jan Beulich <jbeulich@suse.com>
Date:   Wed Jul 5 09:49:16 2023 +0200

    x86: yet more PR target/100711-like splitting

    Following two-operand bitwise operations, add another splitter to also
    deal with not followed by broadcast all on its own, which can be
    expressed as simple embedded broadcast instead once a broadcast operand
    is actually permitted in the respective insn. While there also permit
    a broadcast operand in the corresponding expander.

    gcc/

            PR target/100711
            * config/i386/sse.md: New splitters to simplify
            not;vec_duplicate as a singular vpternlog.
            (one_cmpl<mode>2): Allow broadcast for operand 1.
            (<mask_codefor>one_cmpl<mode>2<mask_name>): Likewise.

    gcc/testsuite/

            PR target/100711
            * gcc.target/i386/pr100711-6.c: New test.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-07-05  7:49 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-21  2:22 [Bug target/100711] New: Miss optimization for pandn crazylht at gmail dot com
2021-05-21  6:42 ` [Bug target/100711] " rguenth at gcc dot gnu.org
2021-05-21 10:36 ` segher at gcc dot gnu.org
2021-05-25  5:40 ` crazylht at gmail dot com
2021-05-25  6:52 ` crazylht at gmail dot com
2021-05-25 10:09 ` segher at gcc dot gnu.org
2021-11-30  8:37 ` cvs-commit at gcc dot gnu.org
2021-11-30  8:46 ` crazylht at gmail dot com
2023-05-24  8:42 ` jbeulich at suse dot com
2023-05-25  6:52 ` crazylht at gmail dot com
2023-05-25  8:36 ` jbeulich at suse dot com
2023-05-25  8:50 ` crazylht at gmail dot com
2023-05-27  9:28 ` cvs-commit at gcc dot gnu.org
2023-07-05  7:49 ` cvs-commit at gcc dot gnu.org
2023-07-05  7:49 ` cvs-commit at gcc dot gnu.org
2023-07-05  7:49 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).