public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/111023] New: missing extendv4siv4hi (and friends)
@ 2023-08-15  7:33 rguenth at gcc dot gnu.org
  2023-08-15 15:23 ` [Bug target/111023] " ubizjak at gmail dot com
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-08-15  7:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023

            Bug ID: 111023
           Summary: missing extendv4siv4hi (and friends)
           Product: gcc
           Version: 13.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

We could vectorize gcc.dg/vect/pr65947-7.c if we implement the
extendv4siv4hi pattern (sign-extend V4HI to V4SI).  We can already do
vec_unpacks_lo via

        pcmpgtw %xmm0, %xmm1
        movdqa  %xmm0, %xmm2
        punpcklwd       %xmm1, %xmm2

and that would trivially extend to the required pattern - just the
input is v4hi instead of v8hi.

Other related patterns are probably missing as well, where we can do
vec_unpack[s]_lo we should be able to implement [zero_]extend.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111023] missing extendv4siv4hi (and friends)
  2023-08-15  7:33 [Bug target/111023] New: missing extendv4siv4hi (and friends) rguenth at gcc dot gnu.org
@ 2023-08-15 15:23 ` ubizjak at gmail dot com
  2023-08-17  6:46 ` rguenth at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: ubizjak at gmail dot com @ 2023-08-15 15:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023

--- Comment #1 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Richard Biener from comment #0)
> We could vectorize gcc.dg/vect/pr65947-7.c if we implement the
> extendv4siv4hi pattern (sign-extend V4HI to V4SI).  We can already do
> vec_unpacks_lo via
> 
>         pcmpgtw %xmm0, %xmm1
>         movdqa  %xmm0, %xmm2
>         punpcklwd       %xmm1, %xmm2
> 
> and that would trivially extend to the required pattern - just the
> input is v4hi instead of v8hi.
> 
> Other related patterns are probably missing as well, where we can do
> vec_unpack[s]_lo we should be able to implement [zero_]extend.

We have:

(define_expand "<insn>v4hiv4si2"
  [(set (match_operand:V4SI 0 "register_operand")
        (any_extend:V4SI
          (match_operand:V4HI 1 "nonimmediate_operand")))]
  "TARGET_SSE4_1"

in sse.md, so the testcase should be vectorized using -msse4.1. Is there any
other pattern missing for efficient vectorization?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111023] missing extendv4siv4hi (and friends)
  2023-08-15  7:33 [Bug target/111023] New: missing extendv4siv4hi (and friends) rguenth at gcc dot gnu.org
  2023-08-15 15:23 ` [Bug target/111023] " ubizjak at gmail dot com
@ 2023-08-17  6:46 ` rguenth at gcc dot gnu.org
  2023-08-18  9:58 ` ubizjak at gmail dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-08-17  6:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Huh, that escaped me.  I'll have to go back and compare where it derails
compared to aarch64 with neon.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111023] missing extendv4siv4hi (and friends)
  2023-08-15  7:33 [Bug target/111023] New: missing extendv4siv4hi (and friends) rguenth at gcc dot gnu.org
  2023-08-15 15:23 ` [Bug target/111023] " ubizjak at gmail dot com
  2023-08-17  6:46 ` rguenth at gcc dot gnu.org
@ 2023-08-18  9:58 ` ubizjak at gmail dot com
  2023-08-18  9:59 ` ubizjak at gmail dot com
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: ubizjak at gmail dot com @ 2023-08-18  9:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |ubizjak at gmail dot com
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2023-08-18
     Ever confirmed|0                           |1

--- Comment #3 from Uroš Bizjak <ubizjak at gmail dot com> ---
The idea of implementing some sign/zero extensions using PUNPCKL?? is quite
interesting. We can implement extensions for all <= 64byte vector modes that
extend to wider vector mode also for SSE2.

I have a patch.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111023] missing extendv4siv4hi (and friends)
  2023-08-15  7:33 [Bug target/111023] New: missing extendv4siv4hi (and friends) rguenth at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-08-18  9:58 ` ubizjak at gmail dot com
@ 2023-08-18  9:59 ` ubizjak at gmail dot com
  2023-08-18 11:37 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: ubizjak at gmail dot com @ 2023-08-18  9:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023

--- Comment #4 from Uroš Bizjak <ubizjak at gmail dot com> ---
Created attachment 55753
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55753&action=edit
Proposed patch

Patch that implements zero/sign extend of <= 64byte vector modes to a wider
vector mode also for SSE2.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111023] missing extendv4siv4hi (and friends)
  2023-08-15  7:33 [Bug target/111023] New: missing extendv4siv4hi (and friends) rguenth at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-08-18  9:59 ` ubizjak at gmail dot com
@ 2023-08-18 11:37 ` rguenth at gcc dot gnu.org
  2023-08-18 17:08 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-08-18 11:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
So for gcc.dg/vect/pr65947-7.c the main difference is that aarch64 succeeds
with

t.c:12:21: note:  ***** Re-trying analysis with vector mode V4HI
t.c:12:21: note:   === vect_analyze_data_refs ===
t.c:12:21: note:   got vectype for stmt: aval_13 = *_3;
vector(4) short int
t.c:12:21: note:   got vectype for stmt: _7 = *_6;
vector(4) int

while x86_64 fails with both the default (V8HI) and V8QI

t.c:12:21: note:   === vect_analyze_data_refs ===
t.c:12:21: note:   got vectype for stmt: aval_13 = *_3;
vector(8) short int
t.c:12:21: note:   got vectype for stmt: _7 = *_6;
vector(4) int
...
t.c:12:21: note:  ***** Re-trying analysis with vector mode V8QI
t.c:12:21: note:   === vect_analyze_data_refs ===
t.c:12:21: note:   got vectype for stmt: aval_13 = *_3;
vector(4) short int
t.c:12:21: note:   got vectype for stmt: _7 = *_6;
vector(2) int

that is, aarch64 is special here in that it somehow tries V4HI which ends
up behaving differently than V8QI.  aarch64 also tries V2SI for the
epilogue which yields

t.c:12:21: note:   === vect_analyze_data_refs ===
t.c:12:21: note:   got vectype for stmt: aval_13 = *_3;
vector(4) short int 
t.c:12:21: note:   got vectype for stmt: _7 = *_6;
vector(2) int

aarch64 also fails for V8HI (same default).

The order for aarch64 is V8HI, V4HI ..., x86_64 tries V8HI, V8QI, V4QI.

That V4HI yields V4SI as related mode is a "fluke"(?) of
aarch64_vectorize_related_mode which has

  /* Prefer to use 1 128-bit vector instead of 2 64-bit vectors.  */
  if (TARGET_SIMD
      && (vec_flags & VEC_ADVSIMD)
      && known_eq (nunits, 0U)
      && known_eq (GET_MODE_BITSIZE (vector_mode), 64U)
      && maybe_ge (GET_MODE_BITSIZE (element_mode)
                   * GET_MODE_NUNITS (vector_mode), 128U))
    {
      machine_mode res = aarch64_simd_container_mode (element_mode, 128);
      if (VECTOR_MODE_P (res))
        return res;
    }

which essentially "violates" the one-vector-size design of the loop
vectorizer in these kind of special cases.

So indeed x86 isn't going to vectorize this because of the inherent
limitation of the vectorizer which chooses vector types too early.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111023] missing extendv4siv4hi (and friends)
  2023-08-15  7:33 [Bug target/111023] New: missing extendv4siv4hi (and friends) rguenth at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2023-08-18 11:37 ` rguenth at gcc dot gnu.org
@ 2023-08-18 17:08 ` cvs-commit at gcc dot gnu.org
  2023-08-21  7:26 ` ubizjak at gmail dot com
  2023-08-21  9:25 ` rguenth at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-08-18 17:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Uros Bizjak <uros@gcc.gnu.org>:

https://gcc.gnu.org/g:4123b5609da53c8f8ac01c90aef127ad6375e9df

commit r14-3327-g4123b5609da53c8f8ac01c90aef127ad6375e9df
Author: Uros Bizjak <ubizjak@gmail.com>
Date:   Fri Aug 18 19:06:38 2023 +0200

    i386: Use PUNPCKL?? to implement vector extend and zero_extend for
TARGET_SSE2.

    Implement vector extend and zero_extend functionality for TARGET_SSE2 using
    PUNPCKL?? family of instructions. The code for e.g. zero-extend from V2SI
to
    V2DImode improves from:

            movd    %xmm0, %edx
            pshufd  $85, %xmm0, %xmm0
            movd    %xmm0, %eax
            movq    %rdx, (%rdi)
            movq    %rax, 8(%rdi)

    to:
            pxor    %xmm1, %xmm1
            punpckldq       %xmm1, %xmm0
            movaps  %xmm0, (%rdi)

    And the code for sign-extend from V2SI to V2DImode from:

            movd    %xmm0, %edx
            pshufd  $85, %xmm0, %xmm0
            movd    %xmm0, %eax
            movslq  %edx, %rdx
            cltq
            movq    %rdx, (%rdi)
            movq    %rax, 8(%rdi)

    to:
            pxor    %xmm1, %xmm1
            pcmpgtd %xmm0, %xmm1
            punpckldq       %xmm1, %xmm0
            movaps  %xmm0, (%rdi)

            PR target/111023

    gcc/ChangeLog:

            * config/i386/i386-expand.cc (ix86_split_mmx_punpck):
            Also handle V2QImode.
            (ix86_expand_sse_extend): New function.
            * config/i386/i386-protos.h (ix86_expand_sse_extend): New
prototype.
            * config/i386/mmx.md (<any_extend:insn>v4qiv4hi2): Enable for
            TARGET_SSE2.  Expand through ix86_expand_sse_extend for
!TARGET_SSE4_1.
            (<any_extend:insn>v2hiv2si2): Ditto.
            (<any_extend:insn>v2qiv2hi2): Ditto.
            * config/i386/sse.md (<any_extend:insn>v8qiv8hi2): Ditto.
            (<any_extend:insn>v4hiv4si2): Ditto.
            (<any_extend:insn>v2siv2di2): Ditto.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr111023-2.c: New test.
            * gcc.target/i386/pr111023-4b.c: New test.
            * gcc.target/i386/pr111023-8b.c: New test.
            * gcc.target/i386/pr111023.c: New test.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111023] missing extendv4siv4hi (and friends)
  2023-08-15  7:33 [Bug target/111023] New: missing extendv4siv4hi (and friends) rguenth at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2023-08-18 17:08 ` cvs-commit at gcc dot gnu.org
@ 2023-08-21  7:26 ` ubizjak at gmail dot com
  2023-08-21  9:25 ` rguenth at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: ubizjak at gmail dot com @ 2023-08-21  7:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023

Uroš Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|ubizjak at gmail dot com           |unassigned at gcc dot gnu.org
             Status|ASSIGNED                    |NEW
                 CC|                            |ubizjak at gmail dot com

--- Comment #7 from Uroš Bizjak <ubizjak at gmail dot com> ---
The target part is now implemented (even for SSE2).

Should we keep this PR open as a tree-vectorizer enhancement?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111023] missing extendv4siv4hi (and friends)
  2023-08-15  7:33 [Bug target/111023] New: missing extendv4siv4hi (and friends) rguenth at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2023-08-21  7:26 ` ubizjak at gmail dot com
@ 2023-08-21  9:25 ` rguenth at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-08-21  9:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111023

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Let's close it, we very likely have a duplicate.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-08-21  9:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-15  7:33 [Bug target/111023] New: missing extendv4siv4hi (and friends) rguenth at gcc dot gnu.org
2023-08-15 15:23 ` [Bug target/111023] " ubizjak at gmail dot com
2023-08-17  6:46 ` rguenth at gcc dot gnu.org
2023-08-18  9:58 ` ubizjak at gmail dot com
2023-08-18  9:59 ` ubizjak at gmail dot com
2023-08-18 11:37 ` rguenth at gcc dot gnu.org
2023-08-18 17:08 ` cvs-commit at gcc dot gnu.org
2023-08-21  7:26 ` ubizjak at gmail dot com
2023-08-21  9:25 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).