public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/104946] New: [12 regression] Suboptimal gimple foding for blendvpd under sse4.1
@ 2022-03-16  5:54 crazylht at gmail dot com
  2022-03-16  5:58 ` [Bug target/104946] " crazylht at gmail dot com
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: crazylht at gmail dot com @ 2022-03-16  5:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104946

            Bug ID: 104946
           Summary: [12 regression] Suboptimal gimple foding for blendvpd
                    under sse4.1
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
  Target Milestone: ---

When working on PR104666, i found

cat test.c

typedef double __m128d __attribute__((__vector_size__(16), __may_alias__));
__m128d sse4_1_blendvpd (__m128d a, __m128d b, __m128d c)
__attribute__((__target__("sse4.1")));

__m128d
generic_blendvpd (__m128d a, __m128d b, __m128d c)
{
  return __builtin_ia32_blendvpd (a, b, c);
}

gcc -O2 -msse4.1 -mno-sse4.2

generic_blendvpd:
        movq    rax, xmm2
        movapd  xmm3, xmm0
        test    rax, rax
        jns     .L3
        movapd  xmm0, xmm1
.L3:
        pextrq  rax, xmm2, 1
        unpckhpd        xmm3, xmm3
        test    rax, rax
        jns     .L5
        unpckhpd        xmm1, xmm1
        movapd  xmm3, xmm1
.L5:
        unpcklpd        xmm0, xmm3
        ret

It's because it pcmpgtq is under sse4.2 w/o which vec_cmpv2di will be lower to
scalar operations and not combined back.

w/ sse4.2 gcc can generate optimal code.

generic_blendvpd:
        movapd  xmm3, xmm0
        movdqa  xmm0, xmm2
        blendvpd        xmm3, xmm1, xmm0
        movapd  xmm0, xmm3
        ret

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/104946] [12 regression] Suboptimal gimple foding for blendvpd under sse4.1
  2022-03-16  5:54 [Bug target/104946] New: [12 regression] Suboptimal gimple foding for blendvpd under sse4.1 crazylht at gmail dot com
@ 2022-03-16  5:58 ` crazylht at gmail dot com
  2022-03-16  7:53 ` rguenth at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: crazylht at gmail dot com @ 2022-03-16  5:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104946

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
             Target|                            |x86_64-*-* i?86-*-*

--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---
I think we should restrict gimple folding for __builtin_ia32_blendvpd under
TARGET_SSE4_2.

For other blendv builtins, corresponding vec_cmp is available as long as
builtin isa matches.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/104946] [12 regression] Suboptimal gimple foding for blendvpd under sse4.1
  2022-03-16  5:54 [Bug target/104946] New: [12 regression] Suboptimal gimple foding for blendvpd under sse4.1 crazylht at gmail dot com
  2022-03-16  5:58 ` [Bug target/104946] " crazylht at gmail dot com
@ 2022-03-16  7:53 ` rguenth at gcc dot gnu.org
  2022-03-16  8:57 ` cvs-commit at gcc dot gnu.org
  2022-03-16  9:00 ` crazylht at gmail dot com
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-03-16  7:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104946

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/104946] [12 regression] Suboptimal gimple foding for blendvpd under sse4.1
  2022-03-16  5:54 [Bug target/104946] New: [12 regression] Suboptimal gimple foding for blendvpd under sse4.1 crazylht at gmail dot com
  2022-03-16  5:58 ` [Bug target/104946] " crazylht at gmail dot com
  2022-03-16  7:53 ` rguenth at gcc dot gnu.org
@ 2022-03-16  8:57 ` cvs-commit at gcc dot gnu.org
  2022-03-16  9:00 ` crazylht at gmail dot com
  3 siblings, 0 replies; 5+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-03-16  8:57 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104946

--- Comment #2 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:570d5bff9af537265a3e0935140786e5fdf51de1

commit r12-7662-g570d5bff9af537265a3e0935140786e5fdf51de1
Author: liuhongt <hongtao.liu@intel.com>
Date:   Wed Mar 16 15:59:57 2022 +0800

    Don't fold __builtin_ia32_blendvpd w/o sse4.2.

    __builtin_ia32_blendvpd is defined under sse4.1 and gimple folded
    to ((v2di) c) < 0 ? b : a where vec_cmpv2di is under sse4.2 w/o which
    it's veclowered to scalar operations and not combined back in rtl.

    gcc/ChangeLog:

            PR target/104946
            * config/i386/i386-builtin.def (BDESC): Add
            CODE_FOR_sse4_1_blendvpd for IX86_BUILTIN_BLENDVPD.
            * config/i386/i386.cc (ix86_gimple_fold_builtin): Don't fold
            __builtin_ia32_blendvpd w/o sse4.2

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/sse4_1-blendvpd-1.c: New test.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug target/104946] [12 regression] Suboptimal gimple foding for blendvpd under sse4.1
  2022-03-16  5:54 [Bug target/104946] New: [12 regression] Suboptimal gimple foding for blendvpd under sse4.1 crazylht at gmail dot com
                   ` (2 preceding siblings ...)
  2022-03-16  8:57 ` cvs-commit at gcc dot gnu.org
@ 2022-03-16  9:00 ` crazylht at gmail dot com
  3 siblings, 0 replies; 5+ messages in thread
From: crazylht at gmail dot com @ 2022-03-16  9:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104946

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED

--- Comment #3 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC12.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-03-16  9:00 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-16  5:54 [Bug target/104946] New: [12 regression] Suboptimal gimple foding for blendvpd under sse4.1 crazylht at gmail dot com
2022-03-16  5:58 ` [Bug target/104946] " crazylht at gmail dot com
2022-03-16  7:53 ` rguenth at gcc dot gnu.org
2022-03-16  8:57 ` cvs-commit at gcc dot gnu.org
2022-03-16  9:00 ` crazylht at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).