public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/96246] New: [AVX512] unefficient code generatation for vpblendm*
@ 2020-07-20  6:07 crazylht at gmail dot com
  2020-07-20  7:21 ` [Bug target/96246] " rguenth at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2020-07-20  6:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246

            Bug ID: 96246
           Summary: [AVX512] unefficient code generatation for vpblendm*
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
  Target Milestone: ---
            Target: i386, x86-64

cat test.c

---
typedef int v8si __attribute__ ((__vector_size__ (32)));
v8si
foo (v8si a, v8si b, v8si c, v8si d)
{
    return a > b ? c : d;
}
---

gcc11 -O2 -mavx512f -mavx512vl

gcc generate
---
        vpcmpd  $6, %ymm1, %ymm0, %k1
        vmovdqa32       %ymm2, %ymm3{%k1}
        vmovdqa %ymm3, %ymm0 
        ret
---

could be optimized to

---
        vpcmpd  $6, %ymm1, %ymm0, %k1
        vpblendmd       %ymm2, %ymm3, %ymm0 {%k1}
---

gcc failed to generate optimal code because in sse.md

(define_insn "<avx512>_load<mode>_mask have the same pattern as 
(define_insn "<avx512>_blendm<mode>" and existed early in the file, rtx pattern
match is always recognized as <avx512>_load<mode>_mask which missed opportunity
in pass_reload, and can't combine to <avx512>_blendm<mode> after reload.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/96246] [AVX512] unefficient code generatation for vpblendm*
  2020-07-20  6:07 [Bug target/96246] New: [AVX512] unefficient code generatation for vpblendm* crazylht at gmail dot com
@ 2020-07-20  7:21 ` rguenth at gcc dot gnu.org
  2020-07-20  7:55 ` crazylht at gmail dot com
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2020-07-20  7:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2020-07-20
             Status|UNCONFIRMED                 |NEW

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
With -mavx2 it works:

        vpcmpgtd        %ymm1, %ymm0, %ymm0
        vpblendvb       %ymm0, %ymm2, %ymm3, %ymm0

not sure how _load<mode> comes into play - we expand from


  <bb 2> [local count: 1073741824]:
  _6 = .VCOND (a_2(D), b_3(D), c_4(D), d_5(D), 109);
  return _6;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/96246] [AVX512] unefficient code generatation for vpblendm*
  2020-07-20  6:07 [Bug target/96246] New: [AVX512] unefficient code generatation for vpblendm* crazylht at gmail dot com
  2020-07-20  7:21 ` [Bug target/96246] " rguenth at gcc dot gnu.org
@ 2020-07-20  7:55 ` crazylht at gmail dot com
  2020-08-13  3:28 ` cvs-commit at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2020-07-20  7:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246

--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #1)
> With -mavx2 it works:
> 
>         vpcmpgtd        %ymm1, %ymm0, %ymm0
>         vpblendvb       %ymm0, %ymm2, %ymm3, %ymm0
> 
> not sure how _load<mode> comes into play - we expand from
<avx512>_load<mode>_mask have same rtx pattern as <avx512>_blendm<mode>_, the
only difference is constraint(_load<mode>_mask has '0C' for second constraint)

---
1057 (define_insn "<avx512>_load<mode>_mask"
 1058   [(set (match_operand:V48_AVX512VL 0 "register_operand" "=v,v")
 1059         (vec_merge:V48_AVX512VL
 1060           (match_operand:V48_AVX512VL 1 "nonimmediate_operand" "v,m")
 1061           (match_operand:V48_AVX512VL 2 "nonimm_or_0_operand" "0C,0C")
 1062           (match_operand:<avx512fmaskmode> 3 "register_operand"
"Yk,Yk")))]

...


 1159 (define_insn "<avx512>_blendm<mode>"
 1160   [(set (match_operand:V48_AVX512VL 0 "register_operand" "=v")
 1161         (vec_merge:V48_AVX512VL
 1162           (match_operand:V48_AVX512VL 2 "nonimmediate_operand" "vm")
 1163           (match_operand:V48_AVX512VL 1 "register_operand" "v")
 1164           (match_operand:<avx512fmaskmode> 3 "register_operand" "Yk")))]

---
because <avx512>_load<mode>_mask existed early(in line 1057) than
<avx512>_blendm<mode> (in line 1159) in md file, after expand, the pattern is
always recognized as <avx512>_load<mode>_mask, and pass_reload will only match
'0' constraint and missed for 'v' constraint.
> 
> 
>   <bb 2> [local count: 1073741824]:
>   _6 = .VCOND (a_2(D), b_3(D), c_4(D), d_5(D), 109);
>   return _6;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/96246] [AVX512] unefficient code generatation for vpblendm*
  2020-07-20  6:07 [Bug target/96246] New: [AVX512] unefficient code generatation for vpblendm* crazylht at gmail dot com
  2020-07-20  7:21 ` [Bug target/96246] " rguenth at gcc dot gnu.org
  2020-07-20  7:55 ` crazylht at gmail dot com
@ 2020-08-13  3:28 ` cvs-commit at gcc dot gnu.org
  2020-08-13  3:29 ` crazylht at gmail dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-08-13  3:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246

--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:7123217afb33d4a2860f552ad778a819cc8dea5e

commit r11-2683-g7123217afb33d4a2860f552ad778a819cc8dea5e
Author: liuhongt <hongtao.liu@intel.com>
Date:   Tue Jul 21 15:25:20 2020 +0800

    Merge two define_insn: <avx512>_blendm<mode>, <avx512>_load<mode>_mask.

    Those two define_insns have same pattern, and <avx512>_load<mode>_mask
    would always be matched since it show up earlier in the md file, and
    it may lose some opportunity in pass_reload since
    <avx512>_load<mode>_mask only have constraint "0C" for operand2, and
    "v" constraint in <avx512>_vblendm<mode> would never be matched.

    2020-07-21  Hongtao Liu  <hongtao.liu@intel.com>

    gcc/
            PR target/96246
            * config/i386/sse.md (<avx512>_load<mode>_mask,
            <avx512>_load<mode>_mask): Extend to generate blendm
            instructions.
            (<avx512>_blendm<mode>, <avx512>_blendm<mode>): Change
            define_insn to define_expand.

    gcc/testsuite/
            * gcc.target/i386/avx512bw-pr96246-1.c: New test.
            * gcc.target/i386/avx512bw-pr96246-2.c: New test.
            * gcc.target/i386/avx512vl-pr96246-1.c: New test.
            * gcc.target/i386/avx512vl-pr96246-2.c: New test.
            * gcc.target/i386/avx512bw-vmovdqu16-1.c: Adjust test.
            * gcc.target/i386/avx512bw-vmovdqu8-1.c: Ditto.
            * gcc.target/i386/avx512f-vmovapd-1.c: Ditto.
            * gcc.target/i386/avx512f-vmovaps-1.c: Ditto.
            * gcc.target/i386/avx512f-vmovdqa32-1.c: Ditto.
            * gcc.target/i386/avx512f-vmovdqa64-1.c: Ditto.
            * gcc.target/i386/avx512vl-pr92686-movcc-1.c: Ditto.
            * gcc.target/i386/avx512vl-pr96246-1.c: Ditto.
            * gcc.target/i386/avx512vl-pr96246-2.c: Ditto.
            * gcc.target/i386/avx512vl-vmovapd-1.c: Ditto.
            * gcc.target/i386/avx512vl-vmovaps-1.c: Ditto.
            * gcc.target/i386/avx512vl-vmovdqa32-1.c: Ditto.
            * gcc.target/i386/avx512vl-vmovdqa64-1.c: Ditto.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/96246] [AVX512] unefficient code generatation for vpblendm*
  2020-07-20  6:07 [Bug target/96246] New: [AVX512] unefficient code generatation for vpblendm* crazylht at gmail dot com
                   ` (2 preceding siblings ...)
  2020-08-13  3:28 ` cvs-commit at gcc dot gnu.org
@ 2020-08-13  3:29 ` crazylht at gmail dot com
  2020-08-28 13:33 ` nathan at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2020-08-13  3:29 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC11.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/96246] [AVX512] unefficient code generatation for vpblendm*
  2020-07-20  6:07 [Bug target/96246] New: [AVX512] unefficient code generatation for vpblendm* crazylht at gmail dot com
                   ` (3 preceding siblings ...)
  2020-08-13  3:29 ` crazylht at gmail dot com
@ 2020-08-28 13:33 ` nathan at gcc dot gnu.org
  2020-08-28 15:34 ` crazylht at gmail dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: nathan at gcc dot gnu.org @ 2020-08-28 13:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246

Nathan Sidwell <nathan at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |nathan at gcc dot gnu.org
         Resolution|FIXED                       |---
             Status|RESOLVED                    |REOPENED

--- Comment #5 from Nathan Sidwell <nathan at gcc dot gnu.org> ---

FAIL: g++.target/i386/avx512bw-pr96246-2.C   execution test
FAIL: g++.target/i386/avx512vl-pr96246-2.C   execution test


the tests can fail at runtime, because they do not check 
/* { dg-do run { avx512f_runtime } } */

or alternatively use 
 #include "avx512f-check.h"

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/96246] [AVX512] unefficient code generatation for vpblendm*
  2020-07-20  6:07 [Bug target/96246] New: [AVX512] unefficient code generatation for vpblendm* crazylht at gmail dot com
                   ` (4 preceding siblings ...)
  2020-08-28 13:33 ` nathan at gcc dot gnu.org
@ 2020-08-28 15:34 ` crazylht at gmail dot com
  2020-09-03  2:26 ` cvs-commit at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2020-08-28 15:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246

--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Nathan Sidwell from comment #5)
> FAIL: g++.target/i386/avx512bw-pr96246-2.C   execution test
> FAIL: g++.target/i386/avx512vl-pr96246-2.C   execution test
> 
> 
> the tests can fail at runtime, because they do not check 
> /* { dg-do run { avx512f_runtime } } */
> 
> or alternatively use 
>  #include "avx512f-check.h"

Thanks for the report, i'll adjust the test.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/96246] [AVX512] unefficient code generatation for vpblendm*
  2020-07-20  6:07 [Bug target/96246] New: [AVX512] unefficient code generatation for vpblendm* crazylht at gmail dot com
                   ` (5 preceding siblings ...)
  2020-08-28 15:34 ` crazylht at gmail dot com
@ 2020-09-03  2:26 ` cvs-commit at gcc dot gnu.org
  2020-09-03  2:32 ` crazylht at gmail dot com
  2020-09-03  4:47 ` crazylht at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2020-09-03  2:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:8bd5530bfa136663f1fa79e9a1d3932b5adf15bd

commit r11-2990-g8bd5530bfa136663f1fa79e9a1d3932b5adf15bd
Author: liuhongt <hongtao.liu@intel.com>
Date:   Mon Aug 31 10:54:13 2020 +0800

    Adjust testcase.

    gcc/testsuite/ChangeLog:
            PR target/96246
            PR target/96855
            PR target/96856
            PR target/96857
            * g++.target/i386/avx512bw-pr96246-2.C: Add runtime check for
            AVX512BW.
            * g++.target/i386/avx512vl-pr96246-2.C: Add runtime check for
            AVX512BW and AVX512VL
            * g++.target/i386/avx512f-helper.h: New header.
            * gcc.target/i386/pr92658-avx512f.c: Add
            -mprefer-vector-width=512 to avoid impact of different default
            mtune which gcc is built with.
            * gcc.target/i386/avx512bw-pr95488-1.c: Ditto.
            * gcc.target/i386/pr92645-4.c: Add -mno-avx512f to avoid
            impact of different default march which gcc is built with.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/96246] [AVX512] unefficient code generatation for vpblendm*
  2020-07-20  6:07 [Bug target/96246] New: [AVX512] unefficient code generatation for vpblendm* crazylht at gmail dot com
                   ` (6 preceding siblings ...)
  2020-09-03  2:26 ` cvs-commit at gcc dot gnu.org
@ 2020-09-03  2:32 ` crazylht at gmail dot com
  2020-09-03  4:47 ` crazylht at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2020-09-03  2:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246

--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
Should be fixed in GCC11.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/96246] [AVX512] unefficient code generatation for vpblendm*
  2020-07-20  6:07 [Bug target/96246] New: [AVX512] unefficient code generatation for vpblendm* crazylht at gmail dot com
                   ` (7 preceding siblings ...)
  2020-09-03  2:32 ` crazylht at gmail dot com
@ 2020-09-03  4:47 ` crazylht at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2020-09-03  4:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96246

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|REOPENED                    |RESOLVED

--- Comment #9 from Hongtao.liu <crazylht at gmail dot com> ---
.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-09-03  4:47 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-20  6:07 [Bug target/96246] New: [AVX512] unefficient code generatation for vpblendm* crazylht at gmail dot com
2020-07-20  7:21 ` [Bug target/96246] " rguenth at gcc dot gnu.org
2020-07-20  7:55 ` crazylht at gmail dot com
2020-08-13  3:28 ` cvs-commit at gcc dot gnu.org
2020-08-13  3:29 ` crazylht at gmail dot com
2020-08-28 13:33 ` nathan at gcc dot gnu.org
2020-08-28 15:34 ` crazylht at gmail dot com
2020-09-03  2:26 ` cvs-commit at gcc dot gnu.org
2020-09-03  2:32 ` crazylht at gmail dot com
2020-09-03  4:47 ` crazylht at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).