From: Hongtao Liu <crazylht@gmail.com>
To: Jan Beulich <jbeulich@suse.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
Hongtao Liu <hongtao.liu@intel.com>,
Kirill Yukhin <kirill.yukhin@gmail.com>
Subject: Re: [PATCH 2/5] x86: use VPTERNLOG also for certain andnot forms
Date: Sun, 25 Jun 2023 12:58:08 +0800 [thread overview]
Message-ID: <CAMZc-by_Q6P=zzT9fXEkU3o8ddsH_Jr6nybrc_S=7HByyfJcJA@mail.gmail.com> (raw)
In-Reply-To: <3cf55c98-d18a-d1ad-2fc2-015c63e217ca@suse.com>
On Wed, Jun 21, 2023 at 2:27 PM Jan Beulich via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> When it's the memory operand which is to be inverted, using VPANDN*
> requires a further load instruction. The same can be achieved by a
> single VPTERNLOG*. Add two new alternatives (for plain memory and
> embedded broadcast), adjusting the predicate for the first operand
> accordingly.
>
> Two pre-existing testcases actually end up being affected (improved) by
> the change, which is reflected in updated expectations there.
LGTM.
>
> gcc/
>
> PR target/93768
> * config/i386/sse.md (*andnot<mode>3): Add new alternatives
> for memory form operand 1.
>
> gcc/testsuite/
>
> PR target/93768
> * gcc.target/i386/avx512f-andn-di-zmm-2.c: New test.
> * gcc.target/i386/avx512f-andn-si-zmm-2.c: Adjust expecations
> towards generated code.
> * gcc.target/i386/pr100711-3.c: Adjust expectations for 32-bit
> code.
>
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -17210,11 +17210,13 @@
> "TARGET_AVX512F")
>
> (define_insn "*andnot<mode>3"
> - [(set (match_operand:VI 0 "register_operand" "=x,x,v")
> + [(set (match_operand:VI 0 "register_operand" "=x,x,v,v,v")
> (and:VI
> - (not:VI (match_operand:VI 1 "vector_operand" "0,x,v"))
> - (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr")))]
> - "TARGET_SSE"
> + (not:VI (match_operand:VI 1 "bcst_vector_operand" "0,x,v,m,Br"))
> + (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr,v,v")))]
> + "TARGET_SSE
> + && (register_operand (operands[1], <MODE>mode)
> + || register_operand (operands[2], <MODE>mode))"
> {
> char buf[64];
> const char *ops;
> @@ -17281,6 +17283,15 @@
> case 2:
> ops = "v%s%s\t{%%2, %%1, %%0|%%0, %%1, %%2}";
> break;
> + case 3:
> + case 4:
> + tmp = "pternlog";
> + ssesuffix = "<ternlogsuffix>";
> + if (which_alternative != 4 || TARGET_AVX512VL)
> + ops = "v%s%s\t{$0x44, %%1, %%2, %%0|%%0, %%2, %%1, $0x44}";
> + else
> + ops = "v%s%s\t{$0x44, %%g1, %%g2, %%g0|%%g0, %%g2, %%g1, $0x44}";
> + break;
> default:
> gcc_unreachable ();
> }
> @@ -17289,7 +17300,7 @@
> output_asm_insn (buf, operands);
> return "";
> }
> - [(set_attr "isa" "noavx,avx,avx")
> + [(set_attr "isa" "noavx,avx,avx,*,*")
> (set_attr "type" "sselog")
> (set (attr "prefix_data16")
> (if_then_else
> @@ -17297,9 +17308,12 @@
> (eq_attr "mode" "TI"))
> (const_string "1")
> (const_string "*")))
> - (set_attr "prefix" "orig,vex,evex")
> + (set_attr "prefix" "orig,vex,evex,evex,evex")
> (set (attr "mode")
> - (cond [(match_test "TARGET_AVX2")
> + (cond [(and (eq_attr "alternative" "3,4")
> + (match_test "<MODE_SIZE> < 64 && !TARGET_AVX512VL"))
> + (const_string "XI")
> + (match_test "TARGET_AVX2")
> (const_string "<sseinsnmode>")
> (match_test "TARGET_AVX")
> (if_then_else
> @@ -17310,7 +17324,15 @@
> (match_test "optimize_function_for_size_p (cfun)"))
> (const_string "V4SF")
> ]
> - (const_string "<sseinsnmode>")))])
> + (const_string "<sseinsnmode>")))
> + (set (attr "enabled")
> + (cond [(eq_attr "alternative" "3")
> + (symbol_ref "<MODE_SIZE> == 64 || TARGET_AVX512VL")
> + (eq_attr "alternative" "4")
> + (symbol_ref "<MODE_SIZE> == 64 || TARGET_AVX512VL
> + || (TARGET_AVX512F && !TARGET_PREFER_AVX256)")
> + ]
> + (const_string "*")))])
>
> ;; PR target/100711: Split notl; vpbroadcastd; vpand as vpbroadcastd; vpandn
> (define_split
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512f-andn-di-zmm-2.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512f -mno-avx512vl -mprefer-vector-width=512 -O2" } */
> +/* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\\\$0x44, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */
> +/* { dg-final { scan-assembler-not "vpbroadcast" } } */
> +
> +#define type __m512i
> +#define vec 512
> +#define op andnot
> +#define suffix epi64
> +#define SCALAR long long
> +
> +#include "avx512-binop-2.h"
> --- a/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c
> +++ b/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c
> @@ -1,7 +1,7 @@
> /* { dg-do compile } */
> /* { dg-options "-mavx512f -O2" } */
> -/* { dg-final { scan-assembler-times "vpbroadcastd\[^\n\]*%zmm\[0-9\]+" 1 } } */
> -/* { dg-final { scan-assembler-times "vpandnd\[^\n\]*%zmm\[0-9\]+" 1 } } */
> +/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$0x44, \\(%(?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */
> +/* { dg-final { scan-assembler-not "vpbroadcast" } } */
>
> #define type __m512i
> #define vec 512
> --- a/gcc/testsuite/gcc.target/i386/pr100711-3.c
> +++ b/gcc/testsuite/gcc.target/i386/pr100711-3.c
> @@ -37,4 +37,6 @@ v8di foo_v8di (long long a, v8di b)
> return (__extension__ (v8di) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) & b;
> }
>
> -/* { dg-final { scan-assembler-times "vpandn" 4 } } */
> +/* { dg-final { scan-assembler-times "vpandn" 4 { target { ! ia32 } } } } */
> +/* { dg-final { scan-assembler-times "vpandn" 2 { target { ia32 } } } } */
> +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0x44" 2 { target { ia32 } } } } */
>
--
BR,
Hongtao
next prev parent reply other threads:[~2023-06-25 4:58 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-21 6:24 [PATCH 0/5] x86: make better use of VPTERNLOG{D,Q} Jan Beulich
2023-06-21 6:25 ` [PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations Jan Beulich
2023-06-25 4:42 ` Hongtao Liu
2023-06-25 5:52 ` Jan Beulich
2023-06-25 7:13 ` Hongtao Liu
2023-06-25 7:23 ` Hongtao Liu
2023-06-25 7:30 ` Hongtao Liu
2023-06-25 13:35 ` Jan Beulich
2023-06-26 0:42 ` Hongtao Liu
2023-06-21 6:27 ` [PATCH 2/5] x86: use VPTERNLOG also for certain andnot forms Jan Beulich
2023-06-25 4:58 ` Hongtao Liu [this message]
2023-06-21 6:27 ` [PATCH 3/5] x86: allow memory operand for AVX2 splitter for PR target/100711 Jan Beulich
2023-06-25 4:58 ` Hongtao Liu
2023-06-21 6:27 ` [PATCH 4/5] x86: further PR target/100711-like splitting Jan Beulich
2023-06-25 5:06 ` Hongtao Liu
2023-06-25 6:16 ` Jan Beulich
2023-06-25 6:27 ` Hongtao Liu
2023-06-21 6:28 ` [PATCH 5/5] x86: yet more " Jan Beulich
2023-06-25 5:12 ` Hongtao Liu
2023-06-25 6:25 ` Jan Beulich
2023-06-25 6:35 ` Hongtao Liu
2023-06-25 6:41 ` Hongtao Liu
2023-11-06 11:10 ` Jan Beulich
2023-11-06 13:48 ` Hongtao Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMZc-by_Q6P=zzT9fXEkU3o8ddsH_Jr6nybrc_S=7HByyfJcJA@mail.gmail.com' \
--to=crazylht@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hongtao.liu@intel.com \
--cc=jbeulich@suse.com \
--cc=kirill.yukhin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).