From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x112c.google.com (mail-yw1-x112c.google.com [IPv6:2607:f8b0:4864:20::112c]) by sourceware.org (Postfix) with ESMTPS id A39243858D35 for ; Sun, 25 Jun 2023 04:58:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A39243858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yw1-x112c.google.com with SMTP id 00721157ae682-57028539aadso22634027b3.2 for ; Sat, 24 Jun 2023 21:58:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687669099; x=1690261099; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Yv/ViWEd8TVq7rwmWdi85zTAYX3ox6v3tl+qFE2/uLA=; b=OqYgmcftR8cn8l0ISbk6jUbxyRDvO10Pj6BulRTFpyvU8yFnfFqREKE/l9DM7uIUaf zqB41wbXwJSgjGhZy8WLkbcIIWybyZKozmJ1UH67P3InkSLyksy23zbYhNgCalH0+u9l WPmTb+57HH0yZynBKFFBP+YEjww5jktUdXWv28eBovMZ5yvZyzGdo7CORJ10pgkXq2Ek IZoTS5NnGnjmSOJEu2Wru3AcFlnhvLUFulk7nM81epls/Wd+QkKi/RYyAG9IuJctPM3d dnkrk7wHlnPdVS8VZDF08IE2VZz2gUrlQ60I0DapX0cfuwcxSC00PG5X1C/F885daKZ+ 8ssw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687669099; x=1690261099; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Yv/ViWEd8TVq7rwmWdi85zTAYX3ox6v3tl+qFE2/uLA=; b=krsu8OjcsNz2we+nWHsr87KIPOK/z5XRIyDDAsT1cIrAD7HrcHwvrd4nb+4pdxCTsF Ru5KQW9+/Oof3xROxPkvwzWSn26jt7BMn3BXt6U4LBEmLQGEgH/EgGRqGcggmXWLicW6 +2UtobfOqbwRCS4bxG1dxym99yNYLqYJIw5jJ/rpm6+D5hKRGslSjrZE6m8IskXNWvzv GF4n4BJuv7CoHyNKnXNiD705yJ4jI+0f+D8TE8qQhOWVxuIb7v45mKHff4QgJ3VeWDdu EloHMciWUONCt09q4//H+TvknGL21fwyxIQzgiDLbOPtlmi2VM5p7jX8lSok7avs5Bso UiSQ== X-Gm-Message-State: AC+VfDx4pgl/0QsIt/3o6dxJeD1DFPmREsQw61Eid5Z9lEb1cvjuAzF4 DnLkVnrWPFxYWQbryGyQhZitl58xPDE8zwWDM1I= X-Google-Smtp-Source: ACHHUZ4xIIQw7nr1uS0TDstdFrpOHxBh4eRLCtXNVRLfP/bGMACUgiqwr3xVZNqOtI4d6glNhOt2p0txZ+KmV0yqjho= X-Received: by 2002:a81:5211:0:b0:56c:e371:ab0 with SMTP id g17-20020a815211000000b0056ce3710ab0mr31018697ywb.5.1687669098918; Sat, 24 Jun 2023 21:58:18 -0700 (PDT) MIME-Version: 1.0 References: <04f99abe-a563-d093-23b7-4abf0f91633d@suse.com> <3cf55c98-d18a-d1ad-2fc2-015c63e217ca@suse.com> In-Reply-To: <3cf55c98-d18a-d1ad-2fc2-015c63e217ca@suse.com> From: Hongtao Liu Date: Sun, 25 Jun 2023 12:58:08 +0800 Message-ID: Subject: Re: [PATCH 2/5] x86: use VPTERNLOG also for certain andnot forms To: Jan Beulich Cc: "gcc-patches@gcc.gnu.org" , Hongtao Liu , Kirill Yukhin Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Jun 21, 2023 at 2:27=E2=80=AFPM Jan Beulich via Gcc-patches wrote: > > When it's the memory operand which is to be inverted, using VPANDN* > requires a further load instruction. The same can be achieved by a > single VPTERNLOG*. Add two new alternatives (for plain memory and > embedded broadcast), adjusting the predicate for the first operand > accordingly. > > Two pre-existing testcases actually end up being affected (improved) by > the change, which is reflected in updated expectations there. LGTM. > > gcc/ > > PR target/93768 > * config/i386/sse.md (*andnot3): Add new alternatives > for memory form operand 1. > > gcc/testsuite/ > > PR target/93768 > * gcc.target/i386/avx512f-andn-di-zmm-2.c: New test. > * gcc.target/i386/avx512f-andn-si-zmm-2.c: Adjust expecations > towards generated code. > * gcc.target/i386/pr100711-3.c: Adjust expectations for 32-bit > code. > > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -17210,11 +17210,13 @@ > "TARGET_AVX512F") > > (define_insn "*andnot3" > - [(set (match_operand:VI 0 "register_operand" "=3Dx,x,v") > + [(set (match_operand:VI 0 "register_operand" "=3Dx,x,v,v,v") > (and:VI > - (not:VI (match_operand:VI 1 "vector_operand" "0,x,v")) > - (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr")))] > - "TARGET_SSE" > + (not:VI (match_operand:VI 1 "bcst_vector_operand" "0,x,v,m,Br")= ) > + (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr,v,v")))] > + "TARGET_SSE > + && (register_operand (operands[1], mode) > + || register_operand (operands[2], mode))" > { > char buf[64]; > const char *ops; > @@ -17281,6 +17283,15 @@ > case 2: > ops =3D "v%s%s\t{%%2, %%1, %%0|%%0, %%1, %%2}"; > break; > + case 3: > + case 4: > + tmp =3D "pternlog"; > + ssesuffix =3D ""; > + if (which_alternative !=3D 4 || TARGET_AVX512VL) > + ops =3D "v%s%s\t{$0x44, %%1, %%2, %%0|%%0, %%2, %%1, $0x44}"; > + else > + ops =3D "v%s%s\t{$0x44, %%g1, %%g2, %%g0|%%g0, %%g2, %%g1, $0x44}= "; > + break; > default: > gcc_unreachable (); > } > @@ -17289,7 +17300,7 @@ > output_asm_insn (buf, operands); > return ""; > } > - [(set_attr "isa" "noavx,avx,avx") > + [(set_attr "isa" "noavx,avx,avx,*,*") > (set_attr "type" "sselog") > (set (attr "prefix_data16") > (if_then_else > @@ -17297,9 +17308,12 @@ > (eq_attr "mode" "TI")) > (const_string "1") > (const_string "*"))) > - (set_attr "prefix" "orig,vex,evex") > + (set_attr "prefix" "orig,vex,evex,evex,evex") > (set (attr "mode") > - (cond [(match_test "TARGET_AVX2") > + (cond [(and (eq_attr "alternative" "3,4") > + (match_test " < 64 && !TARGET_AVX512VL")) > + (const_string "XI") > + (match_test "TARGET_AVX2") > (const_string "") > (match_test "TARGET_AVX") > (if_then_else > @@ -17310,7 +17324,15 @@ > (match_test "optimize_function_for_size_p (cfun)")) > (const_string "V4SF") > ] > - (const_string "")))]) > + (const_string ""))) > + (set (attr "enabled") > + (cond [(eq_attr "alternative" "3") > + (symbol_ref " =3D=3D 64 || TARGET_AVX512VL") > + (eq_attr "alternative" "4") > + (symbol_ref " =3D=3D 64 || TARGET_AVX512VL > + || (TARGET_AVX512F && !TARGET_PREFER_AVX256= )") > + ] > + (const_string "*")))]) > > ;; PR target/100711: Split notl; vpbroadcastd; vpand as vpbroadcastd; vp= andn > (define_split > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/avx512f-andn-di-zmm-2.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-mavx512f -mno-avx512vl -mprefer-vector-width=3D512 -O2= " } */ > +/* { dg-final { scan-assembler-times "vpternlogq\[ \\t\]+\\\$0x44, \\(%(= ?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */ > +/* { dg-final { scan-assembler-not "vpbroadcast" } } */ > + > +#define type __m512i > +#define vec 512 > +#define op andnot > +#define suffix epi64 > +#define SCALAR long long > + > +#include "avx512-binop-2.h" > --- a/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c > +++ b/gcc/testsuite/gcc.target/i386/avx512f-andn-si-zmm-2.c > @@ -1,7 +1,7 @@ > /* { dg-do compile } */ > /* { dg-options "-mavx512f -O2" } */ > -/* { dg-final { scan-assembler-times "vpbroadcastd\[^\n\]*%zmm\[0-9\]+" = 1 } } */ > -/* { dg-final { scan-assembler-times "vpandnd\[^\n\]*%zmm\[0-9\]+" 1 } }= */ > +/* { dg-final { scan-assembler-times "vpternlogd\[ \\t\]+\\\$0x44, \\(%(= ?:eax|rdi|edi)\\)\\\{1to\[1-8\]+\\\}, %zmm\[0-9\]+, %zmm0" 1 } } */ > +/* { dg-final { scan-assembler-not "vpbroadcast" } } */ > > #define type __m512i > #define vec 512 > --- a/gcc/testsuite/gcc.target/i386/pr100711-3.c > +++ b/gcc/testsuite/gcc.target/i386/pr100711-3.c > @@ -37,4 +37,6 @@ v8di foo_v8di (long long a, v8di b) > return (__extension__ (v8di) {~a, ~a, ~a, ~a, ~a, ~a, ~a, ~a}) & b; > } > > -/* { dg-final { scan-assembler-times "vpandn" 4 } } */ > +/* { dg-final { scan-assembler-times "vpandn" 4 { target { ! ia32 } } } = } */ > +/* { dg-final { scan-assembler-times "vpandn" 2 { target { ia32 } } } } = */ > +/* { dg-final { scan-assembler-times "vpternlog\[dq\]\[ \\t\]+\\\$0x44" = 2 { target { ia32 } } } } */ > --=20 BR, Hongtao