From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb29.google.com (mail-yb1-xb29.google.com [IPv6:2607:f8b0:4864:20::b29]) by sourceware.org (Postfix) with ESMTPS id 6DBB93857700 for ; Wed, 24 May 2023 09:01:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6DBB93857700 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-xb29.google.com with SMTP id 3f1490d57ef6-ba82956d3e0so1317293276.0 for ; Wed, 24 May 2023 02:01:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684918895; x=1687510895; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=8WCYsanBTXIYkHeH6GoAbVOw558GGvkbLyOeq/3nEw0=; b=bFqZbtoIkXHAcLO3rG3PYQM1TtOa+CP0Et7bnQwaY3TgZBoELc4/KozDf3cWwdC7LF t/acx8Q2xSFpiIT7cfQW0qJZllKdU8p3debzfnaNEm6Z+UQpCsXJ3C6N52sxIrPEje4n UWx5MdMUoAdK7nA8OYksjY+6pPuChkx7Vz7pX1+n9wX07nIemyEUiJsQiuLJz9/klmt6 BVBw8st4fzB6PJZprCkb3wmdpMeq0RDeZfVTEuDjI6a7/IRMhOylQQF20mrgcEKDCNm9 SnNIne3Dp2fX+uk43c29uzV8RQPpJvqLzzhS4ijlxc9Mt0sYRYMA+Ftvs80R8VNkB31L vuIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684918895; x=1687510895; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8WCYsanBTXIYkHeH6GoAbVOw558GGvkbLyOeq/3nEw0=; b=f0fTqrLKdkCQ0FU/hBF/esNidTaWtpmo9r6voPF17C+MO9JmoCibcwoN4co33GfxEH KrCqM32BPb+LD9A3cxotPtpzPyiyqdv3E98FuXAUaW1RfUQn1FraHkUd7mTuHM0ukfoM kVXAIkBkBhAxbZ4zs3W/8zi5ne/SqYoQJqSOyExYAFwnMS8n8QtB6xzHHc/sqSncrMO1 yzV5bbJrcguNBd9zRcE17KjnUgOJPW7zNUwGH0XpgQNSWHfxh0yCCoMVWIFLUFtbsjn0 dUKJb40cVAKqeCXnJnP98lEZqkDeh8SydCBVAX2x934e3C5M/q/jArPddLMcyUZU7MhC zE8A== X-Gm-Message-State: AC+VfDwTUrepP0Fp2bsSDPkAHuDIXn8yUpoQPDNSc6mjColBwdu+EIE9 tRpmfnH2ENuWaLlWht5R9gMNtwyNme1y+fX05eM= X-Google-Smtp-Source: ACHHUZ43oOu5DptiTexfTeMBGQj33phynk3lQHw5WiEKEJ64bjip+boWuNebFkeyM84KFPSDJcCIIfgbANrYN8q1c18= X-Received: by 2002:a25:1e09:0:b0:bac:46f2:8d0f with SMTP id e9-20020a251e09000000b00bac46f28d0fmr7602247ybe.3.1684918894695; Wed, 24 May 2023 02:01:34 -0700 (PDT) MIME-Version: 1.0 References: <999cd9e7-c20a-2992-590e-82ef01506604@suse.com> In-Reply-To: <999cd9e7-c20a-2992-590e-82ef01506604@suse.com> From: Hongtao Liu Date: Wed, 24 May 2023 17:01:23 +0800 Message-ID: Subject: Re: x86: making better use of vpternlog{d,q} To: Jan Beulich Cc: "gcc@gcc.gnu.org" , Kirill Yukhin , Hongtao Liu Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, May 24, 2023 at 3:58=E2=80=AFPM Jan Beulich via Gcc wrote: > > Hello, > > for a couple of years I was meaning to extend the use of these AVX512F > insns beyond the pretty minimalistic ones there are so far. Now that I've > got around to at least draft something, I ran into a couple of issues I > cannot explain. I'd like to start with understanding the unexpected > effects of a change to an existing insn I have made (reproduced at the > bottom). I certainly was prepared to observe testsuite failures, but it > ends up failing tests I didn't expect it would fail, and - upon looking > at sibling ones - also ends up leaving intact tests which I would expect > would then need adjustment (because of using the new alternative). > > In particular (all mentioned tests are in gcc.target/i386/) > - avx512f-andn-si-zmm-1.c (and its AVX512VL counterparts) fails because > for whatever reason generated code reverts back to using vpbroadcastd, > - avx512f-andn-di-zmm-1.c, otoh, is unaffected (i.e. continues to use > vpandnq with embedded broadcast), > - avx512f-andn-si-zmm-2.c doesn't use the new 4th insn alternative when > at the same time a made-up DI variant of the test (akin to what might > be an avx512f-andn-di-zmm-2.c testcase) does. > IOW: How is SI mode element size different here from DI mode one? Is > there anything wrong with the 4th alternative I'm adding, or is this > hinting at some anomaly elsewhere? __m512i is defined as __v8di, when it's used for _mm512_andnot_epi32, it's explicitlt converted to (__v16si) and creates an extra subreg which is not needed for DImode cases. And pass_combine try to match the below pattern but failed due to the condition REG_P (operands[1]) || REG_P (operands[2]). Here I think you want register_operand instead of REG_P. 157(set (reg:V16SI 91) 158 (and:V16SI (not:V16SI (subreg:V16SI (reg:V8DI 98) 0)) 159 (vec_duplicate:V16SI (mem:SI (reg:DI 99) [1 *f_3(D)+0 S4 A32])))= ) > > Just to mention it, avx512f-andn-si-zmm-5.c similarly fails > unexpectedly, but I guess for the same reason (and there aren't AVX512VL > or DI mode element counterparts thereof). > > Jan > > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -17019,11 +17019,11 @@ > "TARGET_AVX512F") > > (define_insn "*andnot3" > - [(set (match_operand:VI 0 "register_operand" "=3Dx,x,v") > + [(set (match_operand:VI 0 "register_operand" "=3Dx,x,v,v") > (and:VI > - (not:VI (match_operand:VI 1 "vector_operand" "0,x,v")) > - (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr")))] > - "TARGET_SSE" > + (not:VI (match_operand:VI 1 "bcst_vector_operand" "0,x,v,mBr")) > + (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr,v")))] > + "TARGET_SSE && (REG_P (operands[1]) || REG_P (operands[2]))" > { > char buf[64]; > const char *ops; > @@ -17090,6 +17090,11 @@ > case 2: > ops =3D "v%s%s\t{%%2, %%1, %%0|%%0, %%1, %%2}"; > break; > + case 3: > + tmp =3D "pternlog"; > + ssesuffix =3D ""; > + ops =3D "v%s%s\t{$0x44, %%1, %%2, %%0|%%0, %%2, %%1, $0x44}"; > + break; > default: > gcc_unreachable (); > } > @@ -17098,7 +17103,7 @@ > output_asm_insn (buf, operands); > return ""; > } > - [(set_attr "isa" "noavx,avx,avx") > + [(set_attr "isa" "noavx,avx,avx,avx512f") > (set_attr "type" "sselog") > (set (attr "prefix_data16") > (if_then_else > @@ -17106,7 +17111,7 @@ > (eq_attr "mode" "TI")) > (const_string "1") > (const_string "*"))) > - (set_attr "prefix" "orig,vex,evex") > + (set_attr "prefix" "orig,vex,evex,evex") > (set (attr "mode") > (cond [(match_test "TARGET_AVX2") > (const_string "") > @@ -17119,7 +17124,11 @@ > (match_test "optimize_function_for_size_p (cfun)")) > (const_string "V4SF") > ] > - (const_string "")))]) > + (const_string ""))) > + (set (attr "enabled") > + (if_then_else (eq_attr "alternative" "3") > + (symbol_ref " =3D=3D 64 ? TARGET_AVX512F= : TARGET_AVX512VL") > + (const_string "*")))]) > > ;; PR target/100711: Split notl; vpbroadcastd; vpand as vpbroadcastd; vp= andn > (define_split --=20 BR, Hongtao