From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yw1-x112e.google.com (mail-yw1-x112e.google.com [IPv6:2607:f8b0:4864:20::112e]) by sourceware.org (Postfix) with ESMTPS id 45BF63858D35 for ; Sun, 25 Jun 2023 07:30:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 45BF63858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yw1-x112e.google.com with SMTP id 00721157ae682-570114e1feaso24176437b3.3 for ; Sun, 25 Jun 2023 00:30:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687678225; x=1690270225; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yy065GvUmK+RjhkVp4bhXHX85CBuI/EX4NDDavxk+os=; b=GpBbhoY793L87M/5iX/+lGL+WF8DK0lwP6+BOO01F87ib2EMldCj2m7bhjJ88mO0NM s4JaAKn9KXt8pNxy7Sg/IDPTsvyLP+AoBNWzqcAc96+rh0mhzKbX5duWOLagsQomSZh3 /UbUD7ge2PziLqmSrN6effSqrJyJkkBytOV4iqTpWasZmmqGb7qgMohymquMYloJ/AQp OsmbNgO5AX/XiKsrJR9hp23ks7KsChC2NHIXFrnrFmRCXLRgyPZxW1vA1NqqHNcwLWrz TuD4cn25g+f1eHa1ZfiNyiK4EdCzq6K+gy/595nMR2oDZiOAulioBldWCz70oqLxmeqw ftdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687678225; x=1690270225; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yy065GvUmK+RjhkVp4bhXHX85CBuI/EX4NDDavxk+os=; b=B15Li/C2ktnIQmIBb+9v297ecXt9R2AVaQ2Yo+dMovs1PWmOF2JYRhd9YpbptDNBBK uBE2ue9vXldEvhqjSXY3fU/aTBv/FWjVDA15ZBy8wx+FcyIUVRududhmcWpTsQbvQsV6 px0nyPyq3BrEaIXWDP545OYuLXT1z+XSpE/NJumyrZf7DwZCTczmqKfXzwsPpHNZkW5N dZd2f3YW3vICt7AnSNu9uriBAfc6SJ9gDRc4BAUoLMiTojPNcl9fSBJnnz85WQeAMEul NbvGnbqjHGx1Cja5oljpRejMO6hAC/GaH6kkaU5DV3CpVjVVBgyIzz71Ucs0xXBX9TlW EnjA== X-Gm-Message-State: AC+VfDxkZC722pm3uZu73EKIl4GtnStDCXfeCmN0KbgiJtZ1yBBAB8UD orrWIaQ2wW+jkyOhRFrnzv6H3PMR1yMSEPpGO2jCfQmFwR634w== X-Google-Smtp-Source: ACHHUZ54TXt9llzClEulv667z/q5JS6ZPamxMpo268omz1KWKEldwRt2mh72Zd2lUbxXKmJfudYykBQs+uL+pUHd6yo= X-Received: by 2002:a81:6842:0:b0:573:592a:6f7e with SMTP id d63-20020a816842000000b00573592a6f7emr20938940ywc.19.1687678225578; Sun, 25 Jun 2023 00:30:25 -0700 (PDT) MIME-Version: 1.0 References: <04f99abe-a563-d093-23b7-4abf0f91633d@suse.com> <457ffad0-9ecd-3e19-f5ab-6153ce4b8bad@suse.com> <615edb3e-3dda-4e3f-9b71-43738a268afd@suse.com> In-Reply-To: From: Hongtao Liu Date: Sun, 25 Jun 2023 15:30:14 +0800 Message-ID: Subject: Re: [PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations To: Jan Beulich Cc: "gcc-patches@gcc.gnu.org" , Hongtao Liu , Kirill Yukhin Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sun, Jun 25, 2023 at 3:23=E2=80=AFPM Hongtao Liu wr= ote: > > On Sun, Jun 25, 2023 at 3:13=E2=80=AFPM Hongtao Liu = wrote: > > > > On Sun, Jun 25, 2023 at 1:52=E2=80=AFPM Jan Beulich = wrote: > > > > > > On 25.06.2023 06:42, Hongtao Liu wrote: > > > > On Wed, Jun 21, 2023 at 2:26=E2=80=AFPM Jan Beulich via Gcc-patches > > > > wrote: > > > >> > > > >> +(define_code_iterator andor [and ior]) > > > >> +(define_code_attr nlogic [(and "nor") (ior "nand")]) > > > >> +(define_code_attr ternlog_nlogic [(and "0x11") (ior "0x77")]) > > > >> + > > > >> +(define_insn "*3" > > > >> + [(set (match_operand:VI 0 "register_operand" "=3Dv,v") > > > >> + (andor:VI > > > >> + (not:VI (match_operand:VI 1 "bcst_vector_operand" "%v,v"= )) > > > >> + (not:VI (match_operand:VI 2 "bcst_vector_operand" "vBr,m= "))))] > > > > I'm thinking of doing it in simplify_rtx or gimple match.pd to tran= sform > > > > (and (not op1)) (not op2)) -> (not: (ior: op1 op2)) > > > > > > This wouldn't be a win (not + andn) -> (or + not), but what's > > > more important is ... > > > > > > > (ior (not op1) (not op2)) -> (not : (and op1 op2)) > > > > > > > > Even w/o avx512f, the transformation should also benefit since it > > > > takes less logic operations 3 -> 2.(or 2 -> 2 for pandn). > > > > > > ... that these transformations (from the, as per the doc, > > > canonical representation of nand and nor) are already occurring > > I see, there're already such simplifications in the gimple phase, so > > the question: is there any need for and/ior:not not pattern? > > Can you provide a testcase to demonstrate that and/ior: not not > > pattern is needed? > > typedef int v4si __attribute__((vector_size(16))); > v4si > foo1 (v4si a, v4si b) > { > return ~a & ~b; > } > > I only gimple have optimized it to > > [local count: 1073741824]: > # DEBUG BEGIN_STMT > _1 =3D a_2(D) | b_3(D); > _4 =3D ~_1; > return _4; > > > But rtl still try to match > > (set (reg:V4SI 86) > (and:V4SI (not:V4SI (reg:V4SI 88)) > (not:V4SI (reg:V4SI 89)))) > > Hmm. In rtl, we're using xor -1 for not, so it's (insn 8 7 9 2 (set (reg:V4SI 87) (ior:V4SI (reg:V4SI 88) (reg:V4SI 89))) "/app/example.cpp":6:15 6830 {*iorv4si3} (expr_list:REG_DEAD (reg:V4SI 89) (expr_list:REG_DEAD (reg:V4SI 88) (nil)))) (insn 9 8 14 2 (set (reg:V4SI 86) (xor:V4SI (reg:V4SI 87) (const_vector:V4SI [ (const_int -1 [0xffffffffffffffff]) repeated x4 ]))) "/app/example.cpp":6:18 6792 {*one_cmplv4si2} Then simplified to > (set (reg:V4SI 86) > (and:V4SI (not:V4SI (reg:V4SI 88)) > (not:V4SI (reg:V4SI 89)))) > by 3565 case XOR: 3566 if (trueop1 =3D=3D CONST0_RTX (mode)) 3567 return op0; 3568 if (INTEGRAL_MODE_P (mode) && trueop1 =3D=3D CONSTM1_RTX (mode)) 3569 return simplify_gen_unary (NOT, mode, op0, mode); and 1018 /* Apply De Morgan's laws to reduce number of patterns for machin= es 1019 with negating logical insns (and-not, nand, etc.). If result = has 1020 only one NOT, put it first, since that is how the patterns are 1021 coded. */ 1022 if (GET_CODE (op) =3D=3D IOR || GET_CODE (op) =3D=3D AND) 1023 { 1024 rtx in1 =3D XEXP (op, 0), in2 =3D XEXP (op, 1); 1025 machine_mode op_mode; 1026 1027 op_mode =3D GET_MODE (in1); 1028 in1 =3D simplify_gen_unary (NOT, op_mode, in1, op_mode); 1029 1030 op_mode =3D GET_MODE (in2); 1031 if (op_mode =3D=3D VOIDmode) 1032 op_mode =3D mode; 1033 in2 =3D simplify_gen_unary (NOT, op_mode, in2, op_mode); 1034 1035 if (GET_CODE (in2) =3D=3D NOT && GET_CODE (in1) !=3D NOT) 1036 std::swap (in1, in2); 1037 1038 return gen_rtx_fmt_ee (GET_CODE (op) =3D=3D IOR ? AND : IOR, 1039 mode, in1, in2); 1040 } Ok, got it, and/ior:not not pattern LGTM then. > > > in common code, _if_ no suitable insn can be found. That was at > > > least the conclusion I drew from looking around a lot, supported > > > by the code that's generated prior to this change. > > > > > > Jan > > > > > > > > -- > > BR, > > Hongtao > > > > -- > BR, > Hongtao --=20 BR, Hongtao