From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yb1-xb30.google.com (mail-yb1-xb30.google.com [IPv6:2607:f8b0:4864:20::b30]) by sourceware.org (Postfix) with ESMTPS id CED313858D20 for ; Mon, 26 Jun 2023 00:42:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CED313858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-yb1-xb30.google.com with SMTP id 3f1490d57ef6-bff0beb2d82so2821488276.2 for ; Sun, 25 Jun 2023 17:42:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687740168; x=1690332168; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=8wI06ZuEpy27Acgnm9ZVTArL6hSZl9WP0N/FhspuH68=; b=JdBg8f9FsrQn3V1s3wCtJ0tngWxdXOXnEc71cRbxrhBOj/QUFW+6stB+rw7yzc9+lY PjcEoKeq9lteiVsqzyQW5zJ3erX+T0Y8P1L33BqD8LLM1eUmNlLnBgRpglFdAjbtaspk Hbe/XK10tAzAcAcUqUIfncXEjeJz9drPCYmF+uZgDVO7/Kj+Qtym37MUz6UmFb6y8BWK /Aj+x2RZ9mxg5sz07daWOQRplXVqN6y33dGEZomkP0qbigJvWA50hs5zkb1V9iJ/r/mP u02iV0OvOTnLlNIjl9vqxuBpGeAiMfxJ33JJmMvpQUoV2J+x8oez+ZS42ym6s4kZi6Os lmHg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687740168; x=1690332168; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8wI06ZuEpy27Acgnm9ZVTArL6hSZl9WP0N/FhspuH68=; b=JSEx+MTpgFjep9bqZT4GXvLjMCM6HNUuWPMBo8GAVVZ8X9JZ0dqV+e8t38nWmbrBvH MZYpcZxgFWCvVAyydSUkaLdTjON+J9K6iaXoVo4h6Ap2ofe7uin4aDBSlXpDW/K8GuQA +QBELugAH/dVK7r7hU1BYLNZi5RaK9aT86+RjTCAIIhytgQNTE50E7Tez0w+JqYk8kcA rif4zePb14pek4bUL5PEEwwJ/A8R0o2ntglEkBnclKm4A0HcxNmtOGWaUUXqrdXDmrzq +DsQSuboSuM9ToC9lX8K+w8hOkz8g8PfPlsyzCKrulpdfuN9kJLFQM/hpgVzoO7GH8Ws zw3g== X-Gm-Message-State: AC+VfDzKtRVjGSDr5ZwPRN72a7Lt/3H1AXFpAiCxHyysI0z00brn6KuF NP5NrMQgftnVctKwYaqR3MUhhDj49Hmar8J+m/X1RrSS2+w= X-Google-Smtp-Source: ACHHUZ6DXsx3zhP9Mrdbzo1Xj2Es/cMAGDOyuWzgZOEaKo6JDnNBvyZs4JO53Ocgl5pdDel7zdHbaU2L1YQSyaKiUI8= X-Received: by 2002:a25:a283:0:b0:bdd:5625:1195 with SMTP id c3-20020a25a283000000b00bdd56251195mr24300267ybi.8.1687740167985; Sun, 25 Jun 2023 17:42:47 -0700 (PDT) MIME-Version: 1.0 References: <04f99abe-a563-d093-23b7-4abf0f91633d@suse.com> <457ffad0-9ecd-3e19-f5ab-6153ce4b8bad@suse.com> <615edb3e-3dda-4e3f-9b71-43738a268afd@suse.com> <49e10641-7a2f-440a-df07-562aaf265ecc@suse.com> In-Reply-To: <49e10641-7a2f-440a-df07-562aaf265ecc@suse.com> From: Hongtao Liu Date: Mon, 26 Jun 2023 08:42:36 +0800 Message-ID: Subject: Re: [PATCH 1/5] x86: use VPTERNLOG for further bitwise two-vector operations To: Jan Beulich Cc: "gcc-patches@gcc.gnu.org" , Hongtao Liu , Kirill Yukhin Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sun, Jun 25, 2023 at 9:35=E2=80=AFPM Jan Beulich wro= te: > > On 25.06.2023 09:30, Hongtao Liu wrote: > > On Sun, Jun 25, 2023 at 3:23=E2=80=AFPM Hongtao Liu wrote: > >> > >> On Sun, Jun 25, 2023 at 3:13=E2=80=AFPM Hongtao Liu wrote: > >>> > >>> On Sun, Jun 25, 2023 at 1:52=E2=80=AFPM Jan Beulich wrote: > >>>> > >>>> On 25.06.2023 06:42, Hongtao Liu wrote: > >>>>> On Wed, Jun 21, 2023 at 2:26=E2=80=AFPM Jan Beulich via Gcc-patches > >>>>> wrote: > >>>>>> > >>>>>> +(define_code_iterator andor [and ior]) > >>>>>> +(define_code_attr nlogic [(and "nor") (ior "nand")]) > >>>>>> +(define_code_attr ternlog_nlogic [(and "0x11") (ior "0x77")]) > >>>>>> + > >>>>>> +(define_insn "*3" > >>>>>> + [(set (match_operand:VI 0 "register_operand" "=3Dv,v") > >>>>>> + (andor:VI > >>>>>> + (not:VI (match_operand:VI 1 "bcst_vector_operand" "%v,v"= )) > >>>>>> + (not:VI (match_operand:VI 2 "bcst_vector_operand" "vBr,m= "))))] > >>>>> I'm thinking of doing it in simplify_rtx or gimple match.pd to tran= sform > >>>>> (and (not op1)) (not op2)) -> (not: (ior: op1 op2)) > >>>> > >>>> This wouldn't be a win (not + andn) -> (or + not), but what's > >>>> more important is ... > >>>> > >>>>> (ior (not op1) (not op2)) -> (not : (and op1 op2)) > >>>>> > >>>>> Even w/o avx512f, the transformation should also benefit since it > >>>>> takes less logic operations 3 -> 2.(or 2 -> 2 for pandn). > >>>> > >>>> ... that these transformations (from the, as per the doc, > >>>> canonical representation of nand and nor) are already occurring > >>> I see, there're already such simplifications in the gimple phase, so > >>> the question: is there any need for and/ior:not not pattern? > >>> Can you provide a testcase to demonstrate that and/ior: not not > >>> pattern is needed? > >> > >> typedef int v4si __attribute__((vector_size(16))); > >> v4si > >> foo1 (v4si a, v4si b) > >> { > >> return ~a & ~b; > >> } > >> > >> I only gimple have optimized it to > >> > >> [local count: 1073741824]: > >> # DEBUG BEGIN_STMT > >> _1 =3D a_2(D) | b_3(D); > >> _4 =3D ~_1; > >> return _4; > >> > >> > >> But rtl still try to match > >> > >> (set (reg:V4SI 86) > >> (and:V4SI (not:V4SI (reg:V4SI 88)) > >> (not:V4SI (reg:V4SI 89)))) > >> > >> Hmm. > > In rtl, we're using xor -1 for not, so it's > > > > (insn 8 7 9 2 (set (reg:V4SI 87) > > (ior:V4SI (reg:V4SI 88) > > (reg:V4SI 89))) "/app/example.cpp":6:15 6830 {*iorv4si3} > > (expr_list:REG_DEAD (reg:V4SI 89) > > (expr_list:REG_DEAD (reg:V4SI 88) > > (nil)))) > > (insn 9 8 14 2 (set (reg:V4SI 86) > > (xor:V4SI (reg:V4SI 87) > > (const_vector:V4SI [ > > (const_int -1 [0xffffffffffffffff]) repeated x4 > > ]))) "/app/example.cpp":6:18 6792 {*one_cmplv4si2} > > > > Then simplified to > >> (set (reg:V4SI 86) > >> (and:V4SI (not:V4SI (reg:V4SI 88)) > >> (not:V4SI (reg:V4SI 89)))) > >> > > > > by > > > > 3565 case XOR: > > 3566 if (trueop1 =3D=3D CONST0_RTX (mode)) > > 3567 return op0; > > 3568 if (INTEGRAL_MODE_P (mode) && trueop1 =3D=3D CONSTM1_RTX (mod= e)) > > 3569 return simplify_gen_unary (NOT, mode, op0, mode); > > > > and > > > > 1018 /* Apply De Morgan's laws to reduce number of patterns for ma= chines > > 1019 with negating logical insns (and-not, nand, etc.). If res= ult has > > 1020 only one NOT, put it first, since that is how the patterns= are > > 1021 coded. */ > > 1022 if (GET_CODE (op) =3D=3D IOR || GET_CODE (op) =3D=3D AND) > > 1023 { > > 1024 rtx in1 =3D XEXP (op, 0), in2 =3D XEXP (op, 1); > > 1025 machine_mode op_mode; > > 1026 > > 1027 op_mode =3D GET_MODE (in1); > > 1028 in1 =3D simplify_gen_unary (NOT, op_mode, in1, op_mode); > > 1029 > > 1030 op_mode =3D GET_MODE (in2); > > 1031 if (op_mode =3D=3D VOIDmode) > > 1032 op_mode =3D mode; > > 1033 in2 =3D simplify_gen_unary (NOT, op_mode, in2, op_mode); > > 1034 > > 1035 if (GET_CODE (in2) =3D=3D NOT && GET_CODE (in1) !=3D NOT) > > 1036 std::swap (in1, in2); > > 1037 > > 1038 return gen_rtx_fmt_ee (GET_CODE (op) =3D=3D IOR ? AND : I= OR, > > 1039 mode, in1, in2); > > 1040 } > > > > > > Ok, got it, and/ior:not not pattern LGTM then. > > Just to avoid misunderstandings - together with your initial > reply that's then an "okay" to the patch as a whole, right? Yes. > > Thanks, Jan --=20 BR, Hongtao