From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) by sourceware.org (Postfix) with ESMTPS id 00EB33858405 for ; Mon, 9 May 2022 15:52:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 00EB33858405 Received: by mail-ej1-x62e.google.com with SMTP id kq17so27694751ejb.4 for ; Mon, 09 May 2022 08:52:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=Kp0HZE49wwmXwIKe6e1kNxJFYCGEjrUs8s+LGrSJTY8=; b=1U8uInHcvVisVSKbQm0FHKtWmDNtYvuZvxH1/sAvD1+RuQofdFZyQJE1hw1FOB3Ei6 nKeEb/0fsiQsOgm4vq8yyx4jYOnIYSa12pPVZOS/RKi3eSdViiZ7lJrdbM9PyPioOVOj Q0W0qKxaOMfVbJpaXhsZEMFVTdJSy7kUAEhOeDtFkMqHMck50pSHVbTpjNWefzoskveW KfW4Wl5T5GykLxiJ70Ru/iS6ytf+7SnSY0sEzFwhK12gyJ36FIFt5JEbhvBmO6O4k3Mk FS37IROwuZ2znAbDJJWIl8h+8/HEkaF2OpicptFNUG34ECl3ARWDWoP+UqE9ng+inqKT LWtQ== X-Gm-Message-State: AOAM533zq1r5pQYempqDCDGn7qwzhXb1lSUvfqlsq98OT3Sn/XLJ+lge IuB5Hczb4KQP9f1bLp0gId1xhlCEhgy3Gr6mKL8XFA== X-Google-Smtp-Source: ABdhPJxSYAVpz7TmVvwDOSip0TDc345AhonzLBxKMCIiyCGROBn3xOj7Kz0PDJ7scN5brQCxlsORyBXFltWm8hXcSPU= X-Received: by 2002:a17:907:c0d:b0:6f3:ed89:d9c with SMTP id ga13-20020a1709070c0d00b006f3ed890d9cmr14695588ejc.502.1652111541612; Mon, 09 May 2022 08:52:21 -0700 (PDT) MIME-Version: 1.0 References: <72p4o95q-697o-1o21-po9r-58r6rq3nq9n@fhfr.qr> <627013s-9227-o1rp-12p-sr4sp6rn897p@fhfr.qr> In-Reply-To: From: Prathamesh Kulkarni Date: Mon, 9 May 2022 21:21:44 +0530 Message-ID: Subject: Re: [2/2] PR96463 -- changes to type checking vec_perm_expr in middle end To: Prathamesh Kulkarni , Richard Biener , gcc Patches , richard.sandiford@arm.com Content-Type: multipart/mixed; boundary="0000000000009175fb05de963160" X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 09 May 2022 15:52:25 -0000 --0000000000009175fb05de963160 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, 9 May 2022 at 19:22, Richard Sandiford wrote: > > Prathamesh Kulkarni writes: > > On Tue, 3 May 2022 at 18:25, Richard Sandiford > > wrote: > >> > >> Prathamesh Kulkarni writes: > >> > On Tue, 4 Jan 2022 at 19:12, Richard Sandiford > >> > wrote: > >> >> > >> >> Richard Biener writes: > >> >> > On Tue, 4 Jan 2022, Richard Sandiford wrote: > >> >> > > >> >> >> Richard Biener writes: > >> >> >> > On Fri, 17 Dec 2021, Richard Sandiford wrote: > >> >> >> > > >> >> >> >> Prathamesh Kulkarni writes: > >> >> >> >> > Hi, > >> >> >> >> > The attached patch rearranges order of type-check for vec_p= erm_expr > >> >> >> >> > and relaxes type checking for > >> >> >> >> > lhs =3D vec_perm_expr > >> >> >> >> > > >> >> >> >> > when: > >> >> >> >> > rhs1 =3D=3D rhs2, > >> >> >> >> > lhs is variable length vector, > >> >> >> >> > rhs1 is fixed length vector, > >> >> >> >> > TREE_TYPE (lhs) =3D=3D TREE_TYPE (rhs1) > >> >> >> >> > > >> >> >> >> > I am not sure tho if this check is correct ? My intent was = to capture > >> >> >> >> > case when vec_perm_expr is used to "extend" fixed length ve= ctor to > >> >> >> >> > it's VLA equivalent. > >> >> >> >> > >> >> >> >> VLAness isn't really the issue. We want the same thing to wo= rk for > >> >> >> >> -msve-vector-bits=3D256, -msve-vector-bits=3D512, etc., even = though the > >> >> >> >> vectors are fixed-length in that case. > >> >> >> >> > >> >> >> >> The principle is that for: > >> >> >> >> > >> >> >> >> A =3D VEC_PERM_EXPR ; > >> >> >> >> > >> >> >> >> the requirements are: > >> >> >> >> > >> >> >> >> - A, B, C and D must be vectors > >> >> >> >> - A, B and C must have the same element type > >> >> >> >> - D must have an integer element type > >> >> >> >> - A and D must have the same number of elements (NA) > >> >> >> >> - B and C must have the same number of elements (NB) > >> >> >> >> > >> >> >> >> The semantics are that we create a joined vector BC (all elem= ents of B > >> >> >> >> followed by all element of C) and that: > >> >> >> >> > >> >> >> >> A[i] =3D BC[D[i] % (NB+NB)] > >> >> >> >> > >> >> >> >> for 0 =E2=89=A4 i < NA. > >> >> >> >> > >> >> >> >> This operation makes sense even if NA !=3D NB. > >> >> >> > > >> >> >> > But note that we don't currently expect NA !=3D NB and the opt= ab just > >> >> >> > has a single mode. > >> >> >> > >> >> >> True, but we only need this for constant permutes. They are alr= eady > >> >> >> special in that they allow the index elements to be wider than t= he data > >> >> >> elements. > >> >> > > >> >> > OK, then we should reflect this in the stmt verification and only= relax > >> >> > the constant permute vector case and also amend the > >> >> > TARGET_VECTORIZE_VEC_PERM_CONST accordingly. > >> >> > >> >> Sounds good. > >> >> > >> >> > For non-constant permutes the docs say the mode of vec_perm is > >> >> > the common mode of operands 1 and 2 whilst the mode of operand 0 > >> >> > is unspecified - even unconstrained by the docs. I'm not sure > >> >> > if vec_perm expansion is expected to eventually FAIL. Updating t= he > >> >> > docs of vec_perm would be appreciated as well. > >> >> > >> >> Yeah, I guess de facto operand 0 has to be the same mode as operand= s > >> >> 1 and 2. Maybe that was just an oversight, or maybe it seemed obvi= ous > >> >> or self-explanatory at the time. :-) > >> >> > >> >> > As said I prefer to not mangle the existing stmt checking too muc= h > >> >> > at this stage so minimal adjustment is prefered there. > >> >> > >> >> The PR is only an enhancement request rather than a bug, so I think= the > >> >> patch would need to wait for GCC 13 whatever happens. > >> > Hi, > >> > In attached patch, the type checking is relaxed only if mask is cons= tant. > >> > Does this look OK ? > >> > > >> > Thanks, > >> > Prathamesh > >> >> > >> >> Thanks, > >> >> Richard > >> > > >> > diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc > >> > index e321d929fd0..02b88f67855 100644 > >> > --- a/gcc/tree-cfg.cc > >> > +++ b/gcc/tree-cfg.cc > >> > @@ -4307,6 +4307,24 @@ verify_gimple_assign_ternary (gassign *stmt) > >> > break; > >> > > >> > case VEC_PERM_EXPR: > >> > + /* If permute is constant, then we allow for lhs and rhs > >> > + to have different vector types, provided: > >> > + (1) lhs, rhs1, rhs2, and rhs3 have same element type. > >> > >> This isn't a requirement for rhs3. > >> > >> > + (2) rhs3 vector has integer element type. > >> > + (3) len(lhs) =3D=3D len(rhs3) && len(rhs1) =3D=3D len(rhs2). = */ > >> > + > >> > + if (TREE_CONSTANT (rhs3) > >> > + && VECTOR_TYPE_P (lhs_type) > >> > + && VECTOR_TYPE_P (rhs1_type) > >> > + && VECTOR_TYPE_P (rhs2_type) > >> > + && VECTOR_TYPE_P (rhs3_type) > >> > + && TREE_TYPE (lhs_type) =3D=3D TREE_TYPE (rhs1_type) > >> > + && TREE_TYPE (lhs_type) =3D=3D TREE_TYPE (rhs2_type) > >> > + && INTEGRAL_TYPE_P (TREE_TYPE (rhs3_type)) > >> > + && known_eq (TYPE_VECTOR_SUBPARTS (lhs_type), TYPE_VECTOR_SU= BPARTS (rhs3_type)) > >> > + && known_eq (TYPE_VECTOR_SUBPARTS (rhs1_type), TYPE_VECTOR_S= UBPARTS (rhs2_type))) > >> > + return false; > >> > + > >> > >> I think this should be integrated into the existing conditions > >> rather than done as an initial special case. > >> > >> It might make sense to start with: > >> > >> if (TREE_CODE (rhs1_type) !=3D VECTOR_TYPE > >> || TREE_CODE (rhs2_type) !=3D VECTOR_TYPE > >> || TREE_CODE (rhs3_type) !=3D VECTOR_TYPE) > >> { > >> > >> but expanded to test lhs_type too. Then the other parts of the new te= st > >> should be distributed across the existing conditions. > >> > >> The type tests should use useless_type_conversion_p rather than =3D=3D= . > > Hi Richard, > > Thanks for the suggestions. In the attached patch, I tried to > > distribute the checks > > across existing conditions, does it look OK ? > > I am not sure how to write tests for the type checks tho, does > > gimple-fe support vec_perm_expr ? > > I did a quick testsuite run for vect.exp and the patch doesn't seem to > > cause any unexpected failures. > > > > Thanks, > > Prathamesh > >> > >> Thanks, > >> Richard > >> > >> > >> > >> > if (!useless_type_conversion_p (lhs_type, rhs1_type) > >> > || !useless_type_conversion_p (lhs_type, rhs2_type)) > >> > { > > > > diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc > > index e321d929fd0..a845c7fff93 100644 > > --- a/gcc/tree-cfg.cc > > +++ b/gcc/tree-cfg.cc > > @@ -4307,18 +4307,14 @@ verify_gimple_assign_ternary (gassign *stmt) > > break; > > > > case VEC_PERM_EXPR: > > - if (!useless_type_conversion_p (lhs_type, rhs1_type) > > - || !useless_type_conversion_p (lhs_type, rhs2_type)) > > - { > > - error ("type mismatch in %qs", code_name); > > - debug_generic_expr (lhs_type); > > - debug_generic_expr (rhs1_type); > > - debug_generic_expr (rhs2_type); > > - debug_generic_expr (rhs3_type); > > - return true; > > - } > > + /* If permute is constant, then we allow for lhs and rhs > > + to have different vector types, provided: > > + (1) lhs, rhs1, rhs2 have same element type. > > + (2) rhs3 vector has integer element type. > > + (3) len(lhs) =3D=3D len(rhs3) && len(rhs1) =3D=3D len(rhs2). */ > > > > - if (TREE_CODE (rhs1_type) !=3D VECTOR_TYPE > > + if (TREE_CODE (lhs_type) !=3D VECTOR_TYPE > > + || TREE_CODE (rhs1_type) !=3D VECTOR_TYPE > > || TREE_CODE (rhs2_type) !=3D VECTOR_TYPE > > || TREE_CODE (rhs3_type) !=3D VECTOR_TYPE) > > { > > @@ -4330,10 +4326,29 @@ verify_gimple_assign_ternary (gassign *stmt) > > return true; > > } > > > > + /* If lhs, rhs1, and rhs2 are different vector types, > > + then relax the check if rhs3 is constant and lhs, rhs1, and rhs2 > > + have same element types. */ > > + if ((!useless_type_conversion_p (lhs_type, rhs1_type) > > + || !useless_type_conversion_p (lhs_type, rhs2_type)) > > + && (!TREE_CONSTANT (rhs3) > > + || TREE_TYPE (lhs_type) !=3D TREE_TYPE (rhs1_type) > > + || TREE_TYPE (lhs_type) !=3D TREE_TYPE (rhs2_type))) > > These TREE_TYPE tests should use !useless_type_conversion_p too, > instead of !=3D. Maybe it would be easier to follow as: > > if (TREE_CONSTANT (rhs3) > ? ... > : ...) > > so that we don't have doubled useless_type_conversion_p checks > for the TREE_CONSTANT case. > > > + { > > + error ("type mismatch in %qs", code_name); > > + debug_generic_expr (lhs_type); > > + debug_generic_expr (rhs1_type); > > + debug_generic_expr (rhs2_type); > > + debug_generic_expr (rhs3_type); > > + return true; > > + } > > + > > + /* If rhs3 is constant, relax the check len(rhs2) =3D=3D len(rhs= 3). */ > > if (maybe_ne (TYPE_VECTOR_SUBPARTS (rhs1_type), > > TYPE_VECTOR_SUBPARTS (rhs2_type)) > > - || maybe_ne (TYPE_VECTOR_SUBPARTS (rhs2_type), > > + || (maybe_ne (TYPE_VECTOR_SUBPARTS (rhs2_type), > > TYPE_VECTOR_SUBPARTS (rhs3_type)) > > + && !TREE_CONSTANT (rhs3)) > > Very minor, but I think this reads better with the !TREE_CONSTANT first > in the && rather than second. There's no reason to compare the lengths > for TREE_CONSTANT. > > Otherwise it looks good to me, but Richard should have the final say. Thanks, I addressed the above suggestions in the attached patch. Does it look OK ? Thanks, Prathamesh > > Thanks, > Richard > > > || maybe_ne (TYPE_VECTOR_SUBPARTS (rhs3_type), > > TYPE_VECTOR_SUBPARTS (lhs_type))) > > { --0000000000009175fb05de963160 Content-Type: text/plain; charset="US-ASCII"; name="pr96463-7-midend.txt" Content-Disposition: attachment; filename="pr96463-7-midend.txt" Content-Transfer-Encoding: base64 Content-ID: X-Attachment-Id: f_l2ywhehi0 ZGlmZiAtLWdpdCBhL2djYy90cmVlLWNmZy5jYyBiL2djYy90cmVlLWNmZy5jYwppbmRleCBlMzIx ZDkyOWZkMC4uMzFmMjUxNGE0MDcgMTAwNjQ0Ci0tLSBhL2djYy90cmVlLWNmZy5jYworKysgYi9n Y2MvdHJlZS1jZmcuY2MKQEAgLTQzMDcsMTggKzQzMDcsMTQgQEAgdmVyaWZ5X2dpbXBsZV9hc3Np Z25fdGVybmFyeSAoZ2Fzc2lnbiAqc3RtdCkKICAgICAgIGJyZWFrOwogCiAgICAgY2FzZSBWRUNf UEVSTV9FWFBSOgotICAgICAgaWYgKCF1c2VsZXNzX3R5cGVfY29udmVyc2lvbl9wIChsaHNfdHlw ZSwgcmhzMV90eXBlKQotCSAgfHwgIXVzZWxlc3NfdHlwZV9jb252ZXJzaW9uX3AgKGxoc190eXBl LCByaHMyX3R5cGUpKQotCXsKLQkgIGVycm9yICgidHlwZSBtaXNtYXRjaCBpbiAlcXMiLCBjb2Rl X25hbWUpOwotCSAgZGVidWdfZ2VuZXJpY19leHByIChsaHNfdHlwZSk7Ci0JICBkZWJ1Z19nZW5l cmljX2V4cHIgKHJoczFfdHlwZSk7Ci0JICBkZWJ1Z19nZW5lcmljX2V4cHIgKHJoczJfdHlwZSk7 Ci0JICBkZWJ1Z19nZW5lcmljX2V4cHIgKHJoczNfdHlwZSk7Ci0JICByZXR1cm4gdHJ1ZTsKLQl9 CisgICAgICAvKiBJZiBwZXJtdXRlIGlzIGNvbnN0YW50LCB0aGVuIHdlIGFsbG93IGZvciBsaHMg YW5kIHJocworCSB0byBoYXZlIGRpZmZlcmVudCB2ZWN0b3IgdHlwZXMsIHByb3ZpZGVkOgorCSAo MSkgbGhzLCByaHMxLCByaHMyIGhhdmUgc2FtZSBlbGVtZW50IHR5cGUuCisJICgyKSByaHMzIHZl Y3RvciBoYXMgaW50ZWdlciBlbGVtZW50IHR5cGUuCisJICgzKSBsZW4obGhzKSA9PSBsZW4ocmhz MykgJiYgbGVuKHJoczEpID09IGxlbihyaHMyKS4gICovCiAKLSAgICAgIGlmIChUUkVFX0NPREUg KHJoczFfdHlwZSkgIT0gVkVDVE9SX1RZUEUKKyAgICAgIGlmIChUUkVFX0NPREUgKGxoc190eXBl KSAhPSBWRUNUT1JfVFlQRQorCSAgfHwgVFJFRV9DT0RFIChyaHMxX3R5cGUpICE9IFZFQ1RPUl9U WVBFCiAJICB8fCBUUkVFX0NPREUgKHJoczJfdHlwZSkgIT0gVkVDVE9SX1RZUEUKIAkgIHx8IFRS RUVfQ09ERSAocmhzM190eXBlKSAhPSBWRUNUT1JfVFlQRSkKIAl7CkBAIC00MzMwLDEwICs0MzI2 LDI4IEBAIHZlcmlmeV9naW1wbGVfYXNzaWduX3Rlcm5hcnkgKGdhc3NpZ24gKnN0bXQpCiAJICBy ZXR1cm4gdHJ1ZTsKIAl9CiAKKyAgICAgIC8qIElmIHJoczMgaXMgY29uc3RhbnQsIHdlIGFsbG93 IGxocywgcmhzMSBhbmQgcmhzMiB0byBiZSBkaWZmZXJlbnQgdmVjdG9yIHR5cGVzLAorCSBhcyBs b25nIGFzIGxocywgcmhzMSBhbmQgcmhzMiBoYXZlIHNhbWUgZWxlbWVudCB0eXBlLiAgKi8KKyAg ICAgIGlmIChUUkVFX0NPTlNUQU5UIChyaHMzKQorCSAgPyAoIXVzZWxlc3NfdHlwZV9jb252ZXJz aW9uX3AgKFRSRUVfVFlQRSAobGhzX3R5cGUpLCBUUkVFX1RZUEUgKHJoczFfdHlwZSkpCisJICAg ICB8fCAhdXNlbGVzc190eXBlX2NvbnZlcnNpb25fcCAoVFJFRV9UWVBFIChsaHNfdHlwZSksIFRS RUVfVFlQRSAocmhzMl90eXBlKSkpCisJICA6ICghdXNlbGVzc190eXBlX2NvbnZlcnNpb25fcCAo bGhzX3R5cGUsIHJoczFfdHlwZSkKKwkgICAgIHx8ICF1c2VsZXNzX3R5cGVfY29udmVyc2lvbl9w IChsaHNfdHlwZSwgcmhzMl90eXBlKSkpCisJeworCSAgICBlcnJvciAoInR5cGUgbWlzbWF0Y2gg aW4gJXFzIiwgY29kZV9uYW1lKTsKKwkgICAgZGVidWdfZ2VuZXJpY19leHByIChsaHNfdHlwZSk7 CisJICAgIGRlYnVnX2dlbmVyaWNfZXhwciAocmhzMV90eXBlKTsKKwkgICAgZGVidWdfZ2VuZXJp Y19leHByIChyaHMyX3R5cGUpOworCSAgICBkZWJ1Z19nZW5lcmljX2V4cHIgKHJoczNfdHlwZSk7 CisJICAgIHJldHVybiB0cnVlOworCX0KKworICAgICAgLyogSWYgcmhzMyBpcyBjb25zdGFudCwg cmVsYXggdGhlIGNoZWNrIGxlbihyaHMyKSA9PSBsZW4ocmhzMykuICAqLwogICAgICAgaWYgKG1h eWJlX25lIChUWVBFX1ZFQ1RPUl9TVUJQQVJUUyAocmhzMV90eXBlKSwKIAkJICAgIFRZUEVfVkVD VE9SX1NVQlBBUlRTIChyaHMyX3R5cGUpKQotCSAgfHwgbWF5YmVfbmUgKFRZUEVfVkVDVE9SX1NV QlBBUlRTIChyaHMyX3R5cGUpLAotCQkgICAgICAgVFlQRV9WRUNUT1JfU1VCUEFSVFMgKHJoczNf dHlwZSkpCisJICB8fCAoIVRSRUVfQ09OU1RBTlQocmhzMykKKwkgICAgICAmJiBtYXliZV9uZSAo VFlQRV9WRUNUT1JfU1VCUEFSVFMgKHJoczJfdHlwZSksCisJCQkgICBUWVBFX1ZFQ1RPUl9TVUJQ QVJUUyAocmhzM190eXBlKSkpCiAJICB8fCBtYXliZV9uZSAoVFlQRV9WRUNUT1JfU1VCUEFSVFMg KHJoczNfdHlwZSksCiAJCSAgICAgICBUWVBFX1ZFQ1RPUl9TVUJQQVJUUyAobGhzX3R5cGUpKSkK IAl7Cg== --0000000000009175fb05de963160--