From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id D3C29385783F for ; Tue, 16 May 2023 12:06:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D3C29385783F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 30D0B2F4; Tue, 16 May 2023 05:07:06 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 01A2C3F663; Tue, 16 May 2023 05:06:20 -0700 (PDT) From: Richard Sandiford To: Tejas Belagod Mail-Followup-To: Tejas Belagod ,"gcc-patches\@gcc.gnu.org" , richard.sandiford@arm.com Cc: "gcc-patches\@gcc.gnu.org" Subject: Re: [PATCH] [PR96339] AArch64: Optimise svlast[ab] References: <20230316113927.4967-1-tejas.belagod@arm.com> Date: Tue, 16 May 2023 13:06:19 +0100 In-Reply-To: (Tejas Belagod's message of "Tue, 16 May 2023 12:28:36 +0100") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-23.1 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Tejas Belagod writes: >>> + { >>> + b =3D build3 (BIT_FIELD_REF, TREE_TYPE (f.lhs), val, >>> + bitsize_int (step * BITS_PER_UNIT), >>> + bitsize_int ((16 - step) * BITS_PER_UNIT)); >>> + >>> + return gimple_build_assign (f.lhs, b); >>> + } >>> + >>> + /* If VECTOR_CST_NELTS_PER_PATTERN (pred) =3D=3D 2 and every mult= iple of >>> + 'step_1' in >>> + [VECTOR_CST_NPATTERNS .. VECTOR_CST_ENCODED_NELTS - 1] >>> + is zero, then we can treat the vector as VECTOR_CST_NPATTERNS >>> + elements followed by all inactive elements. */ >>> + if (!const_vl && VECTOR_CST_NELTS_PER_PATTERN (pred) =3D=3D 2) >> >> Following on from the above, maybe use: >> >> !VECTOR_CST_NELTS (pred).is_constant () >> >> instead of !const_vl here. >> >> I have a horrible suspicion that I'm contradicting our earlier discussion >> here, sorry, but: I think we have to return null if NELTS_PER_PATTERN != =3D 2. >> >>=20 >> >> IIUC, the NPATTERNS .. ENCODED_ELTS represent the repeated part of the > encoded >> constant. This means the repetition occurs if NELTS_PER_PATTERN =3D=3D 2= , IOW the >> base1 repeats in the encoding. This loop is checking this condition and = looks >> for a 1 in the repeated part of the NELTS_PER_PATTERN =3D=3D 2 in a VL v= ector. >> Please correct me if I=E2=80=99m misunderstanding here. > > NELTS_PER_PATTERN =3D=3D 1 is also a repeating pattern: it means that the > entire sequence is repeated to fill a vector. So if an NELTS_PER_PATTERN > =3D=3D 1 constant has elements {0, 1, 0, 0}, the vector is: > > {0, 1, 0, 0, 0, 1, 0, 0, ...} > > > Wouldn=E2=80=99t the vect_all_same(pred, step) cover this case for a give= n value of > step? > > > and the optimisation can't handle that. NELTS_PER_PATTERN =3D=3D 3 isn't > likely to occur for predicates, but in principle it has the same problem. > >=20=20 > > OK, I had misunderstood the encoding to always make base1 the repeating v= alue > by adjusting the NPATTERNS accordingly =E2=80=93 I didn=E2=80=99t know yo= u could also have the > base2 value and beyond encoding the repeat value. In this case could I ju= st > remove NELTS_PER_PATTERN =3D=3D 2 condition and the enclosed loop would c= heck for a > repeating =E2=80=981=E2=80=99 in the repeated part of the encoded pattern? But for NELTS_PER_PATTERN=3D=3D1, the whole encoded sequence repeats. So you would have to start the check at element 0 rather than NPATTERNS. And then (for NELTS_PER_PATTERN=3D=3D1) the loop would reject any constant that has a nonzero element. But all valid zero-vector cases have been handled by this point, so the effect wouldn't be useful. It should never be the case that all elements from NPATTERNS onwards are zero for NELTS_PER_PATTERN=3D=3D3; that case should be canonicalised to NELTS_PER_PATTERN=3D=3D2 instead. So in practice it's simpler and more obviously correct to punt when NELTS_PER_PATTERN !=3D 2. Thanks, Richard