From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 0E0703857809 for ; Tue, 20 Oct 2020 16:42:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 0E0703857809 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 93F381FB; Tue, 20 Oct 2020 09:42:27 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E8A4A3F66B; Tue, 20 Oct 2020 09:42:26 -0700 (PDT) From: Richard Sandiford To: Hongtao Liu Mail-Followup-To: Hongtao Liu , Hongtao Liu via Gcc-patches , Segher Boessenkool , richard.sandiford@arm.com Cc: Hongtao Liu via Gcc-patches , Segher Boessenkool Subject: Re: [PATCH] [PR rtl-optimization/97249]Simplify vec_select of paradoxical subreg. References: <20201013195900.GA2672@gate.crashing.org> <20201014173532.GC2672@gate.crashing.org> Date: Tue, 20 Oct 2020 17:42:25 +0100 In-Reply-To: (Hongtao Liu's message of "Tue, 20 Oct 2020 11:20:48 +0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 20 Oct 2020 16:42:29 -0000 Hongtao Liu writes: >> > + && (GET_MODE_NUNITS (mode)).is_constant (&l1) >> > + && (GET_MODE_NUNITS (GET_MODE (SUBREG_REG (trueop0)))) >> > + .is_constant (&l2) >> > + && known_le (l1, l2) >> >> I'm not sure the last two &&s are really the important condition. >> I think we should drop them for the suggestion below. >> > > Changed, assume gcc also support something like (vec_select:v4di > (reg:v2di) (parallel [ (const_int 0) (const_int 1) (const_int 1) > (const_int 0)])) > as long as the range of selection guaranteed by > || maybe_ge (UINTVAL (idx) + subreg_offset, nunits)) Yeah, that vec_select looks OK. >> >> > + if (!CONST_INT_P (idx)) >> >> Here I think we should check: >> >> || maybe_ge (UINTVAL (idx) + subreg_offset, nunits= )) >> >> where: >> >> poly_uint64 nunits >> =3D GET_MODE_NUNITS (GET_MODE (SUBREG_REG (trueop0)))). >> > > Changed. > >> This makes sure that all indices are in range. In particular, it's >> valid for the SUBREG_REG to be narrower than mode, for appropriate >> vec_select indices >> > > Yes, that's what paradoxical subreg means. But I was comparing the mode of the vec_select with the mode of the SUBREG_REG (rather than the mode of trueop0 with the mode of the SUBREG_REG, which is what matters for paradoxical subregs). > + /* Simplify vec_select of a subreg of X to just a vec_select of X > + when X has same component mode as vec_select. */ > + unsigned HOST_WIDE_INT subreg_offset =3D 0; > + if (GET_CODE (trueop0) =3D=3D SUBREG > + && GET_MODE_INNER (mode) > + =3D=3D GET_MODE_INNER (GET_MODE (SUBREG_REG (trueop0))) > + && (GET_MODE_NUNITS (mode)).is_constant (&l1) Unnecessary brackets around =E2=80=9CGET_MODE_NUNITS (mode)=E2=80=9D. > + && constant_multiple_p (SUBREG_BYTE (trueop0), > + GET_MODE_UNIT_BITSIZE (mode), > + &subreg_offset)) Sorry, my bad, this should be: && constant_multiple_p (subreg_memory_offset (trueop0), GET_MODE_UNIT_BITSIZE (mode), &subreg_offset)) > + { > + gcc_assert (XVECLEN (trueop1, 0) =3D=3D l1); > + bool success =3D true; > + poly_uint64 nunits > + =3D GET_MODE_NUNITS (GET_MODE (SUBREG_REG (trueop0))); > + for (int i =3D 0; i !=3D l1; i++) > + { > + rtx idx =3D XVECEXP (trueop1, 0, i); > + if (!CONST_INT_P (idx) > + || maybe_ge (UINTVAL (idx) + subreg_offset, nunits)) > + { > + success =3D false; > + break; > + } > + } > + if (success) > + { > + rtx par =3D trueop1; > + if (subreg_offset) > + { > + rtvec vec =3D rtvec_alloc (l1); > + for (int i =3D 0; i < l1; i++) > + RTVEC_ELT (vec, i) > + =3D GEN_INT (INTVAL (XVECEXP (trueop1, 0, i) > + + subreg_offset)); This is applying subreg_offset to the pointer rather than the INTVAL. It should be: =3D GEN_INT (UINTVAL (XVECEXP (trueop1, 0, i)) + subreg_offset); OK with those changes, thanks. Richard