From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id A5F63384B0C0 for ; Fri, 11 Sep 2020 07:20:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A5F63384B0C0 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=richard.sandiford@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F09BA113E; Fri, 11 Sep 2020 00:20:55 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.98.126]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 29EE53F73C; Fri, 11 Sep 2020 00:20:55 -0700 (PDT) From: Richard Sandiford To: Richard Biener Mail-Followup-To: Richard Biener , Andrea Corallo , gcc-patches@gcc.gnu.org, ook@ucw.cz, nd@arm.com, richard.sandiford@arm.com Cc: Andrea Corallo , gcc-patches@gcc.gnu.org, ook@ucw.cz, nd@arm.com Subject: Re: [PATCH] vec: don't select partial vectors when looping on full vectors References: Date: Fri, 11 Sep 2020 08:20:54 +0100 In-Reply-To: (Richard Biener's message of "Thu, 10 Sep 2020 12:13:28 +0200 (CEST)") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Sep 2020 07:20:58 -0000 Richard Biener writes: > On Wed, 9 Sep 2020, Andrea Corallo wrote: >> Hi all, >>=20 >> this patch is meant not to generate predication in loop when the >> loop is operating only on full vectors. >>=20 >> Ex: >>=20 >> #+BEGIN_SRC C >> /* Vector length is 256. */ >> void >> f (int *restrict x, int *restrict y, unsigned int n) { >> for (unsigned int i =3D 0; i < n * 8; ++i) >> x[i] +=3D y[i]; >> } >> #+END_SRC >>=20 >> Compiling on aarch64 with -O3 -msve-vector-bits=3D256 current trunk >> gives: >>=20 >> #+BEGIN_SRC asm >> f: >> .LFB0: >> .cfi_startproc >> lsl w2, w2, 3 >> cbz w2, .L1 >> mov x3, 0 >> whilelo p0.s, xzr, x2 >> .p2align 3,,7 >> .L3: >> ld1w z0.s, p0/z, [x0, x3, lsl 2] >> ld1w z1.s, p0/z, [x1, x3, lsl 2] >> add z0.s, z0.s, z1.s >> st1w z0.s, p0, [x0, x3, lsl 2] >> add x3, x3, 8 >> whilelo p0.s, x3, x2 >> b.any .L3 >> .L1: >> ret >> .cfi_endproc >> #+END_SRC >>=20 >> With the patch applied: >>=20 >> #+BEGIN_SRC asm >> f: >> .LFB0: >> .cfi_startproc >> lsl w3, w2, 3 >> cbz w3, .L1 >> mov x2, 0 >> ptrue p0.b, vl32 >> .p2align 3,,7 >> .L3: >> ld1w z0.s, p0/z, [x0, x2, lsl 2] >> ld1w z1.s, p0/z, [x1, x2, lsl 2] >> add z0.s, z0.s, z1.s >> st1w z0.s, p0, [x0, x2, lsl 2] >> add x2, x2, 8 >> cmp x2, x3 >> bne .L3 >> .L1: >> ret >> .cfi_endproc >> #+END_SRC >>=20 >> To achieve this we check earlier if the loop needs peeling and if is >> not the case we do not set LOOP_VINFO_USING_PARTIAL_VECTORS_P to true. >>=20 >> I moved some logic from 'determine_peel_for_niter' to >> 'vect_need_peeling_or_part_vects_p' so it can be used for this purpose. >>=20 >> Bootstrapped and regtested on aarch64-linux-gnu. > > Looks OK to me, the comment > > @@ -2267,7 +2278,10 @@ start_over: > { > if (param_vect_partial_vector_usage =3D=3D 0) > LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) =3D false; > - else if (vect_verify_full_masking (loop_vinfo) > + else if ((vect_verify_full_masking (loop_vinfo) > + && vect_need_peeling_or_part_vects_p (loop_vinfo)) > + /* Don't use partial vectors if we don't need to peel the > + loop. */ > || vect_verify_loop_lens (loop_vinfo)) > > seems to be oddly misplaced (I'd put it before the call). Yeah, IMO it'd better to put it in the first =E2=80=9Cif=E2=80=9D. Also, very minor, but I think it'd be better not to shorten the name: vect_need_peeling_or_partial_vectors_p Thanks, Richard