From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 496143858D28 for ; Wed, 24 May 2023 15:31:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 496143858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 621B21042; Wed, 24 May 2023 08:32:27 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3A8833F762; Wed, 24 May 2023 08:31:41 -0700 (PDT) From: Richard Sandiford To: =?utf-8?B?6ZKf5bGF5ZOy?= Mail-Followup-To: =?utf-8?B?6ZKf5bGF5ZOy?= ,gcc-patches , rguenther , richard.sandiford@arm.com Cc: gcc-patches , rguenther Subject: Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support References: <20230524144801.73537-1-juzhe.zhong@rivai.ai> <7AF1D88A0988DC3D+2023052423130398041121@rivai.ai> Date: Wed, 24 May 2023 16:31:39 +0100 In-Reply-To: <7AF1D88A0988DC3D+2023052423130398041121@rivai.ai> (=?utf-8?B?IumSn+WxheWTsiIncw==?= message of "Wed, 24 May 2023 23:13:04 +0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-21.1 required=5.0 tests=BAYES_00,BODY_8BITS,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: =E9=92=9F=E5=B1=85=E5=93=B2 writes: > Hi, the .optimized dump is like this: > > [local count: 21045336]: > ivtmp.26_36 =3D (unsigned long) &x; > ivtmp.27_3 =3D (unsigned long) &y; > ivtmp.30_6 =3D (unsigned long) &MEM [(void *)&y + 16B]; > ivtmp.31_10 =3D (unsigned long) &MEM [(void *)&y + 32B]; > ivtmp.32_14 =3D (unsigned long) &MEM [(void *)&y + 48B]; > > [local count: 273589366]: > # ivtmp_72 =3D PHI > # ivtmp.26_41 =3D PHI > # ivtmp.27_1 =3D PHI > # ivtmp.30_4 =3D PHI > # ivtmp.31_8 =3D PHI > # ivtmp.32_12 =3D PHI > loop_len_34 =3D MIN_EXPR ; > loop_len_48 =3D MIN_EXPR ; > _74 =3D loop_len_34 - loop_len_48; Yeah, I think this needs to be: loop_len_48 =3D MIN_EXPR ; _74 =3D loop_len_34 * 2 - loop_len_48; =20=20 (as valid gimple). The point is that... > loop_len_49 =3D MIN_EXPR <_74, 4>; > _75 =3D _74 - loop_len_49; > loop_len_50 =3D MIN_EXPR <_75, 4>; > loop_len_51 =3D _75 - loop_len_50; ...there are 4 lengths capped to 4, for a total element count of 16. But loop_len_34 is never greater than 8. So for this case we either need to multiply, or we need to create a fresh IV for the second rgroup. Both approaches are fine. Thanks, Richard