From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=4MOh=BN=arm.com=richard.sandiford@sourceware.org>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
	by sourceware.org (Postfix) with ESMTP id 496143858D28
	for <gcc-patches@gcc.gnu.org>; Wed, 24 May 2023 15:31:43 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 496143858D28
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 621B21042;
	Wed, 24 May 2023 08:32:27 -0700 (PDT)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72])
	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3A8833F762;
	Wed, 24 May 2023 08:31:41 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: =?utf-8?B?6ZKf5bGF5ZOy?= <juzhe.zhong@rivai.ai>
Mail-Followup-To: =?utf-8?B?6ZKf5bGF5ZOy?=
 <juzhe.zhong@rivai.ai>,gcc-patches <gcc-patches@gcc.gnu.org>,  rguenther
 <rguenther@suse.de>, richard.sandiford@arm.com
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,  rguenther <rguenther@suse.de>
Subject: Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support
References: <20230524144801.73537-1-juzhe.zhong@rivai.ai>
	<mpty1ldepjk.fsf@arm.com>
	<7AF1D88A0988DC3D+2023052423130398041121@rivai.ai>
Date: Wed, 24 May 2023 16:31:39 +0100
In-Reply-To: <7AF1D88A0988DC3D+2023052423130398041121@rivai.ai>
 (=?utf-8?B?IumSn+WxheWTsiIncw==?=
	message of "Wed, 24 May 2023 23:13:04 +0800")
Message-ID: <mpto7m9eof8.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-21.1 required=5.0 tests=BAYES_00,BODY_8BITS,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

=E9=92=9F=E5=B1=85=E5=93=B2 <juzhe.zhong@rivai.ai> writes:
> Hi, the .optimized dump is like this:
>
>   <bb 2> [local count: 21045336]:
>   ivtmp.26_36 =3D (unsigned long) &x;
>   ivtmp.27_3 =3D (unsigned long) &y;
>   ivtmp.30_6 =3D (unsigned long) &MEM <int[200]> [(void *)&y + 16B];
>   ivtmp.31_10 =3D (unsigned long) &MEM <int[200]> [(void *)&y + 32B];
>   ivtmp.32_14 =3D (unsigned long) &MEM <int[200]> [(void *)&y + 48B];
>
>   <bb 3> [local count: 273589366]:
>   # ivtmp_72 =3D PHI <ivtmp_73(3), 100(2)>
>   # ivtmp.26_41 =3D PHI <ivtmp.26_37(3), ivtmp.26_36(2)>
>   # ivtmp.27_1 =3D PHI <ivtmp.27_2(3), ivtmp.27_3(2)>
>   # ivtmp.30_4 =3D PHI <ivtmp.30_5(3), ivtmp.30_6(2)>
>   # ivtmp.31_8 =3D PHI <ivtmp.31_9(3), ivtmp.31_10(2)>
>   # ivtmp.32_12 =3D PHI <ivtmp.32_13(3), ivtmp.32_14(2)>
>   loop_len_34 =3D MIN_EXPR <ivtmp_72, 8>;
>   loop_len_48 =3D MIN_EXPR <loop_len_34, 4>;
>   _74 =3D loop_len_34 - loop_len_48;

Yeah, I think this needs to be:

  loop_len_48 =3D MIN_EXPR <loop_len_34 * 2, 4>;
  _74 =3D loop_len_34 * 2 - loop_len_48;
=20=20
(as valid gimple).  The point is that...

>   loop_len_49 =3D MIN_EXPR <_74, 4>;
>   _75 =3D _74 - loop_len_49;
>   loop_len_50 =3D MIN_EXPR <_75, 4>;
>   loop_len_51 =3D _75 - loop_len_50;

...there are 4 lengths capped to 4, for a total element count of 16.
But loop_len_34 is never greater than 8.

So for this case we either need to multiply, or we need to create
a fresh IV for the second rgroup.  Both approaches are fine.

Thanks,
Richard