From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=4MOh=BN=arm.com=richard.sandiford@sourceware.org>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
	by sourceware.org (Postfix) with ESMTP id 882743858D28
	for <gcc-patches@gcc.gnu.org>; Wed, 24 May 2023 16:01:00 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 882743858D28
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5C2671042;
	Wed, 24 May 2023 09:01:45 -0700 (PDT)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72])
	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B38A03F67D;
	Wed, 24 May 2023 09:00:59 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: =?utf-8?B?6ZKf5bGF5ZOy?= <juzhe.zhong@rivai.ai>
Mail-Followup-To: =?utf-8?B?6ZKf5bGF5ZOy?=
 <juzhe.zhong@rivai.ai>,gcc-patches <gcc-patches@gcc.gnu.org>,  rguenther
 <rguenther@suse.de>, richard.sandiford@arm.com
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,  rguenther <rguenther@suse.de>
Subject: Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support
References: <20230524144801.73537-1-juzhe.zhong@rivai.ai>
	<mpty1ldepjk.fsf@arm.com>
	<7AF1D88A0988DC3D+2023052423130398041121@rivai.ai>
	<mpto7m9eof8.fsf@arm.com>
	<5B94EE89DA317A86+2023052423421230674834@rivai.ai>
	<mpth6s1enph.fsf@arm.com>
	<70D20B75C645F088+2023052423522166255038@rivai.ai>
Date: Wed, 24 May 2023 17:00:58 +0100
In-Reply-To: <70D20B75C645F088+2023052423522166255038@rivai.ai>
 (=?utf-8?B?IumSn+WxheWTsiIncw==?=
	message of "Wed, 24 May 2023 23:52:22 +0800")
Message-ID: <mpt8rdden2d.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-21.1 required=5.0 tests=BAYES_00,BODY_8BITS,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

=E9=92=9F=E5=B1=85=E5=93=B2 <juzhe.zhong@rivai.ai> writes:
> Oh. I see. Thank you so much for pointing this.
> Could you tell me what I should do in the codes?
> It seems that I should adjust it in=20
> vect_adjust_loop_lens_control
>
> muliply by some factor ? Is this correct multiply by max_nscalars_per_iter
> ?

max_nscalars_per_iter * factor rather than just max_nscalars_per_iter

Note that it's possible for later max_nscalars_per_iter * factor to
be smaller, so a division might be needed in rare cases.  E.g.:

uint64_t x[100];
uint16_t y[200];

void f() {
  for (int i =3D 0, j =3D 0; i < 100; i +=3D 2, j +=3D 4) {
    x[i + 0] +=3D 1;
    x[i + 1] +=3D 2;
    y[j + 0] +=3D 1;
    y[j + 1] +=3D 2;
    y[j + 2] +=3D 3;
    y[j + 3] +=3D 4;
  }
}

where y has a single-control rgroup with max_nscalars_per_iter =3D=3D 4
and x has a 2-control rgroup with max_nscalars_per_iter =3D=3D 2

What gives the best code in these cases?  Is emitting a multiplication
better?  Or is using a new IV better?

Thanks,
Richard