From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 882743858D28 for ; Wed, 24 May 2023 16:01:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 882743858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5C2671042; Wed, 24 May 2023 09:01:45 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B38A03F67D; Wed, 24 May 2023 09:00:59 -0700 (PDT) From: Richard Sandiford To: =?utf-8?B?6ZKf5bGF5ZOy?= Mail-Followup-To: =?utf-8?B?6ZKf5bGF5ZOy?= ,gcc-patches , rguenther , richard.sandiford@arm.com Cc: gcc-patches , rguenther Subject: Re: [PATCH V14] VECT: Add decrement IV iteration loop control by variable amount support References: <20230524144801.73537-1-juzhe.zhong@rivai.ai> <7AF1D88A0988DC3D+2023052423130398041121@rivai.ai> <5B94EE89DA317A86+2023052423421230674834@rivai.ai> <70D20B75C645F088+2023052423522166255038@rivai.ai> Date: Wed, 24 May 2023 17:00:58 +0100 In-Reply-To: <70D20B75C645F088+2023052423522166255038@rivai.ai> (=?utf-8?B?IumSn+WxheWTsiIncw==?= message of "Wed, 24 May 2023 23:52:22 +0800") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-21.1 required=5.0 tests=BAYES_00,BODY_8BITS,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: =E9=92=9F=E5=B1=85=E5=93=B2 writes: > Oh. I see. Thank you so much for pointing this. > Could you tell me what I should do in the codes? > It seems that I should adjust it in=20 > vect_adjust_loop_lens_control > > muliply by some factor ? Is this correct multiply by max_nscalars_per_iter > ? max_nscalars_per_iter * factor rather than just max_nscalars_per_iter Note that it's possible for later max_nscalars_per_iter * factor to be smaller, so a division might be needed in rare cases. E.g.: uint64_t x[100]; uint16_t y[200]; void f() { for (int i =3D 0, j =3D 0; i < 100; i +=3D 2, j +=3D 4) { x[i + 0] +=3D 1; x[i + 1] +=3D 2; y[j + 0] +=3D 1; y[j + 1] +=3D 2; y[j + 2] +=3D 3; y[j + 3] +=3D 4; } } where y has a single-control rgroup with max_nscalars_per_iter =3D=3D 4 and x has a 2-control rgroup with max_nscalars_per_iter =3D=3D 2 What gives the best code in these cases? Is emitting a multiplication better? Or is using a new IV better? Thanks, Richard