From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=N2Yc=CU=arm.com=richard.sandiford@sourceware.org>
Received: from foss.arm.com (foss.arm.com [217.140.110.172])
	by sourceware.org (Postfix) with ESMTP id BBD2C3858412
	for <gcc-patches@gcc.gnu.org>; Sun,  2 Jul 2023 09:13:02 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BBD2C3858412
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14])
	by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E6E9B2F4;
	Sun,  2 Jul 2023 02:13:44 -0700 (PDT)
Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72])
	by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6F27E3F64C;
	Sun,  2 Jul 2023 02:13:01 -0700 (PDT)
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <richard.guenther@gmail.com>
Mail-Followup-To: Richard Biener <richard.guenther@gmail.com>,Kewen Lin <linkw@linux.ibm.com>,  gcc-patches@gcc.gnu.org,  segher@kernel.crashing.org,  bergner@linux.ibm.com, richard.sandiford@arm.com
Cc: Kewen Lin <linkw@linux.ibm.com>,  gcc-patches@gcc.gnu.org,  segher@kernel.crashing.org,  bergner@linux.ibm.com
Subject: Re: [PATCH 0/9] vect: Move costing next to the transform for vect load
References: <cover.1686573640.git.linkw@linux.ibm.com>
	<CAFiYyc1VT0G7_x59KgU4nseAnBdJ18V_N81L5JguGgx-av1n4Q@mail.gmail.com>
Date: Sun, 02 Jul 2023 10:13:00 +0100
In-Reply-To: <CAFiYyc1VT0G7_x59KgU4nseAnBdJ18V_N81L5JguGgx-av1n4Q@mail.gmail.com>
	(Richard Biener's message of "Fri, 30 Jun 2023 13:37:23 +0200")
Message-ID: <mptlefywuar.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-21.1 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Jun 13, 2023 at 4:07=E2=80=AFAM Kewen Lin <linkw@linux.ibm.com> w=
rote:
>>
>> This patch series follows Richi's suggestion at the link [1],
>> which suggest structuring vectorizable_load to make costing
>> next to the transform, in order to make it easier to keep
>> costing and the transform in sync.

FTR, I was keeping quiet given that this was following an agreed plan :)

Thanks for organising the series this way.  It made it easier to review.

>> For now, it's a known
>> issue that what we cost can be inconsistent with what we
>> transform, as the case in PR82255 and some other associated
>> test cases in the patches of this series show.
>>
>> Basically this patch series makes costing not call function
>> vect_model_load_cost any more.  To make the review and
>> bisection easy, I organized the changes according to the
>> memory access types of vector load.  For each memory access
>> type, firstly it follows the handlings in the function
>> vect_model_load_costto avoid any missing, then refines
>> further by referring to the transform code, I also checked
>> them with some typical test cases to verify.  Hope the
>> subjects of patches are clear enough.
>>
>> The whole series can be bootstrapped and regtested
>> incrementally on:
>>   - x86_64-redhat-linux
>>   - aarch64-linux-gnu
>>   - powerpc64-linux-gnu P7, P8 and P9
>>   - powerpc64le-linux-gnu P8, P9 and P10
>>
>> By considering the current vector test buckets are mainly
>> tested without cost model, I also verified the whole patch
>> series was neutral for SPEC2017 int/fp on Power9 at O2,
>> O3 and Ofast separately.
>
> I went through the series now and I like it overall (well, I suggested
> the change).
> Looking at the changes I think we want some followup to reduce the
> mess in the final loop nest.  We already have some VMAT_* cases handled
> separately, maybe we can split out some more cases.  Maybe we should
> bite the bullet and duplicate that loop nest for the different VMAT_* cas=
es.
> Maybe we can merge some of the if (!costing_p) checks by clever
> re-ordering.  So what
> this series doesn't improve is overall readability of the code (indent an=
d our
> 80 char line limit).
>
> The change also makes it more difficult(?) to separate analysis and trans=
form
> though in the end I hope that analysis will actually "code generate" to a=
 (SLP)
> data structure so the target will have a chance to see the actual flow of=
 insns.
>
> That said, I'd like to hear from Richard whether he thinks this is a step
> in the right direction.

Yeah, agree that it's probably better on balance.  It's going to need a
bit of discipline to make sure that we don't accidentally change the IR
during the analysis phase, but I guess that already exists to a lesser
extent with the =E2=80=9Cbefore !vec_stmt=E2=80=9D/=E2=80=9Cafter !vec_stmt=
=E2=80=9D split.

Thanks,
Richard