From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id BBD2C3858412 for ; Sun, 2 Jul 2023 09:13:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BBD2C3858412 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E6E9B2F4; Sun, 2 Jul 2023 02:13:44 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6F27E3F64C; Sun, 2 Jul 2023 02:13:01 -0700 (PDT) From: Richard Sandiford To: Richard Biener Mail-Followup-To: Richard Biener ,Kewen Lin , gcc-patches@gcc.gnu.org, segher@kernel.crashing.org, bergner@linux.ibm.com, richard.sandiford@arm.com Cc: Kewen Lin , gcc-patches@gcc.gnu.org, segher@kernel.crashing.org, bergner@linux.ibm.com Subject: Re: [PATCH 0/9] vect: Move costing next to the transform for vect load References: Date: Sun, 02 Jul 2023 10:13:00 +0100 In-Reply-To: (Richard Biener's message of "Fri, 30 Jun 2023 13:37:23 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-21.1 required=5.0 tests=BAYES_00,KAM_DMARC_NONE,KAM_DMARC_STATUS,KAM_LAZY_DOMAIN_SECURITY,SPF_HELO_NONE,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Richard Biener writes: > On Tue, Jun 13, 2023 at 4:07=E2=80=AFAM Kewen Lin w= rote: >> >> This patch series follows Richi's suggestion at the link [1], >> which suggest structuring vectorizable_load to make costing >> next to the transform, in order to make it easier to keep >> costing and the transform in sync. FTR, I was keeping quiet given that this was following an agreed plan :) Thanks for organising the series this way. It made it easier to review. >> For now, it's a known >> issue that what we cost can be inconsistent with what we >> transform, as the case in PR82255 and some other associated >> test cases in the patches of this series show. >> >> Basically this patch series makes costing not call function >> vect_model_load_cost any more. To make the review and >> bisection easy, I organized the changes according to the >> memory access types of vector load. For each memory access >> type, firstly it follows the handlings in the function >> vect_model_load_costto avoid any missing, then refines >> further by referring to the transform code, I also checked >> them with some typical test cases to verify. Hope the >> subjects of patches are clear enough. >> >> The whole series can be bootstrapped and regtested >> incrementally on: >> - x86_64-redhat-linux >> - aarch64-linux-gnu >> - powerpc64-linux-gnu P7, P8 and P9 >> - powerpc64le-linux-gnu P8, P9 and P10 >> >> By considering the current vector test buckets are mainly >> tested without cost model, I also verified the whole patch >> series was neutral for SPEC2017 int/fp on Power9 at O2, >> O3 and Ofast separately. > > I went through the series now and I like it overall (well, I suggested > the change). > Looking at the changes I think we want some followup to reduce the > mess in the final loop nest. We already have some VMAT_* cases handled > separately, maybe we can split out some more cases. Maybe we should > bite the bullet and duplicate that loop nest for the different VMAT_* cas= es. > Maybe we can merge some of the if (!costing_p) checks by clever > re-ordering. So what > this series doesn't improve is overall readability of the code (indent an= d our > 80 char line limit). > > The change also makes it more difficult(?) to separate analysis and trans= form > though in the end I hope that analysis will actually "code generate" to a= (SLP) > data structure so the target will have a chance to see the actual flow of= insns. > > That said, I'd like to hear from Richard whether he thinks this is a step > in the right direction. Yeah, agree that it's probably better on balance. It's going to need a bit of discipline to make sure that we don't accidentally change the IR during the analysis phase, but I guess that already exists to a lesser extent with the =E2=80=9Cbefore !vec_stmt=E2=80=9D/=E2=80=9Cafter !vec_stmt= =E2=80=9D split. Thanks, Richard