From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=sEL2=CS=gmail.com=richard.guenther@sourceware.org>
Received: from mail-lj1-x22b.google.com (mail-lj1-x22b.google.com [IPv6:2a00:1450:4864:20::22b])
	by sourceware.org (Postfix) with ESMTPS id 113FC3860758
	for <gcc-patches@gcc.gnu.org>; Fri, 30 Jun 2023 11:37:37 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 113FC3860758
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-lj1-x22b.google.com with SMTP id 38308e7fff4ca-2b6a5fd1f46so26670721fa.1
        for <gcc-patches@gcc.gnu.org>; Fri, 30 Jun 2023 04:37:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20221208; t=1688125055; x=1690717055;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=+B+CpnF1jfCG96PDGK3kJoBRDuk6yesYNsDy1lkUwO0=;
        b=cs9mhGUxD6JCJl+z/13FitpLv3ToTarThBnFcScK159IVW4HewTYD7lHAu4AvTbY0X
         d/FM3OjcyVZQbmzGmpcgx6gyBplap80b51/eDvx6twGYw3pqd2ST1wtNnQaAD0N9reNn
         ZvYrT/6B++eyNW5iigvl+AZtKD9+EI2B2/drTMbvg5KdBDdcCqb2iaFINgZMwBY6gUBh
         u6/GMoiH9nDEJwL/HlLVw2y2866txDDTswZERGMhUdSly2edIgMD8dLxaKNM8HrkWQ+Y
         mWh8JGsNUW0wewLcD9X1/H8CfeEikhd1wSk6GmnJfUSzPqyy4qPAvFEk3dWVc/iGePbG
         4gFg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1688125055; x=1690717055;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=+B+CpnF1jfCG96PDGK3kJoBRDuk6yesYNsDy1lkUwO0=;
        b=KdpiqlDECsOtstKbmB8EMo0tMTMqsIxyE/hwM/Vz5w6hnSqL+KAe8S7CsPPwbOTuyA
         vKYoTMfmW2rYrdSLrdsUpcVruyriDzqPO0bxlm0TVKzMmv7ujoWSNwVr1sRfEbAawrE/
         u7FFQXxjddk0a6jpbGdg+ZcmYhhR6wk6ChKV9LjofiXiEbzJCi7Ml49xZkHBB73yJLab
         QlfEo5OLNjN8VsZOQxi723I1kNzs/8cC80/2u4yg9ZRnhlt+3pfmSzQSO+HiNkJDejm2
         AG2316oySlCo410PZtmwXKMhCwDihp+18UbKDRUs05STy5dT/3c7biYuAHST/YRLOvIM
         LhyQ==
X-Gm-Message-State: ABy/qLZ9e1RKFkuoQLaCmHS62EDcMdeYhf8yrxYpbRlxBQxDgekDD8Yw
	B8HroTdPDmgVE71C2Sk5qwyUw9uNye2GDB/vOWFewdG9
X-Google-Smtp-Source: APBJJlFoaKxkzekfEvo9FWTkEZ7BC18XACfGr3NlaB0MWhe3nInidqZVeTxicamBYiBFX9G/MMMBhsNlk1A1k/jFS4g=
X-Received: by 2002:a2e:9844:0:b0:2b6:9bd3:840e with SMTP id
 e4-20020a2e9844000000b002b69bd3840emr2200405ljj.21.1688125055229; Fri, 30 Jun
 2023 04:37:35 -0700 (PDT)
MIME-Version: 1.0
References: <cover.1686573640.git.linkw@linux.ibm.com>
In-Reply-To: <cover.1686573640.git.linkw@linux.ibm.com>
From: Richard Biener <richard.guenther@gmail.com>
Date: Fri, 30 Jun 2023 13:37:23 +0200
Message-ID: <CAFiYyc1VT0G7_x59KgU4nseAnBdJ18V_N81L5JguGgx-av1n4Q@mail.gmail.com>
Subject: Re: [PATCH 0/9] vect: Move costing next to the transform for vect load
To: Kewen Lin <linkw@linux.ibm.com>
Cc: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com, 
	segher@kernel.crashing.org, bergner@linux.ibm.com
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Tue, Jun 13, 2023 at 4:07=E2=80=AFAM Kewen Lin <linkw@linux.ibm.com> wro=
te:
>
> This patch series follows Richi's suggestion at the link [1],
> which suggest structuring vectorizable_load to make costing
> next to the transform, in order to make it easier to keep
> costing and the transform in sync.  For now, it's a known
> issue that what we cost can be inconsistent with what we
> transform, as the case in PR82255 and some other associated
> test cases in the patches of this series show.
>
> Basically this patch series makes costing not call function
> vect_model_load_cost any more.  To make the review and
> bisection easy, I organized the changes according to the
> memory access types of vector load.  For each memory access
> type, firstly it follows the handlings in the function
> vect_model_load_costto avoid any missing, then refines
> further by referring to the transform code, I also checked
> them with some typical test cases to verify.  Hope the
> subjects of patches are clear enough.
>
> The whole series can be bootstrapped and regtested
> incrementally on:
>   - x86_64-redhat-linux
>   - aarch64-linux-gnu
>   - powerpc64-linux-gnu P7, P8 and P9
>   - powerpc64le-linux-gnu P8, P9 and P10
>
> By considering the current vector test buckets are mainly
> tested without cost model, I also verified the whole patch
> series was neutral for SPEC2017 int/fp on Power9 at O2,
> O3 and Ofast separately.

I went through the series now and I like it overall (well, I suggested
the change).
Looking at the changes I think we want some followup to reduce the
mess in the final loop nest.  We already have some VMAT_* cases handled
separately, maybe we can split out some more cases.  Maybe we should
bite the bullet and duplicate that loop nest for the different VMAT_* cases=
.
Maybe we can merge some of the if (!costing_p) checks by clever
re-ordering.  So what
this series doesn't improve is overall readability of the code (indent and =
our
80 char line limit).

The change also makes it more difficult(?) to separate analysis and transfo=
rm
though in the end I hope that analysis will actually "code generate" to a (=
SLP)
data structure so the target will have a chance to see the actual flow of i=
nsns.

That said, I'd like to hear from Richard whether he thinks this is a step
in the right direction.

Are you willing to followup with doing the same re-structuring to
vectorizable_store?

OK from my side with the few comments addressed.  The patch likely needs re=
fresh
after the RVV changes in this area?

Thanks,
Richard.

> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563624.html
>
> Kewen Lin (9):
>   vect: Move vect_model_load_cost next to the transform in vectorizable_l=
oad
>   vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER && gs_inf=
o.decl
>   vect: Adjust vectorizable_load costing on VMAT_INVARIANT
>   vect: Adjust vectorizable_load costing on VMAT_ELEMENTWISE and VMAT_STR=
IDED_SLP
>   vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER
>   vect: Adjust vectorizable_load costing on VMAT_LOAD_STORE_LANES
>   vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_REVERSE
>   vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_PERMUTE
>   vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS
>
>  .../vect/costmodel/ppc/costmodel-pr82255.c    |  31 +
>  .../costmodel/ppc/costmodel-vect-reversed.c   |  22 +
>  gcc/testsuite/gcc.target/i386/pr70021.c       |   2 +-
>  gcc/tree-vect-stmts.cc                        | 651 ++++++++++--------
>  4 files changed, 432 insertions(+), 274 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr8=
2255.c
>  create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vec=
t-reversed.c
>
> --
> 2.31.1
>