From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by sourceware.org (Postfix) with ESMTPS id 610923858D37 for ; Mon, 3 Apr 2023 11:46:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 610923858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ed1-x536.google.com with SMTP id h8so116170589ede.8 for ; Mon, 03 Apr 2023 04:46:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1680522370; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=R4edYcSjuWHqU+eAOR6xQZkbtex3+6UVglNfSxDVdGQ=; b=YVko0IZZpD4MFMSzWmXVeJug7uP6KNH0z9aR/olWKFZRqAnVnN2RVNl8mUcCN+TkSN 7vS29vph6KpQMucgZnASLHHVqSYIt5gZUb/72r/OhGJ+6Wo6CNSOlX1RFrVWNWK32CGb 1KuqheuVczXfBLmxePKghf8o1WHvNivBTTuUZ0uh/dQZ9yMw+zNuxGkKvVc/USemjt0z HnWnKvV/KLucQgjwvdnnF5pgXnGg5U9YH3kla3u6RnbymCJRpUcNKrnVg8SdvmfarIU2 oBa8MH+NJKVITJZSk+wgt3m4micQt1t2uS2sSbgWoAsuBKaYpE1MzrA0WKS331vdOMjE Zrgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680522370; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=R4edYcSjuWHqU+eAOR6xQZkbtex3+6UVglNfSxDVdGQ=; b=T9OYiNlWgxL/2C19lJxMyGyWNUj9fhhiMhWakwUb54i9sVOcYBsS0fOEaDn5IjiqOv vpUHB4iHNCwi5No+s0bIrdhSHIdH4bDAn2TKhU4m9JXEiUxuSdrrNbjj+kJTzjE9H5Jt +qfHxwngWH52LOJyvRZ8yuH2Rt6xmBv9ARh7hml+gkA+luDwXc6X/HC6vYTtnkqg4DGn wQVYxyeyH9p2AArZjnUcWyl9ySzOlTouvSMiMWKypA/b5+3bFLklt6/+GmGGqlxXyGkY BosCGx3X4LH4uea1A2PMctLfxjzqUI1kb35WkY4kKyLKC2BGbsl5gDxKRYy/1lKcbPGl BqrA== X-Gm-Message-State: AAQBX9flyGBKtj4wfv6B7o/Jm+5T2G4JxmSyAEOKL7whntlozzrV6OLu 564ZFDepuorTwkxN7iHb6cEedLYKWqmJgmjbYILGHQ== X-Google-Smtp-Source: AKy350YBvHEWAZ5tBU5136aPAFVuZpVY0O7uKDgnQp+8W1wqNw9VyozvGLGrzdNNStEjYjY/g+Xff1Ilus/PJIlHdAo= X-Received: by 2002:a17:906:9c8c:b0:948:8f3:cf36 with SMTP id fj12-20020a1709069c8c00b0094808f3cf36mr2958969ejc.1.1680522369949; Mon, 03 Apr 2023 04:46:09 -0700 (PDT) MIME-Version: 1.0 References: <20230327074654.1126912-1-philipp.tomsich@vrull.eu> In-Reply-To: From: Philipp Tomsich Date: Mon, 3 Apr 2023 13:45:59 +0200 Message-ID: Subject: Re: [PATCH] aarch64: update ampere1 vectorization cost To: Kyrylo Tkachov Cc: "gcc-patches@gcc.gnu.org" , Richard Sandiford , Tamar Christina , Manolis Tsamis Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Kyrill, We reran on GCC12 and GCC11, reproducing the same improvements (e.g., on fotonik3d) that prompted the changes. I'll apply the backports later this week, unless you have any further conce= rns=E2=80=A6 Thanks, Philipp. On Mon, 27 Mar 2023 at 11:24, Kyrylo Tkachov wrote= : > > > > > -----Original Message----- > > From: Philipp Tomsich > > Sent: Monday, March 27, 2023 9:50 AM > > To: Kyrylo Tkachov > > Cc: gcc-patches@gcc.gnu.org; Richard Sandiford > > ; Tamar Christina > > ; Manolis Tsamis > > Subject: Re: [PATCH] aarch64: update ampere1 vectorization cost > > > > On Mon, 27 Mar 2023 at 16:45, Kyrylo Tkachov > > wrote: > > > > > > Hi Philipp, > > > > > > > -----Original Message----- > > > > From: Gcc-patches > > > bounces+kyrylo.tkachov=3Darm.com@gcc.gnu.org> On Behalf Of Philipp > > > > Tomsich > > > > Sent: Monday, March 27, 2023 8:47 AM > > > > To: gcc-patches@gcc.gnu.org > > > > Cc: Richard Sandiford ; Tamar Christina > > > > ; Philipp Tomsich > > ; > > > > Manolis Tsamis > > > > Subject: [PATCH] aarch64: update ampere1 vectorization cost > > > > > > > > The original submission of AmpereOne (-mcpu=3Dampere1) costs occurr= ed > > > > prior to exhaustive testing of vectorizable workloads against > > > > hardware. > > > > > > > > Adjust the vector costs to achieve the best results and more closel= y > > > > match the underlying hardware. > > > > > > > > gcc/ChangeLog: > > > > > > > > * config/aarch64/aarch64.cc: Update vector costs for ampere1. > > > > > > > > Co-Authored-By: Manolis Tsamis > > > > > > > > Signed-off-by: Philipp Tomsich > > > > --- > > > > We would like to get this into GCC 13 to avoid having to backport a= t > > > > the start of the next cycle. > > > > > > > > > > Given this affects only the ampere1 costs that sounds fine to me and = fairly > > low risk, you are being trusted that these costs are actually desirable= and > > properly validated on the hardware involved. > > > > > > > OK for backports? > > > > > > This is ok for trunk (GCC 13). Do you also want to backport this to o= ther > > branches? > > > > Ampere1 (with the older vector costs) are in GCC12 and GCC11. > > I would like to backport to those as well. > > Ok then, though you may want to run the benchmarks on the branches as wel= l to make sure the costs give the expected benefit there as well. > Thanks, > Kyrill > > > > > Thanks, > > Philipp. > > > > > Thanks, > > > Kyrill > > > > > > > > > > > gcc/config/aarch64/aarch64.cc | 12 ++++++------ > > > > 1 file changed, 6 insertions(+), 6 deletions(-) > > > > > > > > diff --git a/gcc/config/aarch64/aarch64.cc > > b/gcc/config/aarch64/aarch64.cc > > > > index b27f4354031..661fff65cea 100644 > > > > --- a/gcc/config/aarch64/aarch64.cc > > > > +++ b/gcc/config/aarch64/aarch64.cc > > > > @@ -1132,7 +1132,7 @@ static const struct cpu_vector_cost > > > > thunderx3t110_vector_cost =3D > > > > > > > > static const advsimd_vec_cost ampere1_advsimd_vector_cost =3D > > > > { > > > > - 3, /* int_stmt_cost */ > > > > + 1, /* int_stmt_cost */ > > > > 3, /* fp_stmt_cost */ > > > > 0, /* ld2_st2_permute_cost */ > > > > 0, /* ld3_st3_permute_cost */ > > > > @@ -1148,17 +1148,17 @@ static const advsimd_vec_cost > > > > ampere1_advsimd_vector_cost =3D > > > > 8, /* store_elt_extra_cost */ > > > > 6, /* vec_to_scalar_cost */ > > > > 7, /* scalar_to_vec_cost */ > > > > - 5, /* align_load_cost */ > > > > - 5, /* unalign_load_cost */ > > > > - 2, /* unalign_store_cost */ > > > > - 2 /* store_cost */ > > > > + 4, /* align_load_cost */ > > > > + 4, /* unalign_load_cost */ > > > > + 1, /* unalign_store_cost */ > > > > + 1 /* store_cost */ > > > > }; > > > > > > > > /* Ampere-1 costs for vector insn classes. */ > > > > static const struct cpu_vector_cost ampere1_vector_cost =3D > > > > { > > > > 1, /* scalar_int_stmt_cost */ > > > > - 1, /* scalar_fp_stmt_cost */ > > > > + 3, /* scalar_fp_stmt_cost */ > > > > 4, /* scalar_load_cost */ > > > > 1, /* scalar_store_cost */ > > > > 1, /* cond_taken_branch_cost */ > > > > -- > > > > 2.34.1 > > >