From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by sourceware.org (Postfix) with ESMTPS id 959EB385F014 for ; Wed, 4 Aug 2021 11:44:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 959EB385F014 Received: by mail-ej1-x636.google.com with SMTP id gs8so3190065ejc.13 for ; Wed, 04 Aug 2021 04:44:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=dxNh6lllZ48phiQIkahuLdFjEd46s3cXjD18uHIBMk4=; b=SEXPPE4BSwYg+XKPIpc/1WMlx1xEPsuNaGDgReBW2VWNTMXBpBDPEEMHgA1pLTyBeW 6d7d3tjOoSZ0hN3I+fYMbYTci9y7cGtOjO2h88uCD9+8hz4ir0kmp/MRjLSX1h2BZepD EqP2Q+IgWJnEoYbI2HcVe6pTxJztOKeqZcHLT0j7wWl7bmsUt6PbYT/fxlroiSvMyl1T aHWcREK9unydKa3fTEO5O/h5YNRWltqX3Wsed2TRDjxstLdnTIJWj3iFi3yOjAl7F5He dELXk45bacZV9Vkbw5qCSyrlOvNf0oDepRfXoq/H1V+UoSVUoQZ9AAq+NpLTaLvwhoXs Ty/Q== X-Gm-Message-State: AOAM532wwXHzQRiTMj52GDNU9+Kxbrai3pySU+bUUzDE5f0kwAHhbj8W rFCb42/00gsdK8kKR8gg9z+wrZs6/JgqoAeLzZo= X-Google-Smtp-Source: ABdhPJxWwHrvacXSerCHOIFpdD5dSIKUff8G2wGlpnpneQ2mWe5Gp6bJAOFhcsrasPtYq8jR6GmsQb3FTQokWh0/H3M= X-Received: by 2002:a17:906:4c89:: with SMTP id q9mr25081646eju.118.1628077449495; Wed, 04 Aug 2021 04:44:09 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Wed, 4 Aug 2021 13:43:58 +0200 Message-ID: Subject: Re: [PATCH 5/8] aarch64: Tweak the cost of elementwise stores To: Richard Sandiford , GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Aug 2021 11:44:12 -0000 On Tue, Aug 3, 2021 at 2:09 PM Richard Sandiford via Gcc-patches wrote: > > When the vectoriser scalarises a strided store, it counts one > scalar_store for each element plus one vec_to_scalar extraction > for each element. However, extracting element 0 is free on AArch64, > so it should have zero cost. > > I don't have a testcase that requires this for existing -mtune > options, but it becomes more important with a later patch. > > gcc/ > * config/aarch64/aarch64.c (aarch64_is_store_elt_extraction): New > function, split out from... > (aarch64_detect_vector_stmt_subtype): ...here. > (aarch64_add_stmt_cost): Treat extracting element 0 as free. > --- > gcc/config/aarch64/aarch64.c | 22 +++++++++++++++++++--- > 1 file changed, 19 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 36f11808916..084f8caa0da 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -14622,6 +14622,18 @@ aarch64_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost, > } > } > > +/* Return true if an operaton of kind KIND for STMT_INFO represents > + the extraction of an element from a vector in preparation for > + storing the element to memory. */ > +static bool > +aarch64_is_store_elt_extraction (vect_cost_for_stmt kind, > + stmt_vec_info stmt_info) > +{ > + return (kind == vec_to_scalar > + && STMT_VINFO_DATA_REF (stmt_info) > + && DR_IS_WRITE (STMT_VINFO_DATA_REF (stmt_info))); > +} It would be nice to put functions like this in tree-vectorizer.h in some section marked with a comment to contain helpers for the target add_stmt_cost. > /* Return true if STMT_INFO represents part of a reduction. */ > static bool > aarch64_is_reduction (stmt_vec_info stmt_info) > @@ -14959,9 +14971,7 @@ aarch64_detect_vector_stmt_subtype (vec_info *vinfo, vect_cost_for_stmt kind, > /* Detect cases in which vec_to_scalar is describing the extraction of a > vector element in preparation for a scalar store. The store itself is > costed separately. */ > - if (kind == vec_to_scalar > - && STMT_VINFO_DATA_REF (stmt_info) > - && DR_IS_WRITE (STMT_VINFO_DATA_REF (stmt_info))) > + if (aarch64_is_store_elt_extraction (kind, stmt_info)) > return simd_costs->store_elt_extra_cost; > > /* Detect SVE gather loads, which are costed as a single scalar_load > @@ -15382,6 +15392,12 @@ aarch64_add_stmt_cost (class vec_info *vinfo, void *data, int count, > if (vectype && aarch64_sve_only_stmt_p (stmt_info, vectype)) > costs->saw_sve_only_op = true; > > + /* If we scalarize a strided store, the vectorizer costs one > + vec_to_scalar for each element. However, we can store the first > + element using an FP store without a separate extract step. */ > + if (aarch64_is_store_elt_extraction (kind, stmt_info)) > + count -= 1; > + > stmt_cost = aarch64_detect_scalar_stmt_subtype > (vinfo, kind, stmt_info, stmt_cost); >