From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by sourceware.org (Postfix) with ESMTPS id 17A183861834 for ; Wed, 4 Aug 2021 11:45:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 17A183861834 Received: by mail-ed1-x52f.google.com with SMTP id cf5so3130413edb.2 for ; Wed, 04 Aug 2021 04:45:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=AUMG4tqejlAtLG+1VEYF4hgL9gR2Clh0+wNV3Y/Yhg0=; b=SYK75v4wbO0JHmfWPYC1Q6q43aRjki4L/J1W/BAiy5FVh8o8zhQJSmpI9Kzv2JOFwx eFuN2kgir96li04oPH++IqZMz212Y0ODSYmIWrBiY9kGv3MNsCPxZDfWzIHUOXCotg4a CFYvlH2MWnapec3W5/cGH/L2XUoXrdgA9M9kqEmRlhNen2rjh0RnAoRq5Qxnu0ie9aK1 iV/NX9NFIlw7Tx2dF8GNvGbGS8Wdy2sOu0/2ZjWr9NYVARaXx0CiwHoKy5TwTmygLGah 0dT7rmYOyLgN7+O4kbNKL7A2B4+1XoYZXmTOrSU2lgarW/fgpK6KqBgMJlNKEsToBN8F THTQ== X-Gm-Message-State: AOAM532JHPISjzMyBdVeyX50lC/OLTA2JjGJMLCAF1jgpQzcPHFWfDT8 4a6j2YrWuCEWP3OKps2xc0Fgm3TJmMQPnVyFCWo= X-Google-Smtp-Source: ABdhPJwCi7wtb45TOiIjLAAoGzkXODrVUUL5tE3tBw88pr18ztNZ2tbf2epq5W3FCneSgixfz7CxxFgHjc3PWNxbS6A= X-Received: by 2002:a05:6402:1603:: with SMTP id f3mr31341539edv.274.1628077555950; Wed, 04 Aug 2021 04:45:55 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Wed, 4 Aug 2021 13:45:45 +0200 Message-ID: Subject: Re: [PATCH 6/8] aarch64: Tweak MLA vector costs To: Richard Sandiford , GCC Patches Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-8.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Aug 2021 11:45:58 -0000 On Tue, Aug 3, 2021 at 2:10 PM Richard Sandiford via Gcc-patches wrote: > > The issue-based vector costs currently assume that a multiply-add > sequence can be implemented using a single instruction. This is > generally true for scalars (which have a 4-operand instruction) > and SVE (which allows the output to be tied to any input). > However, for Advanced SIMD, multiplying two values and adding > an invariant will end up being a move and an MLA. > > The only target to use the issue-based vector costs is Neoverse V1, > which would generally prefer SVE in this case anyway. I therefore > don't have a self-contained testcase. However, the distinction > becomes more important with a later patch. But we do cost any invariants separately (for the prologue), so they should be available in a register. How doesn't that work? > gcc/ > * config/aarch64/aarch64.c (aarch64_multiply_add_p): Add a vec_flags > parameter. Detect cases in which an Advanced SIMD MLA would almost > certainly require a MOV. > (aarch64_count_ops): Update accordingly. > --- > gcc/config/aarch64/aarch64.c | 25 ++++++++++++++++++++++--- > 1 file changed, 22 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 084f8caa0da..19045ef6944 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -14767,9 +14767,12 @@ aarch64_integer_truncation_p (stmt_vec_info stmt_info) > > /* Return true if STMT_INFO is the second part of a two-statement multiply-add > or multiply-subtract sequence that might be suitable for fusing into a > - single instruction. */ > + single instruction. If VEC_FLAGS is zero, analyze the operation as > + a scalar one, otherwise analyze it as an operation on vectors with those > + VEC_* flags. */ > static bool > -aarch64_multiply_add_p (vec_info *vinfo, stmt_vec_info stmt_info) > +aarch64_multiply_add_p (vec_info *vinfo, stmt_vec_info stmt_info, > + unsigned int vec_flags) > { > gassign *assign = dyn_cast (stmt_info->stmt); > if (!assign) > @@ -14797,6 +14800,22 @@ aarch64_multiply_add_p (vec_info *vinfo, stmt_vec_info stmt_info) > if (!rhs_assign || gimple_assign_rhs_code (rhs_assign) != MULT_EXPR) > continue; > > + if (vec_flags & VEC_ADVSIMD) > + { > + /* Scalar and SVE code can tie the result to any FMLA input (or none, > + although that requires a MOVPRFX for SVE). However, Advanced SIMD > + only supports MLA forms, so will require a move if the result > + cannot be tied to the accumulator. The most important case in > + which this is true is when the accumulator input is invariant. */ > + rhs = gimple_op (assign, 3 - i); > + if (TREE_CODE (rhs) != SSA_NAME) > + return false; > + def_stmt_info = vinfo->lookup_def (rhs); > + if (!def_stmt_info > + || STMT_VINFO_DEF_TYPE (def_stmt_info) == vect_external_def) > + return false; > + } > + > return true; > } > return false; > @@ -15232,7 +15251,7 @@ aarch64_count_ops (class vec_info *vinfo, aarch64_vector_costs *costs, > } > > /* Assume that multiply-adds will become a single operation. */ > - if (stmt_info && aarch64_multiply_add_p (vinfo, stmt_info)) > + if (stmt_info && aarch64_multiply_add_p (vinfo, stmt_info, vec_flags)) > return; > > /* When costing scalar statements in vector code, the count already