Re: [PATCH][RFC] tree-optimization/101908 - avoid STLF fails when vectorizing

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <rguenther@suse.de>
Cc: gcc-patches@gcc.gnu.org,  hongtao.liu@intel.com
Subject: Re: [PATCH][RFC] tree-optimization/101908 - avoid STLF fails when vectorizing
Date: Mon, 28 Mar 2022 15:35:56 +0100	[thread overview]
Message-ID: <mptilryywj7.fsf@arm.com> (raw)
In-Reply-To: <7243n6n9-n4sq-8or-p523-q568r7891oo9@fhfr.qr> (Richard Biener's message of "Mon, 28 Mar 2022 15:49:29 +0200 (CEST)")

Richard Biener <rguenther@suse.de> writes:
> On Mon, 28 Mar 2022, Richard Sandiford wrote:
>
>> Richard Biener <rguenther@suse.de> writes:
>> > Since we're now vectorizing by default at -O2 issues like PR101908
>> > become more important where we apply basic-block vectorization to
>> > parts of the function covering loads from function parameters passed
>> > on the stack.  Since we have no good idea how the stack pushing
>> > was performed but we do have a good idea that it was done recent
>> > when at the start of the function a STLF failure is inevitable
>> > if the argument passing code didn't use the same or larger vector
>> > stores and also has them aligned the same way.
>> >
>> > Until there's a robust IPA based solution the following implements
>> > target independent heuristics in the vectorizer to retain scalar
>> > loads for loads from parameters likely passed in memory (I use
>> > a BLKmode DECL_MODE check for this rather than firing up
>> > cummulative-args).  I've restricted this also to loads from the
>> > first "block" (that can be less than the first basic block if there's
>> > a call for example), since that covers the testcase.
>> >
>> > Note that for the testcase (but not c-ray from the bugreport) there's
>> > a x86 peephole2 that vectorizes things back, so the patch is
>> > not effective there.
>> >
>> > Any comments?  I know we're also looking at x86 port specific
>> > mitigations but the issue will hit arm and power/z as well I think.
>> 
>> I'm not sure this is a target-independent win.  In a loop that:
>> 
>>   stores 2 scalars
>>   loads the stored scalars as a vector
>>   adds a vector
>>   stores a vector
>> 
>> (no feedback), I see a 20% regression using elementwise accesses for
>> the load vs. using a normal vector load (measured on a Cortex-A72).
>> With feedback the elementwise version is still slower, but obviously
>> not by such a big factor.
>
> I see, so that's even without a call inbetween the scalar stores
> and the vector load as is the case we're trying to cover.  I
> would suspect that maybe the two elementwise accesses execute
> too close to the two scalar stores to benefit from any forwarding
> with the A72 micro-architecture?  Do you see a speedup when
> performing a vector store instead of two scalar stores?

Yeah, it's faster with a vector store than with 2 scalar stores.

The difference between elementwise loads and vector loads reproduces
with execution of the stores and the load forced further apart.

Note that (unlike x86?) the elementwise loads are still done on the
vector side, so this is not a scalar->vector vs. scalar->scalar
trade-off.

Thanks,
Richard

next prev parent reply	other threads:[~2022-03-28 14:35 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-25 12:11 Richard Biener
2022-03-25 13:25 ` Hongtao Liu
2022-03-25 13:42   ` Richard Biener
2022-03-25 14:09     ` Hongtao Liu
2022-03-28 13:37 ` Richard Sandiford
2022-03-28 13:49   ` Richard Biener
2022-03-28 14:35     ` Richard Sandiford [this message]
2022-03-29  6:21       ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptilryywj7.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hongtao.liu@intel.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).