public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: Hao Liu OS <hliu@os.amperecomputing.com>,
	 "GCC-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]
Date: Wed, 26 Jul 2023 11:12:19 +0100	[thread overview]
Message-ID: <mptjzunc77g.fsf@arm.com> (raw)
In-Reply-To: <CAFiYyc0F-vvXpu_UUCs_=poJw=m5Rzdgbnb8NjQVbpDZ6frYYA@mail.gmail.com> (Richard Biener's message of "Wed, 26 Jul 2023 12:02:01 +0200")

Richard Biener <richard.guenther@gmail.com> writes:
> On Wed, Jul 26, 2023 at 11:14 AM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Richard Biener <richard.guenther@gmail.com> writes:
>> > On Wed, Jul 26, 2023 at 4:02 AM Hao Liu OS via Gcc-patches
>> > <gcc-patches@gcc.gnu.org> wrote:
>> >>
>> >> > When was STMT_VINFO_REDUC_DEF empty?  I just want to make sure that we're not papering over an issue elsewhere.
>> >>
>> >> Yes, I also wonder if this is an issue in vectorizable_reduction.  Below is the the gimple of "gcc.target/aarch64/sve/cost_model_13.c":
>> >>
>> >>   <bb 3>:
>> >>   # res_18 = PHI <res_15(7), 0(6)>
>> >>   # i_20 = PHI <i_16(7), 0(6)>
>> >>   _1 = (long unsigned int) i_20;
>> >>   _2 = _1 * 2;
>> >>   _3 = x_14(D) + _2;
>> >>   _4 = *_3;
>> >>   _5 = (unsigned short) _4;
>> >>   res.0_6 = (unsigned short) res_18;
>> >>   _7 = _5 + res.0_6;                             <-- The current stmt_info
>> >>   res_15 = (short int) _7;
>> >>   i_16 = i_20 + 1;
>> >>   if (n_11(D) > i_16)
>> >>     goto <bb 7>;
>> >>   else
>> >>     goto <bb 4>;
>> >>
>> >>   <bb 7>:
>> >>   goto <bb 3>;
>> >>
>> >> It looks like that STMT_VINFO_REDUC_DEF should be "res_18 = PHI <res_15(7), 0(6)>"?
>> >> The status here is:
>> >>   STMT_VINFO_REDUC_IDX (stmt_info): 1
>> >>   STMT_VINFO_REDUC_TYPE (stmt_info): TREE_CODE_REDUCTION
>> >>   STMT_VINFO_REDUC_VECTYPE (stmt_info): 0x0
>> >
>> > Not all stmts in the SSA cycle forming the reduction have
>> > STMT_VINFO_REDUC_DEF set,
>> > only the last (latch def) and live stmts have at the moment.
>>
>> Ah, thanks.  In that case, Hao, I think we can avoid the ICE by changing:
>>
>>   if ((kind == scalar_stmt || kind == vector_stmt || kind == vec_to_scalar)
>>       && vect_is_reduction (stmt_info))
>>
>> to:
>>
>>   if ((kind == scalar_stmt || kind == vector_stmt || kind == vec_to_scalar)
>>       && STMT_VINFO_LIVE_P (stmt_info)
>>       && vect_is_reduction (stmt_info))
>>
>> instead of using a null check.
>
> But as seen you will miss stmts that are part of the reduction then?

Yeah, but the code is doing a maximum of all the reductions in the loop:

      /* ??? Ideally we'd do COUNT reductions in parallel, but unfortunately
	 that's not yet the case.  */
      ops->reduction_latency = MAX (ops->reduction_latency, base * count);

So as it stands, we only need to see each reduction (as opposed to each
reduction statement) once.

But we do want to know the length of each reduction...

> In principle we could put STMT_VINFO_REDUC_DEF to other stmts
> as well.  See vectorizable_reduction in the
>
>   while (reduc_def != PHI_RESULT (reduc_def_phi))
>
> loop.

Nod.  That's where I'd got the STMT_VINFO_LIVE_P thing from.

>> I see that vectorizable_reduction calculates a reduc_chain_length.
>> Would it be OK to store that in the stmt_vec_info?  I suppose the
>> AArch64 code should be multiplying by that as well.  (It would be a
>> separate patch from this one though.)
>
> I don't think that's too relevant here (it also counts noop conversions).

Bah.  I'm loath to copy that loop and just pick out the relevant
statements though.

I suppose if every statement had a STMT_VINFO_REDUC_DEF, aarch64 could
maintain a hash map from STMT_VINFO_REDUC_DEF to total latencies, and
then take the maximum of those total latencies.  It sounds pretty
complex though...

Thanks,
Richard


  reply	other threads:[~2023-07-26 10:12 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-19  4:33 Hao Liu OS
2023-07-24  1:58 ` Hao Liu OS
2023-07-24 11:10 ` Richard Sandiford
2023-07-25  9:10   ` Hao Liu OS
2023-07-25  9:44     ` Richard Sandiford
2023-07-26  2:01       ` Hao Liu OS
2023-07-26  8:47         ` Richard Biener
2023-07-26  9:14           ` Richard Sandiford
2023-07-26 10:02             ` Richard Biener
2023-07-26 10:12               ` Richard Sandiford [this message]
2023-07-26 12:00                 ` Richard Biener
2023-07-26 12:54             ` Hao Liu OS
2023-07-28 10:06               ` Hao Liu OS
2023-07-28 17:35               ` Richard Sandiford
2023-07-31  2:39                 ` Hao Liu OS
2023-07-31  9:11                   ` Richard Sandiford
2023-07-31  9:25                     ` Hao Liu OS
2023-08-01  9:43                     ` Hao Liu OS
2023-08-02  3:45                       ` Hao Liu OS
2023-08-03  9:33                         ` Hao Liu OS
2023-08-03 10:10                         ` Richard Sandiford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=mptjzunc77g.fsf@arm.com \
    --to=richard.sandiford@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hliu@os.amperecomputing.com \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).