From: Richard Sandiford <richard.sandiford@arm.com>
To: Richard Biener <richard.guenther@gmail.com>
Cc: Hao Liu OS <hliu@os.amperecomputing.com>,
"GCC-patches\@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] AArch64: Do not increase the vect reduction latency by multiplying count [PR110625]
Date: Wed, 26 Jul 2023 11:12:19 +0100 [thread overview]
Message-ID: <mptjzunc77g.fsf@arm.com> (raw)
In-Reply-To: <CAFiYyc0F-vvXpu_UUCs_=poJw=m5Rzdgbnb8NjQVbpDZ6frYYA@mail.gmail.com> (Richard Biener's message of "Wed, 26 Jul 2023 12:02:01 +0200")
Richard Biener <richard.guenther@gmail.com> writes:
> On Wed, Jul 26, 2023 at 11:14 AM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Richard Biener <richard.guenther@gmail.com> writes:
>> > On Wed, Jul 26, 2023 at 4:02 AM Hao Liu OS via Gcc-patches
>> > <gcc-patches@gcc.gnu.org> wrote:
>> >>
>> >> > When was STMT_VINFO_REDUC_DEF empty? I just want to make sure that we're not papering over an issue elsewhere.
>> >>
>> >> Yes, I also wonder if this is an issue in vectorizable_reduction. Below is the the gimple of "gcc.target/aarch64/sve/cost_model_13.c":
>> >>
>> >> <bb 3>:
>> >> # res_18 = PHI <res_15(7), 0(6)>
>> >> # i_20 = PHI <i_16(7), 0(6)>
>> >> _1 = (long unsigned int) i_20;
>> >> _2 = _1 * 2;
>> >> _3 = x_14(D) + _2;
>> >> _4 = *_3;
>> >> _5 = (unsigned short) _4;
>> >> res.0_6 = (unsigned short) res_18;
>> >> _7 = _5 + res.0_6; <-- The current stmt_info
>> >> res_15 = (short int) _7;
>> >> i_16 = i_20 + 1;
>> >> if (n_11(D) > i_16)
>> >> goto <bb 7>;
>> >> else
>> >> goto <bb 4>;
>> >>
>> >> <bb 7>:
>> >> goto <bb 3>;
>> >>
>> >> It looks like that STMT_VINFO_REDUC_DEF should be "res_18 = PHI <res_15(7), 0(6)>"?
>> >> The status here is:
>> >> STMT_VINFO_REDUC_IDX (stmt_info): 1
>> >> STMT_VINFO_REDUC_TYPE (stmt_info): TREE_CODE_REDUCTION
>> >> STMT_VINFO_REDUC_VECTYPE (stmt_info): 0x0
>> >
>> > Not all stmts in the SSA cycle forming the reduction have
>> > STMT_VINFO_REDUC_DEF set,
>> > only the last (latch def) and live stmts have at the moment.
>>
>> Ah, thanks. In that case, Hao, I think we can avoid the ICE by changing:
>>
>> if ((kind == scalar_stmt || kind == vector_stmt || kind == vec_to_scalar)
>> && vect_is_reduction (stmt_info))
>>
>> to:
>>
>> if ((kind == scalar_stmt || kind == vector_stmt || kind == vec_to_scalar)
>> && STMT_VINFO_LIVE_P (stmt_info)
>> && vect_is_reduction (stmt_info))
>>
>> instead of using a null check.
>
> But as seen you will miss stmts that are part of the reduction then?
Yeah, but the code is doing a maximum of all the reductions in the loop:
/* ??? Ideally we'd do COUNT reductions in parallel, but unfortunately
that's not yet the case. */
ops->reduction_latency = MAX (ops->reduction_latency, base * count);
So as it stands, we only need to see each reduction (as opposed to each
reduction statement) once.
But we do want to know the length of each reduction...
> In principle we could put STMT_VINFO_REDUC_DEF to other stmts
> as well. See vectorizable_reduction in the
>
> while (reduc_def != PHI_RESULT (reduc_def_phi))
>
> loop.
Nod. That's where I'd got the STMT_VINFO_LIVE_P thing from.
>> I see that vectorizable_reduction calculates a reduc_chain_length.
>> Would it be OK to store that in the stmt_vec_info? I suppose the
>> AArch64 code should be multiplying by that as well. (It would be a
>> separate patch from this one though.)
>
> I don't think that's too relevant here (it also counts noop conversions).
Bah. I'm loath to copy that loop and just pick out the relevant
statements though.
I suppose if every statement had a STMT_VINFO_REDUC_DEF, aarch64 could
maintain a hash map from STMT_VINFO_REDUC_DEF to total latencies, and
then take the maximum of those total latencies. It sounds pretty
complex though...
Thanks,
Richard
next prev parent reply other threads:[~2023-07-26 10:12 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-19 4:33 Hao Liu OS
2023-07-24 1:58 ` Hao Liu OS
2023-07-24 11:10 ` Richard Sandiford
2023-07-25 9:10 ` Hao Liu OS
2023-07-25 9:44 ` Richard Sandiford
2023-07-26 2:01 ` Hao Liu OS
2023-07-26 8:47 ` Richard Biener
2023-07-26 9:14 ` Richard Sandiford
2023-07-26 10:02 ` Richard Biener
2023-07-26 10:12 ` Richard Sandiford [this message]
2023-07-26 12:00 ` Richard Biener
2023-07-26 12:54 ` Hao Liu OS
2023-07-28 10:06 ` Hao Liu OS
2023-07-28 17:35 ` Richard Sandiford
2023-07-31 2:39 ` Hao Liu OS
2023-07-31 9:11 ` Richard Sandiford
2023-07-31 9:25 ` Hao Liu OS
2023-08-01 9:43 ` Hao Liu OS
2023-08-02 3:45 ` Hao Liu OS
2023-08-03 9:33 ` Hao Liu OS
2023-08-03 10:10 ` Richard Sandiford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=mptjzunc77g.fsf@arm.com \
--to=richard.sandiford@arm.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=hliu@os.amperecomputing.com \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).