From: "juzhe.zhong@rivai.ai" <juzhe.zhong@rivai.ai>
To: "Robin Dapp" <rdapp.gcc@gmail.com>,
gcc-patches <gcc-patches@gcc.gnu.org>
Cc: "Robin Dapp" <rdapp.gcc@gmail.com>,
kito.cheng <kito.cheng@gmail.com>,
Kito.cheng <kito.cheng@sifive.com>,
jeffreyalaw <jeffreyalaw@gmail.com>
Subject: Re: Re: [PATCH V4] RISC-V: Support Dynamic LMUL Cost model
Date: Tue, 12 Sep 2023 17:14:04 +0800 [thread overview]
Message-ID: <951EBCCD5204EC72+2023091217140349725937@rivai.ai> (raw)
In-Reply-To: <6acec37d-162f-5aa2-3ce9-d10abbb5468f@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3802 bytes --]
Thanks Robin.
I have tried your codes. It works fine and tests passes.
Does your code O(nlogn) complexity ?
juzhe.zhong@rivai.ai
From: Robin Dapp
Date: 2023-09-12 16:19
To: Juzhe-Zhong; gcc-patches
CC: rdapp.gcc; kito.cheng; kito.cheng; jeffreyalaw
Subject: Re: [PATCH V4] RISC-V: Support Dynamic LMUL Cost model
Hi Juzhe,
> +max_number_of_live_regs (const basic_block bb,
> + const hash_map<tree, pair> &live_ranges,
> + unsigned int max_point, machine_mode biggest_mode,
> + int lmul)
> +{
> + unsigned int max_nregs = 0;
> + unsigned int i;
> + unsigned int live_point = 0;
> + auto_vec<unsigned int> live_vars_vec;
> + live_vars_vec.safe_grow (max_point + 1, true);
> + for (i = 0; i < live_vars_vec.length (); ++i)
> + live_vars_vec[i] = 0;
> + for (hash_map<tree, pair>::iterator iter = live_ranges.begin ();
> + iter != live_ranges.end (); ++iter)
> + {
> + tree var = (*iter).first;
> + pair live_range = (*iter).second;
> + for (i = live_range.first; i <= live_range.second; i++)
> + {
> + machine_mode mode = TYPE_MODE (TREE_TYPE (var));
> + unsigned int nregs
> + = compute_nregs_for_mode (mode, biggest_mode, lmul);
> + live_vars_vec[i] += nregs;
> + if (live_vars_vec[i] > max_nregs)
> + max_nregs = live_vars_vec[i];
> + }
> + }
My concern is that we have O(nm) here, where n = number of live_ranges
and m = size of live range. In large basic blocks (think calculix of
SPECfp 2006 which can reach up to 2000 instructions IIRC) this might
become prohibitive.
I'm going to do a quick benchmark with calculix and report back. If
there is no noticable difference we can ditch my idea.
For short live ranges (like < 10) the O(nm) could be better. As of now,
we still calculate the nregs n*m times, though. I have something like
the following in mind (it is definitely not shorter, though):
struct range {
unsigned int pt;
bool start;
unsigned int nregs;
};
auto_vec<range> ranges (2 * live_ranges.elements ());
for (hash_map<tree, pair>::iterator iter = live_ranges.begin ();
iter != live_ranges.end (); ++iter)
{
tree var = (*iter).first;
machine_mode mode = TYPE_MODE (TREE_TYPE (var));
unsigned int nregs
= compute_nregs_for_mode (mode, biggest_mode, lmul);
ranges.quick_push ({(*iter).second.first, true, nregs});
ranges.quick_push ({(*iter).second.second, false, nregs});
}
ranges.qsort ([] (const void *a, const void *b) -> int {
unsigned int aa = ((const range *)a)->pt;
unsigned int bb = ((const range *)b)->pt;
if (aa < bb)
return -1;
if (aa == bb)
return 0;
return 1;
});
unsigned int cur = 0;
max_nregs = ranges[0].nregs;
for (auto r : ranges)
{
if (r.start)
cur += r.nregs;
else
cur -= r.nregs;
max_nregs = MAX (max_nregs, cur);
}
> + for (i = 0; i < cfun->gimple_df->ssa_names->length (); i++)
> + {
> + tree t = ssa_name (i);
> + if (!t)
> + continue;
Could likely be replaced by
tree t;
FOR_EACH_SSA_NAME (i, t, cfun)
> +static void
> +update_local_live_ranges (
> + vec_info *vinfo,
> + hash_map<basic_block, vec<stmt_point>> &program_points_per_bb,
> + hash_map<basic_block, hash_map<tree, pair>> &live_ranges_per_bb)
> +{
I just realized (sorry) that this is "nested" a bit far. Can we still
have e.g.
> + if (loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (vinfo))
> + {
this,
> + if (STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info))
> + != undef_vec_info_type)
this,
> + if (live_range)
> + {
and this just "continue"?
Apart from that, LGTM.
Regards
Robin
next prev parent reply other threads:[~2023-09-12 9:14 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-12 6:49 Juzhe-Zhong
2023-09-12 8:19 ` Robin Dapp
2023-09-12 9:14 ` juzhe.zhong [this message]
2023-09-12 9:17 ` Robin Dapp
2023-09-12 9:25 ` juzhe.zhong
2023-09-12 9:31 ` Robin Dapp
2023-09-12 9:36 ` juzhe.zhong
2023-09-12 10:57 ` Robin Dapp
[not found] ` <2023091217364091739043@rivai.ai>
2023-09-12 9:38 ` juzhe.zhong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=951EBCCD5204EC72+2023091217140349725937@rivai.ai \
--to=juzhe.zhong@rivai.ai \
--cc=gcc-patches@gcc.gnu.org \
--cc=jeffreyalaw@gmail.com \
--cc=kito.cheng@gmail.com \
--cc=kito.cheng@sifive.com \
--cc=rdapp.gcc@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).