public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "rguenther at suse dot de" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/113583] Main loop in 519.lbm not vectorized.
Date: Fri, 26 Jan 2024 10:21:55 +0000 [thread overview]
Message-ID: <bug-113583-4-eihsZpoLHP@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-113583-4@http.gcc.gnu.org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
--- Comment #10 from rguenther at suse dot de <rguenther at suse dot de> ---
On Fri, 26 Jan 2024, rdapp at gcc dot gnu.org wrote:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113583
>
> --- Comment #9 from Robin Dapp <rdapp at gcc dot gnu.org> ---
> (In reply to rguenther@suse.de from comment #6)
>
> > t.c:47:21: missed: the size of the group of accesses is not a power of 2
> > or not equal to 3
> > t.c:47:21: missed: not falling back to elementwise accesses
> > t.c:58:15: missed: not vectorized: relevant stmt not supported: _4 =
> > *_3;
> > t.c:47:21: missed: bad operation or unsupported loop bound.
> >
> > where we don't consider using gather because we have a known constant
> > stride (20). Since the stores are really scatters we don't attempt
> > to SLP either.
> >
> > Disabling the above heuristic we get this vectorized as well, avoiding
> > gather/scatter by manually implementing them and using a quite high
> > VF of 8 (with -mprefer-vector-width=256 you get VF 4 and likely
> > faster code in the end).
>
> I suppose you're referring to this?
>
> /* FIXME: At the moment the cost model seems to underestimate the
> cost of using elementwise accesses. This check preserves the
> traditional behavior until that can be fixed. */
> stmt_vec_info first_stmt_info = DR_GROUP_FIRST_ELEMENT (stmt_info);
> if (!first_stmt_info)
> first_stmt_info = stmt_info;
> if (*memory_access_type == VMAT_ELEMENTWISE
> && !STMT_VINFO_STRIDED_P (first_stmt_info)
> && !(stmt_info == DR_GROUP_FIRST_ELEMENT (stmt_info)
> && !DR_GROUP_NEXT_ELEMENT (stmt_info)
> && !pow2p_hwi (DR_GROUP_SIZE (stmt_info))))
> {
> if (dump_enabled_p ())
> dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
> "not falling back to elementwise accesses\n");
> return false;
> }
>
>
> I did some more tests on my laptop. As said above the whole loop in lbm is
> larger and contains two ifs. The first one prevents clang and GCC from
> vectorizing the loop, the second one
>
> if( TEST_FLAG_SWEEP( srcGrid, ACCEL )) {
> ux = 0.005;
> uy = 0.002;
> uz = 0.000;
> }
>
> seems to be if-converted? by clang or at least doesn't inhibit vectorization.
>
> Now if I comment out the first, larger if clang does vectorize the loop. With
> the return false commented out in the above GCC snippet GCC also vectorizes,
> but only when both ifs are commented out.
>
> Results (with both ifs commented out), -march=native (resulting in avx2), best
> of 3 as lbm is notoriously fickle:
>
> gcc trunk vanilla: 156.04s
> gcc trunk with elementwise: 132.10s
> clang 17: 143.06s
>
> Of course even the comment already said that costing is difficult and the
> change will surely cause regressions elsewhere. However the 15% improvement
> with vectorization (or the 9% improvement of clang) IMHO show that it's surely
> useful to look into this further. On top, the riscv clang seems to not care
> about the first if either and still vectorize. I haven't looked closer what
> happens there, though.
Yes. I think this shows we should remove the above hack and instead
try to fix the costing next stage1.
next prev parent reply other threads:[~2024-01-26 10:21 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-24 14:21 [Bug tree-optimization/113583] New: " rdapp at gcc dot gnu.org
2024-01-24 14:42 ` [Bug tree-optimization/113583] " juzhe.zhong at rivai dot ai
2024-01-24 14:44 ` rdapp at gcc dot gnu.org
2024-01-24 15:00 ` juzhe.zhong at rivai dot ai
2024-01-25 3:06 ` juzhe.zhong at rivai dot ai
2024-01-25 3:13 ` juzhe.zhong at rivai dot ai
2024-01-25 5:41 ` pinskia at gcc dot gnu.org
2024-01-25 9:05 ` rguenther at suse dot de
2024-01-25 9:16 ` juzhe.zhong at rivai dot ai
2024-01-25 9:34 ` rguenth at gcc dot gnu.org
2024-01-26 9:50 ` rdapp at gcc dot gnu.org
2024-01-26 10:21 ` rguenther at suse dot de [this message]
2024-02-05 6:59 ` juzhe.zhong at rivai dot ai
2024-02-07 3:39 ` juzhe.zhong at rivai dot ai
2024-02-07 7:48 ` juzhe.zhong at rivai dot ai
2024-02-07 8:04 ` rguenther at suse dot de
2024-02-07 8:08 ` juzhe.zhong at rivai dot ai
2024-02-07 8:13 ` juzhe.zhong at rivai dot ai
2024-02-07 10:24 ` rguenther at suse dot de
2024-05-13 14:17 ` rdapp at gcc dot gnu.org
2024-05-16 12:41 ` rguenth at gcc dot gnu.org
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-113583-4-eihsZpoLHP@http.gcc.gnu.org/bugzilla/ \
--to=gcc-bugzilla@gcc.gnu.org \
--cc=gcc-bugs@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).