From: Richard Biener <rguenther@suse.de>
To: Jakub Jelinek <jakub@redhat.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] tree-optimization/113552 - fix num_call accounting in simd clone vectorization
Date: Tue, 23 Jan 2024 13:40:15 +0100 (CET) [thread overview]
Message-ID: <96o9o1on-0523-6s9p-n00q-o7qn476p0470@fhfr.qr> (raw)
In-Reply-To: <Za+vL+OdLFaA4UU6@tucnak>
On Tue, 23 Jan 2024, Jakub Jelinek wrote:
> On Tue, Jan 23, 2024 at 01:03:46PM +0100, Richard Biener wrote:
> > On Tue, 23 Jan 2024, Jakub Jelinek wrote:
> >
> > > On Tue, Jan 23, 2024 at 12:56:52PM +0100, Richard Biener wrote:
> > > > The following avoids using exact_log2 on the number of SIMD clone calls
> > > > to be emitted when vectorizing calls since that can easily be not
> > > > a power of two in which case it will return -1. For different simd
> > > > clones the number of calls will differ by a multiply with a power of two
> > > > only so using floor_log2 is good enough here.
> > > >
> > > > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> > > >
> > > > PR tree-optimization/113552
> > > > * tree-vect-stmts.cc (vectorizable_simd_clone_call): Use
> > > > floor_log2 instead of exact_log2 on the number of calls.
> > >
> > > Is there any target which supports non-power-of-two simdlen?
> > > If not, perhaps we should add !pow2p_hwi (num_calls) to the continue;
> > > condition a few lines earlier?
> >
> > Is non-power-of-two simdlen a thing? Note there's nothing wrong
> > with non-power-of-two num_calls, with VF == 4 and a group size
> > of 3 you get 12 lanes and either 3 (simdlen == 4) or 6 (simdlen == 2)
> > calls.
> >
> > Iff non-power-of-two simdlen is a thing then we could bias
> > by + num_calls (no idea why we use *_log2 in the first place, but it
> > was that way since the beginning).
>
> Ah, with SLP I can understand it doesn't have to be a power of two,
> the original
> + if (n->simdclone->simdlen
> + < (unsigned) LOOP_VINFO_VECT_FACTOR (loop_vinfo))
> + this_badness += (exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo))
> + - exact_log2 (n->simdclone->simdlen)) * 1024;
> was written for loop vectorization only and I think correctly assumed
> power of 2 loop vectorization factors as well as simdlens.
Note even without SLP we could vectorize a group size of 3 with
interleaving. But yes, VF for interleaving is a power of two and
so is simdlen so the above should have worked.
> I admit I don't remember why log2 rather than the count has been used,
> the first version of the patch is
> https://gcc.gnu.org/pipermail/gcc-patches/2013-November/374728.html
> But if we keep using log2, perhaps better ceil_log2 because num_calls of
> 3 is certainly more expensive than 2.
I think only the relative accounting to the other this_badness
increments matter since as said, at least with power-of-two simdlen
we'd get different {ceil,floor}_log2 values for different simdlen.
It's never going to be 3 vs 2 but 3 * 2^n vs. 3 * 2^m so floor or
ceil doesn't matter. In fact we could have just using
some inverse of exact_log2 (n->simdclone->simdlen). That is,
it's only simdlen that's varying in this addend to this_badness.
For the exact_log2 case the behavior is unchanged and we now only
get a sensible number for the others now. Maybe log2 was for the
fear of overflow or over-accounting compared to the other increments.
When overflow is not an issue we could also use
floor_log2 (num_calls * 4096) to be more "precise". I don't know
why we have all these different weights and why they are the way
they are - but it's a lot of apples and oranges we put together
in a magic number to compare ;)
I prefer to do a minimal change now, but both floor and ceil work
for me, so if you prefer I can change (but as said, it would only
matter with the mixing with the other cost factors, and in unknown
ways since the desire is not spelled out).
Richard.
next prev parent reply other threads:[~2024-01-23 12:41 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-23 11:56 Richard Biener
2024-01-23 12:01 ` Jakub Jelinek
2024-01-23 12:03 ` Richard Biener
2024-01-23 12:21 ` Jakub Jelinek
2024-01-23 12:40 ` Richard Biener [this message]
2024-01-23 12:43 ` Jakub Jelinek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=96o9o1on-0523-6s9p-n00q-o7qn476p0470@fhfr.qr \
--to=rguenther@suse.de \
--cc=gcc-patches@gcc.gnu.org \
--cc=jakub@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).