Re: [PATCH v3] Re: OpenMP: Generate SIMD clones for functions with "declare target"

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: Jakub Jelinek <jakub@redhat.com>
To: Sandra Loosemore <sandra@codesourcery.com>
Cc: Jan Hubicka <jh@suse.cz>,
	"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>,
	Thomas Schwinge <thomas@codesourcery.com>
Subject: Re: [PATCH v3] Re: OpenMP: Generate SIMD clones for functions with "declare target"
Date: Thu, 20 Oct 2022 16:07:02 +0200	[thread overview]
Message-ID: <Y1FWBg4O5gLnhx/F@tucnak> (raw)
In-Reply-To: <c8a948f0-ecc0-96ca-c36f-b8707fce5b24@codesourcery.com>

On Sun, Oct 16, 2022 at 07:23:05PM -0600, Sandra Loosemore wrote:
> My sense is that the first approach would be more straightforward than the
> second one, and I am willing to continue to work on that.  However, I think
> I need some direction to get started, as I presently know nothing about
> cgraph and I was unable to find any useful overview or interface
> documentation in the GCC internals manual.  Is this as simple as inserting
> an existing pass into the passlist to clean up after vectorization, or does
> it involve writing something more or less from scratch?

We (as I've discovered during the work on assumptions) have
TODO_discard_function which when returned from an execute pass throws away
a function completely (except now assumption functions for which it doesn't
release body; this could be done in some pass shortly after IPA, or
alternatively before expansion).  But another thing that needs to be done is for the
non-public declare simd clones (both explicit and implicit from your patch)
to be ordered in cgraph after anything that has a cgraph edge to its
original function.  I don't know how to do that, you should talk to Honza,
Richi or Martin about that.
I think the current behavior is that callees are processed before callers
if possible (unless there are cycles), which is certainly what we want for
say assume functions, or IPA RA etc.  But in case of non-public simd clones
we want to do it the other way around (at the expense of IPA RA), so that
we can throw away functions which aren't needed.

> > I admit I don't remember where exactly the simd clone happens wrt. other
> > IPA passes, but I think it is late pass; so, does it happen for GCN
> > offloading only in the lto1 offloading compiler?
> > Shouldn't the auto optimization be then done only in the offloading
> > lto1 for GCN then (say guard on targetm boolean)?
> 
> I'm afraid I don't know much about offloading, but I was under the
> impression it all goes through the same compilation process, just with a
> different target?

I've looked at it today and it seems late ipa passes are executed after LTO
bytecode is streamed back in.
If you say try:
#pragma omp declare simd
int foo (int x) { return x; }

int
main ()
{
  int a[64] = {};
  #pragma omp target map(a)
  #pragma omp simd
  for (int i = 0; i < 64; i++)
    a[i] = foo (a[i]);
}
with
gcc -foffload-options='-fdump-tree-all -fdump-ipa-all' -fdump-tree-all -fdump-ipa-all -O2 -fopenmp a.c -o a
you ought to see the simdclone dump both as a.c.*i.simdclone and a.x*.mkoffload.*i.simdclone
where the former is what is done for the host code (and host fallback),
while the latter is what is done in the offloading lto.
Can't verify it 100% because I have only nvptx-none offloading configured
and in that case pass_omp_simd_clone::gate is disabled in offloading lto
because targetm.simd_clone.compute_vecsize_and_simdlen is NULL for nvptx.
But it is non-NULL for gcn.

Thus, IMHO it is exactly the pass_omp_simd_clone pass where you want to
implement this auto-simdization discovery, guarded with
#ifdef ACCEL_COMPILER and the new option (which means it will be done
only for gcn and not on the host right now).  And do it at the start of
ipa_omp_simd_clone, before the
  FOR_EACH_FUNCTION (node)
    expand_simd_clones (node);
loop, or, if it is purely local decision for each function, at the
start of expand_simd_clones with similar guarding, punt on functions
with "noclone" attribute, or !node->definition.  You need to repeat the
  if (node->has_gimple_body_p ())
    node->get_body ();
to get body before you analyze it.

And please put the new functions for such analysis into omp-simd-clone.cc
where they belong.

	Jakub

next prev parent reply	other threads:[~2022-10-20 14:07 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-14 17:32 Sandra Loosemore
2022-09-14 18:12 ` Jakub Jelinek
2022-09-22  3:17   ` [PATCH v2] " Sandra Loosemore
2022-09-30 10:37     ` Jakub Jelinek
2022-10-17  1:23       ` [PATCH v3] " Sandra Loosemore
2022-10-20 14:07         ` Jakub Jelinek [this message]
2022-10-27  2:27           ` Sandra Loosemore
2022-10-27 10:09             ` Thomas Schwinge
2022-10-27 20:40               ` Sandra Loosemore
2022-09-14 21:45 ` Thomas Schwinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y1FWBg4O5gLnhx/F@tucnak \
    --to=jakub@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jh@suse.cz \
    --cc=sandra@codesourcery.com \
    --cc=thomas@codesourcery.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).