nvptx offloading patches [2/n]

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* nvptx offloading patches [2/n]
@ 2014-11-01 11:51 Bernd Schmidt
  2014-11-03 22:23 ` Jeff Law
  2015-02-04 10:56 ` Jakub Jelinek
  0 siblings, 2 replies; 13+ messages in thread
From: Bernd Schmidt @ 2014-11-01 11:51 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 675 bytes --]

LTO has a mechanism not to stream out common nodes that are expected to 
be identical on each run. When using LTO to communicate between 
compilers for different targets, the va_list_type_node and related ones 
must be excluded from this.

Richard B mentioned in a recent mail that the i386 backend uses direct 
comparisons to va_list_type_node. After investigating a bit it seems to 
me that this is not actually a problem: what's being compared is the 
return value of ix86_canonical_va_list_type, which always chooses one of 
va_list_type_node or its ABI variants, so the comparison should hold 
even with this patch.

Bootstrapped and tested on x86_64-linux, ok?

Bernd

[-- Attachment #2: valist.diff --]
[-- Type: text/x-patch, Size: 876 bytes --]

	* tre-streamer.c (preload_common_nodes): Skip TI_VA_LIST_TYPE and
	related nodes.

Index: gcc/tree-streamer.c
===================================================================
--- gcc/tree-streamer.c.orig
+++ gcc/tree-streamer.c
@@ -309,10 +309,14 @@ preload_common_nodes (struct streamer_tr
     record_common_node (cache, sizetype_tab[i]);

   for (i = 0; i < TI_MAX; i++)
-    /* Skip boolean type and constants, they are frontend dependent.  */
+    /* Skip boolean type and constants, they are frontend dependent.
+       Skip va_list types, target dependent and may not survive offloading.  */
     if (i != TI_BOOLEAN_TYPE
 	&& i != TI_BOOLEAN_FALSE
-	&& i != TI_BOOLEAN_TRUE)
+	&& i != TI_BOOLEAN_TRUE
+	&& i != TI_VA_LIST_TYPE
+	&& i != TI_VA_LIST_GPR_COUNTER_FIELD
+	&& i != TI_VA_LIST_FPR_COUNTER_FIELD)
       record_common_node (cache, global_trees[i]);
 }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nvptx offloading patches [2/n]
  2014-11-01 11:51 nvptx offloading patches [2/n] Bernd Schmidt
@ 2014-11-03 22:23 ` Jeff Law
  2014-11-14 18:53   ` Bernd Schmidt
  2015-02-04 10:56 ` Jakub Jelinek
  1 sibling, 1 reply; 13+ messages in thread
From: Jeff Law @ 2014-11-03 22:23 UTC (permalink / raw)
  To: Bernd Schmidt, GCC Patches

On 11/01/14 05:51, Bernd Schmidt wrote:
> LTO has a mechanism not to stream out common nodes that are expected to
> be identical on each run. When using LTO to communicate between
> compilers for different targets, the va_list_type_node and related ones
> must be excluded from this.
>
> Richard B mentioned in a recent mail that the i386 backend uses direct
> comparisons to va_list_type_node. After investigating a bit it seems to
> me that this is not actually a problem: what's being compared is the
> return value of ix86_canonical_va_list_type, which always chooses one of
> va_list_type_node or its ABI variants, so the comparison should hold
> even with this patch.
>
> Bootstrapped and tested on x86_64-linux, ok?
Would like the SuSE guys to chime in here since they know more about the 
streamer than I.

jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nvptx offloading patches [2/n]
  2014-11-03 22:23 ` Jeff Law
@ 2014-11-14 18:53   ` Bernd Schmidt
  0 siblings, 0 replies; 13+ messages in thread
From: Bernd Schmidt @ 2014-11-14 18:53 UTC (permalink / raw)
  To: Jeff Law, GCC Patches

On 11/03/2014 11:23 PM, Jeff Law wrote:
> On 11/01/14 05:51, Bernd Schmidt wrote:
>> LTO has a mechanism not to stream out common nodes that are expected to
>> be identical on each run. When using LTO to communicate between
>> compilers for different targets, the va_list_type_node and related ones
>> must be excluded from this.
>>
>> Richard B mentioned in a recent mail that the i386 backend uses direct
>> comparisons to va_list_type_node. After investigating a bit it seems to
>> me that this is not actually a problem: what's being compared is the
>> return value of ix86_canonical_va_list_type, which always chooses one of
>> va_list_type_node or its ABI variants, so the comparison should hold
>> even with this patch.
>>
>> Bootstrapped and tested on x86_64-linux, ok?
> Would like the SuSE guys to chime in here since they know more about the
> streamer than I.

Ping.


Bernd


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nvptx offloading patches [2/n]
  2014-11-01 11:51 nvptx offloading patches [2/n] Bernd Schmidt
  2014-11-03 22:23 ` Jeff Law
@ 2015-02-04 10:56 ` Jakub Jelinek
  2015-02-04 10:59   ` Jakub Jelinek
  2015-02-09 10:16   ` Richard Biener
  1 sibling, 2 replies; 13+ messages in thread
From: Jakub Jelinek @ 2015-02-04 10:56 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches

On Sat, Nov 01, 2014 at 12:51:32PM +0100, Bernd Schmidt wrote:
> LTO has a mechanism not to stream out common nodes that are expected to be
> identical on each run. When using LTO to communicate between compilers for
> different targets, the va_list_type_node and related ones must be excluded
> from this.
> 
> Richard B mentioned in a recent mail that the i386 backend uses direct
> comparisons to va_list_type_node. After investigating a bit it seems to me
> that this is not actually a problem: what's being compared is the return
> value of ix86_canonical_va_list_type, which always chooses one of
> va_list_type_node or its ABI variants, so the comparison should hold even
> with this patch.
> 
> Bootstrapped and tested on x86_64-linux, ok?

How can the offloading of functions using va_start/va_end/va_arg work,
until we apply (in GCC 6?) Michael's patches and extend them - make
all those 3 internal functions lowered only after IPA?

I mean, nvptx supposedly contains different va_list type (from quick glance
it uses void *, while e.g. x86_64 uses a struct [1]), and we gimplify it
early, so for GCC 5 the only option is IMHO to refuse to compile (sorry?)
when streaming functions that use the host va_list type.

For GCC 6, presumably if it is lowered late, if the host va_list would be
at least as big as target va_list, we could stick stuff in there, or rewrite
to the target va_list.  Still, if e.g. va_list is embedded in structures, or
used in global vars, we'd need to pad the structures or something.

	Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nvptx offloading patches [2/n]
  2015-02-04 10:56 ` Jakub Jelinek
@ 2015-02-04 10:59   ` Jakub Jelinek
  2015-02-09 10:16   ` Richard Biener
  1 sibling, 0 replies; 13+ messages in thread
From: Jakub Jelinek @ 2015-02-04 10:59 UTC (permalink / raw)
  To: Bernd Schmidt, Richard Biener; +Cc: GCC Patches

On Wed, Feb 04, 2015 at 11:55:54AM +0100, Jakub Jelinek wrote:
> On Sat, Nov 01, 2014 at 12:51:32PM +0100, Bernd Schmidt wrote:
> > LTO has a mechanism not to stream out common nodes that are expected to be
> > identical on each run. When using LTO to communicate between compilers for
> > different targets, the va_list_type_node and related ones must be excluded
> > from this.
> > 
> > Richard B mentioned in a recent mail that the i386 backend uses direct
> > comparisons to va_list_type_node. After investigating a bit it seems to me
> > that this is not actually a problem: what's being compared is the return
> > value of ix86_canonical_va_list_type, which always chooses one of
> > va_list_type_node or its ABI variants, so the comparison should hold even
> > with this patch.
> > 
> > Bootstrapped and tested on x86_64-linux, ok?
> 
> How can the offloading of functions using va_start/va_end/va_arg work,
> until we apply (in GCC 6?) Michael's patches and extend them - make
> all those 3 internal functions lowered only after IPA?
> 
> I mean, nvptx supposedly contains different va_list type (from quick glance
> it uses void *, while e.g. x86_64 uses a struct [1]), and we gimplify it
> early, so for GCC 5 the only option is IMHO to refuse to compile (sorry?)
> when streaming functions that use the host va_list type.
> 
> For GCC 6, presumably if it is lowered late, if the host va_list would be
> at least as big as target va_list, we could stick stuff in there, or rewrite
> to the target va_list.  Still, if e.g. va_list is embedded in structures, or
> used in global vars, we'd need to pad the structures or something.

That said, if your patch doesn't break normal LTO, I agree it doesn't make
much sense to 
> 
> 	Jakub

	Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nvptx offloading patches [2/n]
  2015-02-04 10:56 ` Jakub Jelinek
  2015-02-04 10:59   ` Jakub Jelinek
@ 2015-02-09 10:16   ` Richard Biener
  2015-02-17 16:37     ` Bernd Schmidt
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Biener @ 2015-02-09 10:16 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Schmidt, GCC Patches

On Wed, Feb 4, 2015 at 11:55 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Sat, Nov 01, 2014 at 12:51:32PM +0100, Bernd Schmidt wrote:
>> LTO has a mechanism not to stream out common nodes that are expected to be
>> identical on each run. When using LTO to communicate between compilers for
>> different targets, the va_list_type_node and related ones must be excluded
>> from this.
>>
>> Richard B mentioned in a recent mail that the i386 backend uses direct
>> comparisons to va_list_type_node. After investigating a bit it seems to me
>> that this is not actually a problem: what's being compared is the return
>> value of ix86_canonical_va_list_type, which always chooses one of
>> va_list_type_node or its ABI variants, so the comparison should hold even
>> with this patch.
>>
>> Bootstrapped and tested on x86_64-linux, ok?
>
> How can the offloading of functions using va_start/va_end/va_arg work,
> until we apply (in GCC 6?) Michael's patches and extend them - make
> all those 3 internal functions lowered only after IPA?
>
> I mean, nvptx supposedly contains different va_list type (from quick glance
> it uses void *, while e.g. x86_64 uses a struct [1]), and we gimplify it
> early, so for GCC 5 the only option is IMHO to refuse to compile (sorry?)
> when streaming functions that use the host va_list type.
>
> For GCC 6, presumably if it is lowered late, if the host va_list would be
> at least as big as target va_list, we could stick stuff in there, or rewrite
> to the target va_list.  Still, if e.g. va_list is embedded in structures, or
> used in global vars, we'd need to pad the structures or something.

In principle I am always happy these days to preload less nodes.

Thus, if your patch survives LTO bootstrap and you can still LTO
a TU with ms_abi valist functions successfully (not sure if that's
exercised in the testsuite) then it is fine.

Note that I _did_ run into issues with excempting nodes from
preloading because of pointer comparisons.  The issue is that
types created by the backends and the middle-end do not
participate in the type merging done by LTO.  Thus the actual
issue may be not on x86 (because it implements
the canonical_va_list_type hook) but on other targets that
end up using std_canonical_va_list_type.

Thanks,
Richard.

>         Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nvptx offloading patches [2/n]
  2015-02-09 10:16   ` Richard Biener
@ 2015-02-17 16:37     ` Bernd Schmidt
  2015-02-17 17:10       ` Jakub Jelinek
  0 siblings, 1 reply; 13+ messages in thread
From: Bernd Schmidt @ 2015-02-17 16:37 UTC (permalink / raw)
  To: Richard Biener, Jakub Jelinek; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 854 bytes --]

On 02/09/2015 11:16 AM, Richard Biener wrote:

> Thus, if your patch survives LTO bootstrap and you can still LTO
> a TU with ms_abi valist functions successfully (not sure if that's
> exercised in the testsuite) then it is fine.

I've now done the LTO bootstrap, and the program below compiled with 
-flto still works. Does that seem sufficient?

> Note that I _did_ run into issues with excempting nodes from
> preloading because of pointer comparisons.  The issue is that
> types created by the backends and the middle-end do not
> participate in the type merging done by LTO.  Thus the actual
> issue may be not on x86 (because it implements
> the canonical_va_list_type hook) but on other targets that
> end up using std_canonical_va_list_type.

Hmm. That doesn't really make me want to commit it at this stage of the 
development process.


Bernd


[-- Attachment #2: valist.c --]
[-- Type: text/plain, Size: 919 bytes --]

#include <stdarg.h>
#include <cross-stdarg.h>
#include <stdio.h>
#include <stdlib.h>

static int x;
static const char *y;

__attribute__((noinline,noclone)) static void verror_msg(va_list p)
{
  x = va_arg (p, int);
  y = va_arg (p, const char *);
}

__attribute__((noinline,noclone)) static void err(int errnum, const char *s, ...)
{
  va_list p;

  va_start(p, s);
  verror_msg(p);
  va_end(p);
}

__attribute__((noinline,noclone,ms_abi)) static void verror_msg2(ms_va_list p)
{
  x = va_arg (p, int);
  y = va_arg (p, const char *);
}

__attribute__((noinline,noclone,ms_abi)) static void err2(int errnum, const char *s, ...)
{
  ms_va_list p;

  __ms_va_start (p, s);
  verror_msg2 (p);
  __ms_va_end(p);
}

int main ()
{ 
  const char *p1 = "t1";
  const char *p2 = "t2";
  err (0, "test", 3, p1);
  if (x != 3 || y != p1)
    abort ();
  err2 (0, "ms", 2, p2);
  if (x != 2 || y != p2)
    abort ();
  exit(0);
}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nvptx offloading patches [2/n]
  2015-02-17 16:37     ` Bernd Schmidt
@ 2015-02-17 17:10       ` Jakub Jelinek
  2015-02-17 20:55         ` Bernd Schmidt
  0 siblings, 1 reply; 13+ messages in thread
From: Jakub Jelinek @ 2015-02-17 17:10 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Richard Biener, GCC Patches

On Tue, Feb 17, 2015 at 05:37:04PM +0100, Bernd Schmidt wrote:
> On 02/09/2015 11:16 AM, Richard Biener wrote:
> 
> >Thus, if your patch survives LTO bootstrap and you can still LTO
> >a TU with ms_abi valist functions successfully (not sure if that's
> >exercised in the testsuite) then it is fine.
> 
> I've now done the LTO bootstrap, and the program below compiled with -flto
> still works. Does that seem sufficient?

E.g. va_list_gpr_counter_field and va_list_fpr_counter_field are compared
for equality though, for va_list_type_node TYPE_MAIN_VARIANT is compared for
equality.  Otherwise the stdarg pass might misbehave.

What exact testcase are you trying to fix with this patch, and how do you
think offloading of code using va_list can work?

	Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nvptx offloading patches [2/n]
  2015-02-17 17:10       ` Jakub Jelinek
@ 2015-02-17 20:55         ` Bernd Schmidt
  2015-02-19 13:10           ` Jakub Jelinek
  0 siblings, 1 reply; 13+ messages in thread
From: Bernd Schmidt @ 2015-02-17 20:55 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, GCC Patches

On 02/17/2015 06:10 PM, Jakub Jelinek wrote:
>
> What exact testcase are you trying to fix with this patch, and how do you
> think offloading of code using va_list can work?

The exact testcase is any offloaded program - streaming in lto will 
crash if there is a mismatch in these preloaded nodes.

For OpenACC programs using va_list - I don't expect them to work at all. 
I don't believe the spec considers such issues, and ptx isn't expected 
to support variadic functions in the first place ("The current version 
of PTX does not support variadic functions" is what the spec has to say; 
the gcc port overachieves a little by implementing them anyway).

Bernd

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nvptx offloading patches [2/n]
  2015-02-17 20:55         ` Bernd Schmidt
@ 2015-02-19 13:10           ` Jakub Jelinek
  2015-02-19 13:46             ` Richard Biener
  2015-02-20  9:40             ` Offloading vs va_list (was: nvptx offloading patches [2/n]) Thomas Schwinge
  0 siblings, 2 replies; 13+ messages in thread
From: Jakub Jelinek @ 2015-02-19 13:10 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Richard Biener, GCC Patches

On Tue, Feb 17, 2015 at 09:55:32PM +0100, Bernd Schmidt wrote:
> On 02/17/2015 06:10 PM, Jakub Jelinek wrote:
> >
> >What exact testcase are you trying to fix with this patch, and how do you
> >think offloading of code using va_list can work?
> 
> The exact testcase is any offloaded program - streaming in lto will crash if
> there is a mismatch in these preloaded nodes.
> 
> For OpenACC programs using va_list - I don't expect them to work at all. I
> don't believe the spec considers such issues, and ptx isn't expected to
> support variadic functions in the first place ("The current version of PTX
> does not support variadic functions" is what the spec has to say; the gcc
> port overachieves a little by implementing them anyway).

How do you support printf etc.?  Those are all varargs functions.

Anyway, could following untested patch be used as a temporary hack?
It might pessimize for GCC5 slightly the intelmic offloading, but I hope
the way forward is the stdarg late lowering for GCC 6.

Richard on IRC said it might be better for the lto_stream_offload_p
path to whitelist nodes it has to preload rather than blacklist the ones
that it doesn't, otherwise e.g. if you try to offload from say x86_64-mingw
with 32-bit long, but 64-bit pointers to 64-bit intelmic, preloading
the long_type_node will certainly break lots of things, while not preloading
them would only be problematic for builtins, we'll need some pass over the
builtins in the IL in any case, to find out if they are compatible or not,
adjust if needed and give up otherwise.  But I'd hope it can be worked on
incrementally, if this patch (plus the approved nvptx offloading patches,
plus mode_table streaming) makes the nvptx offloading work.

2015-02-19  Bernd Schmidt  <bernds@codesourcery.com>
	    Jakub Jelinek  <jakub@redhat.com>

	* tree-streamer.c (preload_common_nodes): Don't preload
	TI_VA_LIST* for offloading.
	* tree-stdarg.c (pass_stdarg::gate): Disable for ACCEL_COMPILER
	in_lto_p.

--- gcc/tree-streamer.c.jj	2015-02-18 12:36:20.000000000 +0100
+++ gcc/tree-streamer.c	2015-02-19 13:57:26.089626006 +0100
@@ -342,7 +342,14 @@ preload_common_nodes (struct streamer_tr
 	&& i != TI_TARGET_OPTION_DEFAULT
 	&& i != TI_TARGET_OPTION_CURRENT
 	&& i != TI_CURRENT_TARGET_PRAGMA
-	&& i != TI_CURRENT_OPTIMIZE_PRAGMA)
+	&& i != TI_CURRENT_OPTIMIZE_PRAGMA
+	/* Skip va_list* related nodes if offloading.  For native LTO
+	   we want them to be merged for the stdarg pass, for offloading
+	   they might not be identical between host and offloading target.  */
+	&& (!lto_stream_offload_p
+	    || (i != TI_VA_LIST_TYPE
+		&& i != TI_VA_LIST_GPR_COUNTER_FIELD
+		&& i != TI_VA_LIST_FPR_COUNTER_FIELD)))
       record_common_node (cache, global_trees[i]);
 }
 
--- gcc/tree-stdarg.c.jj	2015-02-18 22:55:52.000000000 +0100
+++ gcc/tree-stdarg.c	2015-02-19 14:00:11.882905823 +0100
@@ -705,6 +705,13 @@ public:
   virtual bool gate (function *fun)
     {
       return (flag_stdarg_opt
+#ifdef ACCEL_COMPILER
+	      /* Disable for GCC5 in the offloading compilers, as
+		 va_list and gpr/fpr counter fields are not merged.
+		 In GCC6 when stdarg is lowered late this shouldn't be
+		 an issue.  */
+	      && !in_lto_p
+#endif
 	      /* This optimization is only for stdarg functions.  */
 	      && fun->stdarg != 0);
     }

	Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: nvptx offloading patches [2/n]
  2015-02-19 13:10           ` Jakub Jelinek
@ 2015-02-19 13:46             ` Richard Biener
  2015-02-20  9:40             ` Offloading vs va_list (was: nvptx offloading patches [2/n]) Thomas Schwinge
  1 sibling, 0 replies; 13+ messages in thread
From: Richard Biener @ 2015-02-19 13:46 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Schmidt, GCC Patches

On Thu, Feb 19, 2015 at 2:09 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Feb 17, 2015 at 09:55:32PM +0100, Bernd Schmidt wrote:
>> On 02/17/2015 06:10 PM, Jakub Jelinek wrote:
>> >
>> >What exact testcase are you trying to fix with this patch, and how do you
>> >think offloading of code using va_list can work?
>>
>> The exact testcase is any offloaded program - streaming in lto will crash if
>> there is a mismatch in these preloaded nodes.
>>
>> For OpenACC programs using va_list - I don't expect them to work at all. I
>> don't believe the spec considers such issues, and ptx isn't expected to
>> support variadic functions in the first place ("The current version of PTX
>> does not support variadic functions" is what the spec has to say; the gcc
>> port overachieves a little by implementing them anyway).
>
> How do you support printf etc.?  Those are all varargs functions.
>
> Anyway, could following untested patch be used as a temporary hack?
> It might pessimize for GCC5 slightly the intelmic offloading, but I hope
> the way forward is the stdarg late lowering for GCC 6.
>
> Richard on IRC said it might be better for the lto_stream_offload_p
> path to whitelist nodes it has to preload rather than blacklist the ones
> that it doesn't, otherwise e.g. if you try to offload from say x86_64-mingw
> with 32-bit long, but 64-bit pointers to 64-bit intelmic, preloading
> the long_type_node will certainly break lots of things, while not preloading
> them would only be problematic for builtins, we'll need some pass over the
> builtins in the IL in any case, to find out if they are compatible or not,
> adjust if needed and give up otherwise.  But I'd hope it can be worked on
> incrementally, if this patch (plus the approved nvptx offloading patches,
> plus mode_table streaming) makes the nvptx offloading work.

The patch works for me if it helps anything.

Thanks,
Richard.

> 2015-02-19  Bernd Schmidt  <bernds@codesourcery.com>
>             Jakub Jelinek  <jakub@redhat.com>
>
>         * tree-streamer.c (preload_common_nodes): Don't preload
>         TI_VA_LIST* for offloading.
>         * tree-stdarg.c (pass_stdarg::gate): Disable for ACCEL_COMPILER
>         in_lto_p.
>
> --- gcc/tree-streamer.c.jj      2015-02-18 12:36:20.000000000 +0100
> +++ gcc/tree-streamer.c 2015-02-19 13:57:26.089626006 +0100
> @@ -342,7 +342,14 @@ preload_common_nodes (struct streamer_tr
>         && i != TI_TARGET_OPTION_DEFAULT
>         && i != TI_TARGET_OPTION_CURRENT
>         && i != TI_CURRENT_TARGET_PRAGMA
> -       && i != TI_CURRENT_OPTIMIZE_PRAGMA)
> +       && i != TI_CURRENT_OPTIMIZE_PRAGMA
> +       /* Skip va_list* related nodes if offloading.  For native LTO
> +          we want them to be merged for the stdarg pass, for offloading
> +          they might not be identical between host and offloading target.  */
> +       && (!lto_stream_offload_p
> +           || (i != TI_VA_LIST_TYPE
> +               && i != TI_VA_LIST_GPR_COUNTER_FIELD
> +               && i != TI_VA_LIST_FPR_COUNTER_FIELD)))
>        record_common_node (cache, global_trees[i]);
>  }
>
> --- gcc/tree-stdarg.c.jj        2015-02-18 22:55:52.000000000 +0100
> +++ gcc/tree-stdarg.c   2015-02-19 14:00:11.882905823 +0100
> @@ -705,6 +705,13 @@ public:
>    virtual bool gate (function *fun)
>      {
>        return (flag_stdarg_opt
> +#ifdef ACCEL_COMPILER
> +             /* Disable for GCC5 in the offloading compilers, as
> +                va_list and gpr/fpr counter fields are not merged.
> +                In GCC6 when stdarg is lowered late this shouldn't be
> +                an issue.  */
> +             && !in_lto_p
> +#endif
>               /* This optimization is only for stdarg functions.  */
>               && fun->stdarg != 0);
>      }
>
>         Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Offloading vs va_list (was: nvptx offloading patches [2/n])
  2015-02-19 13:10           ` Jakub Jelinek
  2015-02-19 13:46             ` Richard Biener
@ 2015-02-20  9:40             ` Thomas Schwinge
  2015-02-20  9:42               ` Jakub Jelinek
  1 sibling, 1 reply; 13+ messages in thread
From: Thomas Schwinge @ 2015-02-20  9:40 UTC (permalink / raw)
  To: Jakub Jelinek, Bernd Schmidt, Richard Biener; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1748 bytes --]

Hi!

On Thu, 19 Feb 2015 14:09:29 +0100, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Feb 17, 2015 at 09:55:32PM +0100, Bernd Schmidt wrote:
> > On 02/17/2015 06:10 PM, Jakub Jelinek wrote:
> > >
> > >What exact testcase are you trying to fix with this patch, and how do you
> > >think offloading of code using va_list can work?
> > 
> > The exact testcase is any offloaded program - streaming in lto will crash if
> > there is a mismatch in these preloaded nodes.

> could following untested patch be used as a temporary hack?

Thanks!  I'll leave the approval to Bernd, but can already report that
this works fine in my testing, for intelmic and nvptx offloading.

> It might pessimize for GCC5 slightly the intelmic offloading, but I hope
> the way forward is the stdarg late lowering for GCC 6.

> Richard on IRC said it might be better for the lto_stream_offload_p
> path to whitelist nodes it has to preload rather than blacklist the ones
> that it doesn't, otherwise e.g. if you try to offload from say x86_64-mingw
> with 32-bit long, but 64-bit pointers to 64-bit intelmic

That's a good consideration (for the future), but we're not currently
supporting the case of offloading with non-matching ABIs (data types).

> preloading
> the long_type_node will certainly break lots of things, while not preloading
> them would only be problematic for builtins, we'll need some pass over the
> builtins in the IL in any case, to find out if they are compatible or not,
> adjust if needed and give up otherwise.  But I'd hope it can be worked on
> incrementally, if this patch (plus the approved nvptx offloading patches,
> plus mode_table streaming) makes the nvptx offloading work.


Grüße,
 Thomas

[-- Attachment #2: Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Offloading vs va_list (was: nvptx offloading patches [2/n])
  2015-02-20  9:40             ` Offloading vs va_list (was: nvptx offloading patches [2/n]) Thomas Schwinge
@ 2015-02-20  9:42               ` Jakub Jelinek
  0 siblings, 0 replies; 13+ messages in thread
From: Jakub Jelinek @ 2015-02-20  9:42 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Bernd Schmidt, Richard Biener, GCC Patches

On Fri, Feb 20, 2015 at 10:33:38AM +0100, Thomas Schwinge wrote:
> On Thu, 19 Feb 2015 14:09:29 +0100, Jakub Jelinek <jakub@redhat.com> wrote:
> > On Tue, Feb 17, 2015 at 09:55:32PM +0100, Bernd Schmidt wrote:
> > > On 02/17/2015 06:10 PM, Jakub Jelinek wrote:
> > > >
> > > >What exact testcase are you trying to fix with this patch, and how do you
> > > >think offloading of code using va_list can work?
> > > 
> > > The exact testcase is any offloaded program - streaming in lto will crash if
> > > there is a mismatch in these preloaded nodes.
> 
> > could following untested patch be used as a temporary hack?
> 
> Thanks!  I'll leave the approval to Bernd, but can already report that
> this works fine in my testing, for intelmic and nvptx offloading.

Richard already approved it if it helps anything.  So, if your testing
suggests it helps something, I'll apply it.

The mode_table patch is still awaiting approval, and Bernd's approved patches
aren't applied, ditto your toplevel configure patch.

	Jakub

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-02-20  9:40 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-01 11:51 nvptx offloading patches [2/n] Bernd Schmidt
2014-11-03 22:23 ` Jeff Law
2014-11-14 18:53   ` Bernd Schmidt
2015-02-04 10:56 ` Jakub Jelinek
2015-02-04 10:59   ` Jakub Jelinek
2015-02-09 10:16   ` Richard Biener
2015-02-17 16:37     ` Bernd Schmidt
2015-02-17 17:10       ` Jakub Jelinek
2015-02-17 20:55         ` Bernd Schmidt
2015-02-19 13:10           ` Jakub Jelinek
2015-02-19 13:46             ` Richard Biener
2015-02-20  9:40             ` Offloading vs va_list (was: nvptx offloading patches [2/n]) Thomas Schwinge
2015-02-20  9:42               ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).