public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Fix PR58115
@ 2013-11-03 10:26 Bernd Edlinger
  2013-11-19 12:11 ` [PING] " Bernd Edlinger
                   ` (3 more replies)
  0 siblings, 4 replies; 39+ messages in thread
From: Bernd Edlinger @ 2013-11-03 10:26 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener, Jakub Jelinek

[-- Attachment #1: Type: text/plain, Size: 588 bytes --]

Hello,

on i686-pc-linux-gnu the test case gcc.target/i386/intrinsics_4.c fails because of
an internal compiler error, see PR58155.

The reason for this is that the optab CODE_FOR_movv8sf is disabled when it
should be enabled.

This happens because invoke_set_current_function_hook changes the pointer
"this_fn_optabs" after targetm.set_current_function has already modified the
optab to enable/disable CODE_FOR_movv8sf, leaving that optab entry
in an undefined state.

Boot-strapped and regression-tested on i686-pc-linux-gnu.

Ok for trunk?

Regards
Bernd. 		 	   		  

[-- Attachment #2: changelog-pr58115.txt --]
[-- Type: text/plain, Size: 190 bytes --]

2013-11-03  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	PR target/58115
	* function.c (invoke_set_current_function_hook): Call
	targetm.set_current_function after setting this_fn_optabs.


[-- Attachment #3: patch-pr58115.diff --]
[-- Type: application/octet-stream, Size: 669 bytes --]

Index: gcc/function.c
===================================================================
--- gcc/function.c	(revision 204101)
+++ gcc/function.c	(working copy)
@@ -4426,7 +4426,6 @@ invoke_set_current_function_hook (tree fndecl)
 	  cl_optimization_restore (&global_options, TREE_OPTIMIZATION (opts));
 	}
 
-      targetm.set_current_function (fndecl);
       this_fn_optabs = this_target_optabs;
 
       if (opts != optimization_default_node)
@@ -4436,6 +4435,8 @@ invoke_set_current_function_hook (tree fndecl)
 	    this_fn_optabs = (struct target_optabs *)
 	      TREE_OPTIMIZATION_OPTABS (opts);
 	}
+
+      targetm.set_current_function (fndecl);
     }
 }
 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PING] Fix PR58115
  2013-11-03 10:26 [PATCH] Fix PR58115 Bernd Edlinger
@ 2013-11-19 12:11 ` Bernd Edlinger
  2013-11-19 14:59 ` [PATCH] " H.J. Lu
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 39+ messages in thread
From: Bernd Edlinger @ 2013-11-19 12:11 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener, Jakub Jelinek

PING...

this patch still needs review: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00133.html

Thanks.


> on i686-pc-linux-gnu the test case gcc.target/i386/intrinsics_4.c fails because of
> an internal compiler error, see PR58155.
>
> The reason for this is that the optab CODE_FOR_movv8sf is disabled when it
> should be enabled.
>
> This happens because invoke_set_current_function_hook changes the pointer
> "this_fn_optabs" after targetm.set_current_function has already modified the
> optab to enable/disable CODE_FOR_movv8sf, leaving that optab entry
> in an undefined state.
>
> Boot-strapped and regression-tested on i686-pc-linux-gnu.
>
> Ok for trunk?
>
> Regards
> Bernd. 		 	   		  

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2013-11-03 10:26 [PATCH] Fix PR58115 Bernd Edlinger
  2013-11-19 12:11 ` [PING] " Bernd Edlinger
@ 2013-11-19 14:59 ` H.J. Lu
  2013-11-19 15:02   ` Bernd Edlinger
  2013-12-02 15:08 ` Richard Biener
  2014-01-06 10:27 ` Richard Sandiford
  3 siblings, 1 reply; 39+ messages in thread
From: H.J. Lu @ 2013-11-19 14:59 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: gcc-patches, Richard Biener, Jakub Jelinek

On Sun, Nov 3, 2013 at 2:25 AM, Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
> Hello,
>
> on i686-pc-linux-gnu the test case gcc.target/i386/intrinsics_4.c fails because of
> an internal compiler error, see PR58155.
>
> The reason for this is that the optab CODE_FOR_movv8sf is disabled when it
> should be enabled.
>
> This happens because invoke_set_current_function_hook changes the pointer
> "this_fn_optabs" after targetm.set_current_function has already modified the
> optab to enable/disable CODE_FOR_movv8sf, leaving that optab entry
> in an undefined state.
>
> Boot-strapped and regression-tested on i686-pc-linux-gnu.
>
> Ok for trunk?
>
> Regards
> Bernd.

Are you sure your patch is for

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58155


-- 
H.J.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCH] Fix PR58115
  2013-11-19 14:59 ` [PATCH] " H.J. Lu
@ 2013-11-19 15:02   ` Bernd Edlinger
  0 siblings, 0 replies; 39+ messages in thread
From: Bernd Edlinger @ 2013-11-19 15:02 UTC (permalink / raw)
  To: H.J. Lu; +Cc: gcc-patches, Richard Biener, Jakub Jelinek

On Tue, 19 Nov 2013 06:21:22, H.J. Lu wrote:
>
> On Sun, Nov 3, 2013 at 2:25 AM, Bernd Edlinger
> <bernd.edlinger@hotmail.de> wrote:
>> Hello,
>>
>> on i686-pc-linux-gnu the test case gcc.target/i386/intrinsics_4.c fails because of
>> an internal compiler error, see PR58155.
>>
>> The reason for this is that the optab CODE_FOR_movv8sf is disabled when it
>> should be enabled.
>>
>> This happens because invoke_set_current_function_hook changes the pointer
>> "this_fn_optabs" after targetm.set_current_function has already modified the
>> optab to enable/disable CODE_FOR_movv8sf, leaving that optab entry
>> in an undefined state.
>>
>> Boot-strapped and regression-tested on i686-pc-linux-gnu.
>>
>> Ok for trunk?
>>
>> Regards
>> Bernd.
>
> Are you sure your patch is for
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58155

Oh. Sorry.

I meant http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58115


Thanks
Bernd.

>
>
> --
> H.J. 		 	   		  

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2013-11-03 10:26 [PATCH] Fix PR58115 Bernd Edlinger
  2013-11-19 12:11 ` [PING] " Bernd Edlinger
  2013-11-19 14:59 ` [PATCH] " H.J. Lu
@ 2013-12-02 15:08 ` Richard Biener
  2014-01-06 10:27 ` Richard Sandiford
  3 siblings, 0 replies; 39+ messages in thread
From: Richard Biener @ 2013-12-02 15:08 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: gcc-patches, Jakub Jelinek

On Sun, Nov 3, 2013 at 11:25 AM, Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
> Hello,
>
> on i686-pc-linux-gnu the test case gcc.target/i386/intrinsics_4.c fails because of
> an internal compiler error, see PR58155.
>
> The reason for this is that the optab CODE_FOR_movv8sf is disabled when it
> should be enabled.
>
> This happens because invoke_set_current_function_hook changes the pointer
> "this_fn_optabs" after targetm.set_current_function has already modified the
> optab to enable/disable CODE_FOR_movv8sf, leaving that optab entry
> in an undefined state.
>
> Boot-strapped and regression-tested on i686-pc-linux-gnu.
>
> Ok for trunk?

Ok.

Thanks,
Richard.

> Regards
> Bernd.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2013-11-03 10:26 [PATCH] Fix PR58115 Bernd Edlinger
                   ` (2 preceding siblings ...)
  2013-12-02 15:08 ` Richard Biener
@ 2014-01-06 10:27 ` Richard Sandiford
  2014-01-06 11:21   ` Jakub Jelinek
  2014-01-07 20:11   ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115) Jakub Jelinek
  3 siblings, 2 replies; 39+ messages in thread
From: Richard Sandiford @ 2014-01-06 10:27 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: gcc-patches, Richard Biener, Jakub Jelinek

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
> Hello,
>
> on i686-pc-linux-gnu the test case gcc.target/i386/intrinsics_4.c fails
> because of
> an internal compiler error, see PR58155.
>
> The reason for this is that the optab CODE_FOR_movv8sf is disabled when it
> should be enabled.
>
> This happens because invoke_set_current_function_hook changes the pointer
> "this_fn_optabs" after targetm.set_current_function has already modified the
> optab to enable/disable CODE_FOR_movv8sf, leaving that optab entry
> in an undefined state.
>
> Boot-strapped and regression-tested on i686-pc-linux-gnu.
>
> Ok for trunk?
>
> Regards
> Bernd. 		 	   		  
>
>
> Index: gcc/function.c
> ===================================================================
> --- gcc/function.c	(revision 204101)
> +++ gcc/function.c	(working copy)
> @@ -4426,7 +4426,6 @@ invoke_set_current_function_hook (tree fndecl)
>  	  cl_optimization_restore (&global_options, TREE_OPTIMIZATION (opts));
>  	}
>  
> -      targetm.set_current_function (fndecl);
>        this_fn_optabs = this_target_optabs;
>  
>        if (opts != optimization_default_node)
> @@ -4436,6 +4435,8 @@ invoke_set_current_function_hook (tree fndecl)
>  	    this_fn_optabs = (struct target_optabs *)
>  	      TREE_OPTIMIZATION_OPTABS (opts);
>  	}
> +
> +      targetm.set_current_function (fndecl);
>      }
>  }
>  

Sorry for only realising now, but this breaks the mips16 attribute.
The idea with the old order was that set_current_function would set
up this_target, then that choice would percolate through.

I don't think the patch is really correct for i386 either.  Although
it fixes the intrinsics_4.c test for -m32, it fails for this extended
version, both with and without -m32:

  #include <immintrin.h>

  __m256 a[10], b[10], c[10];

  void __attribute__((target ("avx")))
  foo (void)
  {
    a[0] = _mm256_and_ps (b[0], c[0]);
  }

  void __attribute__((target ("avx"), optimize (3)))
  bar (void)
  {
    a[0] = _mm256_and_ps (b[0], c[0]);
  }

The failure without -m32 is a regression from before the patch.

i386.c tries to detect when we're moving between different target options
and when a target_reinit is needed, but it doesn't take into account the
fact that optabs are cached on a per-optimisation-node basis.  It's
possible to move between functions with the same target node but
different optimisation nodes.  With the old order this wasn't
really something i386.c should have to worry about, since it was
invoke_set_current_function that handled optimisation nodes.
But with the new order i386.c would need to reinitialise at least
the optabs when moving between functions with the same target node
but different optimisation nodes.  And I'd really rather not force
every target that uses DECL_FUNCTION_SPECIFIC_TARGET to copy-&-paste
the code to do that.

Of course, IMO, the cleanest fix would be to use switchable targets
for i386...

Thanks,
Richard

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2014-01-06 10:27 ` Richard Sandiford
@ 2014-01-06 11:21   ` Jakub Jelinek
  2014-01-06 12:16     ` Richard Biener
  2014-01-07 20:11   ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115) Jakub Jelinek
  1 sibling, 1 reply; 39+ messages in thread
From: Jakub Jelinek @ 2014-01-06 11:21 UTC (permalink / raw)
  To: Bernd Edlinger, gcc-patches, Richard Biener, rdsandiford

On Mon, Jan 06, 2014 at 10:27:06AM +0000, Richard Sandiford wrote:
> Of course, IMO, the cleanest fix would be to use switchable targets
> for i386...

We IMHO want that anyway, e.g. #pragma omp declare simd tests take eons to
compile because even with just a few routines in a CU there are hundreds or
thousands of IRA reinitializations.

	Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2014-01-06 11:21   ` Jakub Jelinek
@ 2014-01-06 12:16     ` Richard Biener
  2014-01-06 17:00       ` Bernd Edlinger
  0 siblings, 1 reply; 39+ messages in thread
From: Richard Biener @ 2014-01-06 12:16 UTC (permalink / raw)
  To: Jakub Jelinek, Jakub Jelinek, Bernd Edlinger, gcc-patches, rdsandiford

Jakub Jelinek <jakub@redhat.com> wrote:
>On Mon, Jan 06, 2014 at 10:27:06AM +0000, Richard Sandiford wrote:
>> Of course, IMO, the cleanest fix would be to use switchable targets
>> for i386...
>
>We IMHO want that anyway, e.g. #pragma omp declare simd tests take eons
>to
>compile because even with just a few routines in a CU there are
>hundreds or
>thousands of IRA reinitializations.

We also want a big fat comment before the non-obvious order of inits when switching cfuns.

Bernd, please revert your patch for now.

Thanks,
Richard.

>	Jakub


^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCH] Fix PR58115
  2014-01-06 12:16     ` Richard Biener
@ 2014-01-06 17:00       ` Bernd Edlinger
  2014-01-06 19:17         ` Richard Sandiford
  0 siblings, 1 reply; 39+ messages in thread
From: Bernd Edlinger @ 2014-01-06 17:00 UTC (permalink / raw)
  To: Richard Biener, Jakub Jelinek, gcc-patches, rdsandiford

>
> Jakub Jelinek <jakub@redhat.com> wrote:
>>On Mon, Jan 06, 2014 at 10:27:06AM +0000, Richard Sandiford wrote:
>>> Of course, IMO, the cleanest fix would be to use switchable targets
>>> for i386...
>>
>>We IMHO want that anyway, e.g. #pragma omp declare simd tests take eons
>>to
>>compile because even with just a few routines in a CU there are
>>hundreds or
>>thousands of IRA reinitializations.
>
> We also want a big fat comment before the non-obvious order of inits when switching cfuns.
>
> Bernd, please revert your patch for now.
>

OK, reverted for now.

Bernd.

> Thanks,
> Richard.
>
>> Jakub
>
> 		 	   		  

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2014-01-06 17:00       ` Bernd Edlinger
@ 2014-01-06 19:17         ` Richard Sandiford
  2014-01-06 21:43           ` Bernd Edlinger
  0 siblings, 1 reply; 39+ messages in thread
From: Richard Sandiford @ 2014-01-06 19:17 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: Richard Biener, Jakub Jelinek, gcc-patches

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>>
>> Jakub Jelinek <jakub@redhat.com> wrote:
>>>On Mon, Jan 06, 2014 at 10:27:06AM +0000, Richard Sandiford wrote:
>>>> Of course, IMO, the cleanest fix would be to use switchable targets
>>>> for i386...
>>>
>>>We IMHO want that anyway, e.g. #pragma omp declare simd tests take eons
>>>to
>>>compile because even with just a few routines in a CU there are
>>>hundreds or
>>>thousands of IRA reinitializations.
>>
>> We also want a big fat comment before the non-obvious order of inits
>> when switching cfuns.
>>
>> Bernd, please revert your patch for now.
>>
>
> OK, reverted for now.

Thanks Bernd.  (And sorry for the tone of my original message -- wasn't happy
with that when I read it back.)

How about this patch for the big comment?

Thanks,
Richard


gcc/
	* function.c (invoke_set_current_function_hook): Add more commentary.

Index: gcc/function.c
===================================================================
--- gcc/function.c	2014-01-06 19:09:27.821167117 +0000
+++ gcc/function.c	2014-01-06 19:15:27.764240857 +0000
@@ -4405,9 +4405,23 @@ invoke_set_current_function_hook (tree f
 	  cl_optimization_restore (&global_options, TREE_OPTIMIZATION (opts));
 	}
 
+      /* Give the target an opportunity to change the global state in
+	 response to function attributes.  This includes picking the
+	 correct value of this_target on SWITCHABLE_TARGET targets.
+
+	 Note that this_target (and in particular this_target_optabs)
+	 should always reflect the target state with default optimization
+	 options.  We therefore call the target hook first and then apply
+	 any function-specific optimization options on top.  */
       targetm.set_current_function (fndecl);
+
+      /* Set this_fn_optabs appropriately, on the assumption that we
+	 want the default optimization options.  */
       this_fn_optabs = this_target_optabs;
 
+      /* Now adjust this_fn_optabs if using non-default optimization options.
+	 init_tree_optimization_optabs ensures that the cached value of
+	 TREE_OPTIMIZATION_OPTABS (opts) is correct for this_target.  */
       if (opts != optimization_default_node)
 	{
 	  init_tree_optimization_optabs (opts);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCH] Fix PR58115
  2014-01-06 19:17         ` Richard Sandiford
@ 2014-01-06 21:43           ` Bernd Edlinger
  2014-01-07 12:12             ` Richard Sandiford
  0 siblings, 1 reply; 39+ messages in thread
From: Bernd Edlinger @ 2014-01-06 21:43 UTC (permalink / raw)
  To: Richard Sandiford; +Cc: Richard Biener, Jakub Jelinek, gcc-patches

On Mon, 6 Jan 2014 19:16:57, Richard Saniford wrote:
>
> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>>>
>>> Jakub Jelinek <jakub@redhat.com> wrote:
>>>>On Mon, Jan 06, 2014 at 10:27:06AM +0000, Richard Sandiford wrote:
>>>>> Of course, IMO, the cleanest fix would be to use switchable targets
>>>>> for i386...
>>>>
>>>>We IMHO want that anyway, e.g. #pragma omp declare simd tests take eons
>>>>to
>>>>compile because even with just a few routines in a CU there are
>>>>hundreds or
>>>>thousands of IRA reinitializations.
>>>
>>> We also want a big fat comment before the non-obvious order of inits
>>> when switching cfuns.
>>>
>>> Bernd, please revert your patch for now.
>>>
>>
>> OK, reverted for now.
>
> Thanks Bernd. (And sorry for the tone of my original message -- wasn't happy
> with that when I read it back.)
>

No problem, Richard, I don't see this personally in any way.
 
> How about this patch for the big comment?
>

The comment should say that target_set_current_function()
cannot call target_reinit() because:

target_reinit()=>lang_dependent_init_target()
=>init_optabs()=>init_all_optabs(this_fn_optabs);

uses this_fn_optabs which is undefined here.

However many targets (nios2, rx, i386, rs6000) do exactly that.

Is there currently any target, that sets this_target_optab in the
target_set_current_function?


Regards,
Bernd.

> Thanks,
> Richard
>
>
> gcc/
> * function.c (invoke_set_current_function_hook): Add more commentary.
>
> Index: gcc/function.c
> ===================================================================
> --- gcc/function.c 2014-01-06 19:09:27.821167117 +0000
> +++ gcc/function.c 2014-01-06 19:15:27.764240857 +0000
> @@ -4405,9 +4405,23 @@ invoke_set_current_function_hook (tree f
> cl_optimization_restore (&global_options, TREE_OPTIMIZATION (opts));
> }
>
> + /* Give the target an opportunity to change the global state in
> + response to function attributes. This includes picking the
> + correct value of this_target on SWITCHABLE_TARGET targets.
> +
> + Note that this_target (and in particular this_target_optabs)
> + should always reflect the target state with default optimization
> + options. We therefore call the target hook first and then apply
> + any function-specific optimization options on top. */
> targetm.set_current_function (fndecl);
> +
> + /* Set this_fn_optabs appropriately, on the assumption that we
> + want the default optimization options. */
> this_fn_optabs = this_target_optabs;
>
> + /* Now adjust this_fn_optabs if using non-default optimization options.
> + init_tree_optimization_optabs ensures that the cached value of
> + TREE_OPTIMIZATION_OPTABS (opts) is correct for this_target. */
> if (opts != optimization_default_node)
> {
> init_tree_optimization_optabs (opts); 		 	   		  

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2014-01-06 21:43           ` Bernd Edlinger
@ 2014-01-07 12:12             ` Richard Sandiford
  2014-01-07 14:10               ` Richard Biener
  0 siblings, 1 reply; 39+ messages in thread
From: Richard Sandiford @ 2014-01-07 12:12 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: Richard Biener, Jakub Jelinek, gcc-patches

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>> How about this patch for the big comment?
>>
>
> The comment should say that target_set_current_function()
> cannot call target_reinit() because:
>
> target_reinit()=>lang_dependent_init_target()
> =>init_optabs()=>init_all_optabs(this_fn_optabs);
>
> uses this_fn_optabs which is undefined here.
>
> However many targets (nios2, rx, i386, rs6000) do exactly that.
>
> Is there currently any target, that sets this_target_optab in the
> target_set_current_function?

MIPS :-)  (via save_target_globals_default_opts=>save_target_globals)

I think other targets need to do the same thing in order for tests
like that extended intrinsics_4.c to work.  How does this patch look?
Tested on x86_64-linux-gnu.

I didn't remove save_target_globals_default_opts because there the
temporary optimization_current_node also protects a call to init_reg_sets.

Thanks,
Richard


gcc/
	PR target/58115
	* target-globals.c (save_target_globals): Remove this_fn_optab
	handling.
	* toplev.c: Include optabs.h.
	(target_reinit): Temporarily restore the global options if another
	set of options are in force.

gcc/testsuite/
	* gcc.target/i386/intrinsics_4.c (bar): New function.

Index: gcc/target-globals.c
===================================================================
--- gcc/target-globals.c	2014-01-02 22:16:03.042278971 +0000
+++ gcc/target-globals.c	2014-01-07 12:08:33.569900970 +0000
@@ -68,7 +68,6 @@ struct target_globals *
 save_target_globals (void)
 {
   struct target_globals *g;
-  struct target_optabs *saved_this_fn_optabs = this_fn_optabs;
 
   g = ggc_alloc_target_globals ();
   g->flag_state = XCNEW (struct target_flag_state);
@@ -88,10 +87,8 @@ save_target_globals (void)
   g->bb_reorder = XCNEW (struct target_bb_reorder);
   g->lower_subreg = XCNEW (struct target_lower_subreg);
   restore_target_globals (g);
-  this_fn_optabs = this_target_optabs;
   init_reg_sets ();
   target_reinit ();
-  this_fn_optabs = saved_this_fn_optabs;
   return g;
 }
 
Index: gcc/toplev.c
===================================================================
--- gcc/toplev.c	2014-01-07 08:11:43.888058805 +0000
+++ gcc/toplev.c	2014-01-07 12:10:19.448096479 +0000
@@ -78,6 +78,7 @@ Software Foundation; either version 3, o
 #include "diagnostic-color.h"
 #include "context.h"
 #include "pass_manager.h"
+#include "optabs.h"
 
 #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
 #include "dbxout.h"
@@ -1752,6 +1753,23 @@ target_reinit (void)
 {
   struct rtl_data saved_x_rtl;
   rtx *saved_regno_reg_rtx;
+  tree saved_optimization_current_node;
+  struct target_optabs *saved_this_fn_optabs;
+
+  /* Temporarily switch to the default optimization node, so that
+     *this_target_optabs is set to the default, not reflecting
+     whatever a previous function used for the optimize
+     attribute.  */
+  saved_optimization_current_node = optimization_current_node;
+  saved_this_fn_optabs = this_fn_optabs;
+  if (saved_optimization_current_node != optimization_default_node)
+    {
+      optimization_current_node = optimization_default_node;
+      cl_optimization_restore
+	(&global_options,
+	 TREE_OPTIMIZATION (optimization_default_node));
+    }
+  this_fn_optabs = this_target_optabs;
 
   /* Save *crtl and regno_reg_rtx around the reinitialization
      to allow target_reinit being called even after prepare_function_start.  */
@@ -1769,7 +1787,16 @@ target_reinit (void)
   /* Reinitialize lang-dependent parts.  */
   lang_dependent_init_target ();
 
-  /* And restore it at the end, as free_after_compilation from
+  /* Restore the original optimization node.  */
+  if (saved_optimization_current_node != optimization_default_node)
+    {
+      optimization_current_node = saved_optimization_current_node;
+      cl_optimization_restore (&global_options,
+			       TREE_OPTIMIZATION (optimization_current_node));
+    }
+  this_fn_optabs = saved_this_fn_optabs;
+
+  /* Restore regno_reg_rtx at the end, as free_after_compilation from
      expand_dummy_function_end clears it.  */
   if (saved_regno_reg_rtx)
     {
Index: gcc/testsuite/gcc.target/i386/intrinsics_4.c
===================================================================
--- gcc/testsuite/gcc.target/i386/intrinsics_4.c	2013-06-27 22:30:03.877275875 +0100
+++ gcc/testsuite/gcc.target/i386/intrinsics_4.c	2014-01-07 08:15:51.509627927 +0000
@@ -12,3 +12,10 @@ foo (void)
 {
   a[0] = _mm256_and_ps (b[0], c[0]);
 }
+
+/* Try again with a combination of target and optimization attributes.  */
+void __attribute__((target ("avx"), optimize(3)))
+bar (void)
+{
+  a[0] = _mm256_and_ps (b[0], c[0]);
+}

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2014-01-07 12:12             ` Richard Sandiford
@ 2014-01-07 14:10               ` Richard Biener
  2014-01-09  0:31                 ` Bernd Edlinger
  0 siblings, 1 reply; 39+ messages in thread
From: Richard Biener @ 2014-01-07 14:10 UTC (permalink / raw)
  To: Bernd Edlinger, Richard Biener, Jakub Jelinek, gcc-patches,
	Richard Sandiford

On Tue, Jan 7, 2014 at 1:12 PM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>>> How about this patch for the big comment?
>>>
>>
>> The comment should say that target_set_current_function()
>> cannot call target_reinit() because:
>>
>> target_reinit()=>lang_dependent_init_target()
>> =>init_optabs()=>init_all_optabs(this_fn_optabs);
>>
>> uses this_fn_optabs which is undefined here.
>>
>> However many targets (nios2, rx, i386, rs6000) do exactly that.
>>
>> Is there currently any target, that sets this_target_optab in the
>> target_set_current_function?
>
> MIPS :-)  (via save_target_globals_default_opts=>save_target_globals)
>
> I think other targets need to do the same thing in order for tests
> like that extended intrinsics_4.c to work.  How does this patch look?
> Tested on x86_64-linux-gnu.
>
> I didn't remove save_target_globals_default_opts because there the
> temporary optimization_current_node also protects a call to init_reg_sets.

Well, if it works the patch is ok.  You're way more familiar with the
details of this machinery.

Richard.

> Thanks,
> Richard
>
>
> gcc/
>         PR target/58115
>         * target-globals.c (save_target_globals): Remove this_fn_optab
>         handling.
>         * toplev.c: Include optabs.h.
>         (target_reinit): Temporarily restore the global options if another
>         set of options are in force.
>
> gcc/testsuite/
>         * gcc.target/i386/intrinsics_4.c (bar): New function.
>
> Index: gcc/target-globals.c
> ===================================================================
> --- gcc/target-globals.c        2014-01-02 22:16:03.042278971 +0000
> +++ gcc/target-globals.c        2014-01-07 12:08:33.569900970 +0000
> @@ -68,7 +68,6 @@ struct target_globals *
>  save_target_globals (void)
>  {
>    struct target_globals *g;
> -  struct target_optabs *saved_this_fn_optabs = this_fn_optabs;
>
>    g = ggc_alloc_target_globals ();
>    g->flag_state = XCNEW (struct target_flag_state);
> @@ -88,10 +87,8 @@ save_target_globals (void)
>    g->bb_reorder = XCNEW (struct target_bb_reorder);
>    g->lower_subreg = XCNEW (struct target_lower_subreg);
>    restore_target_globals (g);
> -  this_fn_optabs = this_target_optabs;
>    init_reg_sets ();
>    target_reinit ();
> -  this_fn_optabs = saved_this_fn_optabs;
>    return g;
>  }
>
> Index: gcc/toplev.c
> ===================================================================
> --- gcc/toplev.c        2014-01-07 08:11:43.888058805 +0000
> +++ gcc/toplev.c        2014-01-07 12:10:19.448096479 +0000
> @@ -78,6 +78,7 @@ Software Foundation; either version 3, o
>  #include "diagnostic-color.h"
>  #include "context.h"
>  #include "pass_manager.h"
> +#include "optabs.h"
>
>  #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
>  #include "dbxout.h"
> @@ -1752,6 +1753,23 @@ target_reinit (void)
>  {
>    struct rtl_data saved_x_rtl;
>    rtx *saved_regno_reg_rtx;
> +  tree saved_optimization_current_node;
> +  struct target_optabs *saved_this_fn_optabs;
> +
> +  /* Temporarily switch to the default optimization node, so that
> +     *this_target_optabs is set to the default, not reflecting
> +     whatever a previous function used for the optimize
> +     attribute.  */
> +  saved_optimization_current_node = optimization_current_node;
> +  saved_this_fn_optabs = this_fn_optabs;
> +  if (saved_optimization_current_node != optimization_default_node)
> +    {
> +      optimization_current_node = optimization_default_node;
> +      cl_optimization_restore
> +       (&global_options,
> +        TREE_OPTIMIZATION (optimization_default_node));
> +    }
> +  this_fn_optabs = this_target_optabs;
>
>    /* Save *crtl and regno_reg_rtx around the reinitialization
>       to allow target_reinit being called even after prepare_function_start.  */
> @@ -1769,7 +1787,16 @@ target_reinit (void)
>    /* Reinitialize lang-dependent parts.  */
>    lang_dependent_init_target ();
>
> -  /* And restore it at the end, as free_after_compilation from
> +  /* Restore the original optimization node.  */
> +  if (saved_optimization_current_node != optimization_default_node)
> +    {
> +      optimization_current_node = saved_optimization_current_node;
> +      cl_optimization_restore (&global_options,
> +                              TREE_OPTIMIZATION (optimization_current_node));
> +    }
> +  this_fn_optabs = saved_this_fn_optabs;
> +
> +  /* Restore regno_reg_rtx at the end, as free_after_compilation from
>       expand_dummy_function_end clears it.  */
>    if (saved_regno_reg_rtx)
>      {
> Index: gcc/testsuite/gcc.target/i386/intrinsics_4.c
> ===================================================================
> --- gcc/testsuite/gcc.target/i386/intrinsics_4.c        2013-06-27 22:30:03.877275875 +0100
> +++ gcc/testsuite/gcc.target/i386/intrinsics_4.c        2014-01-07 08:15:51.509627927 +0000
> @@ -12,3 +12,10 @@ foo (void)
>  {
>    a[0] = _mm256_and_ps (b[0], c[0]);
>  }
> +
> +/* Try again with a combination of target and optimization attributes.  */
> +void __attribute__((target ("avx"), optimize(3)))
> +bar (void)
> +{
> +  a[0] = _mm256_and_ps (b[0], c[0]);
> +}

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115)
  2014-01-06 10:27 ` Richard Sandiford
  2014-01-06 11:21   ` Jakub Jelinek
@ 2014-01-07 20:11   ` Jakub Jelinek
  2014-01-07 20:52     ` Richard Sandiford
  2014-01-08 11:33     ` Richard Biener
  1 sibling, 2 replies; 39+ messages in thread
From: Jakub Jelinek @ 2014-01-07 20:11 UTC (permalink / raw)
  To: Bernd Edlinger, gcc-patches, Richard Biener, rdsandiford

On Mon, Jan 06, 2014 at 10:27:06AM +0000, Richard Sandiford wrote:
> Of course, IMO, the cleanest fix would be to use switchable targets
> for i386...

The following patch does that, bootstrapped/regtested on x86_64-linux and
i686-linux.  The only problem with the patch is PCH,
+FAIL: 17_intro/headers/c++200x/stdc++.cc (test for excess errors)
+FAIL: 17_intro/headers/c++200x/stdc++_multiple_inclusion.cc (test for excess errors)
(both 32-bit and 64-bit regtests), where it ICEs.  I guess the problem is
that the target globals are allocated partly in GC, partly in heap and
even if they were allocated completely in GC and GTY(()) marked fully all
the individual pointed structures, we IMNSHO still don't want it to be
saved during PCH and restored later, what we have is basically just a cache
of the target globals.

Dunno what is the best way to handle that though.
Either before writing PCH c-common.c could call some tree.c routine that
would traverse the cl_option_hash_table hash table and for every
TARGET_OPTION_NODE in the hash table clear TREE_TARGET_GLOBALS.
Or perhaps some gengtype extension to run some routine before PCH saving
on the tree_target_option structs and clear the globals field in there.
Or use GTY((user)) on tree_target_option, but then dunno how we'd handle the
marking of the embedded opts field (and common).
Any ideas?

2014-01-07  Jakub Jelinek  <jakub@redhat.com>

	PR target/58115
	* tree-core.h (struct target_globals): New forward declaration.
	(struct tree_target_option): Add globals field.
	* tree.h (TREE_TARGET_GLOBALS): Define.
	* target-globals.h (struct target_globals): Define even if
	!SWITCHABLE_TARGET.
	* config/i386/i386.h (SWITCHABLE_TARGET): Define.
	* config/i386/i386.c: Include target-globals.h.
	(ix86_set_current_function): Instead of doing target_reinit
	unconditionally, use save_target_globals_default_opts and
	restore_target_globals.

--- gcc/tree-core.h.jj	2014-01-07 08:47:24.000000000 +0100
+++ gcc/tree-core.h	2014-01-07 16:44:35.591358235 +0100
@@ -1557,11 +1557,18 @@ struct GTY(()) tree_optimization_option
   struct target_optabs *GTY ((skip)) base_optabs;
 };
 
+/* Forward declaration, defined in target-globals.h.  */
+
+struct GTY(()) target_globals;
+
 /* Target options used by a function.  */
 
 struct GTY(()) tree_target_option {
   struct tree_common common;
 
+  /* Target globals for the corresponding target option.  */
+  struct target_globals *globals;
+
   /* The optimization options used by the user.  */
   struct cl_target_option opts;
 };
--- gcc/tree.h.jj	2014-01-03 11:40:33.000000000 +0100
+++ gcc/tree.h	2014-01-07 12:55:39.137295100 +0100
@@ -2695,6 +2695,9 @@ extern tree build_optimization_node (str
 #define TREE_TARGET_OPTION(NODE) \
   (&TARGET_OPTION_NODE_CHECK (NODE)->target_option.opts)
 
+#define TREE_TARGET_GLOBALS(NODE) \
+  (TARGET_OPTION_NODE_CHECK (NODE)->target_option.globals)
+
 /* Return a tree node that encapsulates the target options in OPTS.  */
 extern tree build_target_option_node (struct gcc_options *opts);
 
--- gcc/target-globals.h.jj	2014-01-03 11:40:46.000000000 +0100
+++ gcc/target-globals.h	2014-01-07 17:08:51.113880947 +0100
@@ -37,6 +37,7 @@ extern struct target_builtins *this_targ
 extern struct target_gcse *this_target_gcse;
 extern struct target_bb_reorder *this_target_bb_reorder;
 extern struct target_lower_subreg *this_target_lower_subreg;
+#endif
 
 struct GTY(()) target_globals {
   struct target_flag_state *GTY((skip)) flag_state;
@@ -57,6 +58,7 @@ struct GTY(()) target_globals {
   struct target_lower_subreg *GTY((skip)) lower_subreg;
 };
 
+#if SWITCHABLE_TARGET
 extern struct target_globals default_target_globals;
 
 extern struct target_globals *save_target_globals (void);
--- gcc/config/i386/i386.h.jj	2014-01-06 22:37:19.000000000 +0100
+++ gcc/config/i386/i386.h	2014-01-07 12:13:06.480486755 +0100
@@ -2510,6 +2510,9 @@ extern void debug_dispatch_window (int);
 #define IX86_HLE_ACQUIRE (1 << 16)
 #define IX86_HLE_RELEASE (1 << 17)
 
+/* For switching between functions with different target attributes.  */
+#define SWITCHABLE_TARGET 1
+
 /*
 Local variables:
 version-control: t
--- gcc/config/i386/i386.c.jj	2014-01-06 22:37:19.000000000 +0100
+++ gcc/config/i386/i386.c	2014-01-07 16:52:32.597904760 +0100
@@ -80,6 +80,7 @@ along with GCC; see the file COPYING3.
 #include "tree-pass.h"
 #include "context.h"
 #include "pass_manager.h"
+#include "target-globals.h"
 
 static rtx legitimize_dllimport_symbol (rtx, bool);
 static rtx legitimize_pe_coff_extern_decl (rtx, bool);
@@ -4868,16 +4869,25 @@ ix86_set_current_function (tree fndecl)
 	{
 	  cl_target_option_restore (&global_options,
 				    TREE_TARGET_OPTION (new_tree));
-	  target_reinit ();
+	  if (TREE_TARGET_GLOBALS (new_tree))
+	    restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
+	  else
+	    TREE_TARGET_GLOBALS (new_tree)
+	      = save_target_globals_default_opts ();
 	}
 
       else if (old_tree)
 	{
-	  struct cl_target_option *def
-	    = TREE_TARGET_OPTION (target_option_current_node);
-
-	  cl_target_option_restore (&global_options, def);
-	  target_reinit ();
+	  new_tree = target_option_current_node;
+	  cl_target_option_restore (&global_options,
+				    TREE_TARGET_OPTION (new_tree));
+	  if (TREE_TARGET_GLOBALS (new_tree))
+	    restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
+	  else if (new_tree == target_option_default_node)
+	    restore_target_globals (&default_target_globals);
+	  else
+	    TREE_TARGET_GLOBALS (new_tree)
+	      = save_target_globals_default_opts ();
 	}
     }
 }


	Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115)
  2014-01-07 20:11   ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115) Jakub Jelinek
@ 2014-01-07 20:52     ` Richard Sandiford
  2014-01-08 11:33     ` Richard Biener
  1 sibling, 0 replies; 39+ messages in thread
From: Richard Sandiford @ 2014-01-07 20:52 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Edlinger, gcc-patches, Richard Biener

Jakub Jelinek <jakub@redhat.com> writes:
> On Mon, Jan 06, 2014 at 10:27:06AM +0000, Richard Sandiford wrote:
>> Of course, IMO, the cleanest fix would be to use switchable targets
>> for i386...
>
> The following patch does that, bootstrapped/regtested on x86_64-linux and
> i686-linux.

Neat, thanks.  Looks like it's a lot less intrusive than I thought it
might be.

> The only problem with the patch is PCH,
> +FAIL: 17_intro/headers/c++200x/stdc++.cc (test for excess errors)
> +FAIL: 17_intro/headers/c++200x/stdc++_multiple_inclusion.cc (test for excess errors)
> (both 32-bit and 64-bit regtests), where it ICEs.  I guess the problem is
> that the target globals are allocated partly in GC, partly in heap and
> even if they were allocated completely in GC and GTY(()) marked fully all
> the individual pointed structures, we IMNSHO still don't want it to be
> saved during PCH and restored later, what we have is basically just a cache
> of the target globals.
>
> Dunno what is the best way to handle that though.
> Either before writing PCH c-common.c could call some tree.c routine that
> would traverse the cl_option_hash_table hash table and for every
> TARGET_OPTION_NODE in the hash table clear TREE_TARGET_GLOBALS.
> Or perhaps some gengtype extension to run some routine before PCH saving
> on the tree_target_option structs and clear the globals field in there.
> Or use GTY((user)) on tree_target_option, but then dunno how we'd handle the
> marking of the embedded opts field (and common).
> Any ideas?

Not really.  For MIPS this was relatively easy since there was only one
non-default instance:

/* Implement TARGET_PREPARE_PCH_SAVE.  */

static void
mips_prepare_pch_save (void)
{
  /* We are called in a context where the current MIPS16 vs. non-MIPS16
     setting should be irrelevant.  The question then is: which setting
     makes most sense at load time?

     The PCH is loaded before the first token is read.  We should never
     have switched into MIPS16 mode by that point, and thus should not
     have populated mips16_globals.  Nor can we load the entire contents
     of mips16_globals from the PCH file, because mips16_globals contains
     a combination of GGC and non-GGC data.

     There is therefore no point in trying save the GGC part of
     mips16_globals to the PCH file, or to preserve MIPS16ness across
     the PCH save and load.  The loading compiler would not have access
     to the non-GGC parts of mips16_globals (either from the PCH file,
     or from a copy that the loading compiler generated itself) and would
     have to call target_reinit anyway.

     It therefore seems best to switch back to non-MIPS16 mode at
     save time, and to ensure that mips16_globals remains null after
     a PCH load.  */
  mips_set_compression_mode (0);
  mips16_globals = 0;
}

The idea of iterating over cl_option_hash_table sounded good though.
It might also be the easiest option.

Hopefully at some point we can move the main body of
ix86_set_current_function into generic code, but that's probably
4.10 material.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115)
  2014-01-07 20:11   ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115) Jakub Jelinek
  2014-01-07 20:52     ` Richard Sandiford
@ 2014-01-08 11:33     ` Richard Biener
  2014-01-08 12:45       ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2) Jakub Jelinek
  1 sibling, 1 reply; 39+ messages in thread
From: Richard Biener @ 2014-01-08 11:33 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Edlinger, gcc-patches, Richard Sandiford

On Tue, Jan 7, 2014 at 8:39 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Jan 06, 2014 at 10:27:06AM +0000, Richard Sandiford wrote:
>> Of course, IMO, the cleanest fix would be to use switchable targets
>> for i386...
>
> The following patch does that, bootstrapped/regtested on x86_64-linux and
> i686-linux.  The only problem with the patch is PCH,
> +FAIL: 17_intro/headers/c++200x/stdc++.cc (test for excess errors)
> +FAIL: 17_intro/headers/c++200x/stdc++_multiple_inclusion.cc (test for excess errors)
> (both 32-bit and 64-bit regtests), where it ICEs.  I guess the problem is
> that the target globals are allocated partly in GC, partly in heap and
> even if they were allocated completely in GC and GTY(()) marked fully all
> the individual pointed structures, we IMNSHO still don't want it to be
> saved during PCH and restored later, what we have is basically just a cache
> of the target globals.
>
> Dunno what is the best way to handle that though.
> Either before writing PCH c-common.c could call some tree.c routine that
> would traverse the cl_option_hash_table hash table and for every
> TARGET_OPTION_NODE in the hash table clear TREE_TARGET_GLOBALS.
> Or perhaps some gengtype extension to run some routine before PCH saving
> on the tree_target_option structs and clear the globals field in there.
> Or use GTY((user)) on tree_target_option, but then dunno how we'd handle the
> marking of the embedded opts field (and common).
> Any ideas?

Well, a GTY((skip_pch)) would probably work.  Or move the thing
out-of GC land (thus make cl_option_hash_table persistant) and
simply GTY((skip)) the pointer completely.  Not sure if we ever
collect from it.

Richard.

> 2014-01-07  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/58115
>         * tree-core.h (struct target_globals): New forward declaration.
>         (struct tree_target_option): Add globals field.
>         * tree.h (TREE_TARGET_GLOBALS): Define.
>         * target-globals.h (struct target_globals): Define even if
>         !SWITCHABLE_TARGET.
>         * config/i386/i386.h (SWITCHABLE_TARGET): Define.
>         * config/i386/i386.c: Include target-globals.h.
>         (ix86_set_current_function): Instead of doing target_reinit
>         unconditionally, use save_target_globals_default_opts and
>         restore_target_globals.
>
> --- gcc/tree-core.h.jj  2014-01-07 08:47:24.000000000 +0100
> +++ gcc/tree-core.h     2014-01-07 16:44:35.591358235 +0100
> @@ -1557,11 +1557,18 @@ struct GTY(()) tree_optimization_option
>    struct target_optabs *GTY ((skip)) base_optabs;
>  };
>
> +/* Forward declaration, defined in target-globals.h.  */
> +
> +struct GTY(()) target_globals;
> +
>  /* Target options used by a function.  */
>
>  struct GTY(()) tree_target_option {
>    struct tree_common common;
>
> +  /* Target globals for the corresponding target option.  */
> +  struct target_globals *globals;
> +
>    /* The optimization options used by the user.  */
>    struct cl_target_option opts;
>  };
> --- gcc/tree.h.jj       2014-01-03 11:40:33.000000000 +0100
> +++ gcc/tree.h  2014-01-07 12:55:39.137295100 +0100
> @@ -2695,6 +2695,9 @@ extern tree build_optimization_node (str
>  #define TREE_TARGET_OPTION(NODE) \
>    (&TARGET_OPTION_NODE_CHECK (NODE)->target_option.opts)
>
> +#define TREE_TARGET_GLOBALS(NODE) \
> +  (TARGET_OPTION_NODE_CHECK (NODE)->target_option.globals)
> +
>  /* Return a tree node that encapsulates the target options in OPTS.  */
>  extern tree build_target_option_node (struct gcc_options *opts);
>
> --- gcc/target-globals.h.jj     2014-01-03 11:40:46.000000000 +0100
> +++ gcc/target-globals.h        2014-01-07 17:08:51.113880947 +0100
> @@ -37,6 +37,7 @@ extern struct target_builtins *this_targ
>  extern struct target_gcse *this_target_gcse;
>  extern struct target_bb_reorder *this_target_bb_reorder;
>  extern struct target_lower_subreg *this_target_lower_subreg;
> +#endif
>
>  struct GTY(()) target_globals {
>    struct target_flag_state *GTY((skip)) flag_state;
> @@ -57,6 +58,7 @@ struct GTY(()) target_globals {
>    struct target_lower_subreg *GTY((skip)) lower_subreg;
>  };
>
> +#if SWITCHABLE_TARGET
>  extern struct target_globals default_target_globals;
>
>  extern struct target_globals *save_target_globals (void);
> --- gcc/config/i386/i386.h.jj   2014-01-06 22:37:19.000000000 +0100
> +++ gcc/config/i386/i386.h      2014-01-07 12:13:06.480486755 +0100
> @@ -2510,6 +2510,9 @@ extern void debug_dispatch_window (int);
>  #define IX86_HLE_ACQUIRE (1 << 16)
>  #define IX86_HLE_RELEASE (1 << 17)
>
> +/* For switching between functions with different target attributes.  */
> +#define SWITCHABLE_TARGET 1
> +
>  /*
>  Local variables:
>  version-control: t
> --- gcc/config/i386/i386.c.jj   2014-01-06 22:37:19.000000000 +0100
> +++ gcc/config/i386/i386.c      2014-01-07 16:52:32.597904760 +0100
> @@ -80,6 +80,7 @@ along with GCC; see the file COPYING3.
>  #include "tree-pass.h"
>  #include "context.h"
>  #include "pass_manager.h"
> +#include "target-globals.h"
>
>  static rtx legitimize_dllimport_symbol (rtx, bool);
>  static rtx legitimize_pe_coff_extern_decl (rtx, bool);
> @@ -4868,16 +4869,25 @@ ix86_set_current_function (tree fndecl)
>         {
>           cl_target_option_restore (&global_options,
>                                     TREE_TARGET_OPTION (new_tree));
> -         target_reinit ();
> +         if (TREE_TARGET_GLOBALS (new_tree))
> +           restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
> +         else
> +           TREE_TARGET_GLOBALS (new_tree)
> +             = save_target_globals_default_opts ();
>         }
>
>        else if (old_tree)
>         {
> -         struct cl_target_option *def
> -           = TREE_TARGET_OPTION (target_option_current_node);
> -
> -         cl_target_option_restore (&global_options, def);
> -         target_reinit ();
> +         new_tree = target_option_current_node;
> +         cl_target_option_restore (&global_options,
> +                                   TREE_TARGET_OPTION (new_tree));
> +         if (TREE_TARGET_GLOBALS (new_tree))
> +           restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
> +         else if (new_tree == target_option_default_node)
> +           restore_target_globals (&default_target_globals);
> +         else
> +           TREE_TARGET_GLOBALS (new_tree)
> +             = save_target_globals_default_opts ();
>         }
>      }
>  }
>
>
>         Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2)
  2014-01-08 11:33     ` Richard Biener
@ 2014-01-08 12:45       ` Jakub Jelinek
  2014-01-08 14:29         ` Richard Biener
                           ` (2 more replies)
  0 siblings, 3 replies; 39+ messages in thread
From: Jakub Jelinek @ 2014-01-08 12:45 UTC (permalink / raw)
  To: Richard Biener; +Cc: Bernd Edlinger, gcc-patches, Richard Sandiford

On Wed, Jan 08, 2014 at 12:32:59PM +0100, Richard Biener wrote:
> > Either before writing PCH c-common.c could call some tree.c routine that
> > would traverse the cl_option_hash_table hash table and for every
> > TARGET_OPTION_NODE in the hash table clear TREE_TARGET_GLOBALS.
> > Or perhaps some gengtype extension to run some routine before PCH saving
> > on the tree_target_option structs and clear the globals field in there.
> > Or use GTY((user)) on tree_target_option, but then dunno how we'd handle the
> > marking of the embedded opts field (and common).
> > Any ideas?
> 
> Well, a GTY((skip_pch)) would probably work.  Or move the thing
> out-of GC land (thus make cl_option_hash_table persistant) and
> simply GTY((skip)) the pointer completely.  Not sure if we ever
> collect from it.

Even if the pointer was out of GCC land and GTY((skip)), we'd need to clear
it somewhere during PCH saving, as the containing structure is GC allocated.

I've already implemented in the mean time the variant with the
htab_traverse, all still reachable TARGET_OPTION_NODE trees should be in that
hash table.

Bootstrapped/regtested on x86_64-linux and i686-linux (in both cases with
--enable-checking=yes,rtl and --enable-checking=release, for the
i686-linux/release checking I had to fix an unrelated compare debug issue
I'll post when I manage to reduce testcase).

I'd like to get rid of all the XCNEW calls in target-globals.c as a
follow-up.

As for performance, for --enable-checking=release from very rough check
on make -j48 bootstrap and make -j48 check times the patch is compile time
neutral, on e.g. declare-simd-1.C testcase g++ is twice as fast with the
patch though (~ 0.8 sec without the patch, ~ 0.3 sec with the patch, both
for x86_64 and i686).

Ok for trunk?

2014-01-07  Jakub Jelinek  <jakub@redhat.com>

	PR target/58115
	* tree-core.h (struct target_globals): New forward declaration.
	(struct tree_target_option): Add globals field.
	* tree.h (TREE_TARGET_GLOBALS): Define.
	(prepare_target_option_nodes_for_pch): New prototype.
	* target-globals.h (struct target_globals): Define even if
	!SWITCHABLE_TARGET.
	* tree.c (prepare_target_option_node_for_pch,
	prepare_target_option_nodes_for_pch): New functions.
	* config/i386/i386.h (SWITCHABLE_TARGET): Define.
	* config/i386/i386.c: Include target-globals.h.
	(ix86_set_current_function): Instead of doing target_reinit
	unconditionally, use save_target_globals_default_opts and
	restore_target_globals.
c-family/
	* c-pch.c (c_common_write_pch): Call
	prepare_target_option_nodes_for_pch.

--- gcc/tree-core.h.jj	2014-01-07 08:47:24.000000000 +0100
+++ gcc/tree-core.h	2014-01-07 16:44:35.591358235 +0100
@@ -1557,11 +1557,18 @@ struct GTY(()) tree_optimization_option
   struct target_optabs *GTY ((skip)) base_optabs;
 };
 
+/* Forward declaration, defined in target-globals.h.  */
+
+struct GTY(()) target_globals;
+
 /* Target options used by a function.  */
 
 struct GTY(()) tree_target_option {
   struct tree_common common;
 
+  /* Target globals for the corresponding target option.  */
+  struct target_globals *globals;
+
   /* The optimization options used by the user.  */
   struct cl_target_option opts;
 };
--- gcc/tree.h.jj	2014-01-03 11:40:33.000000000 +0100
+++ gcc/tree.h	2014-01-07 21:28:15.038061120 +0100
@@ -2695,9 +2695,14 @@ extern tree build_optimization_node (str
 #define TREE_TARGET_OPTION(NODE) \
   (&TARGET_OPTION_NODE_CHECK (NODE)->target_option.opts)
 
+#define TREE_TARGET_GLOBALS(NODE) \
+  (TARGET_OPTION_NODE_CHECK (NODE)->target_option.globals)
+
 /* Return a tree node that encapsulates the target options in OPTS.  */
 extern tree build_target_option_node (struct gcc_options *opts);
 
+extern void prepare_target_option_nodes_for_pch (void);
+
 #if defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 2007)
 
 inline tree
--- gcc/target-globals.h.jj	2014-01-03 11:40:46.000000000 +0100
+++ gcc/target-globals.h	2014-01-07 17:08:51.113880947 +0100
@@ -37,6 +37,7 @@ extern struct target_builtins *this_targ
 extern struct target_gcse *this_target_gcse;
 extern struct target_bb_reorder *this_target_bb_reorder;
 extern struct target_lower_subreg *this_target_lower_subreg;
+#endif
 
 struct GTY(()) target_globals {
   struct target_flag_state *GTY((skip)) flag_state;
@@ -57,6 +58,7 @@ struct GTY(()) target_globals {
   struct target_lower_subreg *GTY((skip)) lower_subreg;
 };
 
+#if SWITCHABLE_TARGET
 extern struct target_globals default_target_globals;
 
 extern struct target_globals *save_target_globals (void);
--- gcc/tree.c.jj	2014-01-03 11:40:33.000000000 +0100
+++ gcc/tree.c	2014-01-07 21:27:35.590268195 +0100
@@ -11527,6 +11527,28 @@ build_target_option_node (struct gcc_opt
   return t;
 }
 
+/* Reset TREE_TARGET_GLOBALS cache for TARGET_OPTION_NODE.
+   Called through htab_traverse.  */
+
+static int
+prepare_target_option_node_for_pch (void **slot, void *)
+{
+  tree node = (tree) *slot;
+  if (TREE_CODE (node) == TARGET_OPTION_NODE)
+    TREE_TARGET_GLOBALS (node) = NULL;
+  return 1;
+}
+
+/* Clear TREE_TARGET_GLOBALS of all TARGET_OPTION_NODE trees,
+   so that they aren't saved during PCH writing.  */
+
+void
+prepare_target_option_nodes_for_pch (void)
+{
+  htab_traverse (cl_option_hash_table, prepare_target_option_node_for_pch,
+		 NULL);
+}
+
 /* Determine the "ultimate origin" of a block.  The block may be an inlined
    instance of an inlined instance of a block which is local to an inline
    function, so we have to trace all of the way back through the origin chain
--- gcc/c-family/c-pch.c.jj	2014-01-03 11:40:29.000000000 +0100
+++ gcc/c-family/c-pch.c	2014-01-07 21:29:42.985600418 +0100
@@ -180,6 +180,8 @@ c_common_write_pch (void)
 
   (*debug_hooks->handle_pch) (1);
 
+  prepare_target_option_nodes_for_pch ();
+
   cpp_write_pch_deps (parse_in, pch_outfile);
 
   gt_pch_save (pch_outfile);
--- gcc/config/i386/i386.h.jj	2014-01-06 22:37:19.000000000 +0100
+++ gcc/config/i386/i386.h	2014-01-07 17:21:09.049106675 +0100
@@ -2510,6 +2510,9 @@ extern void debug_dispatch_window (int);
 #define IX86_HLE_ACQUIRE (1 << 16)
 #define IX86_HLE_RELEASE (1 << 17)
 
+/* For switching between functions with different target attributes.  */
+#define SWITCHABLE_TARGET 1
+
 /*
 Local variables:
 version-control: t
--- gcc/config/i386/i386.c.jj	2014-01-06 22:37:19.000000000 +0100
+++ gcc/config/i386/i386.c	2014-01-07 17:21:09.049106675 +0100
@@ -80,6 +80,7 @@ along with GCC; see the file COPYING3.
 #include "tree-pass.h"
 #include "context.h"
 #include "pass_manager.h"
+#include "target-globals.h"
 
 static rtx legitimize_dllimport_symbol (rtx, bool);
 static rtx legitimize_pe_coff_extern_decl (rtx, bool);
@@ -4868,16 +4869,25 @@ ix86_set_current_function (tree fndecl)
 	{
 	  cl_target_option_restore (&global_options,
 				    TREE_TARGET_OPTION (new_tree));
-	  target_reinit ();
+	  if (TREE_TARGET_GLOBALS (new_tree))
+	    restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
+	  else
+	    TREE_TARGET_GLOBALS (new_tree)
+	      = save_target_globals_default_opts ();
 	}
 
       else if (old_tree)
 	{
-	  struct cl_target_option *def
-	    = TREE_TARGET_OPTION (target_option_current_node);
-
-	  cl_target_option_restore (&global_options, def);
-	  target_reinit ();
+	  new_tree = target_option_current_node;
+	  cl_target_option_restore (&global_options,
+				    TREE_TARGET_OPTION (new_tree));
+	  if (TREE_TARGET_GLOBALS (new_tree))
+	    restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
+	  else if (new_tree == target_option_default_node)
+	    restore_target_globals (&default_target_globals);
+	  else
+	    TREE_TARGET_GLOBALS (new_tree)
+	      = save_target_globals_default_opts ();
 	}
     }
 }


	Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2)
  2014-01-08 12:45       ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2) Jakub Jelinek
@ 2014-01-08 14:29         ` Richard Biener
  2014-01-08 18:34         ` [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs Jakub Jelinek
  2014-01-09 16:16         ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2) Richard Henderson
  2 siblings, 0 replies; 39+ messages in thread
From: Richard Biener @ 2014-01-08 14:29 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Edlinger, gcc-patches, Richard Sandiford

On Wed, Jan 8, 2014 at 1:45 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Wed, Jan 08, 2014 at 12:32:59PM +0100, Richard Biener wrote:
>> > Either before writing PCH c-common.c could call some tree.c routine that
>> > would traverse the cl_option_hash_table hash table and for every
>> > TARGET_OPTION_NODE in the hash table clear TREE_TARGET_GLOBALS.
>> > Or perhaps some gengtype extension to run some routine before PCH saving
>> > on the tree_target_option structs and clear the globals field in there.
>> > Or use GTY((user)) on tree_target_option, but then dunno how we'd handle the
>> > marking of the embedded opts field (and common).
>> > Any ideas?
>>
>> Well, a GTY((skip_pch)) would probably work.  Or move the thing
>> out-of GC land (thus make cl_option_hash_table persistant) and
>> simply GTY((skip)) the pointer completely.  Not sure if we ever
>> collect from it.
>
> Even if the pointer was out of GCC land and GTY((skip)), we'd need to clear
> it somewhere during PCH saving, as the containing structure is GC allocated.
>
> I've already implemented in the mean time the variant with the
> htab_traverse, all still reachable TARGET_OPTION_NODE trees should be in that
> hash table.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux (in both cases with
> --enable-checking=yes,rtl and --enable-checking=release, for the
> i686-linux/release checking I had to fix an unrelated compare debug issue
> I'll post when I manage to reduce testcase).
>
> I'd like to get rid of all the XCNEW calls in target-globals.c as a
> follow-up.
>
> As for performance, for --enable-checking=release from very rough check
> on make -j48 bootstrap and make -j48 check times the patch is compile time
> neutral, on e.g. declare-simd-1.C testcase g++ is twice as fast with the
> patch though (~ 0.8 sec without the patch, ~ 0.3 sec with the patch, both
> for x86_64 and i686).
>
> Ok for trunk?

Works for me.  Wait a bit for others to comment though.

Thanks,
Richard.

> 2014-01-07  Jakub Jelinek  <jakub@redhat.com>
>
>         PR target/58115
>         * tree-core.h (struct target_globals): New forward declaration.
>         (struct tree_target_option): Add globals field.
>         * tree.h (TREE_TARGET_GLOBALS): Define.
>         (prepare_target_option_nodes_for_pch): New prototype.
>         * target-globals.h (struct target_globals): Define even if
>         !SWITCHABLE_TARGET.
>         * tree.c (prepare_target_option_node_for_pch,
>         prepare_target_option_nodes_for_pch): New functions.
>         * config/i386/i386.h (SWITCHABLE_TARGET): Define.
>         * config/i386/i386.c: Include target-globals.h.
>         (ix86_set_current_function): Instead of doing target_reinit
>         unconditionally, use save_target_globals_default_opts and
>         restore_target_globals.
> c-family/
>         * c-pch.c (c_common_write_pch): Call
>         prepare_target_option_nodes_for_pch.
>
> --- gcc/tree-core.h.jj  2014-01-07 08:47:24.000000000 +0100
> +++ gcc/tree-core.h     2014-01-07 16:44:35.591358235 +0100
> @@ -1557,11 +1557,18 @@ struct GTY(()) tree_optimization_option
>    struct target_optabs *GTY ((skip)) base_optabs;
>  };
>
> +/* Forward declaration, defined in target-globals.h.  */
> +
> +struct GTY(()) target_globals;
> +
>  /* Target options used by a function.  */
>
>  struct GTY(()) tree_target_option {
>    struct tree_common common;
>
> +  /* Target globals for the corresponding target option.  */
> +  struct target_globals *globals;
> +
>    /* The optimization options used by the user.  */
>    struct cl_target_option opts;
>  };
> --- gcc/tree.h.jj       2014-01-03 11:40:33.000000000 +0100
> +++ gcc/tree.h  2014-01-07 21:28:15.038061120 +0100
> @@ -2695,9 +2695,14 @@ extern tree build_optimization_node (str
>  #define TREE_TARGET_OPTION(NODE) \
>    (&TARGET_OPTION_NODE_CHECK (NODE)->target_option.opts)
>
> +#define TREE_TARGET_GLOBALS(NODE) \
> +  (TARGET_OPTION_NODE_CHECK (NODE)->target_option.globals)
> +
>  /* Return a tree node that encapsulates the target options in OPTS.  */
>  extern tree build_target_option_node (struct gcc_options *opts);
>
> +extern void prepare_target_option_nodes_for_pch (void);
> +
>  #if defined ENABLE_TREE_CHECKING && (GCC_VERSION >= 2007)
>
>  inline tree
> --- gcc/target-globals.h.jj     2014-01-03 11:40:46.000000000 +0100
> +++ gcc/target-globals.h        2014-01-07 17:08:51.113880947 +0100
> @@ -37,6 +37,7 @@ extern struct target_builtins *this_targ
>  extern struct target_gcse *this_target_gcse;
>  extern struct target_bb_reorder *this_target_bb_reorder;
>  extern struct target_lower_subreg *this_target_lower_subreg;
> +#endif
>
>  struct GTY(()) target_globals {
>    struct target_flag_state *GTY((skip)) flag_state;
> @@ -57,6 +58,7 @@ struct GTY(()) target_globals {
>    struct target_lower_subreg *GTY((skip)) lower_subreg;
>  };
>
> +#if SWITCHABLE_TARGET
>  extern struct target_globals default_target_globals;
>
>  extern struct target_globals *save_target_globals (void);
> --- gcc/tree.c.jj       2014-01-03 11:40:33.000000000 +0100
> +++ gcc/tree.c  2014-01-07 21:27:35.590268195 +0100
> @@ -11527,6 +11527,28 @@ build_target_option_node (struct gcc_opt
>    return t;
>  }
>
> +/* Reset TREE_TARGET_GLOBALS cache for TARGET_OPTION_NODE.
> +   Called through htab_traverse.  */
> +
> +static int
> +prepare_target_option_node_for_pch (void **slot, void *)
> +{
> +  tree node = (tree) *slot;
> +  if (TREE_CODE (node) == TARGET_OPTION_NODE)
> +    TREE_TARGET_GLOBALS (node) = NULL;
> +  return 1;
> +}
> +
> +/* Clear TREE_TARGET_GLOBALS of all TARGET_OPTION_NODE trees,
> +   so that they aren't saved during PCH writing.  */
> +
> +void
> +prepare_target_option_nodes_for_pch (void)
> +{
> +  htab_traverse (cl_option_hash_table, prepare_target_option_node_for_pch,
> +                NULL);
> +}
> +
>  /* Determine the "ultimate origin" of a block.  The block may be an inlined
>     instance of an inlined instance of a block which is local to an inline
>     function, so we have to trace all of the way back through the origin chain
> --- gcc/c-family/c-pch.c.jj     2014-01-03 11:40:29.000000000 +0100
> +++ gcc/c-family/c-pch.c        2014-01-07 21:29:42.985600418 +0100
> @@ -180,6 +180,8 @@ c_common_write_pch (void)
>
>    (*debug_hooks->handle_pch) (1);
>
> +  prepare_target_option_nodes_for_pch ();
> +
>    cpp_write_pch_deps (parse_in, pch_outfile);
>
>    gt_pch_save (pch_outfile);
> --- gcc/config/i386/i386.h.jj   2014-01-06 22:37:19.000000000 +0100
> +++ gcc/config/i386/i386.h      2014-01-07 17:21:09.049106675 +0100
> @@ -2510,6 +2510,9 @@ extern void debug_dispatch_window (int);
>  #define IX86_HLE_ACQUIRE (1 << 16)
>  #define IX86_HLE_RELEASE (1 << 17)
>
> +/* For switching between functions with different target attributes.  */
> +#define SWITCHABLE_TARGET 1
> +
>  /*
>  Local variables:
>  version-control: t
> --- gcc/config/i386/i386.c.jj   2014-01-06 22:37:19.000000000 +0100
> +++ gcc/config/i386/i386.c      2014-01-07 17:21:09.049106675 +0100
> @@ -80,6 +80,7 @@ along with GCC; see the file COPYING3.
>  #include "tree-pass.h"
>  #include "context.h"
>  #include "pass_manager.h"
> +#include "target-globals.h"
>
>  static rtx legitimize_dllimport_symbol (rtx, bool);
>  static rtx legitimize_pe_coff_extern_decl (rtx, bool);
> @@ -4868,16 +4869,25 @@ ix86_set_current_function (tree fndecl)
>         {
>           cl_target_option_restore (&global_options,
>                                     TREE_TARGET_OPTION (new_tree));
> -         target_reinit ();
> +         if (TREE_TARGET_GLOBALS (new_tree))
> +           restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
> +         else
> +           TREE_TARGET_GLOBALS (new_tree)
> +             = save_target_globals_default_opts ();
>         }
>
>        else if (old_tree)
>         {
> -         struct cl_target_option *def
> -           = TREE_TARGET_OPTION (target_option_current_node);
> -
> -         cl_target_option_restore (&global_options, def);
> -         target_reinit ();
> +         new_tree = target_option_current_node;
> +         cl_target_option_restore (&global_options,
> +                                   TREE_TARGET_OPTION (new_tree));
> +         if (TREE_TARGET_GLOBALS (new_tree))
> +           restore_target_globals (TREE_TARGET_GLOBALS (new_tree));
> +         else if (new_tree == target_option_default_node)
> +           restore_target_globals (&default_target_globals);
> +         else
> +           TREE_TARGET_GLOBALS (new_tree)
> +             = save_target_globals_default_opts ();
>         }
>      }
>  }
>
>
>         Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
  2014-01-08 12:45       ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2) Jakub Jelinek
  2014-01-08 14:29         ` Richard Biener
@ 2014-01-08 18:34         ` Jakub Jelinek
  2014-01-08 19:41           ` Richard Sandiford
  2014-01-09 16:25           ` Richard Henderson
  2014-01-09 16:16         ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2) Richard Henderson
  2 siblings, 2 replies; 39+ messages in thread
From: Jakub Jelinek @ 2014-01-08 18:34 UTC (permalink / raw)
  To: Richard Biener, Richard Sandiford; +Cc: gcc-patches

On Wed, Jan 08, 2014 at 01:45:40PM +0100, Jakub Jelinek wrote:
> I'd like to get rid of all the XCNEW calls in target-globals.c as a
> follow-up.

Here it is.  The rationale is both to avoid many separate heap allocations
and if TARGET_OPTION_NODE is no longer needed (all FUNCTION_DECLs
referencing it are e.g. optimized away, say static unused functions)
to avoid leaking memory.

Bootstrapped/regtested on x86_64-linux and i686-linux (together
with the i386 SWITCHABLE_TARGET patch).

Though, looking at the sizes, i686-linux allocates 0x67928
bytes which I think with ggc-page.c we allocate 0.5MB for it (acceptable),
on x86_64-linux the allocation size is 0x83aa8 and thus only ~ 15KB over
to fit into 0.5MB, thus I think we allocate 1MB.
So, if we wanted to tune for x86_64, we could not allocate say
target_flag_state (size 0x5008) in the big chunk, but instead make
it GTY((atomic)) and allocate separately.

Or perhaps do that for other very large structs?  In any case, that doesn't
look like something that probably would need to be retuned for every
release.

The current sizes of the structs are:
struct target_globals		0x80		0x40
struct target_flag_state	0x20		0x20
struct target_regs		0x5008		0x5008
struct target_hard_regs		0x35c8		0x33f8
struct target_reload		0xef70		0xef70
struct target_expmed		0x180b0		0xf4b0
struct target_optabs		0x4f0		0x4b9
struct target_cfgloop		0x1c		0x1c
struct target_ira		0x9628		0x9620
struct target_ira_int		0x3fca8		0x322e4
struct target_lra_int		0xa718		0x4e70
struct target_builtins		0x268		0x268
struct target_gcse		0x62		0x62
struct target_bb_reorder	0x4		0x4
struct target_lower_subreg	0x24c		0x18c

Perhaps use cut-off of 4KB with current sizes, anything below that
would be allocated in the single block, anything above it separately.
So 7 structs allocated together, 7 separately.

2014-01-08  Jakub Jelinek  <jakub@redhat.com>

	* target-globals.c (save_target_globals): Allocate most of the
	structs using GC in payload of target_globals struct instead
	of allocating them on the heap.

--- gcc/target-globals.c.jj	2014-01-08 10:23:22.000000000 +0100
+++ gcc/target-globals.c	2014-01-08 14:00:13.183231122 +0100
@@ -68,24 +68,43 @@ struct target_globals *
 save_target_globals (void)
 {
   struct target_globals *g;
-
-  g = ggc_alloc_target_globals ();
-  g->flag_state = XCNEW (struct target_flag_state);
-  g->regs = XCNEW (struct target_regs);
+  struct target_globals_extra {
+    struct target_globals g;
+    struct target_flag_state flag_state;
+    struct target_regs regs;
+    struct target_hard_regs hard_regs;
+    struct target_reload reload;
+    struct target_expmed expmed;
+    struct target_optabs optabs;
+    struct target_cfgloop cfgloop;
+    struct target_ira ira;
+    struct target_ira_int ira_int;
+    struct target_lra_int lra_int;
+    struct target_builtins builtins;
+    struct target_gcse gcse;
+    struct target_bb_reorder bb_reorder;
+    struct target_lower_subreg lower_subreg;
+  } *p;
+  p = (struct target_globals_extra *)
+      ggc_internal_cleared_alloc_stat (sizeof (struct target_globals_extra)
+				       PASS_MEM_STAT);
+  g = (struct target_globals *) p;
+  g->flag_state = &p->flag_state;
+  g->regs = &p->regs;
   g->rtl = ggc_alloc_cleared_target_rtl ();
-  g->hard_regs = XCNEW (struct target_hard_regs);
-  g->reload = XCNEW (struct target_reload);
-  g->expmed = XCNEW (struct target_expmed);
-  g->optabs = XCNEW (struct target_optabs);
+  g->hard_regs = &p->hard_regs;
+  g->reload = &p->reload;
+  g->expmed = &p->expmed;
+  g->optabs = &p->optabs;
   g->libfuncs = ggc_alloc_cleared_target_libfuncs ();
-  g->cfgloop = XCNEW (struct target_cfgloop);
-  g->ira = XCNEW (struct target_ira);
-  g->ira_int = XCNEW (struct target_ira_int);
-  g->lra_int = XCNEW (struct target_lra_int);
-  g->builtins = XCNEW (struct target_builtins);
-  g->gcse = XCNEW (struct target_gcse);
-  g->bb_reorder = XCNEW (struct target_bb_reorder);
-  g->lower_subreg = XCNEW (struct target_lower_subreg);
+  g->cfgloop = &p->cfgloop;
+  g->ira = &p->ira;
+  g->ira_int = &p->ira_int;
+  g->lra_int = &p->lra_int;
+  g->builtins = &p->builtins;
+  g->gcse = &p->gcse;
+  g->bb_reorder = &p->bb_reorder;
+  g->lower_subreg = &p->lower_subreg;
   restore_target_globals (g);
   init_reg_sets ();
   target_reinit ();


	Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
  2014-01-08 18:34         ` [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs Jakub Jelinek
@ 2014-01-08 19:41           ` Richard Sandiford
  2014-01-08 19:54             ` Jakub Jelinek
  2014-01-09 16:25           ` Richard Henderson
  1 sibling, 1 reply; 39+ messages in thread
From: Richard Sandiford @ 2014-01-08 19:41 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, gcc-patches

Jakub Jelinek <jakub@redhat.com> writes:
> 2014-01-08  Jakub Jelinek  <jakub@redhat.com>
>
> 	* target-globals.c (save_target_globals): Allocate most of the
> 	structs using GC in payload of target_globals struct instead
> 	of allocating them on the heap.

Looks good to me FWIW.  I don't know either way about the one-big-blob thing.

Note that we'll still leak memory when deleting TARGET_OPTION_NODEs
because target_ira_int and target_lra_int have pointers to heap-allocated
storage.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
  2014-01-08 19:41           ` Richard Sandiford
@ 2014-01-08 19:54             ` Jakub Jelinek
  0 siblings, 0 replies; 39+ messages in thread
From: Jakub Jelinek @ 2014-01-08 19:54 UTC (permalink / raw)
  To: Richard Biener, gcc-patches, rdsandiford

On Wed, Jan 08, 2014 at 07:41:26PM +0000, Richard Sandiford wrote:
> Jakub Jelinek <jakub@redhat.com> writes:
> > 2014-01-08  Jakub Jelinek  <jakub@redhat.com>
> >
> > 	* target-globals.c (save_target_globals): Allocate most of the
> > 	structs using GC in payload of target_globals struct instead
> > 	of allocating them on the heap.
> 
> Looks good to me FWIW.  I don't know either way about the one-big-blob thing.
> 
> Note that we'll still leak memory when deleting TARGET_OPTION_NODEs
> because target_ira_int and target_lra_int have pointers to heap-allocated
> storage.

Yeah, perhaps that is something to fix incrementally.

But, at least we will not leak ~ 0.5MB per (unique) target attribute
used on some unused function.

	Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCH] Fix PR58115
  2014-01-07 14:10               ` Richard Biener
@ 2014-01-09  0:31                 ` Bernd Edlinger
  2014-01-09  9:02                   ` Richard Sandiford
  0 siblings, 1 reply; 39+ messages in thread
From: Bernd Edlinger @ 2014-01-09  0:31 UTC (permalink / raw)
  To: Richard Biener, Jakub Jelinek, gcc-patches, Richard Sandiford

[-- Attachment #1: Type: text/plain, Size: 5980 bytes --]

Hi,

On Tue, 7 Jan 2014 15:10:20, Richard Biener wrote:
>
> On Tue, Jan 7, 2014 at 1:12 PM, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
>>>> How about this patch for the big comment?
>>>>
>>>
>>> The comment should say that target_set_current_function()
>>> cannot call target_reinit() because:
>>>
>>> target_reinit()=>lang_dependent_init_target()
>>> =>init_optabs()=>init_all_optabs(this_fn_optabs);
>>>
>>> uses this_fn_optabs which is undefined here.
>>>
>>> However many targets (nios2, rx, i386, rs6000) do exactly that.
>>>
>>> Is there currently any target, that sets this_target_optab in the
>>> target_set_current_function?
>>
>> MIPS :-) (via save_target_globals_default_opts=>save_target_globals)
>>
>> I think other targets need to do the same thing in order for tests
>> like that extended intrinsics_4.c to work. How does this patch look?
>> Tested on x86_64-linux-gnu.
>>
>> I didn't remove save_target_globals_default_opts because there the
>> temporary optimization_current_node also protects a call to init_reg_sets.
>
> Well, if it works the patch is ok. You're way more familiar with the
> details of this machinery.
>
> Richard.
>

I found another test case that still fails with today's trunk:

#include <immintrin.h>

__m256 a[10], b[10], c[10];

void __attribute__((target ("sse2"), optimize (3)))
foo (void)
{
}

void __attribute__((target ("avx"), optimize (3)))
bar (void)
{
  a[0] = _mm256_and_ps (b[0], c[0]);
}

compile with i686-pc-linux-gnu-gcc -O2 -msse2 -mno-avx -S  

The attached patch seems to fix this test case for
targets that do not have SWITCHABLE_TARGET.

What do you think about it?

I think Jakub's patch will fix this case, but I did not try.
However even if the i368 is now clean, there are
still many targets that use target_reinit() in
target_set_current_function.


Bernd.

>> Thanks,
>> Richard
>>
>>
>> gcc/
>> PR target/58115
>> * target-globals.c (save_target_globals): Remove this_fn_optab
>> handling.
>> * toplev.c: Include optabs.h.
>> (target_reinit): Temporarily restore the global options if another
>> set of options are in force.
>>
>> gcc/testsuite/
>> * gcc.target/i386/intrinsics_4.c (bar): New function.
>>
>> Index: gcc/target-globals.c
>> ===================================================================
>> --- gcc/target-globals.c 2014-01-02 22:16:03.042278971 +0000
>> +++ gcc/target-globals.c 2014-01-07 12:08:33.569900970 +0000
>> @@ -68,7 +68,6 @@ struct target_globals *
>> save_target_globals (void)
>> {
>> struct target_globals *g;
>> - struct target_optabs *saved_this_fn_optabs = this_fn_optabs;
>>
>> g = ggc_alloc_target_globals ();
>> g->flag_state = XCNEW (struct target_flag_state);
>> @@ -88,10 +87,8 @@ save_target_globals (void)
>> g->bb_reorder = XCNEW (struct target_bb_reorder);
>> g->lower_subreg = XCNEW (struct target_lower_subreg);
>> restore_target_globals (g);
>> - this_fn_optabs = this_target_optabs;
>> init_reg_sets ();
>> target_reinit ();
>> - this_fn_optabs = saved_this_fn_optabs;
>> return g;
>> }
>>
>> Index: gcc/toplev.c
>> ===================================================================
>> --- gcc/toplev.c 2014-01-07 08:11:43.888058805 +0000
>> +++ gcc/toplev.c 2014-01-07 12:10:19.448096479 +0000
>> @@ -78,6 +78,7 @@ Software Foundation; either version 3, o
>> #include "diagnostic-color.h"
>> #include "context.h"
>> #include "pass_manager.h"
>> +#include "optabs.h"
>>
>> #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
>> #include "dbxout.h"
>> @@ -1752,6 +1753,23 @@ target_reinit (void)
>> {
>> struct rtl_data saved_x_rtl;
>> rtx *saved_regno_reg_rtx;
>> + tree saved_optimization_current_node;
>> + struct target_optabs *saved_this_fn_optabs;
>> +
>> + /* Temporarily switch to the default optimization node, so that
>> + *this_target_optabs is set to the default, not reflecting
>> + whatever a previous function used for the optimize
>> + attribute. */
>> + saved_optimization_current_node = optimization_current_node;
>> + saved_this_fn_optabs = this_fn_optabs;
>> + if (saved_optimization_current_node != optimization_default_node)
>> + {
>> + optimization_current_node = optimization_default_node;
>> + cl_optimization_restore
>> + (&global_options,
>> + TREE_OPTIMIZATION (optimization_default_node));
>> + }
>> + this_fn_optabs = this_target_optabs;
>>
>> /* Save *crtl and regno_reg_rtx around the reinitialization
>> to allow target_reinit being called even after prepare_function_start. */
>> @@ -1769,7 +1787,16 @@ target_reinit (void)
>> /* Reinitialize lang-dependent parts. */
>> lang_dependent_init_target ();
>>
>> - /* And restore it at the end, as free_after_compilation from
>> + /* Restore the original optimization node. */
>> + if (saved_optimization_current_node != optimization_default_node)
>> + {
>> + optimization_current_node = saved_optimization_current_node;
>> + cl_optimization_restore (&global_options,
>> + TREE_OPTIMIZATION (optimization_current_node));
>> + }
>> + this_fn_optabs = saved_this_fn_optabs;
>> +
>> + /* Restore regno_reg_rtx at the end, as free_after_compilation from
>> expand_dummy_function_end clears it. */
>> if (saved_regno_reg_rtx)
>> {
>> Index: gcc/testsuite/gcc.target/i386/intrinsics_4.c
>> ===================================================================
>> --- gcc/testsuite/gcc.target/i386/intrinsics_4.c 2013-06-27 22:30:03.877275875 +0100
>> +++ gcc/testsuite/gcc.target/i386/intrinsics_4.c 2014-01-07 08:15:51.509627927 +0000
>> @@ -12,3 +12,10 @@ foo (void)
>> {
>> a[0] = _mm256_and_ps (b[0], c[0]);
>> }
>> +
>> +/* Try again with a combination of target and optimization attributes. */
>> +void __attribute__((target ("avx"), optimize(3)))
>> +bar (void)
>> +{
>> + a[0] = _mm256_and_ps (b[0], c[0]);
>> +} 		 	   		  

[-- Attachment #2: patch-optabs.diff --]
[-- Type: application/octet-stream, Size: 1159 bytes --]

Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	(revision 206437)
+++ gcc/optabs.c	(working copy)
@@ -6227,11 +6227,23 @@ void
 init_tree_optimization_optabs (tree optnode)
 {
   /* Quick exit if we have already computed optabs for this target.  */
+#if SWITCHABLE_TARGET
   if (TREE_OPTIMIZATION_BASE_OPTABS (optnode) == this_target_optabs)
     return;
+  TREE_OPTIMIZATION_BASE_OPTABS (optnode) = this_target_optabs;
+#else
+  if (TREE_OPTIMIZATION_BASE_OPTABS (optnode) == NULL)
+    {
+      TREE_OPTIMIZATION_BASE_OPTABS (optnode) = (struct target_optabs *)
+	ggc_alloc_atomic (sizeof (struct target_optabs));
+    }
+  else if (memcmp (TREE_OPTIMIZATION_BASE_OPTABS (optnode),
+		   this_target_optabs, sizeof (struct target_optabs)) == 0)
+    return;
+  *TREE_OPTIMIZATION_BASE_OPTABS (optnode) = *this_target_optabs;
+#endif
 
   /* Forget any previous information and set up for the current target.  */
-  TREE_OPTIMIZATION_BASE_OPTABS (optnode) = this_target_optabs;
   struct target_optabs *tmp_optabs = (struct target_optabs *)
     TREE_OPTIMIZATION_OPTABS (optnode);
   if (tmp_optabs)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2014-01-09  0:31                 ` Bernd Edlinger
@ 2014-01-09  9:02                   ` Richard Sandiford
  2014-01-09  9:28                     ` Jakub Jelinek
  2014-01-09 10:24                     ` Jakub Jelinek
  0 siblings, 2 replies; 39+ messages in thread
From: Richard Sandiford @ 2014-01-09  9:02 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: Richard Biener, Jakub Jelinek, gcc-patches

Bernd Edlinger <bernd.edlinger@hotmail.de> writes:
> I found another test case that still fails with today's trunk:
>
> #include <immintrin.h>
>
> __m256 a[10], b[10], c[10];
>
> void __attribute__((target ("sse2"), optimize (3)))
> foo (void)
> {
> }
>
> void __attribute__((target ("avx"), optimize (3)))
> bar (void)
> {
>   a[0] = _mm256_and_ps (b[0], c[0]);
> }
>
> compile with i686-pc-linux-gnu-gcc -O2 -msse2 -mno-avx -S  
>
> The attached patch seems to fix this test case for
> targets that do not have SWITCHABLE_TARGET.
>
> What do you think about it?

It looks like a correct fix, but the memcpy is going to be pretty
expensive, since in most cases there will be no difference.

Calling target_reinit is the rare case, and already very slow itself,
so maybe an easier option would be to have a target_reinit counter.
I.e. for !SWITCHABLE_TARGETs only, replace TREE_OPTIMIZATION_BASE_OPTABS
with a "number of target_reinit calls" field.

Not sure it's worth the effort though.  The other targets should
really move to SWITCHABLE_TARGET too.  One of the reasons why I made
SWITCHABLE_TARGET optional was that I was worried it might slow down
the compiler for targets that didn't need it.  Jakub's measurements
suggest that any compile-time effect is in the noise though.

> I think Jakub's patch will fix this case, but I did not try.
> However even if the i368 is now clean, there are
> still many targets that use target_reinit() in
> target_set_current_function.

FWIW I only see three others (nios, rs6000 and rx).  nios and rs6000
are direct cut-&-pastes of the i386 version so should be easy to switch.
rx looks more like MIPS in that it's switching between two specific
subtargets.

Thanks,
Richard

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2014-01-09  9:02                   ` Richard Sandiford
@ 2014-01-09  9:28                     ` Jakub Jelinek
  2014-01-09  9:36                       ` Bernd Edlinger
  2014-01-09 10:24                     ` Jakub Jelinek
  1 sibling, 1 reply; 39+ messages in thread
From: Jakub Jelinek @ 2014-01-09  9:28 UTC (permalink / raw)
  To: Bernd Edlinger, Richard Biener, gcc-patches, rdsandiford

On Thu, Jan 09, 2014 at 09:02:28AM +0000, Richard Sandiford wrote:
> > I think Jakub's patch will fix this case, but I did not try.
> > However even if the i368 is now clean, there are
> > still many targets that use target_reinit() in
> > target_set_current_function.
> 
> FWIW I only see three others (nios, rs6000 and rx).  nios and rs6000
> are direct cut-&-pastes of the i386 version so should be easy to switch.
> rx looks more like MIPS in that it's switching between two specific
> subtargets.

Yeah, if i386 is changed into SWITCHABLE_TARGET, then I'd strongly encourage
rs6000 and nios folks to follow the suit.

	Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCH] Fix PR58115
  2014-01-09  9:28                     ` Jakub Jelinek
@ 2014-01-09  9:36                       ` Bernd Edlinger
  2014-01-09  9:43                         ` Jakub Jelinek
  2014-01-11 17:19                         ` Peter Bergner
  0 siblings, 2 replies; 39+ messages in thread
From: Bernd Edlinger @ 2014-01-09  9:36 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener, gcc-patches, rdsandiford

On Thu, 9 Jan 2014 10:28:43, Jakub Jelinek wrote:
>
> On Thu, Jan 09, 2014 at 09:02:28AM +0000, Richard Sandiford wrote:
>>> I think Jakub's patch will fix this case, but I did not try.
>>> However even if the i368 is now clean, there are
>>> still many targets that use target_reinit() in
>>> target_set_current_function.
>>
>> FWIW I only see three others (nios, rs6000 and rx). nios and rs6000
>> are direct cut-&-pastes of the i386 version so should be easy to switch.
>> rx looks more like MIPS in that it's switching between two specific
>> subtargets.
>
> Yeah, if i386 is changed into SWITCHABLE_TARGET, then I'd strongly encourage
> rs6000 and nios folks to follow the suit.
>
> Jakub

Ok for me. Hope they read this thread...

If that is our policy for 4.9.0, then the comment in function.c where
the targetm.set_current_function (fndecl); is called should
_very_ clearly say that this callback is no longer allowed to call
target_reinit() any more.


Thanks
Bernd. 		 	   		  

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2014-01-09  9:36                       ` Bernd Edlinger
@ 2014-01-09  9:43                         ` Jakub Jelinek
  2014-01-11 17:19                         ` Peter Bergner
  1 sibling, 0 replies; 39+ messages in thread
From: Jakub Jelinek @ 2014-01-09  9:43 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: Richard Biener, gcc-patches, rdsandiford

On Thu, Jan 09, 2014 at 10:36:43AM +0100, Bernd Edlinger wrote:
> On Thu, 9 Jan 2014 10:28:43, Jakub Jelinek wrote:
> >
> > On Thu, Jan 09, 2014 at 09:02:28AM +0000, Richard Sandiford wrote:
> >>> I think Jakub's patch will fix this case, but I did not try.
> >>> However even if the i368 is now clean, there are
> >>> still many targets that use target_reinit() in
> >>> target_set_current_function.
> >>
> >> FWIW I only see three others (nios, rs6000 and rx). nios and rs6000
> >> are direct cut-&-pastes of the i386 version so should be easy to switch.
> >> rx looks more like MIPS in that it's switching between two specific
> >> subtargets.
> >
> > Yeah, if i386 is changed into SWITCHABLE_TARGET, then I'd strongly encourage
> > rs6000 and nios folks to follow the suit.
> >
> > Jakub
> 
> Ok for me. Hope they read this thread...
> 
> If that is our policy for 4.9.0, then the comment in function.c where
> the targetm.set_current_function (fndecl); is called should
> _very_ clearly say that this callback is no longer allowed to call
> target_reinit() any more.

But that is not the case, even with the i386 SWITCHABLE_TARGET patch it may
call target_reinit, because that is what save_target_globals_default_opts
calls.  It just calls it temporarily with the optimization_default_node.

	Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2014-01-09  9:02                   ` Richard Sandiford
  2014-01-09  9:28                     ` Jakub Jelinek
@ 2014-01-09 10:24                     ` Jakub Jelinek
  2014-01-09 11:11                       ` Richard Biener
  1 sibling, 1 reply; 39+ messages in thread
From: Jakub Jelinek @ 2014-01-09 10:24 UTC (permalink / raw)
  To: Bernd Edlinger, Richard Biener, gcc-patches, rdsandiford

On Thu, Jan 09, 2014 at 09:02:28AM +0000, Richard Sandiford wrote:
> It looks like a correct fix, but the memcpy is going to be pretty
> expensive, since in most cases there will be no difference.

Perhaps we should add another tree code, which would represent the
combination of TARGET_OPTION_NODE and OPTIMIZATION_NODE, FUNCTION_DECL
would then refer to this combo node only and that new tree would
refer to both TARGET_OPTION_NODE and OPTIMIZATION_NODE.
That way we could stick the saved optabs into the new node rather than
having default opt cached target optabs, non-default opt with default
target optabs cached too, but for non-default target non-default opt we
don't cache anything and always recompute.

Or perhaps just merge both TARGET_OPTION_NODE and OPTIMIZATION_NODE
into one and let both target and optimize attributes adjust it?

	Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2014-01-09 10:24                     ` Jakub Jelinek
@ 2014-01-09 11:11                       ` Richard Biener
  2014-01-09 11:16                         ` Jakub Jelinek
  0 siblings, 1 reply; 39+ messages in thread
From: Richard Biener @ 2014-01-09 11:11 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Edlinger, gcc-patches, Richard Sandiford

On Thu, Jan 9, 2014 at 11:24 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Jan 09, 2014 at 09:02:28AM +0000, Richard Sandiford wrote:
>> It looks like a correct fix, but the memcpy is going to be pretty
>> expensive, since in most cases there will be no difference.
>
> Perhaps we should add another tree code, which would represent the
> combination of TARGET_OPTION_NODE and OPTIMIZATION_NODE, FUNCTION_DECL
> would then refer to this combo node only and that new tree would
> refer to both TARGET_OPTION_NODE and OPTIMIZATION_NODE.
> That way we could stick the saved optabs into the new node rather than
> having default opt cached target optabs, non-default opt with default
> target optabs cached too, but for non-default target non-default opt we
> don't cache anything and always recompute.
>
> Or perhaps just merge both TARGET_OPTION_NODE and OPTIMIZATION_NODE
> into one and let both target and optimize attributes adjust it?

Yeah - I fail to see why we have two different tree nodes here anyway.

Richard.

>         Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Fix PR58115
  2014-01-09 11:11                       ` Richard Biener
@ 2014-01-09 11:16                         ` Jakub Jelinek
  0 siblings, 0 replies; 39+ messages in thread
From: Jakub Jelinek @ 2014-01-09 11:16 UTC (permalink / raw)
  To: Richard Biener; +Cc: Bernd Edlinger, gcc-patches, Richard Sandiford

On Thu, Jan 09, 2014 at 12:11:12PM +0100, Richard Biener wrote:
> On Thu, Jan 9, 2014 at 11:24 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> > On Thu, Jan 09, 2014 at 09:02:28AM +0000, Richard Sandiford wrote:
> >> It looks like a correct fix, but the memcpy is going to be pretty
> >> expensive, since in most cases there will be no difference.
> >
> > Perhaps we should add another tree code, which would represent the
> > combination of TARGET_OPTION_NODE and OPTIMIZATION_NODE, FUNCTION_DECL
> > would then refer to this combo node only and that new tree would
> > refer to both TARGET_OPTION_NODE and OPTIMIZATION_NODE.
> > That way we could stick the saved optabs into the new node rather than
> > having default opt cached target optabs, non-default opt with default
> > target optabs cached too, but for non-default target non-default opt we
> > don't cache anything and always recompute.
> >
> > Or perhaps just merge both TARGET_OPTION_NODE and OPTIMIZATION_NODE
> > into one and let both target and optimize attributes adjust it?
> 
> Yeah - I fail to see why we have two different tree nodes here anyway.

Well, if the target_reinit stuff (except for optabs) is only dependent on
the target flags and not on the optimization flags (is that really the
case?), then by not having a single node only it is possible to save the
0.5MB or what target blob only once per target specific option combination,
while with only one node we'd need to duplicate that.  And, I think
OPTIMIZATION_NODE works regardless of target support, doesn't need
target_reinit etc.

	Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2)
  2014-01-08 12:45       ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2) Jakub Jelinek
  2014-01-08 14:29         ` Richard Biener
  2014-01-08 18:34         ` [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs Jakub Jelinek
@ 2014-01-09 16:16         ` Richard Henderson
  2 siblings, 0 replies; 39+ messages in thread
From: Richard Henderson @ 2014-01-09 16:16 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener
  Cc: Bernd Edlinger, gcc-patches, Richard Sandiford

On 01/08/2014 04:45 AM, Jakub Jelinek wrote:
> 2014-01-07  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/58115
> 	* tree-core.h (struct target_globals): New forward declaration.
> 	(struct tree_target_option): Add globals field.
> 	* tree.h (TREE_TARGET_GLOBALS): Define.
> 	(prepare_target_option_nodes_for_pch): New prototype.
> 	* target-globals.h (struct target_globals): Define even if
> 	!SWITCHABLE_TARGET.
> 	* tree.c (prepare_target_option_node_for_pch,
> 	prepare_target_option_nodes_for_pch): New functions.
> 	* config/i386/i386.h (SWITCHABLE_TARGET): Define.
> 	* config/i386/i386.c: Include target-globals.h.
> 	(ix86_set_current_function): Instead of doing target_reinit
> 	unconditionally, use save_target_globals_default_opts and
> 	restore_target_globals.
> c-family/
> 	* c-pch.c (c_common_write_pch): Call
> 	prepare_target_option_nodes_for_pch.

Ok.


r~

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
  2014-01-08 18:34         ` [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs Jakub Jelinek
  2014-01-08 19:41           ` Richard Sandiford
@ 2014-01-09 16:25           ` Richard Henderson
  2014-01-09 16:35             ` Jakub Jelinek
  1 sibling, 1 reply; 39+ messages in thread
From: Richard Henderson @ 2014-01-09 16:25 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Biener, Richard Sandiford; +Cc: gcc-patches

On 01/08/2014 10:34 AM, Jakub Jelinek wrote:
>    struct target_globals *g;
> -
> -  g = ggc_alloc_target_globals ();
> -  g->flag_state = XCNEW (struct target_flag_state);
> -  g->regs = XCNEW (struct target_regs);
> +  struct target_globals_extra {
> +    struct target_globals g;
> +    struct target_flag_state flag_state;
> +    struct target_regs regs;
> +    struct target_hard_regs hard_regs;
> +    struct target_reload reload;
> +    struct target_expmed expmed;
> +    struct target_optabs optabs;
> +    struct target_cfgloop cfgloop;
> +    struct target_ira ira;
> +    struct target_ira_int ira_int;
> +    struct target_lra_int lra_int;
> +    struct target_builtins builtins;
> +    struct target_gcse gcse;
> +    struct target_bb_reorder bb_reorder;
> +    struct target_lower_subreg lower_subreg;
> +  } *p;
> +  p = (struct target_globals_extra *)
> +      ggc_internal_cleared_alloc_stat (sizeof (struct target_globals_extra)
> +				       PASS_MEM_STAT);
> +  g = (struct target_globals *) p;
> +  g->flag_state = &p->flag_state;
> +  g->regs = &p->regs;
>    g->rtl = ggc_alloc_cleared_target_rtl ();

So, we're relying on something pointing to G, thus keeping the whole P alive?
I suppose that works but it's fairly ugly that's for sure.

As for the extra ~500k wasted on x86_64, we can either fix our gc allocator to
do something sensible with these high-order allocations, or we can do nearly
this same trick only with libc.  I.e.

  struct target_globals_extra {
    struct target_flag_state flag_state;
    struct target_regs regs;
    struct target_hard_regs hard_regs;
    struct target_reload reload;
    struct target_expmed expmed;
    struct target_optabs optabs;
    struct target_cfgloop cfgloop;
    struct target_ira ira;
    struct target_ira_int ira_int;
    struct target_lra_int lra_int;
    struct target_builtins builtins;
    struct target_gcse gcse;
    struct target_bb_reorder bb_reorder;
    struct target_lower_subreg lower_subreg;
  } *p;

  g = ggc_alloc_target_globals ();
  p = XCNEW (target_globals_extra);
  ...


r~

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
  2014-01-09 16:25           ` Richard Henderson
@ 2014-01-09 16:35             ` Jakub Jelinek
  2014-01-09 16:42               ` Richard Henderson
  0 siblings, 1 reply; 39+ messages in thread
From: Jakub Jelinek @ 2014-01-09 16:35 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Richard Biener, Richard Sandiford, gcc-patches

On Thu, Jan 09, 2014 at 08:25:31AM -0800, Richard Henderson wrote:
> > +  p = (struct target_globals_extra *)
> > +      ggc_internal_cleared_alloc_stat (sizeof (struct target_globals_extra)
> > +				       PASS_MEM_STAT);
> > +  g = (struct target_globals *) p;
> > +  g->flag_state = &p->flag_state;
> > +  g->regs = &p->regs;
> >    g->rtl = ggc_alloc_cleared_target_rtl ();
> 
> So, we're relying on something pointing to G, thus keeping the whole P alive?
> I suppose that works but it's fairly ugly that's for sure.

The separate structures aren't really installed individually, they are
always installed together through restore_target_globals.  As long as the
any FUNCTION_DECL with such TARGET_OPTION_NODE exists, it will be reachable.

The reason why it needs to be GC is:
1) in two of these target_* structures there are embedded rtxes etc. the GC
needs to see
2) if all FUNCTION_DECL with such combination of target attributes are GCed,
we'd leak memory

> As for the extra ~500k wasted on x86_64, we can either fix our gc allocator to
> do something sensible with these high-order allocations, or we can do nearly
> this same trick only with libc.  I.e.
> 
>   struct target_globals_extra {
>     struct target_flag_state flag_state;
>     struct target_regs regs;
>     struct target_hard_regs hard_regs;
>     struct target_reload reload;
>     struct target_expmed expmed;
>     struct target_optabs optabs;
>     struct target_cfgloop cfgloop;
>     struct target_ira ira;
>     struct target_ira_int ira_int;
>     struct target_lra_int lra_int;
>     struct target_builtins builtins;
>     struct target_gcse gcse;
>     struct target_bb_reorder bb_reorder;
>     struct target_lower_subreg lower_subreg;
>   } *p;
> 
>   g = ggc_alloc_target_globals ();
>   p = XCNEW (target_globals_extra);
>   ...

That would be fine for 1), but would mean 2).  It is also fine to GC
allocate each structure individually, but some (like bb_reorder) are say
just 4 bytes long, so it might be overkill.

As noted by Richard S., IRA/LRA still puts pointers to heap allocated
objects into some of the structures, so there is some leak anyway, but not
as big.

	Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
  2014-01-09 16:35             ` Jakub Jelinek
@ 2014-01-09 16:42               ` Richard Henderson
  2014-01-09 23:35                 ` Jakub Jelinek
  0 siblings, 1 reply; 39+ messages in thread
From: Richard Henderson @ 2014-01-09 16:42 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, Richard Sandiford, gcc-patches

On 01/09/2014 08:35 AM, Jakub Jelinek wrote:
> That would be fine for 1), but would mean 2).  It is also fine to GC
> allocate each structure individually, but some (like bb_reorder) are say
> just 4 bytes long, so it might be overkill.


Hmm..  Perhaps define the whole structure as you do, but somewhere global
enough that ggc-page.c can see it, and add to the extra_order_size_table?
I don't know how much memory wastage there would be there, but I can't imagine
it's as much as 0.5MB.


r~

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
  2014-01-09 16:42               ` Richard Henderson
@ 2014-01-09 23:35                 ` Jakub Jelinek
  2014-01-10 17:37                   ` Richard Henderson
  0 siblings, 1 reply; 39+ messages in thread
From: Jakub Jelinek @ 2014-01-09 23:35 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Richard Biener, Richard Sandiford, gcc-patches

On Thu, Jan 09, 2014 at 08:42:03AM -0800, Richard Henderson wrote:
> On 01/09/2014 08:35 AM, Jakub Jelinek wrote:
> > That would be fine for 1), but would mean 2).  It is also fine to GC
> > allocate each structure individually, but some (like bb_reorder) are say
> > just 4 bytes long, so it might be overkill.
> 
> 
> Hmm..  Perhaps define the whole structure as you do, but somewhere global
> enough that ggc-page.c can see it, and add to the extra_order_size_table?
> I don't know how much memory wastage there would be there, but I can't imagine
> it's as much as 0.5MB.

Here is the solution I was talking about earlier, allocate smaller structs
through GC together and larger separately.  Bootstrapped/regtested on
x86_64-linux and i686-linux.

2014-01-09  Jakub Jelinek  <jakub@redhat.com>

	* target-globals.c (save_target_globals): Allocate < 4KB structs using
	GC in payload of target_globals struct instead of allocating them on
	the heap and the larger structs separately using GC.
	* target-globals.h (struct target_globals): Make regs, hard_regs,
	reload, expmed, ira, ira_int and lra_fields GTY((atomic)) instead
	of GTY((skip)) and change type to void *.
	(reset_target_globals): Cast loads from those fields to corresponding
	types.

--- gcc/target-globals.h.jj	2014-01-09 19:24:20.000000000 +0100
+++ gcc/target-globals.h	2014-01-09 19:39:43.879348712 +0100
@@ -41,17 +41,17 @@ extern struct target_lower_subreg *this_
 
 struct GTY(()) target_globals {
   struct target_flag_state *GTY((skip)) flag_state;
-  struct target_regs *GTY((skip)) regs;
+  void *GTY((atomic)) regs;
   struct target_rtl *rtl;
-  struct target_hard_regs *GTY((skip)) hard_regs;
-  struct target_reload *GTY((skip)) reload;
-  struct target_expmed *GTY((skip)) expmed;
+  void *GTY((atomic)) hard_regs;
+  void *GTY((atomic)) reload;
+  void *GTY((atomic)) expmed;
   struct target_optabs *GTY((skip)) optabs;
   struct target_libfuncs *libfuncs;
   struct target_cfgloop *GTY((skip)) cfgloop;
-  struct target_ira *GTY((skip)) ira;
-  struct target_ira_int *GTY((skip)) ira_int;
-  struct target_lra_int *GTY((skip)) lra_int;
+  void *GTY((atomic)) ira;
+  void *GTY((atomic)) ira_int;
+  void *GTY((atomic)) lra_int;
   struct target_builtins *GTY((skip)) builtins;
   struct target_gcse *GTY((skip)) gcse;
   struct target_bb_reorder *GTY((skip)) bb_reorder;
@@ -68,17 +68,17 @@ static inline void
 restore_target_globals (struct target_globals *g)
 {
   this_target_flag_state = g->flag_state;
-  this_target_regs = g->regs;
+  this_target_regs = (struct target_regs *) g->regs;
   this_target_rtl = g->rtl;
-  this_target_hard_regs = g->hard_regs;
-  this_target_reload = g->reload;
-  this_target_expmed = g->expmed;
+  this_target_hard_regs = (struct target_hard_regs *) g->hard_regs;
+  this_target_reload = (struct target_reload *) g->reload;
+  this_target_expmed = (struct target_expmed *) g->expmed;
   this_target_optabs = g->optabs;
   this_target_libfuncs = g->libfuncs;
   this_target_cfgloop = g->cfgloop;
-  this_target_ira = g->ira;
-  this_target_ira_int = g->ira_int;
-  this_target_lra_int = g->lra_int;
+  this_target_ira = (struct target_ira *) g->ira;
+  this_target_ira_int = (struct target_ira_int *) g->ira_int;
+  this_target_lra_int = (struct target_lra_int *) g->lra_int;
   this_target_builtins = g->builtins;
   this_target_gcse = g->gcse;
   this_target_bb_reorder = g->bb_reorder;
--- gcc/target-globals.c.jj	2014-01-08 17:44:57.551583153 +0100
+++ gcc/target-globals.c	2014-01-09 19:38:21.013760564 +0100
@@ -68,24 +68,44 @@ struct target_globals *
 save_target_globals (void)
 {
   struct target_globals *g;
-
-  g = ggc_alloc_target_globals ();
-  g->flag_state = XCNEW (struct target_flag_state);
-  g->regs = XCNEW (struct target_regs);
+  struct target_globals_extra {
+    struct target_globals g;
+    struct target_flag_state flag_state;
+    struct target_optabs optabs;
+    struct target_cfgloop cfgloop;
+    struct target_builtins builtins;
+    struct target_gcse gcse;
+    struct target_bb_reorder bb_reorder;
+    struct target_lower_subreg lower_subreg;
+  } *p;
+  p = (struct target_globals_extra *)
+      ggc_internal_cleared_alloc_stat (sizeof (struct target_globals_extra)
+				       PASS_MEM_STAT);
+  g = (struct target_globals *) p;
+  g->flag_state = &p->flag_state;
+  g->regs = ggc_internal_cleared_alloc_stat (sizeof (struct target_regs)
+					     PASS_MEM_STAT);
   g->rtl = ggc_alloc_cleared_target_rtl ();
-  g->hard_regs = XCNEW (struct target_hard_regs);
-  g->reload = XCNEW (struct target_reload);
-  g->expmed = XCNEW (struct target_expmed);
-  g->optabs = XCNEW (struct target_optabs);
+  g->hard_regs
+    = ggc_internal_cleared_alloc_stat (sizeof (struct target_hard_regs)
+				       PASS_MEM_STAT);
+  g->reload = ggc_internal_cleared_alloc_stat (sizeof (struct target_reload)
+					       PASS_MEM_STAT);
+  g->expmed =  ggc_internal_cleared_alloc_stat (sizeof (struct target_expmed)
+						PASS_MEM_STAT);
+  g->optabs = &p->optabs;
   g->libfuncs = ggc_alloc_cleared_target_libfuncs ();
-  g->cfgloop = XCNEW (struct target_cfgloop);
-  g->ira = XCNEW (struct target_ira);
-  g->ira_int = XCNEW (struct target_ira_int);
-  g->lra_int = XCNEW (struct target_lra_int);
-  g->builtins = XCNEW (struct target_builtins);
-  g->gcse = XCNEW (struct target_gcse);
-  g->bb_reorder = XCNEW (struct target_bb_reorder);
-  g->lower_subreg = XCNEW (struct target_lower_subreg);
+  g->cfgloop = &p->cfgloop;
+  g->ira = ggc_internal_cleared_alloc_stat (sizeof (struct target_ira)
+					    PASS_MEM_STAT);
+  g->ira_int = ggc_internal_cleared_alloc_stat (sizeof (struct target_ira_int)
+						PASS_MEM_STAT);
+  g->lra_int = ggc_internal_cleared_alloc_stat (sizeof (struct target_lra_int)
+						PASS_MEM_STAT);
+  g->builtins = &p->builtins;
+  g->gcse = &p->gcse;
+  g->bb_reorder = &p->bb_reorder;
+  g->lower_subreg = &p->lower_subreg;
   restore_target_globals (g);
   init_reg_sets ();
   target_reinit ();


	Jakub

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
  2014-01-09 23:35                 ` Jakub Jelinek
@ 2014-01-10 17:37                   ` Richard Henderson
  2014-01-12 13:23                     ` Richard Biener
  0 siblings, 1 reply; 39+ messages in thread
From: Richard Henderson @ 2014-01-10 17:37 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, Richard Sandiford, gcc-patches

On 01/09/2014 03:34 PM, Jakub Jelinek wrote:
> 2014-01-09  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* target-globals.c (save_target_globals): Allocate < 4KB structs using
> 	GC in payload of target_globals struct instead of allocating them on
> 	the heap and the larger structs separately using GC.
> 	* target-globals.h (struct target_globals): Make regs, hard_regs,
> 	reload, expmed, ira, ira_int and lra_fields GTY((atomic)) instead
> 	of GTY((skip)) and change type to void *.
> 	(reset_target_globals): Cast loads from those fields to corresponding
> 	types.
> 
> --- gcc/target-globals.h.jj	2014-01-09 19:24:20.000000000 +0100
> +++ gcc/target-globals.h	2014-01-09 19:39:43.879348712 +0100
> @@ -41,17 +41,17 @@ extern struct target_lower_subreg *this_
>  
>  struct GTY(()) target_globals {
>    struct target_flag_state *GTY((skip)) flag_state;
> -  struct target_regs *GTY((skip)) regs;
> +  void *GTY((atomic)) regs;

I'm not entirely fond of this either, for the obvious reason.  Clearly a
deficiency in gengtype, but after 2 hours of poking around I can see that
it isn't a quick fix.

I guess I'm ok with the patch, since the use of the target_globals structure
is so restricted.


r~

^ permalink raw reply	[flat|nested] 39+ messages in thread

* RE: [PATCH] Fix PR58115
  2014-01-09  9:36                       ` Bernd Edlinger
  2014-01-09  9:43                         ` Jakub Jelinek
@ 2014-01-11 17:19                         ` Peter Bergner
  1 sibling, 0 replies; 39+ messages in thread
From: Peter Bergner @ 2014-01-11 17:19 UTC (permalink / raw)
  To: Bernd Edlinger; +Cc: Jakub Jelinek, Richard Biener, gcc-patches, rdsandiford

On Thu, 2014-01-09 at 10:36 +0100, Bernd Edlinger wrote:
> On Thu, 9 Jan 2014 10:28:43, Jakub Jelinek wrote:
> > Yeah, if i386 is changed into SWITCHABLE_TARGET, then I'd strongly encourage
> > rs6000 and nios folks to follow the suit.
> 
> Ok for me. Hope they read this thread...

The rs600 patch was submitted here:

  http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00675.html

Peter


^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
  2014-01-10 17:37                   ` Richard Henderson
@ 2014-01-12 13:23                     ` Richard Biener
  2014-01-12 21:52                       ` Trevor Saunders
  0 siblings, 1 reply; 39+ messages in thread
From: Richard Biener @ 2014-01-12 13:23 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Jakub Jelinek, Richard Sandiford, GCC Patches

On Fri, Jan 10, 2014 at 6:37 PM, Richard Henderson <rth@redhat.com> wrote:
> On 01/09/2014 03:34 PM, Jakub Jelinek wrote:
>> 2014-01-09  Jakub Jelinek  <jakub@redhat.com>
>>
>>       * target-globals.c (save_target_globals): Allocate < 4KB structs using
>>       GC in payload of target_globals struct instead of allocating them on
>>       the heap and the larger structs separately using GC.
>>       * target-globals.h (struct target_globals): Make regs, hard_regs,
>>       reload, expmed, ira, ira_int and lra_fields GTY((atomic)) instead
>>       of GTY((skip)) and change type to void *.
>>       (reset_target_globals): Cast loads from those fields to corresponding
>>       types.
>>
>> --- gcc/target-globals.h.jj   2014-01-09 19:24:20.000000000 +0100
>> +++ gcc/target-globals.h      2014-01-09 19:39:43.879348712 +0100
>> @@ -41,17 +41,17 @@ extern struct target_lower_subreg *this_
>>
>>  struct GTY(()) target_globals {
>>    struct target_flag_state *GTY((skip)) flag_state;
>> -  struct target_regs *GTY((skip)) regs;
>> +  void *GTY((atomic)) regs;
>
> I'm not entirely fond of this either, for the obvious reason.  Clearly a
> deficiency in gengtype, but after 2 hours of poking around I can see that
> it isn't a quick fix.
>
> I guess I'm ok with the patch, since the use of the target_globals structure
> is so restricted.

Yeah.  At some time we need a way to specify a finalization hook called
if an object is collected and eventually a hook that walks extra roots
indirectly
reachable via an object (so you can have GC -> heap -> GC memory layouts
more easily).

Richard.

>
> r~
>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
  2014-01-12 13:23                     ` Richard Biener
@ 2014-01-12 21:52                       ` Trevor Saunders
  2014-01-13  9:32                         ` Richard Biener
  0 siblings, 1 reply; 39+ messages in thread
From: Trevor Saunders @ 2014-01-12 21:52 UTC (permalink / raw)
  To: gcc-patches

On Sun, Jan 12, 2014 at 02:23:21PM +0100, Richard Biener wrote:
> On Fri, Jan 10, 2014 at 6:37 PM, Richard Henderson <rth@redhat.com> wrote:
> > On 01/09/2014 03:34 PM, Jakub Jelinek wrote:
> >> 2014-01-09  Jakub Jelinek  <jakub@redhat.com>
> >>
> >>       * target-globals.c (save_target_globals): Allocate < 4KB structs using
> >>       GC in payload of target_globals struct instead of allocating them on
> >>       the heap and the larger structs separately using GC.
> >>       * target-globals.h (struct target_globals): Make regs, hard_regs,
> >>       reload, expmed, ira, ira_int and lra_fields GTY((atomic)) instead
> >>       of GTY((skip)) and change type to void *.
> >>       (reset_target_globals): Cast loads from those fields to corresponding
> >>       types.
> >>
> >> --- gcc/target-globals.h.jj   2014-01-09 19:24:20.000000000 +0100
> >> +++ gcc/target-globals.h      2014-01-09 19:39:43.879348712 +0100
> >> @@ -41,17 +41,17 @@ extern struct target_lower_subreg *this_
> >>
> >>  struct GTY(()) target_globals {
> >>    struct target_flag_state *GTY((skip)) flag_state;
> >> -  struct target_regs *GTY((skip)) regs;
> >> +  void *GTY((atomic)) regs;
> >
> > I'm not entirely fond of this either, for the obvious reason.  Clearly a
> > deficiency in gengtype, but after 2 hours of poking around I can see that
> > it isn't a quick fix.
> >
> > I guess I'm ok with the patch, since the use of the target_globals structure
> > is so restricted.
> 
> Yeah.  At some time we need a way to specify a finalization hook called
> if an object is collected and eventually a hook that walks extra roots
> indirectly
> reachable via an object (so you can have GC -> heap -> GC memory layouts
> more easily).

I actually tried to add finalizers a couple weeks ago, but it seems
pretty non trivial.  ggc seems to basically just allocate by searching
for the first unmarked block. It doesn't even sweep unmarked stuff, it
just marks and then waits for the space to be allocated over.  I believe
it deals with size by using different pages for each size class? So even
if it did sweep it would be somewhat tricky to know what finalizer to
call. Perhaps a solution is to have separate pages for each type that
needs a finalizer, and be able to mark things as being in one of three
states (in use, needs finalization but not in use, finalized and not in
use).  That might hurt memory consumption in the short term, but I think
finalizers will be really useful in getting stuff out of gc memory so
that's probably not too bad.

Trev

> 
> Richard.
> 
> >
> > r~
> >

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs
  2014-01-12 21:52                       ` Trevor Saunders
@ 2014-01-13  9:32                         ` Richard Biener
  0 siblings, 0 replies; 39+ messages in thread
From: Richard Biener @ 2014-01-13  9:32 UTC (permalink / raw)
  To: Trevor Saunders; +Cc: GCC Patches

On Sun, Jan 12, 2014 at 10:51 PM, Trevor Saunders <tsaunders@mozilla.com> wrote:
> On Sun, Jan 12, 2014 at 02:23:21PM +0100, Richard Biener wrote:
>> On Fri, Jan 10, 2014 at 6:37 PM, Richard Henderson <rth@redhat.com> wrote:
>> > On 01/09/2014 03:34 PM, Jakub Jelinek wrote:
>> >> 2014-01-09  Jakub Jelinek  <jakub@redhat.com>
>> >>
>> >>       * target-globals.c (save_target_globals): Allocate < 4KB structs using
>> >>       GC in payload of target_globals struct instead of allocating them on
>> >>       the heap and the larger structs separately using GC.
>> >>       * target-globals.h (struct target_globals): Make regs, hard_regs,
>> >>       reload, expmed, ira, ira_int and lra_fields GTY((atomic)) instead
>> >>       of GTY((skip)) and change type to void *.
>> >>       (reset_target_globals): Cast loads from those fields to corresponding
>> >>       types.
>> >>
>> >> --- gcc/target-globals.h.jj   2014-01-09 19:24:20.000000000 +0100
>> >> +++ gcc/target-globals.h      2014-01-09 19:39:43.879348712 +0100
>> >> @@ -41,17 +41,17 @@ extern struct target_lower_subreg *this_
>> >>
>> >>  struct GTY(()) target_globals {
>> >>    struct target_flag_state *GTY((skip)) flag_state;
>> >> -  struct target_regs *GTY((skip)) regs;
>> >> +  void *GTY((atomic)) regs;
>> >
>> > I'm not entirely fond of this either, for the obvious reason.  Clearly a
>> > deficiency in gengtype, but after 2 hours of poking around I can see that
>> > it isn't a quick fix.
>> >
>> > I guess I'm ok with the patch, since the use of the target_globals structure
>> > is so restricted.
>>
>> Yeah.  At some time we need a way to specify a finalization hook called
>> if an object is collected and eventually a hook that walks extra roots
>> indirectly
>> reachable via an object (so you can have GC -> heap -> GC memory layouts
>> more easily).
>
> I actually tried to add finalizers a couple weeks ago, but it seems
> pretty non trivial.  ggc seems to basically just allocate by searching
> for the first unmarked block. It doesn't even sweep unmarked stuff, it
> just marks and then waits for the space to be allocated over.  I believe
> it deals with size by using different pages for each size class? So even
> if it did sweep it would be somewhat tricky to know what finalizer to
> call. Perhaps a solution is to have separate pages for each type that
> needs a finalizer, and be able to mark things as being in one of three
> states (in use, needs finalization but not in use, finalized and not in
> use).  That might hurt memory consumption in the short term, but I think
> finalizers will be really useful in getting stuff out of gc memory so
> that's probably not too bad.

I think you would need to have a list of object/finalizer per GC page
and do finalization at sweep_pages () time.

Yes, per-type pools would also work (for types with finalizers).

Or rework how the GC works - surely advanced techs like incremental
or copying collection might benefit GCC.

Richard.

> Trev
>
>>
>> Richard.
>>
>> >
>> > r~
>> >

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2014-01-13  9:32 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-03 10:26 [PATCH] Fix PR58115 Bernd Edlinger
2013-11-19 12:11 ` [PING] " Bernd Edlinger
2013-11-19 14:59 ` [PATCH] " H.J. Lu
2013-11-19 15:02   ` Bernd Edlinger
2013-12-02 15:08 ` Richard Biener
2014-01-06 10:27 ` Richard Sandiford
2014-01-06 11:21   ` Jakub Jelinek
2014-01-06 12:16     ` Richard Biener
2014-01-06 17:00       ` Bernd Edlinger
2014-01-06 19:17         ` Richard Sandiford
2014-01-06 21:43           ` Bernd Edlinger
2014-01-07 12:12             ` Richard Sandiford
2014-01-07 14:10               ` Richard Biener
2014-01-09  0:31                 ` Bernd Edlinger
2014-01-09  9:02                   ` Richard Sandiford
2014-01-09  9:28                     ` Jakub Jelinek
2014-01-09  9:36                       ` Bernd Edlinger
2014-01-09  9:43                         ` Jakub Jelinek
2014-01-11 17:19                         ` Peter Bergner
2014-01-09 10:24                     ` Jakub Jelinek
2014-01-09 11:11                       ` Richard Biener
2014-01-09 11:16                         ` Jakub Jelinek
2014-01-07 20:11   ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115) Jakub Jelinek
2014-01-07 20:52     ` Richard Sandiford
2014-01-08 11:33     ` Richard Biener
2014-01-08 12:45       ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2) Jakub Jelinek
2014-01-08 14:29         ` Richard Biener
2014-01-08 18:34         ` [PATCH] Allocate all target globals using GC for SWITCHABLE_TARGETs Jakub Jelinek
2014-01-08 19:41           ` Richard Sandiford
2014-01-08 19:54             ` Jakub Jelinek
2014-01-09 16:25           ` Richard Henderson
2014-01-09 16:35             ` Jakub Jelinek
2014-01-09 16:42               ` Richard Henderson
2014-01-09 23:35                 ` Jakub Jelinek
2014-01-10 17:37                   ` Richard Henderson
2014-01-12 13:23                     ` Richard Biener
2014-01-12 21:52                       ` Trevor Saunders
2014-01-13  9:32                         ` Richard Biener
2014-01-09 16:16         ` [PATCH] Change i?86/x86_64 into SWITCHABLE_TARGET (PR58115, take 2) Richard Henderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).