public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* -fstrict-aliasing fixes 6/6: permit inlining of comdats
@ 2015-12-04  7:37 Jan Hubicka
  2015-12-04 11:20 ` Richard Biener
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Hubicka @ 2015-12-04  7:37 UTC (permalink / raw)
  To: gcc-patches, rguenther

Hi,
this is the last patch of the series.  It makes operand_equal_p to compare
alias sets even in !flag_strict_aliasing before inlining so inlining 
!flag_strict_aliasing to flag_strict_aliasing is possible when callee is
merged comdat.  I tried to explain it in greater detail in the comment
in ipa-inline-tranform.

While working on the code I noticed that I managed to overload merged with
two meanings. One is that the function had bodies defined in multiple units
(and thus its inlining should not be considered cross-modulo) and other is
that it used to be comdat.  This is usually the same, but not always - one
can manually define weak functions where the bypass for OPTIMIZAITON_NODE
checks can not apply.

Since the first only affects heuristics and I do not think I need to care
about weaks much, I dropped it and renamed the flag to merged_comdat to make
it more obvious what it means.

Bootstrapped/regtested x86_64-linux, OK?

I will work on some testcases for the ICF and fold-const that would lead
to wrong code if alias sets was ignored early.

Honza
	* fold-const.c (operand_equal_p): Before inlining do not permit
	transformations that would break with strict aliasing.
	* ipa-inline.c (can_inline_edge_p) Use merged_comdat.
	* ipa-inline-transform.c (inline_call): When inlining merged comdat do
	not drop strict_aliasing flag of caller.
	* cgraphclones.c (cgraph_node::create_clone): Use merged_comdat.
	* cgraph.c (cgraph_node::dump): Dump merged_comdat.
	* ipa-icf.c (sem_function::merge): Drop merged_comdat when merging
	comdat and non-comdat.
	* cgraph.h (cgraph_node): Rename merged to merged_comdat.
	* ipa-inline-analysis.c (simple_edge_hints): Check both merged_comdat
	and icf_merged.

	* lto-symtab.c (lto_cgraph_replace_node): Update code computing
	merged_comdat.
Index: fold-const.c
===================================================================
--- fold-const.c	(revision 231239)
+++ fold-const.c	(working copy)
@@ -2987,7 +2987,7 @@ operand_equal_p (const_tree arg0, const_
 					   flags)))
 		return 0;
 	      /* Verify that accesses are TBAA compatible.  */
-	      if (flag_strict_aliasing
+	      if ((flag_strict_aliasing || !cfun->after_inlining)
 		  && (!alias_ptr_types_compatible_p
 		        (TREE_TYPE (TREE_OPERAND (arg0, 1)),
 		         TREE_TYPE (TREE_OPERAND (arg1, 1)))
Index: ipa-inline.c
===================================================================
--- ipa-inline.c	(revision 231239)
+++ ipa-inline.c	(working copy)
@@ -466,7 +466,7 @@ can_inline_edge_p (struct cgraph_edge *e
          optimized with the optimization flags of module they are used in.
 	 Also do not care about mixing up size/speed optimization when
 	 DECL_DISREGARD_INLINE_LIMITS is set.  */
-      else if ((callee->merged
+      else if ((callee->merged_comdat
 	        && !lookup_attribute ("optimize",
 				      DECL_ATTRIBUTES (caller->decl)))
 	       || DECL_DISREGARD_INLINE_LIMITS (callee->decl))
Index: ipa-inline-transform.c
===================================================================
--- ipa-inline-transform.c	(revision 231239)
+++ ipa-inline-transform.c	(working copy)
@@ -322,11 +322,26 @@ inline_call (struct cgraph_edge *e, bool
   if (DECL_FUNCTION_PERSONALITY (callee->decl))
     DECL_FUNCTION_PERSONALITY (to->decl)
       = DECL_FUNCTION_PERSONALITY (callee->decl);
+
+  /* merged_comdat indicate that function was originally COMDAT and merged
+     from multiple units.  Because every unit using COMDAT must also define it,
+     we know that the function is safe to build with each of the optimization
+     flags used used to compile them.
+
+     If one unit is compiled with -fstrict-aliasing and
+     other with -fno-strict-aliasing we may bypass dropping the
+     flag_strict_aliasing because we know it would be valid to inline
+     -fstrict-aliaisng variant of the calee, too.  Unless optimization
+     attribute was used, the caller and COMDAT callee must have been
+     compiled with the same flags.  */
   if (!opt_for_fn (callee->decl, flag_strict_aliasing)
-      && opt_for_fn (to->decl, flag_strict_aliasing))
+      && opt_for_fn (to->decl, flag_strict_aliasing)
+      && (!callee->merged_comdat
+	  || lookup_attribute ("optimization",
+			       DECL_ATTRIBUTES (e->caller->decl))
+	  || lookup_attribute ("optimization", DECL_ATTRIBUTES (callee->decl))))
     {
       struct gcc_options opts = global_options;
-
       cl_optimization_restore (&opts,
 	 TREE_OPTIMIZATION (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl)));
       opts.x_flag_strict_aliasing = false;
Index: cgraphclones.c
===================================================================
--- cgraphclones.c	(revision 231239)
+++ cgraphclones.c	(working copy)
@@ -433,7 +433,7 @@ cgraph_node::create_clone (tree new_decl
   new_node->tp_first_run = tp_first_run;
   new_node->tm_clone = tm_clone;
   new_node->icf_merged = icf_merged;
-  new_node->merged = merged;
+  new_node->merged_comdat = merged_comdat;
 
   new_node->clone.tree_map = NULL;
   new_node->clone.args_to_skip = args_to_skip;
Index: lto/lto-symtab.c
===================================================================
--- lto/lto-symtab.c	(revision 231239)
+++ lto/lto-symtab.c	(working copy)
@@ -63,8 +63,9 @@ lto_cgraph_replace_node (struct cgraph_n
       gcc_assert (!prevailing_node->global.inlined_to);
       prevailing_node->mark_address_taken ();
     }
-  if (node->definition && prevailing_node->definition)
-    prevailing_node->merged = true;
+  if (node->definition && prevailing_node->definition
+      && DECL_COMDAT (node->decl) && DECL_COMDAT (prevailing_node->decl))
+    prevailing_node->merged_comdat = true;
 
   /* Redirect all incoming edges.  */
   compatible_p
Index: cgraph.c
===================================================================
--- cgraph.c	(revision 231239)
+++ cgraph.c	(working copy)
@@ -1994,6 +1994,8 @@ cgraph_node::dump (FILE *f)
     fprintf (f, " tm_clone");
   if (icf_merged)
     fprintf (f, " icf_merged");
+  if (merged_comdat)
+    fprintf (f, " merged_comdat");
   if (nonfreeing_fn)
     fprintf (f, " nonfreeing_fn");
   if (DECL_STATIC_CONSTRUCTOR (decl))
Index: ipa-icf.c
===================================================================
--- ipa-icf.c	(revision 231239)
+++ ipa-icf.c	(working copy)
@@ -1352,10 +1352,15 @@ sem_function::merge (sem_item *alias_ite
   gcc_assert (alias->icf_merged || remove || redirect_callers);
   original->icf_merged = true;
 
-  /* Inform the inliner about cross-module merging.  */
-  if ((original->lto_file_data || alias->lto_file_data)
-      && original->lto_file_data != alias->lto_file_data)
-    local_original->merged = original->merged = true;
+  /* We use merged flag to track cases where COMDAT function is known to be
+     compatible its callers.  If we merged in non-COMDAT, we need to give up
+     on this optimization.  */
+  if (original->merged_comdat && !alias->merged_comdat)
+    {
+      if (dump_file)
+	fprintf (dump_file, "Dropping merged_comdat flag.\n\n");
+      local_original->merged_comdat = original->merged_comdat = false;
+    }
 
   if (remove)
     {
Index: cgraph.h
===================================================================
--- cgraph.h	(revision 231239)
+++ cgraph.h	(working copy)
@@ -1333,7 +1333,7 @@ public:
      accesses trapping.  */
   unsigned nonfreeing_fn : 1;
   /* True if there was multiple COMDAT bodies merged by lto-symtab.  */
-  unsigned merged : 1;
+  unsigned merged_comdat : 1;
   /* True if function was created to be executed in parallel.  */
   unsigned parallelized_function : 1;
   /* True if function is part split out by ipa-split.  */
Index: ipa-inline-analysis.c
===================================================================
--- ipa-inline-analysis.c	(revision 231239)
+++ ipa-inline-analysis.c	(working copy)
@@ -3708,7 +3708,7 @@ simple_edge_hints (struct cgraph_edge *e
 
   if (callee->lto_file_data && edge->caller->lto_file_data
       && edge->caller->lto_file_data != callee->lto_file_data
-      && !callee->merged)
+      && !callee->merged_comdat && !callee->icf_merged)
     hints |= INLINE_HINT_cross_module;
 
   return hints;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -fstrict-aliasing fixes 6/6: permit inlining of comdats
  2015-12-04  7:37 -fstrict-aliasing fixes 6/6: permit inlining of comdats Jan Hubicka
@ 2015-12-04 11:20 ` Richard Biener
  2015-12-04 16:57   ` Jan Hubicka
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2015-12-04 11:20 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On Fri, 4 Dec 2015, Jan Hubicka wrote:

> Hi,
> this is the last patch of the series.  It makes operand_equal_p to compare
> alias sets even in !flag_strict_aliasing before inlining so inlining 
> !flag_strict_aliasing to flag_strict_aliasing is possible when callee is
> merged comdat.  I tried to explain it in greater detail in the comment
> in ipa-inline-tranform.
> 
> While working on the code I noticed that I managed to overload merged with
> two meanings. One is that the function had bodies defined in multiple units
> (and thus its inlining should not be considered cross-modulo) and other is
> that it used to be comdat.  This is usually the same, but not always - one
> can manually define weak functions where the bypass for OPTIMIZAITON_NODE
> checks can not apply.
> 
> Since the first only affects heuristics and I do not think I need to care
> about weaks much, I dropped it and renamed the flag to merged_comdat to make
> it more obvious what it means.

I wonder if you can split out the re-naming at this stage.  Further
comments below.

> Bootstrapped/regtested x86_64-linux, OK?
> 
> I will work on some testcases for the ICF and fold-const that would lead
> to wrong code if alias sets was ignored early.

Would be nice to have a wrong-code testcase go with the commit.

> Honza
> 	* fold-const.c (operand_equal_p): Before inlining do not permit
> 	transformations that would break with strict aliasing.
> 	* ipa-inline.c (can_inline_edge_p) Use merged_comdat.
> 	* ipa-inline-transform.c (inline_call): When inlining merged comdat do
> 	not drop strict_aliasing flag of caller.
> 	* cgraphclones.c (cgraph_node::create_clone): Use merged_comdat.
> 	* cgraph.c (cgraph_node::dump): Dump merged_comdat.
> 	* ipa-icf.c (sem_function::merge): Drop merged_comdat when merging
> 	comdat and non-comdat.
> 	* cgraph.h (cgraph_node): Rename merged to merged_comdat.
> 	* ipa-inline-analysis.c (simple_edge_hints): Check both merged_comdat
> 	and icf_merged.
> 
> 	* lto-symtab.c (lto_cgraph_replace_node): Update code computing
> 	merged_comdat.
> Index: fold-const.c
> ===================================================================
> --- fold-const.c	(revision 231239)
> +++ fold-const.c	(working copy)
> @@ -2987,7 +2987,7 @@ operand_equal_p (const_tree arg0, const_
>  					   flags)))
>  		return 0;
>  	      /* Verify that accesses are TBAA compatible.  */
> -	      if (flag_strict_aliasing
> +	      if ((flag_strict_aliasing || !cfun->after_inlining)
>  		  && (!alias_ptr_types_compatible_p
>  		        (TREE_TYPE (TREE_OPERAND (arg0, 1)),
>  		         TREE_TYPE (TREE_OPERAND (arg1, 1)))

Sooo....  first of all the code is broken anyway as it guards
the restrict checking (MR_DEPENDENCE_*) stuff with flag_strict_aliasing
(ick).  Second, I wouldn't mind if we drop the flag_strict_aliasing
check alltogether, a cfun->after_inlining checks makes me just too
nervous.

> Index: ipa-inline.c
> ===================================================================
> --- ipa-inline.c	(revision 231239)
> +++ ipa-inline.c	(working copy)
> @@ -466,7 +466,7 @@ can_inline_edge_p (struct cgraph_edge *e
>           optimized with the optimization flags of module they are used in.
>  	 Also do not care about mixing up size/speed optimization when
>  	 DECL_DISREGARD_INLINE_LIMITS is set.  */
> -      else if ((callee->merged
> +      else if ((callee->merged_comdat
>  	        && !lookup_attribute ("optimize",
>  				      DECL_ATTRIBUTES (caller->decl)))
>  	       || DECL_DISREGARD_INLINE_LIMITS (callee->decl))
> Index: ipa-inline-transform.c
> ===================================================================
> --- ipa-inline-transform.c	(revision 231239)
> +++ ipa-inline-transform.c	(working copy)
> @@ -322,11 +322,26 @@ inline_call (struct cgraph_edge *e, bool
>    if (DECL_FUNCTION_PERSONALITY (callee->decl))
>      DECL_FUNCTION_PERSONALITY (to->decl)
>        = DECL_FUNCTION_PERSONALITY (callee->decl);
> +
> +  /* merged_comdat indicate that function was originally COMDAT and merged
> +     from multiple units.  Because every unit using COMDAT must also define it,
> +     we know that the function is safe to build with each of the optimization
> +     flags used used to compile them.
> +
> +     If one unit is compiled with -fstrict-aliasing and
> +     other with -fno-strict-aliasing we may bypass dropping the
> +     flag_strict_aliasing because we know it would be valid to inline
> +     -fstrict-aliaisng variant of the calee, too.  Unless optimization
> +     attribute was used, the caller and COMDAT callee must have been
> +     compiled with the same flags.  */

So your logic relies on the fact that the -fno-strict-aliasing was
not necessary on copy A if copy B was compiled without that flag
because otherwise copy B would invoke undefined behavior?

This menans it's a language semantics thing but you simply look at
whether it's "comdat"?  Shouldn't this use some ODR thing instead?

Also as undefined behavior only applies at runtime consider copy A
(with -fno-strict-aliasing) is used in contexts where undefined
behavior would occur while copy B not.  Say,

int foo (int *p, short *q)
{
  *p = 1;
  return *q;
}

and the copy A use is foo (&x, &x) while the copy B use foo (&x, &y).

Yes, the case is lame here as we'd miscompile this in copy B and
comdat makes us eventually use that copy for A.  But if we don't
manage to miscompile this without inlining there isn't any undefined
behavior (at runtime) you can rely on.

Just want to know whether you thought about the above cases, I would
declare them invalid but I am not sure the C++ standard agrees here.
Do other FEs end up emitting comdats?  Can the user produce them for C?

>    if (!opt_for_fn (callee->decl, flag_strict_aliasing)
> -      && opt_for_fn (to->decl, flag_strict_aliasing))
> +      && opt_for_fn (to->decl, flag_strict_aliasing)
> +      && (!callee->merged_comdat
> +	  || lookup_attribute ("optimization",
> +			       DECL_ATTRIBUTES (e->caller->decl))
> +	  || lookup_attribute ("optimization", DECL_ATTRIBUTES (callee->decl))))
>      {
>        struct gcc_options opts = global_options;
> -
>        cl_optimization_restore (&opts,
>  	 TREE_OPTIMIZATION (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (to->decl)));
>        opts.x_flag_strict_aliasing = false;
> Index: cgraphclones.c
> ===================================================================
> --- cgraphclones.c	(revision 231239)
> +++ cgraphclones.c	(working copy)
> @@ -433,7 +433,7 @@ cgraph_node::create_clone (tree new_decl
>    new_node->tp_first_run = tp_first_run;
>    new_node->tm_clone = tm_clone;
>    new_node->icf_merged = icf_merged;
> -  new_node->merged = merged;
> +  new_node->merged_comdat = merged_comdat;
>  
>    new_node->clone.tree_map = NULL;
>    new_node->clone.args_to_skip = args_to_skip;
> Index: lto/lto-symtab.c
> ===================================================================
> --- lto/lto-symtab.c	(revision 231239)
> +++ lto/lto-symtab.c	(working copy)
> @@ -63,8 +63,9 @@ lto_cgraph_replace_node (struct cgraph_n
>        gcc_assert (!prevailing_node->global.inlined_to);
>        prevailing_node->mark_address_taken ();
>      }
> -  if (node->definition && prevailing_node->definition)
> -    prevailing_node->merged = true;
> +  if (node->definition && prevailing_node->definition
> +      && DECL_COMDAT (node->decl) && DECL_COMDAT (prevailing_node->decl))
> +    prevailing_node->merged_comdat = true;
>  
>    /* Redirect all incoming edges.  */
>    compatible_p
> Index: cgraph.c
> ===================================================================
> --- cgraph.c	(revision 231239)
> +++ cgraph.c	(working copy)
> @@ -1994,6 +1994,8 @@ cgraph_node::dump (FILE *f)
>      fprintf (f, " tm_clone");
>    if (icf_merged)
>      fprintf (f, " icf_merged");
> +  if (merged_comdat)
> +    fprintf (f, " merged_comdat");
>    if (nonfreeing_fn)
>      fprintf (f, " nonfreeing_fn");
>    if (DECL_STATIC_CONSTRUCTOR (decl))
> Index: ipa-icf.c
> ===================================================================
> --- ipa-icf.c	(revision 231239)
> +++ ipa-icf.c	(working copy)
> @@ -1352,10 +1352,15 @@ sem_function::merge (sem_item *alias_ite
>    gcc_assert (alias->icf_merged || remove || redirect_callers);
>    original->icf_merged = true;
>  
> -  /* Inform the inliner about cross-module merging.  */
> -  if ((original->lto_file_data || alias->lto_file_data)
> -      && original->lto_file_data != alias->lto_file_data)
> -    local_original->merged = original->merged = true;
> +  /* We use merged flag to track cases where COMDAT function is known to be
> +     compatible its callers.  If we merged in non-COMDAT, we need to give up
> +     on this optimization.  */
> +  if (original->merged_comdat && !alias->merged_comdat)
> +    {
> +      if (dump_file)
> +	fprintf (dump_file, "Dropping merged_comdat flag.\n\n");
> +      local_original->merged_comdat = original->merged_comdat = false;
> +    }
>  
>    if (remove)
>      {
> Index: cgraph.h
> ===================================================================
> --- cgraph.h	(revision 231239)
> +++ cgraph.h	(working copy)
> @@ -1333,7 +1333,7 @@ public:
>       accesses trapping.  */
>    unsigned nonfreeing_fn : 1;
>    /* True if there was multiple COMDAT bodies merged by lto-symtab.  */
> -  unsigned merged : 1;
> +  unsigned merged_comdat : 1;
>    /* True if function was created to be executed in parallel.  */
>    unsigned parallelized_function : 1;
>    /* True if function is part split out by ipa-split.  */
> Index: ipa-inline-analysis.c
> ===================================================================
> --- ipa-inline-analysis.c	(revision 231239)
> +++ ipa-inline-analysis.c	(working copy)
> @@ -3708,7 +3708,7 @@ simple_edge_hints (struct cgraph_edge *e
>  
>    if (callee->lto_file_data && edge->caller->lto_file_data
>        && edge->caller->lto_file_data != callee->lto_file_data
> -      && !callee->merged)
> +      && !callee->merged_comdat && !callee->icf_merged)
>      hints |= INLINE_HINT_cross_module;
>  
>    return hints;
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -fstrict-aliasing fixes 6/6: permit inlining of comdats
  2015-12-04 11:20 ` Richard Biener
@ 2015-12-04 16:57   ` Jan Hubicka
  2015-12-04 18:04     ` Jan Hubicka
  2015-12-07  8:16     ` Richard Biener
  0 siblings, 2 replies; 10+ messages in thread
From: Jan Hubicka @ 2015-12-04 16:57 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc-patches

> 
> I wonder if you can split out the re-naming at this stage.  Further
> comments below.

OK, I will commit the renaming and ipa-icf fix separately.
> 
> > Bootstrapped/regtested x86_64-linux, OK?
> > 
> > I will work on some testcases for the ICF and fold-const that would lead
> > to wrong code if alias sets was ignored early.
> 
> Would be nice to have a wrong-code testcase go with the commit.
> 
> > Honza
> > 	* fold-const.c (operand_equal_p): Before inlining do not permit
> > 	transformations that would break with strict aliasing.
> > 	* ipa-inline.c (can_inline_edge_p) Use merged_comdat.
> > 	* ipa-inline-transform.c (inline_call): When inlining merged comdat do
> > 	not drop strict_aliasing flag of caller.
> > 	* cgraphclones.c (cgraph_node::create_clone): Use merged_comdat.
> > 	* cgraph.c (cgraph_node::dump): Dump merged_comdat.
> > 	* ipa-icf.c (sem_function::merge): Drop merged_comdat when merging
> > 	comdat and non-comdat.
> > 	* cgraph.h (cgraph_node): Rename merged to merged_comdat.
> > 	* ipa-inline-analysis.c (simple_edge_hints): Check both merged_comdat
> > 	and icf_merged.
> > 
> > 	* lto-symtab.c (lto_cgraph_replace_node): Update code computing
> > 	merged_comdat.
> > Index: fold-const.c
> > ===================================================================
> > --- fold-const.c	(revision 231239)
> > +++ fold-const.c	(working copy)
> > @@ -2987,7 +2987,7 @@ operand_equal_p (const_tree arg0, const_
> >  					   flags)))
> >  		return 0;
> >  	      /* Verify that accesses are TBAA compatible.  */
> > -	      if (flag_strict_aliasing
> > +	      if ((flag_strict_aliasing || !cfun->after_inlining)
> >  		  && (!alias_ptr_types_compatible_p
> >  		        (TREE_TYPE (TREE_OPERAND (arg0, 1)),
> >  		         TREE_TYPE (TREE_OPERAND (arg1, 1)))
> 
> Sooo....  first of all the code is broken anyway as it guards
> the restrict checking (MR_DEPENDENCE_*) stuff with flag_strict_aliasing
> (ick).  Second, I wouldn't mind if we drop the flag_strict_aliasing
> check alltogether, a cfun->after_inlining checks makes me just too
> nervous.

OK, I will drop the check separately, too.  
Next stage1 we need to look into code merging across alias classes. ipa-icf
scores are currently 40% down compared to GCC 5 at Firefox.
> 
> So your logic relies on the fact that the -fno-strict-aliasing was
> not necessary on copy A if copy B was compiled without that flag
> because otherwise copy B would invoke undefined behavior?

Yes.
> 
> This menans it's a language semantics thing but you simply look at
> whether it's "comdat"?  Shouldn't this use some ODR thing instead?

It is definition of COMDAT. COMDAT functions are output in every unit
used and no matter what body wins the linking is correct.  Only C++
produce comdat functions, so they all comply ODR rule, so we could rely
on the fact that all function bodies should be equivalent on a source
level.
> 
> Also as undefined behavior only applies at runtime consider copy A
> (with -fno-strict-aliasing) is used in contexts where undefined
> behavior would occur while copy B not.  Say,
> 
> int foo (int *p, short *q)
> {
>   *p = 1;
>   return *q;
> }
> 
> and the copy A use is foo (&x, &x) while the copy B use foo (&x, &y).
> 
> Yes, the case is lame here as we'd miscompile this in copy B and
> comdat makes us eventually use that copy for A.  But if we don't
> manage to miscompile this without inlining there isn't any undefined
> behavior (at runtime) you can rely on.

Well, it is ODR violation in this case :)
> 
> Just want to know whether you thought about the above cases, I would
> declare them invalid but I am not sure the C++ standard agrees here.

Well, not exactly of the case mentioned above, but still think that this is
safe (ugly, too). An alternative is to keep around the bodies until after
inlining.  I have infrastructure for that in my tree, but it is hard to tune to
do: first the alternative function body may have different symbol references
(as a result of different early inlinin) which may not be resolved in current
binary so we can not use it at all. Second keepin many alternatives of every
body around makes code size estimates in inliner go crazy.

Honza

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -fstrict-aliasing fixes 6/6: permit inlining of comdats
  2015-12-04 16:57   ` Jan Hubicka
@ 2015-12-04 18:04     ` Jan Hubicka
  2015-12-04 18:08       ` Jan Hubicka
  2015-12-04 23:39       ` H.J. Lu
  2015-12-07  8:16     ` Richard Biener
  1 sibling, 2 replies; 10+ messages in thread
From: Jan Hubicka @ 2015-12-04 18:04 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, gcc-patches

Hi,
this is the patch implementing renaming and fixing logic in ipa-icf/lto-symtab
WRT compuattion of this flag.  I don't seem to be able to construct testcase for
this: the merged flag is currently used in inliner only to decide whether to
ingore optimize_size/optimize levels which should not lead to wrong code and
I can't grep LTO dumps from testsuite.

Bootstrapped/regtested x86_64-linux, comitted
Honza

	* ipa-inline.c (can_inline_edge_p) Use merged_comdat.
	* cgraphclones.c (cgraph_node::create_clone): Use merged_comdat.
	* cgraph.c (cgraph_node::dump): Dump merged_comdat.
	* ipa-icf.c (sem_function::merge): Drop merged_comdat when merging
	comdat and non-comdat.
	* cgraph.h (cgraph_node): Rename merged to merged_comdat.
	* ipa-inline-analysis.c (simple_edge_hints): Check both merged_comdat
	and icf_merged.

	* lto-symtab.c (lto_cgraph_replace_node): Update code computing
	merged_comdat.
Index: cgraphclones.c
===================================================================
--- cgraphclones.c	(revision 231239)
+++ cgraphclones.c	(working copy)
@@ -433,7 +433,7 @@ cgraph_node::create_clone (tree new_decl
   new_node->tp_first_run = tp_first_run;
   new_node->tm_clone = tm_clone;
   new_node->icf_merged = icf_merged;
-  new_node->merged = merged;
+  new_node->merged_comdat = merged_comdat;
 
   new_node->clone.tree_map = NULL;
   new_node->clone.args_to_skip = args_to_skip;
Index: ipa-icf.c
===================================================================
--- ipa-icf.c	(revision 231239)
+++ ipa-icf.c	(working copy)
@@ -1352,10 +1352,15 @@ sem_function::merge (sem_item *alias_ite
   gcc_assert (alias->icf_merged || remove || redirect_callers);
   original->icf_merged = true;
 
-  /* Inform the inliner about cross-module merging.  */
-  if ((original->lto_file_data || alias->lto_file_data)
-      && original->lto_file_data != alias->lto_file_data)
-    local_original->merged = original->merged = true;
+  /* We use merged flag to track cases where COMDAT function is known to be
+     compatible its callers.  If we merged in non-COMDAT, we need to give up
+     on this optimization.  */
+  if (original->merged_comdat && !alias->merged_comdat)
+    {
+      if (dump_file)
+	fprintf (dump_file, "Dropping merged_comdat flag.\n\n");
+      local_original->merged_comdat = original->merged_comdat = false;
+    }
 
   if (remove)
     {
Index: ipa-inline.c
===================================================================
--- ipa-inline.c	(revision 231239)
+++ ipa-inline.c	(working copy)
@@ -466,7 +466,7 @@ can_inline_edge_p (struct cgraph_edge *e
          optimized with the optimization flags of module they are used in.
 	 Also do not care about mixing up size/speed optimization when
 	 DECL_DISREGARD_INLINE_LIMITS is set.  */
-      else if ((callee->merged
+      else if ((callee->merged_comdat
 	        && !lookup_attribute ("optimize",
 				      DECL_ATTRIBUTES (caller->decl)))
 	       || DECL_DISREGARD_INLINE_LIMITS (callee->decl))
Index: ipa-inline-analysis.c
===================================================================
--- ipa-inline-analysis.c	(revision 231239)
+++ ipa-inline-analysis.c	(working copy)
@@ -3708,7 +3708,7 @@ simple_edge_hints (struct cgraph_edge *e
 
   if (callee->lto_file_data && edge->caller->lto_file_data
       && edge->caller->lto_file_data != callee->lto_file_data
-      && !callee->merged)
+      && !callee->merged_comdat && !callee->icf_merged)
     hints |= INLINE_HINT_cross_module;
 
   return hints;
Index: cgraph.c
===================================================================
--- cgraph.c	(revision 231239)
+++ cgraph.c	(working copy)
@@ -1994,6 +1994,8 @@ cgraph_node::dump (FILE *f)
     fprintf (f, " tm_clone");
   if (icf_merged)
     fprintf (f, " icf_merged");
+  if (merged_comdat)
+    fprintf (f, " merged_comdat");
   if (nonfreeing_fn)
     fprintf (f, " nonfreeing_fn");
   if (DECL_STATIC_CONSTRUCTOR (decl))
Index: cgraph.h
===================================================================
--- cgraph.h	(revision 231239)
+++ cgraph.h	(working copy)
@@ -1333,7 +1333,7 @@ public:
      accesses trapping.  */
   unsigned nonfreeing_fn : 1;
   /* True if there was multiple COMDAT bodies merged by lto-symtab.  */
-  unsigned merged : 1;
+  unsigned merged_comdat : 1;
   /* True if function was created to be executed in parallel.  */
   unsigned parallelized_function : 1;
   /* True if function is part split out by ipa-split.  */
Index: lto/lto-symtab.c
===================================================================
--- lto/lto-symtab.c	(revision 231239)
+++ lto/lto-symtab.c	(working copy)
@@ -63,8 +63,9 @@ lto_cgraph_replace_node (struct cgraph_n
       gcc_assert (!prevailing_node->global.inlined_to);
       prevailing_node->mark_address_taken ();
     }
-  if (node->definition && prevailing_node->definition)
-    prevailing_node->merged = true;
+  if (node->definition && prevailing_node->definition
+      && DECL_COMDAT (node->decl) && DECL_COMDAT (prevailing_node->decl))
+    prevailing_node->merged_comdat = true;
 
   /* Redirect all incoming edges.  */
   compatible_p

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -fstrict-aliasing fixes 6/6: permit inlining of comdats
  2015-12-04 18:04     ` Jan Hubicka
@ 2015-12-04 18:08       ` Jan Hubicka
  2015-12-07  8:20         ` Richard Biener
  2015-12-04 23:39       ` H.J. Lu
  1 sibling, 1 reply; 10+ messages in thread
From: Jan Hubicka @ 2015-12-04 18:08 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, gcc-patches

Hi,
this is the patch for fold-const.c. Can you think of some testcase for the
MR_DEPENDENCE_CLIQUE comparsion? I am not that familiar with the code to
be able to construct it :(

Bootstrapped/regtested x86_64-linux, OK?

Honza

	* fold-const.c (operand_equal_p): Do not use flag_strict_aliasing.
Index: fold-const.c
===================================================================
--- fold-const.c	(revision 231290)
+++ fold-const.c	(working copy)
@@ -2987,14 +2987,13 @@ operand_equal_p (const_tree arg0, const_
 					   flags)))
 		return 0;
 	      /* Verify that accesses are TBAA compatible.  */
-	      if (flag_strict_aliasing
-		  && (!alias_ptr_types_compatible_p
-		        (TREE_TYPE (TREE_OPERAND (arg0, 1)),
-		         TREE_TYPE (TREE_OPERAND (arg1, 1)))
-		      || (MR_DEPENDENCE_CLIQUE (arg0)
-			  != MR_DEPENDENCE_CLIQUE (arg1))
-		      || (MR_DEPENDENCE_BASE (arg0)
-			  != MR_DEPENDENCE_BASE (arg1))))
+	      if (!alias_ptr_types_compatible_p
+		     (TREE_TYPE (TREE_OPERAND (arg0, 1)),
+		      TREE_TYPE (TREE_OPERAND (arg1, 1)))
+		  || (MR_DEPENDENCE_CLIQUE (arg0)
+		      != MR_DEPENDENCE_CLIQUE (arg1))
+		  || (MR_DEPENDENCE_BASE (arg0)
+		      != MR_DEPENDENCE_BASE (arg1)))
 		return 0;
 	     /* Verify that alignment is compatible.  */
 	     if (TYPE_ALIGN (TREE_TYPE (arg0))

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -fstrict-aliasing fixes 6/6: permit inlining of comdats
  2015-12-04 18:04     ` Jan Hubicka
  2015-12-04 18:08       ` Jan Hubicka
@ 2015-12-04 23:39       ` H.J. Lu
  1 sibling, 0 replies; 10+ messages in thread
From: H.J. Lu @ 2015-12-04 23:39 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Richard Biener, GCC Patches

On Fri, Dec 4, 2015 at 10:04 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> this is the patch implementing renaming and fixing logic in ipa-icf/lto-symtab
> WRT compuattion of this flag.  I don't seem to be able to construct testcase for
> this: the merged flag is currently used in inliner only to decide whether to
> ingore optimize_size/optimize levels which should not lead to wrong code and
> I can't grep LTO dumps from testsuite.
>
> Bootstrapped/regtested x86_64-linux, comitted
> Honza
>
>         * ipa-inline.c (can_inline_edge_p) Use merged_comdat.
>         * cgraphclones.c (cgraph_node::create_clone): Use merged_comdat.
>         * cgraph.c (cgraph_node::dump): Dump merged_comdat.
>         * ipa-icf.c (sem_function::merge): Drop merged_comdat when merging
>         comdat and non-comdat.
>         * cgraph.h (cgraph_node): Rename merged to merged_comdat.
>         * ipa-inline-analysis.c (simple_edge_hints): Check both merged_comdat
>         and icf_merged.
>
>         * lto-symtab.c (lto_cgraph_replace_node): Update code computing
>         merged_comdat.

This may have caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68708


H.J.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -fstrict-aliasing fixes 6/6: permit inlining of comdats
  2015-12-04 16:57   ` Jan Hubicka
  2015-12-04 18:04     ` Jan Hubicka
@ 2015-12-07  8:16     ` Richard Biener
  2015-12-07 18:52       ` Jan Hubicka
  1 sibling, 1 reply; 10+ messages in thread
From: Richard Biener @ 2015-12-07  8:16 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On Fri, 4 Dec 2015, Jan Hubicka wrote:

> > 
> > I wonder if you can split out the re-naming at this stage.  Further
> > comments below.
> 
> OK, I will commit the renaming and ipa-icf fix separately.
> > 
> > > Bootstrapped/regtested x86_64-linux, OK?
> > > 
> > > I will work on some testcases for the ICF and fold-const that would lead
> > > to wrong code if alias sets was ignored early.
> > 
> > Would be nice to have a wrong-code testcase go with the commit.
> > 
> > > Honza
> > > 	* fold-const.c (operand_equal_p): Before inlining do not permit
> > > 	transformations that would break with strict aliasing.
> > > 	* ipa-inline.c (can_inline_edge_p) Use merged_comdat.
> > > 	* ipa-inline-transform.c (inline_call): When inlining merged comdat do
> > > 	not drop strict_aliasing flag of caller.
> > > 	* cgraphclones.c (cgraph_node::create_clone): Use merged_comdat.
> > > 	* cgraph.c (cgraph_node::dump): Dump merged_comdat.
> > > 	* ipa-icf.c (sem_function::merge): Drop merged_comdat when merging
> > > 	comdat and non-comdat.
> > > 	* cgraph.h (cgraph_node): Rename merged to merged_comdat.
> > > 	* ipa-inline-analysis.c (simple_edge_hints): Check both merged_comdat
> > > 	and icf_merged.
> > > 
> > > 	* lto-symtab.c (lto_cgraph_replace_node): Update code computing
> > > 	merged_comdat.
> > > Index: fold-const.c
> > > ===================================================================
> > > --- fold-const.c	(revision 231239)
> > > +++ fold-const.c	(working copy)
> > > @@ -2987,7 +2987,7 @@ operand_equal_p (const_tree arg0, const_
> > >  					   flags)))
> > >  		return 0;
> > >  	      /* Verify that accesses are TBAA compatible.  */
> > > -	      if (flag_strict_aliasing
> > > +	      if ((flag_strict_aliasing || !cfun->after_inlining)
> > >  		  && (!alias_ptr_types_compatible_p
> > >  		        (TREE_TYPE (TREE_OPERAND (arg0, 1)),
> > >  		         TREE_TYPE (TREE_OPERAND (arg1, 1)))
> > 
> > Sooo....  first of all the code is broken anyway as it guards
> > the restrict checking (MR_DEPENDENCE_*) stuff with flag_strict_aliasing
> > (ick).  Second, I wouldn't mind if we drop the flag_strict_aliasing
> > check alltogether, a cfun->after_inlining checks makes me just too
> > nervous.
> 
> OK, I will drop the check separately, too.  
> Next stage1 we need to look into code merging across alias classes. ipa-icf
> scores are currently 40% down compared to GCC 5 at Firefox.
> > 
> > So your logic relies on the fact that the -fno-strict-aliasing was
> > not necessary on copy A if copy B was compiled without that flag
> > because otherwise copy B would invoke undefined behavior?
> 
> Yes.
> > 
> > This menans it's a language semantics thing but you simply look at
> > whether it's "comdat"?  Shouldn't this use some ODR thing instead?
> 
> It is definition of COMDAT. COMDAT functions are output in every unit
> used and no matter what body wins the linking is correct.  Only C++
> produce comdat functions, so they all comply ODR rule, so we could rely
> on the fact that all function bodies should be equivalent on a source
> level.
> > 
> > Also as undefined behavior only applies at runtime consider copy A
> > (with -fno-strict-aliasing) is used in contexts where undefined
> > behavior would occur while copy B not.  Say,
> > 
> > int foo (int *p, short *q)
> > {
> >   *p = 1;
> >   return *q;
> > }
> > 
> > and the copy A use is foo (&x, &x) while the copy B use foo (&x, &y).
> > 
> > Yes, the case is lame here as we'd miscompile this in copy B and
> > comdat makes us eventually use that copy for A.  But if we don't
> > manage to miscompile this without inlining there isn't any undefined
> > behavior (at runtime) you can rely on.
> 
> Well, it is ODR violation in this case :)
> > 
> > Just want to know whether you thought about the above cases, I would
> > declare them invalid but I am not sure the C++ standard agrees here.
> 
> Well, not exactly of the case mentioned above, but still think that this is
> safe (ugly, too). An alternative is to keep around the bodies until after
> inlining.  I have infrastructure for that in my tree, but it is hard to tune to
> do: first the alternative function body may have different symbol references
> (as a result of different early inlinin) which may not be resolved in current
> binary so we can not use it at all. Second keepin many alternatives of every
> body around makes code size estimates in inliner go crazy.

/me inserts his usual "partitioning to the rescue" comment...

;)

Richard.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -fstrict-aliasing fixes 6/6: permit inlining of comdats
  2015-12-04 18:08       ` Jan Hubicka
@ 2015-12-07  8:20         ` Richard Biener
  2015-12-07 18:53           ` Jan Hubicka
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2015-12-07  8:20 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches

On Fri, 4 Dec 2015, Jan Hubicka wrote:

> Hi,
> this is the patch for fold-const.c. Can you think of some testcase for the
> MR_DEPENDENCE_CLIQUE comparsion? I am not that familiar with the code to
> be able to construct it :(

With ICF it would involve a variant using restrict args vs.
non-restrict args.  For other optimizers it's more difficult
to construct a testcase that would fail.  But if any alias related
compare is necessary in operand_equal_p then the dependence check
is required as well.

> Bootstrapped/regtested x86_64-linux, OK?

Ok.

Thanks,
Richard.

> Honza
> 
> 	* fold-const.c (operand_equal_p): Do not use flag_strict_aliasing.
> Index: fold-const.c
> ===================================================================
> --- fold-const.c	(revision 231290)
> +++ fold-const.c	(working copy)
> @@ -2987,14 +2987,13 @@ operand_equal_p (const_tree arg0, const_
>  					   flags)))
>  		return 0;
>  	      /* Verify that accesses are TBAA compatible.  */
> -	      if (flag_strict_aliasing
> -		  && (!alias_ptr_types_compatible_p
> -		        (TREE_TYPE (TREE_OPERAND (arg0, 1)),
> -		         TREE_TYPE (TREE_OPERAND (arg1, 1)))
> -		      || (MR_DEPENDENCE_CLIQUE (arg0)
> -			  != MR_DEPENDENCE_CLIQUE (arg1))
> -		      || (MR_DEPENDENCE_BASE (arg0)
> -			  != MR_DEPENDENCE_BASE (arg1))))
> +	      if (!alias_ptr_types_compatible_p
> +		     (TREE_TYPE (TREE_OPERAND (arg0, 1)),
> +		      TREE_TYPE (TREE_OPERAND (arg1, 1)))
> +		  || (MR_DEPENDENCE_CLIQUE (arg0)
> +		      != MR_DEPENDENCE_CLIQUE (arg1))
> +		  || (MR_DEPENDENCE_BASE (arg0)
> +		      != MR_DEPENDENCE_BASE (arg1)))
>  		return 0;
>  	     /* Verify that alignment is compatible.  */
>  	     if (TYPE_ALIGN (TREE_TYPE (arg0))

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -fstrict-aliasing fixes 6/6: permit inlining of comdats
  2015-12-07  8:16     ` Richard Biener
@ 2015-12-07 18:52       ` Jan Hubicka
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Hubicka @ 2015-12-07 18:52 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc-patches

> > 
> > Well, not exactly of the case mentioned above, but still think that this is
> > safe (ugly, too). An alternative is to keep around the bodies until after
> > inlining.  I have infrastructure for that in my tree, but it is hard to tune to
> > do: first the alternative function body may have different symbol references
> > (as a result of different early inlinin) which may not be resolved in current
> > binary so we can not use it at all. Second keepin many alternatives of every
> > body around makes code size estimates in inliner go crazy.
> 
> /me inserts his usual "partitioning to the rescue" comment...
> 
> ;)

Hmm, OK. I fail to see how that would save the day here :)  It is not a problem about
the fact that we can not switch flags according the function but about the linking
that happens prior partitioning.

Honza
> 
> Richard.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: -fstrict-aliasing fixes 6/6: permit inlining of comdats
  2015-12-07  8:20         ` Richard Biener
@ 2015-12-07 18:53           ` Jan Hubicka
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Hubicka @ 2015-12-07 18:53 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jan Hubicka, gcc-patches

> On Fri, 4 Dec 2015, Jan Hubicka wrote:
> 
> > Hi,
> > this is the patch for fold-const.c. Can you think of some testcase for the
> > MR_DEPENDENCE_CLIQUE comparsion? I am not that familiar with the code to
> > be able to construct it :(
> 
> With ICF it would involve a variant using restrict args vs.
> non-restrict args.  For other optimizers it's more difficult
> to construct a testcase that would fail.  But if any alias related
> compare is necessary in operand_equal_p then the dependence check
> is required as well.
> 
> > Bootstrapped/regtested x86_64-linux, OK?
> 
> Ok.
OK, thanks.
Comitted w/o a testcase for now.  Once ipa-icf-gimple transition to operand_equal_p
is complete, we can invent a testcase.  Obviously restrict args are checked earlier
independently of the operand comparsion ATM, so this testcase would not fail either.

Honza
> 
> Thanks,
> Richard.
> 
> > Honza
> > 
> > 	* fold-const.c (operand_equal_p): Do not use flag_strict_aliasing.
> > Index: fold-const.c
> > ===================================================================
> > --- fold-const.c	(revision 231290)
> > +++ fold-const.c	(working copy)
> > @@ -2987,14 +2987,13 @@ operand_equal_p (const_tree arg0, const_
> >  					   flags)))
> >  		return 0;
> >  	      /* Verify that accesses are TBAA compatible.  */
> > -	      if (flag_strict_aliasing
> > -		  && (!alias_ptr_types_compatible_p
> > -		        (TREE_TYPE (TREE_OPERAND (arg0, 1)),
> > -		         TREE_TYPE (TREE_OPERAND (arg1, 1)))
> > -		      || (MR_DEPENDENCE_CLIQUE (arg0)
> > -			  != MR_DEPENDENCE_CLIQUE (arg1))
> > -		      || (MR_DEPENDENCE_BASE (arg0)
> > -			  != MR_DEPENDENCE_BASE (arg1))))
> > +	      if (!alias_ptr_types_compatible_p
> > +		     (TREE_TYPE (TREE_OPERAND (arg0, 1)),
> > +		      TREE_TYPE (TREE_OPERAND (arg1, 1)))
> > +		  || (MR_DEPENDENCE_CLIQUE (arg0)
> > +		      != MR_DEPENDENCE_CLIQUE (arg1))
> > +		  || (MR_DEPENDENCE_BASE (arg0)
> > +		      != MR_DEPENDENCE_BASE (arg1)))
> >  		return 0;
> >  	     /* Verify that alignment is compatible.  */
> >  	     if (TYPE_ALIGN (TREE_TYPE (arg0))

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-12-07 18:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-04  7:37 -fstrict-aliasing fixes 6/6: permit inlining of comdats Jan Hubicka
2015-12-04 11:20 ` Richard Biener
2015-12-04 16:57   ` Jan Hubicka
2015-12-04 18:04     ` Jan Hubicka
2015-12-04 18:08       ` Jan Hubicka
2015-12-07  8:20         ` Richard Biener
2015-12-07 18:53           ` Jan Hubicka
2015-12-04 23:39       ` H.J. Lu
2015-12-07  8:16     ` Richard Biener
2015-12-07 18:52       ` Jan Hubicka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).