public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] Make multiple_target.c aware of LTO (PR lto/66295)
@ 2017-01-16  9:30 Martin Liška
  2017-01-17  1:03 ` Jan Hubicka
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Martin Liška @ 2017-01-16  9:30 UTC (permalink / raw)
  To: GCC Patches; +Cc: evstupac, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 532 bytes --]

Hello.

Not being expert in multi_target area, however it consists of 2 passes. The first
one (ipa_target_clone) is responsible for creation of multiple targets for functions
decorated with __attribute__((target_clones("xxx"))). I guess the pass should be
called just in LGEN phase and consecutive clone materialization takes care of these
clones. I'm also adding lto test-case.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
mvc test-cases work find on x86_64-linux-gnu.

Ready to be installed?
Martin

[-- Attachment #2: 0001-Make-multiple_target.c-aware-of-LTO-PR-lto-66295.patch --]
[-- Type: text/x-patch, Size: 1663 bytes --]

From 7ec7045680e10838c43b2713a4fa34b205ba5004 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Fri, 13 Jan 2017 15:26:45 +0100
Subject: [PATCH] Make multiple_target.c aware of LTO (PR lto/66295)

gcc/ChangeLog:

2017-01-13  Martin Liska  <mliska@suse.cz>

	PR lto/66295
	* multiple_target.c (pass_target_clone::gate): Run the pass
	just in LGEN (in LTO mode).

gcc/testsuite/ChangeLog:

2017-01-13  Martin Liska  <mliska@suse.cz>

	PR lto/66295
	* gcc.target/i386/mvc9.c: New test.
---
 gcc/multiple_target.c                |  2 +-
 gcc/testsuite/gcc.target/i386/mvc9.c | 29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/mvc9.c

diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
index 5be3980db20..3df1e297122 100644
--- a/gcc/multiple_target.c
+++ b/gcc/multiple_target.c
@@ -378,7 +378,7 @@ public:
 bool
 pass_target_clone::gate (function *)
 {
-  return true;
+  return !flag_wpa && !flag_ltrans;
 }
 
 } // anon namespace
diff --git a/gcc/testsuite/gcc.target/i386/mvc9.c b/gcc/testsuite/gcc.target/i386/mvc9.c
new file mode 100644
index 00000000000..d510b6c18b4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mvc9.c
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-require-ifunc "" } */
+/* { dg-require-effective-target lto } */
+/* { dg-options "-flto" } */
+
+__attribute__((target_clones("avx","arch=slm","arch=core-avx2","default")))
+int
+foo ()
+{
+  return -2;
+}
+
+int
+bar ()
+{
+  return 2;
+}
+
+int
+main ()
+{
+  int r = 0;
+  r += bar ();
+  r += foo ();
+  r += bar ();
+  r += foo ();
+  r += bar ();
+  return r - 2;
+}
-- 
2.11.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Make multiple_target.c aware of LTO (PR lto/66295)
  2017-01-16  9:30 [PATCH] Make multiple_target.c aware of LTO (PR lto/66295) Martin Liška
@ 2017-01-17  1:03 ` Jan Hubicka
  2017-01-17  9:33   ` Richard Biener
  2017-02-02 13:18   ` Martin Liška
  2017-02-02 13:20 ` [PATCH] IPA: enhance dump output Martin Liška
  2017-03-13 15:19 ` [PATCH 2] Fix multiple target clones nodes (PR lto/66295) Martin Liška
  2 siblings, 2 replies; 17+ messages in thread
From: Jan Hubicka @ 2017-01-17  1:03 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches, evstupac, Jan Hubicka

> Hello.
> 
> Not being expert in multi_target area, however it consists of 2 passes. The first
> one (ipa_target_clone) is responsible for creation of multiple targets for functions
> decorated with __attribute__((target_clones("xxx"))). I guess the pass should be
> called just in LGEN phase and consecutive clone materialization takes care of these
> clones. I'm also adding lto test-case.

It is declared as SIMPLE_IPA_PASS. Then it should either run early (i.e. be in
the queue after pass_ipa_lower_emutls (to run at compile stage) or before pass_ipa_pta
(to run at lgen stage).

If it really needs to run in the middle of IPA pass queue, then it should be
declared as IPA_PASS and run at WPA only.

Honza
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> mvc test-cases work find on x86_64-linux-gnu.
> 
> Ready to be installed?
> Martin

> >From 7ec7045680e10838c43b2713a4fa34b205ba5004 Mon Sep 17 00:00:00 2001
> From: marxin <mliska@suse.cz>
> Date: Fri, 13 Jan 2017 15:26:45 +0100
> Subject: [PATCH] Make multiple_target.c aware of LTO (PR lto/66295)
> 
> gcc/ChangeLog:
> 
> 2017-01-13  Martin Liska  <mliska@suse.cz>
> 
> 	PR lto/66295
> 	* multiple_target.c (pass_target_clone::gate): Run the pass
> 	just in LGEN (in LTO mode).
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-01-13  Martin Liska  <mliska@suse.cz>
> 
> 	PR lto/66295
> 	* gcc.target/i386/mvc9.c: New test.
> ---
>  gcc/multiple_target.c                |  2 +-
>  gcc/testsuite/gcc.target/i386/mvc9.c | 29 +++++++++++++++++++++++++++++
>  2 files changed, 30 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/mvc9.c
> 
> diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
> index 5be3980db20..3df1e297122 100644
> --- a/gcc/multiple_target.c
> +++ b/gcc/multiple_target.c
> @@ -378,7 +378,7 @@ public:
>  bool
>  pass_target_clone::gate (function *)
>  {
> -  return true;
> +  return !flag_wpa && !flag_ltrans;
>  }
>  
>  } // anon namespace
> diff --git a/gcc/testsuite/gcc.target/i386/mvc9.c b/gcc/testsuite/gcc.target/i386/mvc9.c
> new file mode 100644
> index 00000000000..d510b6c18b4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/mvc9.c
> @@ -0,0 +1,29 @@
> +/* { dg-do run } */
> +/* { dg-require-ifunc "" } */
> +/* { dg-require-effective-target lto } */
> +/* { dg-options "-flto" } */
> +
> +__attribute__((target_clones("avx","arch=slm","arch=core-avx2","default")))
> +int
> +foo ()
> +{
> +  return -2;
> +}
> +
> +int
> +bar ()
> +{
> +  return 2;
> +}
> +
> +int
> +main ()
> +{
> +  int r = 0;
> +  r += bar ();
> +  r += foo ();
> +  r += bar ();
> +  r += foo ();
> +  r += bar ();
> +  return r - 2;
> +}
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Make multiple_target.c aware of LTO (PR lto/66295)
  2017-01-17  1:03 ` Jan Hubicka
@ 2017-01-17  9:33   ` Richard Biener
  2017-02-02 13:18   ` Martin Liška
  1 sibling, 0 replies; 17+ messages in thread
From: Richard Biener @ 2017-01-17  9:33 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Martin Liška, GCC Patches, evstupac

On Tue, Jan 17, 2017 at 2:03 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> Hello.
>>
>> Not being expert in multi_target area, however it consists of 2 passes. The first
>> one (ipa_target_clone) is responsible for creation of multiple targets for functions
>> decorated with __attribute__((target_clones("xxx"))). I guess the pass should be
>> called just in LGEN phase and consecutive clone materialization takes care of these
>> clones. I'm also adding lto test-case.
>
> It is declared as SIMPLE_IPA_PASS. Then it should either run early (i.e. be in
> the queue after pass_ipa_lower_emutls (to run at compile stage) or before pass_ipa_pta
> (to run at lgen stage).
>
> If it really needs to run in the middle of IPA pass queue, then it should be
> declared as IPA_PASS and run at WPA only.

The related pass_dispatcher_calls simple-IPA pass runs at LTRANS so
I'd try putting
this before.  Or make it a true IPA pass.

Richard.

> Honza
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>> mvc test-cases work find on x86_64-linux-gnu.
>>
>> Ready to be installed?
>> Martin
>
>> >From 7ec7045680e10838c43b2713a4fa34b205ba5004 Mon Sep 17 00:00:00 2001
>> From: marxin <mliska@suse.cz>
>> Date: Fri, 13 Jan 2017 15:26:45 +0100
>> Subject: [PATCH] Make multiple_target.c aware of LTO (PR lto/66295)
>>
>> gcc/ChangeLog:
>>
>> 2017-01-13  Martin Liska  <mliska@suse.cz>
>>
>>       PR lto/66295
>>       * multiple_target.c (pass_target_clone::gate): Run the pass
>>       just in LGEN (in LTO mode).
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2017-01-13  Martin Liska  <mliska@suse.cz>
>>
>>       PR lto/66295
>>       * gcc.target/i386/mvc9.c: New test.
>> ---
>>  gcc/multiple_target.c                |  2 +-
>>  gcc/testsuite/gcc.target/i386/mvc9.c | 29 +++++++++++++++++++++++++++++
>>  2 files changed, 30 insertions(+), 1 deletion(-)
>>  create mode 100644 gcc/testsuite/gcc.target/i386/mvc9.c
>>
>> diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
>> index 5be3980db20..3df1e297122 100644
>> --- a/gcc/multiple_target.c
>> +++ b/gcc/multiple_target.c
>> @@ -378,7 +378,7 @@ public:
>>  bool
>>  pass_target_clone::gate (function *)
>>  {
>> -  return true;
>> +  return !flag_wpa && !flag_ltrans;
>>  }
>>
>>  } // anon namespace
>> diff --git a/gcc/testsuite/gcc.target/i386/mvc9.c b/gcc/testsuite/gcc.target/i386/mvc9.c
>> new file mode 100644
>> index 00000000000..d510b6c18b4
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/i386/mvc9.c
>> @@ -0,0 +1,29 @@
>> +/* { dg-do run } */
>> +/* { dg-require-ifunc "" } */
>> +/* { dg-require-effective-target lto } */
>> +/* { dg-options "-flto" } */
>> +
>> +__attribute__((target_clones("avx","arch=slm","arch=core-avx2","default")))
>> +int
>> +foo ()
>> +{
>> +  return -2;
>> +}
>> +
>> +int
>> +bar ()
>> +{
>> +  return 2;
>> +}
>> +
>> +int
>> +main ()
>> +{
>> +  int r = 0;
>> +  r += bar ();
>> +  r += foo ();
>> +  r += bar ();
>> +  r += foo ();
>> +  r += bar ();
>> +  return r - 2;
>> +}
>> --
>> 2.11.0
>>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Make multiple_target.c aware of LTO (PR lto/66295)
  2017-01-17  1:03 ` Jan Hubicka
  2017-01-17  9:33   ` Richard Biener
@ 2017-02-02 13:18   ` Martin Liška
  2017-02-03 13:42     ` Martin Liška
  2017-02-03 13:43     ` Martin Liška
  1 sibling, 2 replies; 17+ messages in thread
From: Martin Liška @ 2017-02-02 13:18 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: GCC Patches, evstupac

[-- Attachment #1: Type: text/plain, Size: 1270 bytes --]

Ok, I spent more time with understanding how that pass works and I believe it can be
significantly simplified. I guess target_clones are very close to 'target' attribute
that is handled by C++ FE. It creates cgraph_function_version_info and function dispatcher
is generated.

I hope doing the very same by an early simple IPA pass should be the right approach.
Works fine apart from an assert it triggers:

$ ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gcc.target/i386/mvc9.c -flto -O1
lto1: internal compiler error: in binds_to_current_def_p, at symtab.c:2239
0x8c580a symtab_node::binds_to_current_def_p(symtab_node*)
	../../gcc/symtab.c:2239
0x18cb40b worse_state
	../../gcc/ipa-pure-const.c:477
0x18cd61f propagate_pure_const
	../../gcc/ipa-pure-const.c:1346
0x18ce304 execute
	../../gcc/ipa-pure-const.c:1679

triggered for foo.ifunc:

foo.ifunc/6 (foo.ifunc) @0x7f9535b138a0
  Type: function definition analyzed alias
  Visibility: prevailing_def_ironly artificial
  References: foo.resolver/7 (alias)
  Referring: 
  Read from file: /tmp/ccdj2ikS.o
  Availability: overwritable
  First run: 0
  Function flags:
  Called by: main/2 (1.00 per call) 
  Calls: 

The assert is removed in attached patch, but maybe there's a better approach?

Thanks,
Martin

[-- Attachment #2: 0001-Simplify-creation-of-target_clones-PR-lto-66295.patch --]
[-- Type: text/x-patch, Size: 6279 bytes --]

From 4dbaad270f77d3bf79d4d980371f120a19f7d3a3 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Tue, 24 Jan 2017 13:41:25 +0100
Subject: [PATCH] Simplify creation of target_clones (PR lto/66295)

gcc/ChangeLog:

2017-01-24  Martin Liska  <mliska@suse.cz>

	* multiple_target.c (create_dispatcher_calls): Redirect edge
	from a caller of a dispatcher.
	(expand_target_clones): Make the clones local.
	(ipa_target_clone): Do both target clones and resolvers.
	(ipa_dispatcher_calls): Remove the pass.
	(pass_dispatcher_calls::gate): Likewise.
	(make_pass_dispatcher_calls): Likewise.
	* passes.def (pass_target_clone): Put as very first IPA early
	pass.
	* symtab.c (symtab_node::binds_to_current_def_p): Remove assert.

gcc/testsuite/ChangeLog:

2017-01-24  Martin Liska  <mliska@suse.cz>

	* gcc.target/i386/mvc9.c: New test.
---
 gcc/multiple_target.c                | 71 +++++-------------------------------
 gcc/passes.def                       |  3 +-
 gcc/symtab.c                         |  2 -
 gcc/testsuite/gcc.target/i386/mvc9.c | 28 ++++++++++++++
 4 files changed, 39 insertions(+), 65 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/mvc9.c

diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
index 5be3980db20..7b735ae81ae 100644
--- a/gcc/multiple_target.c
+++ b/gcc/multiple_target.c
@@ -87,6 +87,7 @@ create_dispatcher_calls (struct cgraph_node *node)
 	inode->resolve_alias (cgraph_node::get (resolver_decl));
 
       e->redirect_callee (inode);
+      e->redirect_call_stmt_to_callee ();
       /*  Since REDIRECT_CALLEE modifies NEXT_CALLER field we move to
 	  previously set NEXT_CALLER.  */
       e = NULL;
@@ -283,6 +284,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
       create_new_asm_name (attr, suffix);
       /* Create new target clone.  */
       cgraph_node *new_node = create_target_clone (node, definition, suffix);
+      new_node->local.local = false;
       XDELETEVEC (suffix);
 
       /* Set new attribute for the clone.  */
@@ -334,17 +336,19 @@ expand_target_clones (struct cgraph_node *node, bool definition)
   return ret;
 }
 
-static bool target_clone_pass;
-
 static unsigned int
 ipa_target_clone (void)
 {
   struct cgraph_node *node;
 
-  target_clone_pass = false;
+  bool target_clone_pass = false;
   FOR_EACH_FUNCTION (node)
-    if (node->definition)
-      target_clone_pass |= expand_target_clones (node, true);
+    target_clone_pass |= expand_target_clones (node, node->definition);
+
+  if (target_clone_pass)
+    FOR_EACH_FUNCTION (node)
+      create_dispatcher_calls (node);
+
   return 0;
 }
 
@@ -360,7 +364,7 @@ const pass_data pass_data_target_clone =
   0,				/* properties_provided */
   0,				/* properties_destroyed */
   0,				/* todo_flags_start */
-  0				/* todo_flags_finish */
+  TODO_update_ssa		/* todo_flags_finish */
 };
 
 class pass_target_clone : public simple_ipa_opt_pass
@@ -388,58 +392,3 @@ make_pass_target_clone (gcc::context *ctxt)
 {
   return new pass_target_clone (ctxt);
 }
-
-static unsigned int
-ipa_dispatcher_calls (void)
-{
-  struct cgraph_node *node;
-
-  FOR_EACH_FUNCTION (node)
-    if (!node->definition)
-      target_clone_pass |= expand_target_clones (node, false);
-  if (target_clone_pass)
-    FOR_EACH_FUNCTION (node)
-      create_dispatcher_calls (node);
-  return 0;
-}
-
-namespace {
-
-const pass_data pass_data_dispatcher_calls =
-{
-  SIMPLE_IPA_PASS,		/* type */
-  "dispatchercalls",		/* name */
-  OPTGROUP_NONE,		/* optinfo_flags */
-  TV_NONE,			/* tv_id */
-  ( PROP_ssa | PROP_cfg ),	/* properties_required */
-  0,				/* properties_provided */
-  0,				/* properties_destroyed */
-  0,				/* todo_flags_start */
-  0				/* todo_flags_finish */
-};
-
-class pass_dispatcher_calls : public simple_ipa_opt_pass
-{
-public:
-  pass_dispatcher_calls (gcc::context *ctxt)
-    : simple_ipa_opt_pass (pass_data_dispatcher_calls, ctxt)
-  {}
-
-  /* opt_pass methods: */
-  virtual bool gate (function *);
-  virtual unsigned int execute (function *) { return ipa_dispatcher_calls (); }
-};
-
-bool
-pass_dispatcher_calls::gate (function *)
-{
-  return true;
-}
-
-} // anon namespace
-
-simple_ipa_opt_pass *
-make_pass_dispatcher_calls (gcc::context *ctxt)
-{
-  return new pass_dispatcher_calls (ctxt);
-}
diff --git a/gcc/passes.def b/gcc/passes.def
index 131d659e7a0..c09ec220d70 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -136,6 +136,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
   POP_INSERT_PASSES ()
 
+  NEXT_PASS (pass_target_clone);
   NEXT_PASS (pass_ipa_chkp_produce_thunks);
   NEXT_PASS (pass_ipa_auto_profile);
   NEXT_PASS (pass_ipa_free_inline_summary);
@@ -155,7 +156,6 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_ipa_devirt);
   NEXT_PASS (pass_ipa_cp);
   NEXT_PASS (pass_ipa_cdtor_merge);
-  NEXT_PASS (pass_target_clone);
   NEXT_PASS (pass_ipa_hsa);
   NEXT_PASS (pass_ipa_inline);
   NEXT_PASS (pass_ipa_pure_const);
@@ -174,7 +174,6 @@ along with GCC; see the file COPYING3.  If not see
   INSERT_PASSES_AFTER (all_late_ipa_passes)
   NEXT_PASS (pass_materialize_all_clones);
   NEXT_PASS (pass_ipa_pta);
-  NEXT_PASS (pass_dispatcher_calls);
   NEXT_PASS (pass_omp_simd_clone);
   TERMINATE_PASS_LIST (all_late_ipa_passes)
 
diff --git a/gcc/symtab.c b/gcc/symtab.c
index 0078896c8a8..3f43b07dac7 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -2236,8 +2236,6 @@ symtab_node::binds_to_current_def_p (symtab_node *ref)
   if (DECL_EXTERNAL (decl))
     return false;
 
-  gcc_assert (externally_visible);
-
   if (ref)
     {
       cgraph_node *cref = dyn_cast <cgraph_node *> (ref);
diff --git a/gcc/testsuite/gcc.target/i386/mvc9.c b/gcc/testsuite/gcc.target/i386/mvc9.c
new file mode 100644
index 00000000000..69e3cefb7d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mvc9.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-flto -O2" { target lto } } */
+
+__attribute__((target_clones("avx","arch=slm","arch=core-avx2","default")))
+int
+foo ()
+{
+  return -2;
+}
+
+int
+bar ()
+{
+  return 2;
+}
+
+int
+main ()
+{
+  int r = 0;
+  r += bar ();
+  r += foo ();
+  r += bar ();
+  r += foo ();
+  r += bar ();
+  return r - 2;
+}
-- 
2.11.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] IPA: enhance dump output
  2017-01-16  9:30 [PATCH] Make multiple_target.c aware of LTO (PR lto/66295) Martin Liška
  2017-01-17  1:03 ` Jan Hubicka
@ 2017-02-02 13:20 ` Martin Liška
  2017-02-02 14:54   ` Jan Hubicka
  2017-03-13 15:19 ` [PATCH 2] Fix multiple target clones nodes (PR lto/66295) Martin Liška
  2 siblings, 1 reply; 17+ messages in thread
From: Martin Liška @ 2017-02-02 13:20 UTC (permalink / raw)
  To: GCC Patches; +Cc: evstupac, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 181 bytes --]

Following patch was helpful to deal with the PR.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed or is it stage1 material?
Martin

[-- Attachment #2: 0001-IPA-enhance-dump-output.patch --]
[-- Type: text/x-patch, Size: 1450 bytes --]

From dc02af2bbbe67d9d1fb6119b606b3c1fea726062 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Tue, 24 Jan 2017 13:33:58 +0100
Subject: [PATCH] IPA: enhance dump output

gcc/ChangeLog:

2017-01-24  Martin Liska  <mliska@suse.cz>

	* cgraph.c (cgraph_node::dump): Dump function version info.
	* symtab.c (symtab_node::dump_base): Add missing new line.
---
 gcc/cgraph.c | 10 ++++++++++
 gcc/symtab.c |  1 +
 2 files changed, 11 insertions(+)

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index ef2dc50282c..74839f7d993 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -2066,6 +2066,16 @@ cgraph_node::dump (FILE *f)
     fprintf (f, "  Profile id: %i\n",
 	     profile_id);
   fprintf (f, "  First run: %i\n", tp_first_run);
+  cgraph_function_version_info *vi = function_version ();
+  if (vi != NULL)
+    {
+      /* Iterate to first item in the chain.  */
+      while (vi->prev != NULL)
+	vi = vi->prev;
+      fprintf (f, "  Version info: ");
+      dump_addr (f, "@", (void *)vi);
+      fprintf (f, "\n");
+    }
   fprintf (f, "  Function flags:");
   if (count)
     fprintf (f, " executed %" PRId64"x",
diff --git a/gcc/symtab.c b/gcc/symtab.c
index 87febdc212f..0078896c8a8 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -890,6 +890,7 @@ symtab_node::dump_base (FILE *f)
     {
       fprintf (f, "  Aux:");
       dump_addr (f, " @", (void *)aux);
+      fprintf (f, "\n");
     }
 
   fprintf (f, "  References: ");
-- 
2.11.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] IPA: enhance dump output
  2017-02-02 13:20 ` [PATCH] IPA: enhance dump output Martin Liška
@ 2017-02-02 14:54   ` Jan Hubicka
  2017-02-03 13:40     ` Martin Liška
  0 siblings, 1 reply; 17+ messages in thread
From: Jan Hubicka @ 2017-02-02 14:54 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches, evstupac, Jan Hubicka

> 2017-01-24  Martin Liska  <mliska@suse.cz>
> 
> 	* cgraph.c (cgraph_node::dump): Dump function version info.
> 	* symtab.c (symtab_node::dump_base): Add missing new line.
> ---
>  gcc/cgraph.c | 10 ++++++++++
>  gcc/symtab.c |  1 +
>  2 files changed, 11 insertions(+)
> 
> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
> index ef2dc50282c..74839f7d993 100644
> --- a/gcc/cgraph.c
> +++ b/gcc/cgraph.c
> @@ -2066,6 +2066,16 @@ cgraph_node::dump (FILE *f)
>      fprintf (f, "  Profile id: %i\n",
>  	     profile_id);
>    fprintf (f, "  First run: %i\n", tp_first_run);
> +  cgraph_function_version_info *vi = function_version ();
> +  if (vi != NULL)
> +    {
> +      /* Iterate to first item in the chain.  */
> +      while (vi->prev != NULL)
> +	vi = vi->prev;
> +      fprintf (f, "  Version info: ");
> +      dump_addr (f, "@", (void *)vi);
> +      fprintf (f, "\n");

I suppose it is useful to know that version info is attached, but instead of
dumping an address, i would rather meaningfully print its contents (i.e.
dispatcher and prev/next pointers in list).

OK with that change.

honza
> +    }
>    fprintf (f, "  Function flags:");
>    if (count)
>      fprintf (f, " executed %" PRId64"x",
> diff --git a/gcc/symtab.c b/gcc/symtab.c
> index 87febdc212f..0078896c8a8 100644
> --- a/gcc/symtab.c
> +++ b/gcc/symtab.c
> @@ -890,6 +890,7 @@ symtab_node::dump_base (FILE *f)
>      {
>        fprintf (f, "  Aux:");
>        dump_addr (f, " @", (void *)aux);
> +      fprintf (f, "\n");
>      }
>  
>    fprintf (f, "  References: ");
> -- 
> 2.11.0
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] IPA: enhance dump output
  2017-02-02 14:54   ` Jan Hubicka
@ 2017-02-03 13:40     ` Martin Liška
  0 siblings, 0 replies; 17+ messages in thread
From: Martin Liška @ 2017-02-03 13:40 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: GCC Patches, evstupac

[-- Attachment #1: Type: text/plain, Size: 1757 bytes --]

On 02/02/2017 03:54 PM, Jan Hubicka wrote:
>> 2017-01-24  Martin Liska  <mliska@suse.cz>
>>
>> 	* cgraph.c (cgraph_node::dump): Dump function version info.
>> 	* symtab.c (symtab_node::dump_base): Add missing new line.
>> ---
>>  gcc/cgraph.c | 10 ++++++++++
>>  gcc/symtab.c |  1 +
>>  2 files changed, 11 insertions(+)
>>
>> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
>> index ef2dc50282c..74839f7d993 100644
>> --- a/gcc/cgraph.c
>> +++ b/gcc/cgraph.c
>> @@ -2066,6 +2066,16 @@ cgraph_node::dump (FILE *f)
>>      fprintf (f, "  Profile id: %i\n",
>>  	     profile_id);
>>    fprintf (f, "  First run: %i\n", tp_first_run);
>> +  cgraph_function_version_info *vi = function_version ();
>> +  if (vi != NULL)
>> +    {
>> +      /* Iterate to first item in the chain.  */
>> +      while (vi->prev != NULL)
>> +	vi = vi->prev;
>> +      fprintf (f, "  Version info: ");
>> +      dump_addr (f, "@", (void *)vi);
>> +      fprintf (f, "\n");
> 
> I suppose it is useful to know that version info is attached, but instead of
> dumping an address, i would rather meaningfully print its contents (i.e.
> dispatcher and prev/next pointers in list).
> 
> OK with that change.

Yep, there's final version of patch, which I'll install just after testing.

Martin

> 
> honza
>> +    }
>>    fprintf (f, "  Function flags:");
>>    if (count)
>>      fprintf (f, " executed %" PRId64"x",
>> diff --git a/gcc/symtab.c b/gcc/symtab.c
>> index 87febdc212f..0078896c8a8 100644
>> --- a/gcc/symtab.c
>> +++ b/gcc/symtab.c
>> @@ -890,6 +890,7 @@ symtab_node::dump_base (FILE *f)
>>      {
>>        fprintf (f, "  Aux:");
>>        dump_addr (f, " @", (void *)aux);
>> +      fprintf (f, "\n");
>>      }
>>  
>>    fprintf (f, "  References: ");
>> -- 
>> 2.11.0
>>
> 


[-- Attachment #2: 0001-IPA-enhance-dump-output.patch --]
[-- Type: text/x-patch, Size: 1791 bytes --]

From 9e0affe017bf9420c14f5a53d7f6f282fea43e4a Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Tue, 24 Jan 2017 13:33:58 +0100
Subject: [PATCH] IPA: enhance dump output

gcc/ChangeLog:

2017-01-24  Martin Liska  <mliska@suse.cz>

	* cgraph.c (cgraph_node::dump): Dump function version info.
	* symtab.c (symtab_node::dump_base): Add missing new line.
---
 gcc/cgraph.c | 22 ++++++++++++++++++++++
 gcc/symtab.c |  1 +
 2 files changed, 23 insertions(+)

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index ef2dc50282c..f2f763e31b3 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -2066,6 +2066,28 @@ cgraph_node::dump (FILE *f)
     fprintf (f, "  Profile id: %i\n",
 	     profile_id);
   fprintf (f, "  First run: %i\n", tp_first_run);
+  cgraph_function_version_info *vi = function_version ();
+  if (vi != NULL)
+    {
+      fprintf (f, "  Version info: ");
+      if (vi->prev != NULL)
+	{
+	  fprintf (f, "prev: ");
+	  fprintf (f, "%s/%i ", vi->prev->this_node->asm_name (),
+		   vi->prev->this_node->order);
+	}
+      if (vi->next != NULL)
+	{
+	  fprintf (f, "next: ");
+	  fprintf (f, "%s/%i ", vi->next->this_node->asm_name (),
+		   vi->next->this_node->order);
+	}
+      if (vi->dispatcher_resolver != NULL_TREE)
+	fprintf (f, "dispatcher: %s",
+		 lang_hooks.decl_printable_name (vi->dispatcher_resolver, 2));
+
+      fprintf (f, "\n");
+    }
   fprintf (f, "  Function flags:");
   if (count)
     fprintf (f, " executed %" PRId64"x",
diff --git a/gcc/symtab.c b/gcc/symtab.c
index 87febdc212f..0078896c8a8 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -890,6 +890,7 @@ symtab_node::dump_base (FILE *f)
     {
       fprintf (f, "  Aux:");
       dump_addr (f, " @", (void *)aux);
+      fprintf (f, "\n");
     }
 
   fprintf (f, "  References: ");
-- 
2.11.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Make multiple_target.c aware of LTO (PR lto/66295)
  2017-02-02 13:18   ` Martin Liška
@ 2017-02-03 13:42     ` Martin Liška
  2017-02-03 13:43     ` Martin Liška
  1 sibling, 0 replies; 17+ messages in thread
From: Martin Liška @ 2017-02-03 13:42 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: GCC Patches, evstupac

[-- Attachment #1: Type: text/plain, Size: 1584 bytes --]

On 02/02/2017 02:18 PM, Martin Liška wrote:
> Ok, I spent more time with understanding how that pass works and I believe it can be
> significantly simplified. I guess target_clones are very close to 'target' attribute
> that is handled by C++ FE. It creates cgraph_function_version_info and function dispatcher
> is generated.
> 
> I hope doing the very same by an early simple IPA pass should be the right approach.
> Works fine apart from an assert it triggers:
> 
> $ ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gcc.target/i386/mvc9.c -flto -O1
> lto1: internal compiler error: in binds_to_current_def_p, at symtab.c:2239
> 0x8c580a symtab_node::binds_to_current_def_p(symtab_node*)
> 	../../gcc/symtab.c:2239
> 0x18cb40b worse_state
> 	../../gcc/ipa-pure-const.c:477
> 0x18cd61f propagate_pure_const
> 	../../gcc/ipa-pure-const.c:1346
> 0x18ce304 execute
> 	../../gcc/ipa-pure-const.c:1679
> 
> triggered for foo.ifunc:
> 
> foo.ifunc/6 (foo.ifunc) @0x7f9535b138a0
>   Type: function definition analyzed alias
>   Visibility: prevailing_def_ironly artificial
>   References: foo.resolver/7 (alias)
>   Referring: 
>   Read from file: /tmp/ccdj2ikS.o
>   Availability: overwritable
>   First run: 0
>   Function flags:
>   Called by: main/2 (1.00 per call) 
>   Calls: 
> 
> The assert is removed in attached patch, but maybe there's a better approach?

After discussion with Honza, we agreed that the function should bail out when
declaration has ifunc attribute. I'm going to install it as separate patch
right after proper testing.

Martin

> 
> Thanks,
> Martin
> 


[-- Attachment #2: 0001-Bail-out-binds_to_current_def_p-for-ifunc-functions.patch --]
[-- Type: text/x-patch, Size: 887 bytes --]

From 3f1d907737cf132f52c7a85d697e1f9d267efa86 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Fri, 3 Feb 2017 14:33:16 +0100
Subject: [PATCH] Bail out binds_to_current_def_p for ifunc functions.

gcc/ChangeLog:

2017-02-03  Martin Liska  <mliska@suse.cz>

	* symtab.c (symtab_node::binds_to_current_def_p): Bail out
	in case of a function with ifunc attribute.
---
 gcc/symtab.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/symtab.c b/gcc/symtab.c
index 0078896c8a8..f0baf081040 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -2225,6 +2225,8 @@ symtab_node::binds_to_current_def_p (symtab_node *ref)
   if (transparent_alias)
     return definition
 	   && get_alias_target()->binds_to_current_def_p (ref);
+  if (lookup_attribute ("ifunc", DECL_ATTRIBUTES (decl)))
+    return false;
   if (decl_binds_to_current_def_p (decl))
     return true;
 
-- 
2.11.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] Make multiple_target.c aware of LTO (PR lto/66295)
  2017-02-02 13:18   ` Martin Liška
  2017-02-03 13:42     ` Martin Liška
@ 2017-02-03 13:43     ` Martin Liška
  1 sibling, 0 replies; 17+ messages in thread
From: Martin Liška @ 2017-02-03 13:43 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: GCC Patches, evstupac

[-- Attachment #1: Type: text/plain, Size: 1607 bytes --]

On 02/02/2017 02:18 PM, Martin Liška wrote:
> Ok, I spent more time with understanding how that pass works and I believe it can be
> significantly simplified. I guess target_clones are very close to 'target' attribute
> that is handled by C++ FE. It creates cgraph_function_version_info and function dispatcher
> is generated.
> 
> I hope doing the very same by an early simple IPA pass should be the right approach.
> Works fine apart from an assert it triggers:

After reading of the original thread, it's still unclear why 2 passes we introduced.
As I read the original patch, there was just one. Having pre-approved patch by Honza,
I'll install that after testing.

Martin

> 
> $ ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gcc.target/i386/mvc9.c -flto -O1
> lto1: internal compiler error: in binds_to_current_def_p, at symtab.c:2239
> 0x8c580a symtab_node::binds_to_current_def_p(symtab_node*)
> 	../../gcc/symtab.c:2239
> 0x18cb40b worse_state
> 	../../gcc/ipa-pure-const.c:477
> 0x18cd61f propagate_pure_const
> 	../../gcc/ipa-pure-const.c:1346
> 0x18ce304 execute
> 	../../gcc/ipa-pure-const.c:1679
> 
> triggered for foo.ifunc:
> 
> foo.ifunc/6 (foo.ifunc) @0x7f9535b138a0
>   Type: function definition analyzed alias
>   Visibility: prevailing_def_ironly artificial
>   References: foo.resolver/7 (alias)
>   Referring: 
>   Read from file: /tmp/ccdj2ikS.o
>   Availability: overwritable
>   First run: 0
>   Function flags:
>   Called by: main/2 (1.00 per call) 
>   Calls: 
> 
> The assert is removed in attached patch, but maybe there's a better approach?
> 
> Thanks,
> Martin
> 


[-- Attachment #2: 0001-Simplify-creation-of-target_clones-PR-lto-66295.patch --]
[-- Type: text/x-patch, Size: 5808 bytes --]

From e2ff9bb3ebe8f005e7334780ede92dab6afb1a07 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Tue, 24 Jan 2017 13:41:25 +0100
Subject: [PATCH] Simplify creation of target_clones (PR lto/66295)

gcc/ChangeLog:

2017-01-24  Martin Liska  <mliska@suse.cz>

	* multiple_target.c (create_dispatcher_calls): Redirect edge
	from a caller of a dispatcher.
	(expand_target_clones): Make the clones local.
	(ipa_target_clone): Do both target clones and resolvers.
	(ipa_dispatcher_calls): Remove the pass.
	(pass_dispatcher_calls::gate): Likewise.
	(make_pass_dispatcher_calls): Likewise.
	* passes.def (pass_target_clone): Put as very first IPA early
	pass.

gcc/testsuite/ChangeLog:

2017-01-24  Martin Liska  <mliska@suse.cz>

	* gcc.target/i386/mvc9.c: New test.
---
 gcc/multiple_target.c                | 71 +++++-------------------------------
 gcc/passes.def                       |  3 +-
 gcc/testsuite/gcc.target/i386/mvc9.c | 28 ++++++++++++++
 3 files changed, 39 insertions(+), 63 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/mvc9.c

diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
index 5be3980db20..7b735ae81ae 100644
--- a/gcc/multiple_target.c
+++ b/gcc/multiple_target.c
@@ -87,6 +87,7 @@ create_dispatcher_calls (struct cgraph_node *node)
 	inode->resolve_alias (cgraph_node::get (resolver_decl));
 
       e->redirect_callee (inode);
+      e->redirect_call_stmt_to_callee ();
       /*  Since REDIRECT_CALLEE modifies NEXT_CALLER field we move to
 	  previously set NEXT_CALLER.  */
       e = NULL;
@@ -283,6 +284,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
       create_new_asm_name (attr, suffix);
       /* Create new target clone.  */
       cgraph_node *new_node = create_target_clone (node, definition, suffix);
+      new_node->local.local = false;
       XDELETEVEC (suffix);
 
       /* Set new attribute for the clone.  */
@@ -334,17 +336,19 @@ expand_target_clones (struct cgraph_node *node, bool definition)
   return ret;
 }
 
-static bool target_clone_pass;
-
 static unsigned int
 ipa_target_clone (void)
 {
   struct cgraph_node *node;
 
-  target_clone_pass = false;
+  bool target_clone_pass = false;
   FOR_EACH_FUNCTION (node)
-    if (node->definition)
-      target_clone_pass |= expand_target_clones (node, true);
+    target_clone_pass |= expand_target_clones (node, node->definition);
+
+  if (target_clone_pass)
+    FOR_EACH_FUNCTION (node)
+      create_dispatcher_calls (node);
+
   return 0;
 }
 
@@ -360,7 +364,7 @@ const pass_data pass_data_target_clone =
   0,				/* properties_provided */
   0,				/* properties_destroyed */
   0,				/* todo_flags_start */
-  0				/* todo_flags_finish */
+  TODO_update_ssa		/* todo_flags_finish */
 };
 
 class pass_target_clone : public simple_ipa_opt_pass
@@ -388,58 +392,3 @@ make_pass_target_clone (gcc::context *ctxt)
 {
   return new pass_target_clone (ctxt);
 }
-
-static unsigned int
-ipa_dispatcher_calls (void)
-{
-  struct cgraph_node *node;
-
-  FOR_EACH_FUNCTION (node)
-    if (!node->definition)
-      target_clone_pass |= expand_target_clones (node, false);
-  if (target_clone_pass)
-    FOR_EACH_FUNCTION (node)
-      create_dispatcher_calls (node);
-  return 0;
-}
-
-namespace {
-
-const pass_data pass_data_dispatcher_calls =
-{
-  SIMPLE_IPA_PASS,		/* type */
-  "dispatchercalls",		/* name */
-  OPTGROUP_NONE,		/* optinfo_flags */
-  TV_NONE,			/* tv_id */
-  ( PROP_ssa | PROP_cfg ),	/* properties_required */
-  0,				/* properties_provided */
-  0,				/* properties_destroyed */
-  0,				/* todo_flags_start */
-  0				/* todo_flags_finish */
-};
-
-class pass_dispatcher_calls : public simple_ipa_opt_pass
-{
-public:
-  pass_dispatcher_calls (gcc::context *ctxt)
-    : simple_ipa_opt_pass (pass_data_dispatcher_calls, ctxt)
-  {}
-
-  /* opt_pass methods: */
-  virtual bool gate (function *);
-  virtual unsigned int execute (function *) { return ipa_dispatcher_calls (); }
-};
-
-bool
-pass_dispatcher_calls::gate (function *)
-{
-  return true;
-}
-
-} // anon namespace
-
-simple_ipa_opt_pass *
-make_pass_dispatcher_calls (gcc::context *ctxt)
-{
-  return new pass_dispatcher_calls (ctxt);
-}
diff --git a/gcc/passes.def b/gcc/passes.def
index 131d659e7a0..c09ec220d70 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -136,6 +136,7 @@ along with GCC; see the file COPYING3.  If not see
       POP_INSERT_PASSES ()
   POP_INSERT_PASSES ()
 
+  NEXT_PASS (pass_target_clone);
   NEXT_PASS (pass_ipa_chkp_produce_thunks);
   NEXT_PASS (pass_ipa_auto_profile);
   NEXT_PASS (pass_ipa_free_inline_summary);
@@ -155,7 +156,6 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_ipa_devirt);
   NEXT_PASS (pass_ipa_cp);
   NEXT_PASS (pass_ipa_cdtor_merge);
-  NEXT_PASS (pass_target_clone);
   NEXT_PASS (pass_ipa_hsa);
   NEXT_PASS (pass_ipa_inline);
   NEXT_PASS (pass_ipa_pure_const);
@@ -174,7 +174,6 @@ along with GCC; see the file COPYING3.  If not see
   INSERT_PASSES_AFTER (all_late_ipa_passes)
   NEXT_PASS (pass_materialize_all_clones);
   NEXT_PASS (pass_ipa_pta);
-  NEXT_PASS (pass_dispatcher_calls);
   NEXT_PASS (pass_omp_simd_clone);
   TERMINATE_PASS_LIST (all_late_ipa_passes)
 
diff --git a/gcc/testsuite/gcc.target/i386/mvc9.c b/gcc/testsuite/gcc.target/i386/mvc9.c
new file mode 100644
index 00000000000..69e3cefb7d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/mvc9.c
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-flto -O2" { target lto } } */
+
+__attribute__((target_clones("avx","arch=slm","arch=core-avx2","default")))
+int
+foo ()
+{
+  return -2;
+}
+
+int
+bar ()
+{
+  return 2;
+}
+
+int
+main ()
+{
+  int r = 0;
+  r += bar ();
+  r += foo ();
+  r += bar ();
+  r += foo ();
+  r += bar ();
+  return r - 2;
+}
-- 
2.11.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 2] Fix multiple target clones nodes (PR lto/66295).
  2017-01-16  9:30 [PATCH] Make multiple_target.c aware of LTO (PR lto/66295) Martin Liška
  2017-01-17  1:03 ` Jan Hubicka
  2017-02-02 13:20 ` [PATCH] IPA: enhance dump output Martin Liška
@ 2017-03-13 15:19 ` Martin Liška
  2017-03-14  9:46   ` Richard Biener
  2017-03-15  9:30   ` Rainer Orth
  2 siblings, 2 replies; 17+ messages in thread
From: Martin Liška @ 2017-03-13 15:19 UTC (permalink / raw)
  To: GCC Patches; +Cc: evstupac, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 220 bytes --]

Hello.

This is a small follow-up patch, where local.local flag should be also dropped
for a default implementation.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Martin

[-- Attachment #2: 0001-Fix-multiple-target-clones-nodes-PR-lto-66295.patch --]
[-- Type: text/x-patch, Size: 2213 bytes --]

From 0d7b793b23e6dad738abaee32a9e0fbf0f4c70c5 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Mon, 13 Mar 2017 09:55:40 +0100
Subject: [PATCH] Fix multiple target clones nodes (PR lto/66295).

gcc/ChangeLog:

2017-03-13  Martin Liska  <mliska@suse.cz>

	PR lto/66295
	* multiple_target.c (expand_target_clones): Drop local.local
	flag for default implementation.

gcc/testsuite/ChangeLog:

2017-03-13  Martin Liska  <mliska@suse.cz>

	PR lto/66295
	* gcc.dg/tree-prof/pr66295.c: New test.
---
 gcc/multiple_target.c                    |  1 +
 gcc/testsuite/gcc.dg/tree-prof/pr66295.c | 34 ++++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-prof/pr66295.c

diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
index 7b735ae81ae..4a835bbcc17 100644
--- a/gcc/multiple_target.c
+++ b/gcc/multiple_target.c
@@ -327,6 +327,7 @@ expand_target_clones (struct cgraph_node *node, bool definition)
   tree attributes = make_attribute ("target", "default",
 				    DECL_ATTRIBUTES (node->decl));
   DECL_ATTRIBUTES (node->decl) = attributes;
+  node->local.local = false;
   location_t saved_loc = input_location;
   input_location = DECL_SOURCE_LOCATION (node->decl);
   bool ret
diff --git a/gcc/testsuite/gcc.dg/tree-prof/pr66295.c b/gcc/testsuite/gcc.dg/tree-prof/pr66295.c
new file mode 100644
index 00000000000..1ab7e6c8f64
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/pr66295.c
@@ -0,0 +1,34 @@
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O2" } */
+
+static double bar (double *__restrict, double *__restrict, int)
+__attribute__ ((target_clones("avx,avx2,avx512f,default")));
+
+double
+foo (double *__restrict a, double *__restrict b, int n)
+{
+  return bar (a,b,n);
+}
+
+double
+bar (double *__restrict a, double *__restrict b, int n)	/* { dg-error "attribute\[^\n\r]*foo\[^\n\r]* is unknown" } */
+{
+  double s;
+  int i;
+  s = 0.0;
+  for (i=0; i<n; i++)
+    s += a[i] + b[i];
+
+  return s;
+}
+
+#define N 5
+
+int main ()
+{
+  double a[N] = {1.2f, 1.2f, 1.2f, 1.2f, 1.2f };
+  double b[N] = {1.2f, 1.2f, 1.2f, 1.2f, 1.2f };
+
+  __builtin_printf ("value: %.5f\n", foo (a, b, N));
+  return 0;
+}
-- 
2.11.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2] Fix multiple target clones nodes (PR lto/66295).
  2017-03-13 15:19 ` [PATCH 2] Fix multiple target clones nodes (PR lto/66295) Martin Liška
@ 2017-03-14  9:46   ` Richard Biener
  2017-03-14 10:09     ` Martin Liška
  2017-03-15  9:30   ` Rainer Orth
  1 sibling, 1 reply; 17+ messages in thread
From: Richard Biener @ 2017-03-14  9:46 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches, evstupac, Jan Hubicka

On Mon, Mar 13, 2017 at 4:19 PM, Martin Liška <mliska@suse.cz> wrote:
> Hello.
>
> This is a small follow-up patch, where local.local flag should be also dropped
> for a default implementation.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

I see we have clered the flag on the clones this way.  But isn't it
bogus to have
it set in the first place?  That is, isn't analysis sofar given bogus info?

Shouldn't we instead fix local_p to not mark functions with MV attribute local
in the first place?

Richard.

> Martin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2] Fix multiple target clones nodes (PR lto/66295).
  2017-03-14  9:46   ` Richard Biener
@ 2017-03-14 10:09     ` Martin Liška
  2017-03-14 10:29       ` Richard Biener
  0 siblings, 1 reply; 17+ messages in thread
From: Martin Liška @ 2017-03-14 10:09 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, evstupac, Jan Hubicka

On 03/14/2017 10:46 AM, Richard Biener wrote:
> On Mon, Mar 13, 2017 at 4:19 PM, Martin Liška <mliska@suse.cz> wrote:
>> Hello.
>>
>> This is a small follow-up patch, where local.local flag should be also dropped
>> for a default implementation.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
> 
> I see we have clered the flag on the clones this way.  But isn't it
> bogus to have
> it set in the first place?  That is, isn't analysis sofar given bogus info?

Yes, I did it for clones. Reason why we mark it is that local flag is set
by pass_ipa_function_and_variable_visibility pass, which runs before MV pass.

I can imagine MV can bail out for a non-trivial reason and the visibility pass
should somehow simulate and predict what happens in the MV pass?

Martin

> 
> Shouldn't we instead fix local_p to not mark functions with MV attribute local
> in the first place?
> 
> Richard.
> 
>> Martin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2] Fix multiple target clones nodes (PR lto/66295).
  2017-03-14 10:09     ` Martin Liška
@ 2017-03-14 10:29       ` Richard Biener
  2017-03-14 11:04         ` Martin Liška
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Biener @ 2017-03-14 10:29 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches, evstupac, Jan Hubicka

On Tue, Mar 14, 2017 at 11:09 AM, Martin Liška <mliska@suse.cz> wrote:
> On 03/14/2017 10:46 AM, Richard Biener wrote:
>> On Mon, Mar 13, 2017 at 4:19 PM, Martin Liška <mliska@suse.cz> wrote:
>>> Hello.
>>>
>>> This is a small follow-up patch, where local.local flag should be also dropped
>>> for a default implementation.
>>>
>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>
>>> Ready to be installed?
>>
>> I see we have clered the flag on the clones this way.  But isn't it
>> bogus to have
>> it set in the first place?  That is, isn't analysis sofar given bogus info?
>
> Yes, I did it for clones. Reason why we mark it is that local flag is set
> by pass_ipa_function_and_variable_visibility pass, which runs before MV pass.
>
> I can imagine MV can bail out for a non-trivial reason and the visibility pass
> should somehow simulate and predict what happens in the MV pass?

Setting .local to true is an optimization, isn't it?  So just not doing it when
the attribute is present should be safe.  Just "undoing" .local = true later
is asking for trouble if some intermediate pass decides to do some transform
based on .local = true, no?

Richard.

> Martin
>
>>
>> Shouldn't we instead fix local_p to not mark functions with MV attribute local
>> in the first place?
>>
>> Richard.
>>
>>> Martin
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2] Fix multiple target clones nodes (PR lto/66295).
  2017-03-14 10:29       ` Richard Biener
@ 2017-03-14 11:04         ` Martin Liška
  2017-03-14 11:11           ` Richard Biener
  0 siblings, 1 reply; 17+ messages in thread
From: Martin Liška @ 2017-03-14 11:04 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, evstupac, Jan Hubicka

On 03/14/2017 11:29 AM, Richard Biener wrote:
> On Tue, Mar 14, 2017 at 11:09 AM, Martin Liška <mliska@suse.cz> wrote:
>> On 03/14/2017 10:46 AM, Richard Biener wrote:
>>> On Mon, Mar 13, 2017 at 4:19 PM, Martin Liška <mliska@suse.cz> wrote:
>>>> Hello.
>>>>
>>>> This is a small follow-up patch, where local.local flag should be also dropped
>>>> for a default implementation.
>>>>
>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>>
>>>> Ready to be installed?
>>>
>>> I see we have clered the flag on the clones this way.  But isn't it
>>> bogus to have
>>> it set in the first place?  That is, isn't analysis sofar given bogus info?
>>
>> Yes, I did it for clones. Reason why we mark it is that local flag is set
>> by pass_ipa_function_and_variable_visibility pass, which runs before MV pass.
>>
>> I can imagine MV can bail out for a non-trivial reason and the visibility pass
>> should somehow simulate and predict what happens in the MV pass?
> 
> Setting .local to true is an optimization, isn't it?  So just not doing it when
> the attribute is present should be safe.  Just "undoing" .local = true later
> is asking for trouble if some intermediate pass decides to do some transform
> based on .local = true, no?

Ok, there's more detail analysis. When a MV function is set local (either it's a static symbol),
or IPA split (or IPA inline) makes a clone. This is all fine because we define local as:

struct GTY(()) cgraph_local_info {
  /* Set when function is visible in current compilation unit only and
     its address is never taken.  */
  unsigned local : 1;

All IPA passes can happily makes optimization based on that. However, in moment where we
introduce a dispatcher (that obviously takes an address), then the function starts to not
be local.

Thus I believe my original patch (and the part which is already in trunk) should work
in the right way.

Martin

> 
> Richard.
> 
>> Martin
>>
>>>
>>> Shouldn't we instead fix local_p to not mark functions with MV attribute local
>>> in the first place?
>>>
>>> Richard.
>>>
>>>> Martin
>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2] Fix multiple target clones nodes (PR lto/66295).
  2017-03-14 11:04         ` Martin Liška
@ 2017-03-14 11:11           ` Richard Biener
  0 siblings, 0 replies; 17+ messages in thread
From: Richard Biener @ 2017-03-14 11:11 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches, evstupac, Jan Hubicka

On Tue, Mar 14, 2017 at 12:04 PM, Martin Liška <mliska@suse.cz> wrote:
> On 03/14/2017 11:29 AM, Richard Biener wrote:
>> On Tue, Mar 14, 2017 at 11:09 AM, Martin Liška <mliska@suse.cz> wrote:
>>> On 03/14/2017 10:46 AM, Richard Biener wrote:
>>>> On Mon, Mar 13, 2017 at 4:19 PM, Martin Liška <mliska@suse.cz> wrote:
>>>>> Hello.
>>>>>
>>>>> This is a small follow-up patch, where local.local flag should be also dropped
>>>>> for a default implementation.
>>>>>
>>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>>>
>>>>> Ready to be installed?
>>>>
>>>> I see we have clered the flag on the clones this way.  But isn't it
>>>> bogus to have
>>>> it set in the first place?  That is, isn't analysis sofar given bogus info?
>>>
>>> Yes, I did it for clones. Reason why we mark it is that local flag is set
>>> by pass_ipa_function_and_variable_visibility pass, which runs before MV pass.
>>>
>>> I can imagine MV can bail out for a non-trivial reason and the visibility pass
>>> should somehow simulate and predict what happens in the MV pass?
>>
>> Setting .local to true is an optimization, isn't it?  So just not doing it when
>> the attribute is present should be safe.  Just "undoing" .local = true later
>> is asking for trouble if some intermediate pass decides to do some transform
>> based on .local = true, no?
>
> Ok, there's more detail analysis. When a MV function is set local (either it's a static symbol),
> or IPA split (or IPA inline) makes a clone. This is all fine because we define local as:
>
> struct GTY(()) cgraph_local_info {
>   /* Set when function is visible in current compilation unit only and
>      its address is never taken.  */
>   unsigned local : 1;
>
> All IPA passes can happily makes optimization based on that. However, in moment where we
> introduce a dispatcher (that obviously takes an address), then the function starts to not
> be local.
>
> Thus I believe my original patch (and the part which is already in trunk) should work
> in the right way.

Ah, ok.  I thought it was also supposed to preserve the ABI but if
only the ABI of the
dispatcher (the original fndecl) matters then this is moot.

Patch is ok then.

Thanks,
Richard.

> Martin
>
>>
>> Richard.
>>
>>> Martin
>>>
>>>>
>>>> Shouldn't we instead fix local_p to not mark functions with MV attribute local
>>>> in the first place?
>>>>
>>>> Richard.
>>>>
>>>>> Martin
>>>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2] Fix multiple target clones nodes (PR lto/66295).
  2017-03-13 15:19 ` [PATCH 2] Fix multiple target clones nodes (PR lto/66295) Martin Liška
  2017-03-14  9:46   ` Richard Biener
@ 2017-03-15  9:30   ` Rainer Orth
  2017-03-15 10:04     ` Martin Liška
  1 sibling, 1 reply; 17+ messages in thread
From: Rainer Orth @ 2017-03-15  9:30 UTC (permalink / raw)
  To: Martin Liška; +Cc: GCC Patches, evstupac, Jan Hubicka

Hi Martin,

> This is a small follow-up patch, where local.local flag should be also dropped
> for a default implementation.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

the testcase causes

WARNING: profopt.exp does not support dg-error

I suspect it can just as well go into gcc.dg with

/* { dg-require-profiling "-fprofile-generate" } */

and -fprofile-generate added to dg-options.

	Rainer

-- 
-----------------------------------------------------------------------------
Rainer Orth, Center for Biotechnology, Bielefeld University

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2] Fix multiple target clones nodes (PR lto/66295).
  2017-03-15  9:30   ` Rainer Orth
@ 2017-03-15 10:04     ` Martin Liška
  0 siblings, 0 replies; 17+ messages in thread
From: Martin Liška @ 2017-03-15 10:04 UTC (permalink / raw)
  To: Rainer Orth; +Cc: GCC Patches, evstupac, Jan Hubicka

[-- Attachment #1: Type: text/plain, Size: 664 bytes --]

On 03/15/2017 10:30 AM, Rainer Orth wrote:
> Hi Martin,
>
>> This is a small follow-up patch, where local.local flag should be also dropped
>> for a default implementation.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>
> the testcase causes
>
> WARNING: profopt.exp does not support dg-error
>
> I suspect it can just as well go into gcc.dg with

Thanks for pointing out. As I read the test-case, the dg-error is not needed.
I'm going to install following patch as obvious.

Martin

>
> /* { dg-require-profiling "-fprofile-generate" } */
>
> and -fprofile-generate added to dg-options.
>
> 	Rainer
>


[-- Attachment #2: 0001-Removed-unused-dg-error.patch --]
[-- Type: text/x-patch, Size: 940 bytes --]

From 3ac3656787ded319f19c347a3576474adc48a6d9 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Wed, 15 Mar 2017 11:03:27 +0100
Subject: [PATCH] Removed unused dg-error.

gcc/testsuite/ChangeLog:

2017-03-15  Martin Liska  <mliska@suse.cz>

	* gcc.dg/tree-prof/pr66295.c: Removed unused dg-error.
---
 gcc/testsuite/gcc.dg/tree-prof/pr66295.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-prof/pr66295.c b/gcc/testsuite/gcc.dg/tree-prof/pr66295.c
index 1ab7e6c8f64..b90ef84ed10 100644
--- a/gcc/testsuite/gcc.dg/tree-prof/pr66295.c
+++ b/gcc/testsuite/gcc.dg/tree-prof/pr66295.c
@@ -11,7 +11,7 @@ foo (double *__restrict a, double *__restrict b, int n)
 }
 
 double
-bar (double *__restrict a, double *__restrict b, int n)	/* { dg-error "attribute\[^\n\r]*foo\[^\n\r]* is unknown" } */
+bar (double *__restrict a, double *__restrict b, int n)
 {
   double s;
   int i;
-- 
2.11.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-03-15 10:04 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-16  9:30 [PATCH] Make multiple_target.c aware of LTO (PR lto/66295) Martin Liška
2017-01-17  1:03 ` Jan Hubicka
2017-01-17  9:33   ` Richard Biener
2017-02-02 13:18   ` Martin Liška
2017-02-03 13:42     ` Martin Liška
2017-02-03 13:43     ` Martin Liška
2017-02-02 13:20 ` [PATCH] IPA: enhance dump output Martin Liška
2017-02-02 14:54   ` Jan Hubicka
2017-02-03 13:40     ` Martin Liška
2017-03-13 15:19 ` [PATCH 2] Fix multiple target clones nodes (PR lto/66295) Martin Liška
2017-03-14  9:46   ` Richard Biener
2017-03-14 10:09     ` Martin Liška
2017-03-14 10:29       ` Richard Biener
2017-03-14 11:04         ` Martin Liška
2017-03-14 11:11           ` Richard Biener
2017-03-15  9:30   ` Rainer Orth
2017-03-15 10:04     ` Martin Liška

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).