public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [gomp4] Add tables generation
@ 2014-03-20 17:01 Bernd Schmidt
  2014-03-20 19:24 ` Jakub Jelinek
                   ` (3 more replies)
  0 siblings, 4 replies; 48+ messages in thread
From: Bernd Schmidt @ 2014-03-20 17:01 UTC (permalink / raw)
  To: GCC Patches; +Cc: Ilya Verbin, Michael Zolotukhin

[-- Attachment #1: Type: text/plain, Size: 415 bytes --]

This is based on Michael Zolotukhin's patch 2/3 from a while ago. It 
adds functionality to build function/variable tables that will allow 
libgomp to look up offload target code based on the address of the 
corresponding host function. There are two alternatives, one based on 
named sections, and one based on a target hook when named sections are 
unavailable (as on ptx).

Committed on gomp-4_0-branch.


Bernd

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: zolotukhin2-g4.diff --]
[-- Type: text/x-patch; name="zolotukhin2-g4.diff", Size: 15777 bytes --]

Index: libgcc/ChangeLog
===================================================================
--- libgcc/ChangeLog	(revision 208706)
+++ libgcc/ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2014-03-20  Bernd Schmidt  <bernds@codesourcery.com>
+
+	* crtstuff.c (_omp_func_table, _omp_var_table, _omp_funcs_end,
+	_omp_vars_end): New array fragments.
+	(__OPENMP_TARGET__): New variable.
+
 2014-02-28  Joey Ye  <joey.ye@arm.com>
 
 	PR libgcc/60166
Index: gcc/ChangeLog
===================================================================
--- gcc/ChangeLog	(revision 208720)
+++ gcc/ChangeLog	(working copy)
@@ -1,5 +1,20 @@
 2014-03-20  Bernd Schmidt  <bernds@codesourcery.com>
 
+	Mostly by Michael Zolotukhin:
+	* omp-low.c: Include "common/common-target.h".
+	(expand_omp_target): Pass in address of __OPENMP_TARGET__.
+	(add_decls_addresses_to_decl_constructor, omp_finish_file): New
+	functions.
+	* omp-low.h (omp_finish_file): Declare.
+	* toplev.c: Include "omp-low.h".
+	(compile_file): Call omp_finish_file.
+	* target.def (record_offload_symbol): New hook.
+	* doc/tm.texi.in (TARGET_RECORD_OFFLOAD_SYMBOL): Add.
+	* doc/tm.texi: Regenerate.
+	* configure.ac (ENABLE_OFFLOADING): Define if we have offload_targets.
+	* configure: Regenerate.
+	* config.in: Regenerate.
+
 	* config/darwin.c: Include "lto-section-names.h".
 	(LTO_SEGMENT_NAME): Don't define.
 	* config/i386/winnt.c: Include "lto-section-names.h".
Index: gcc/config.in
===================================================================
--- gcc/config.in	(revision 208715)
+++ gcc/config.in	(working copy)
@@ -139,6 +139,12 @@
 #endif
 
 
+/* Define this to enable support for offloading. */
+#ifndef USED_FOR_TARGET
+#undef ENABLE_OFFLOADING
+#endif
+
+
 /* Define to enable plugin support. */
 #ifndef USED_FOR_TARGET
 #undef ENABLE_PLUGIN
Index: gcc/configure
===================================================================
--- gcc/configure	(revision 208715)
+++ gcc/configure	(working copy)
@@ -7363,6 +7363,11 @@ cat >>confdefs.h <<_ACEOF
 #define OFFLOAD_TARGETS "$offload_targets"
 _ACEOF
 
+if test x$offload_targets != x; then
+
+$as_echo "#define ENABLE_OFFLOADING 1" >>confdefs.h
+
+fi
 
 
 # Check whether --with-multilib-list was given.
@@ -18008,7 +18013,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18011 "configure"
+#line 18016 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -18114,7 +18119,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 18117 "configure"
+#line 18122 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
Index: gcc/configure.ac
===================================================================
--- gcc/configure.ac	(revision 208715)
+++ gcc/configure.ac	(working copy)
@@ -887,6 +887,10 @@ AC_SUBST(enable_accelerator)
 offload_targets=`echo $offload_targets | sed -e 's#,#:#'`
 AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets",
  [Define to hold the list of target names suitable for offloading.])
+if test x$offload_targets != x; then
+  AC_DEFINE(ENABLE_OFFLOADING, 1,
+    [Define this to enable support for offloading.])
+fi
 
 AC_ARG_WITH(multilib-list,
 [AS_HELP_STRING([--with-multilib-list], [select multilibs (AArch64, SH and x86-64 only)])],
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	(revision 208706)
+++ gcc/doc/tm.texi	(working copy)
@@ -11418,3 +11418,9 @@ If defined, this function returns an app
 @deftypefn {Target Hook} void TARGET_ATOMIC_ASSIGN_EXPAND_FENV (tree *@var{hold}, tree *@var{clear}, tree *@var{update})
 ISO C11 requires atomic compound assignments that may raise floating-point exceptions to raise exceptions corresponding to the arithmetic operation whose result was successfully stored in a compare-and-exchange sequence.  This requires code equivalent to calls to @code{feholdexcept}, @code{feclearexcept} and @code{feupdateenv} to be generated at appropriate points in the compare-and-exchange sequence.  This hook should set @code{*@var{hold}} to an expression equivalent to the call to @code{feholdexcept}, @code{*@var{clear}} to an expression equivalent to the call to @code{feclearexcept} and @code{*@var{update}} to an expression equivalent to the call to @code{feupdateenv}.  The three expressions are @code{NULL_TREE} on entry to the hook and may be left as @code{NULL_TREE} if no code is required in a particular place.  The default implementation leaves all three expressions as @code{NULL_TREE}.  The @code{__atomic_feraiseexcept} function from @code{libatomic} may be of use as part of the code generated in @code{*@var{update}}.
 @end deftypefn
+
+@deftypefn {Target Hook} void TARGET_RECORD_OFFLOAD_SYMBOL (tree)
+Used when offloaded functions are seen in the compilation unit and no named
+sections are available.  It is called once for each symbol that must be
+recorded in the offload function and variable table.
+@end deftypefn
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in	(revision 208706)
+++ gcc/doc/tm.texi.in	(working copy)
@@ -8414,3 +8414,5 @@ and the associated definitions of those
 @hook TARGET_ATOMIC_ALIGN_FOR_MODE
 
 @hook TARGET_ATOMIC_ASSIGN_EXPAND_FENV
+
+@hook TARGET_RECORD_OFFLOAD_SYMBOL
Index: gcc/omp-low.c
===================================================================
--- gcc/omp-low.c	(revision 208706)
+++ gcc/omp-low.c	(working copy)
@@ -64,6 +64,7 @@ along with GCC; see the file COPYING3.
 #include "optabs.h"
 #include "cfgloop.h"
 #include "target.h"
+#include "common/common-target.h"
 #include "omp-low.h"
 #include "gimple-low.h"
 #include "tree-cfgcleanup.h"
@@ -8671,19 +8672,22 @@ expand_omp_target (struct omp_region *re
     }
 
   gimple g;
-  /* FIXME: This will be address of
-     extern char __OPENMP_TARGET__[] __attribute__((visibility ("hidden")))
-     symbol, as soon as the linker plugin is able to create it for us.  */
-  tree openmp_target = build_zero_cst (ptr_type_node);
+  tree openmp_target
+    = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+		  get_identifier ("__OPENMP_TARGET__"), ptr_type_node);
+  TREE_PUBLIC (openmp_target) = 1;
+  DECL_EXTERNAL (openmp_target) = 1;
   if (kind == GF_OMP_TARGET_KIND_REGION)
     {
       tree fnaddr = build_fold_addr_expr (child_fn);
-      g = gimple_build_call (builtin_decl_explicit (start_ix), 7,
-			     device, fnaddr, openmp_target, t1, t2, t3, t4);
+      g = gimple_build_call (builtin_decl_explicit (start_ix), 7, device,
+			     fnaddr, build_fold_addr_expr (openmp_target),
+			     t1, t2, t3, t4);
     }
   else
-    g = gimple_build_call (builtin_decl_explicit (start_ix), 6,
-			   device, openmp_target, t1, t2, t3, t4);
+    g = gimple_build_call (builtin_decl_explicit (start_ix), 6, device,
+			   build_fold_addr_expr (openmp_target),
+			   t1, t2, t3, t4);
   gimple_set_location (g, gimple_location (entry_stmt));
   gsi_insert_before (&gsi, g, GSI_SAME_STMT);
   if (kind != GF_OMP_TARGET_KIND_REGION)
@@ -12801,4 +12805,139 @@ make_pass_omp_simd_clone (gcc::context *
   return new pass_omp_simd_clone (ctxt);
 }
 
+/* Helper function for omp_finish_file routine.
+   Takes decls from V_DECLS and adds their addresses and sizes to
+   constructor-vector V_CTOR.  It will be later used as DECL_INIT for decl
+   representing a global symbol for OpenMP descriptor.  */
+static void
+add_decls_addresses_to_decl_constructor (vec<tree, va_gc> *v_decls,
+					 vec<constructor_elt, va_gc> *v_ctor)
+{
+  unsigned len = vec_safe_length (v_decls);
+  for (unsigned i = 0; i < len; i++)
+    {
+      tree it = (*v_decls)[i];
+      bool is_function = TREE_CODE (it) != VAR_DECL;
+
+      CONSTRUCTOR_APPEND_ELT (v_ctor, NULL_TREE, build_fold_addr_expr (it));
+      if (!is_function)
+	CONSTRUCTOR_APPEND_ELT (v_ctor, NULL_TREE,
+				fold_convert (const_ptr_type_node,
+					      DECL_SIZE (it)));
+    }
+}
+
+/* Create new symbol containing (address, size) pairs for omp-marked
+   functions and global variables.  */
+void
+omp_finish_file (void)
+{
+  struct cgraph_node *node;
+  struct varpool_node *vnode;
+  const char *funcs_section_name = ".offload_func_table_section";
+  const char *vars_section_name = ".offload_var_table_section";
+  vec<tree, va_gc> *v_funcs, *v_vars;
+
+  vec_alloc (v_vars, 0);
+  vec_alloc (v_funcs, 0);
+
+  /* Collect all omp-target functions.  */
+  FOR_EACH_DEFINED_FUNCTION (node)
+    {
+      /* TODO: This check could fail on functions, created by omp
+	 parallel/task pragmas.  It's better to name outlined for offloading
+	 functions in some different way and to check here the function name.
+	 It could be something like "*_omp_tgtfn" in contrast with "*_omp_fn"
+	 for functions from omp parallel/task pragmas.  */
+      if (!lookup_attribute ("omp declare target",
+			     DECL_ATTRIBUTES (node->decl))
+	  || !DECL_ARTIFICIAL (node->decl))
+	continue;
+      vec_safe_push (v_funcs, node->decl);
+    }
+  /* Collect all omp-target global variables.  */
+  FOR_EACH_DEFINED_VARIABLE (vnode)
+    {
+      if (!lookup_attribute ("omp declare target",
+			     DECL_ATTRIBUTES (vnode->decl))
+	  || TREE_CODE (vnode->decl) != VAR_DECL
+	  || DECL_SIZE (vnode->decl) == 0)
+	continue;
+
+      vec_safe_push (v_vars, vnode->decl);
+    }
+  unsigned num_vars = vec_safe_length (v_vars);
+  unsigned num_funcs = vec_safe_length (v_funcs);
+
+  if (num_vars == 0 && num_funcs == 0)
+    return;
+
+#ifdef ACCEL_COMPILER
+  /* Decls are placed in reversed order in fat-objects, so we need to
+     revert them back if we compile target.  */
+  for (unsigned i = 0; i < num_funcs / 2; i++)
+    {
+      tree it = (*v_funcs)[i];
+      (*v_funcs)[i] = (*v_funcs)[num_funcs - i - 1];
+      (*v_funcs)[num_funcs - i - 1] = it;
+    }
+  for (unsigned i = 0; i < num_vars / 2; i++)
+    {
+      tree it = (*v_vars)[i];
+      (*v_vars)[i] = (*v_vars)[num_vars - i - 1];
+      (*v_vars)[num_vars - i - 1] = it;
+    }
+#endif
+
+  if (targetm_common.have_named_sections)
+    {
+      vec<constructor_elt, va_gc> *v_f, *v_v;
+      vec_alloc (v_f, num_funcs);
+      vec_alloc (v_v, num_vars * 2);
+
+      add_decls_addresses_to_decl_constructor (v_funcs, v_f);
+      add_decls_addresses_to_decl_constructor (v_vars, v_v);
+
+      tree vars_decl_type = build_array_type_nelts (pointer_sized_int_node,
+						    num_vars * 2);
+      tree funcs_decl_type = build_array_type_nelts (pointer_sized_int_node,
+						     num_funcs);
+      TYPE_ALIGN (vars_decl_type) = TYPE_ALIGN (pointer_sized_int_node);
+      TYPE_ALIGN (funcs_decl_type) = TYPE_ALIGN (pointer_sized_int_node);
+      tree ctor_v = build_constructor (vars_decl_type, v_v);
+      tree ctor_f = build_constructor (funcs_decl_type, v_f);
+      TREE_CONSTANT (ctor_v) = TREE_CONSTANT (ctor_f) = 1;
+      TREE_STATIC (ctor_v) = TREE_STATIC (ctor_f) = 1;
+      tree funcs_decl = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+				    get_identifier (".omp_func_table"),
+				    funcs_decl_type);
+      tree vars_decl = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+				   get_identifier (".omp_var_table"),
+				   vars_decl_type);
+      TREE_STATIC (funcs_decl) = TREE_STATIC (vars_decl) = 1;
+      DECL_INITIAL (funcs_decl) = ctor_f;
+      DECL_INITIAL (vars_decl) = ctor_v;
+      DECL_SECTION_NAME (funcs_decl)
+	= build_string (strlen (funcs_section_name), funcs_section_name);
+      DECL_SECTION_NAME (vars_decl)
+	= build_string (strlen (vars_section_name), vars_section_name);
+ 
+      varpool_assemble_decl (varpool_node_for_decl (vars_decl));
+      varpool_assemble_decl (varpool_node_for_decl (funcs_decl));
+   }
+  else
+    {
+      for (unsigned i = 0; i < num_funcs; i++)
+	{
+	  tree it = (*v_funcs)[i];
+	  targetm.record_offload_symbol (it);
+	}  
+      for (unsigned i = 0; i < num_funcs; i++)
+	{
+	  tree it = (*v_vars)[i];
+	  targetm.record_offload_symbol (it);
+	}  
+    }
+}
+
 #include "gt-omp-low.h"
Index: gcc/omp-low.h
===================================================================
--- gcc/omp-low.h	(revision 208706)
+++ gcc/omp-low.h	(working copy)
@@ -27,5 +27,6 @@ extern void omp_expand_local (basic_bloc
 extern void free_omp_regions (void);
 extern tree omp_reduction_init (tree, tree);
 extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *);
+extern void omp_finish_file (void);
 
 #endif /* GCC_OMP_LOW_H */
Index: gcc/target.def
===================================================================
--- gcc/target.def	(revision 208706)
+++ gcc/target.def	(working copy)
@@ -1772,6 +1772,14 @@ HOOK_VECTOR_END (vectorize)
 #undef HOOK_PREFIX
 #define HOOK_PREFIX "TARGET_"
 
+DEFHOOK
+(record_offload_symbol,
+ "Used when offloaded functions are seen in the compilation unit and no named\n\
+sections are available.  It is called once for each symbol that must be\n\
+recorded in the offload function and variable table.",
+ void, (tree),
+ hook_void_tree)
+
 /* Allow target specific overriding of option settings after options have
   been changed by an attribute or pragma or when it is reset at the
   end of the code affected by an attribute or pragma.  */
Index: gcc/toplev.c
===================================================================
--- gcc/toplev.c	(revision 208706)
+++ gcc/toplev.c	(working copy)
@@ -79,6 +79,7 @@ along with GCC; see the file COPYING3.
 #include "context.h"
 #include "pass_manager.h"
 #include "optabs.h"
+#include "omp-low.h"
 
 #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO)
 #include "dbxout.h"
@@ -577,6 +578,8 @@ compile_file (void)
       if (flag_sanitize & SANITIZE_THREAD)
 	tsan_finish_file ();
 
+      omp_finish_file ();
+
       output_shared_constant_pool ();
       output_object_blocks ();
       finish_tm_clone_pairs ();
Index: libgcc/crtstuff.c
===================================================================
--- libgcc/crtstuff.c	(revision 208706)
+++ libgcc/crtstuff.c	(working copy)
@@ -311,6 +311,15 @@ register_tm_clones (void)
 }
 #endif /* USE_TM_CLONE_REGISTRY */
 
+#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
+void *_omp_func_table[0]
+  __attribute__ ((__used__, visibility ("protected"),
+		  section (".offload_func_table_section"))) = { };
+void *_omp_var_table[0]
+  __attribute__ ((__used__, visibility ("protected"),
+		  section (".offload_var_table_section"))) = { };
+#endif
+
 #if defined(INIT_SECTION_ASM_OP) || defined(INIT_ARRAY_SECTION_ASM_OP)
 
 #ifdef OBJECT_FORMAT_ELF
@@ -752,6 +761,23 @@ __do_global_ctors (void)
 #error "What are you doing with crtstuff.c, then?"
 #endif
 
+#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
+void *_omp_funcs_end[0]
+  __attribute__ ((__used__, visibility ("protected"),
+		  section (".offload_func_table_section"))) = { };
+void *_omp_vars_end[0]
+  __attribute__ ((__used__, visibility ("protected"),
+		  section (".offload_var_table_section"))) = { };
+extern void *_omp_func_table[];
+extern void *_omp_var_table[];
+void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("protected"))) =
+{
+  &_omp_func_table, &_omp_funcs_end,
+  &_omp_var_table, &_omp_vars_end
+};
+#endif
+
+
 #else /* ! CRT_BEGIN && ! CRT_END */
 #error "One of CRT_BEGIN or CRT_END must be defined."
 #endif

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-20 17:01 [gomp4] Add tables generation Bernd Schmidt
@ 2014-03-20 19:24 ` Jakub Jelinek
  2014-03-21 15:21   ` Bernd Schmidt
  2014-03-27 13:44 ` Ilya Verbin
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 48+ messages in thread
From: Jakub Jelinek @ 2014-03-20 19:24 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches, Ilya Verbin, Michael Zolotukhin

On Thu, Mar 20, 2014 at 05:50:13PM +0100, Bernd Schmidt wrote:
> --- libgcc/crtstuff.c	(revision 208706)
> +++ libgcc/crtstuff.c	(working copy)
> @@ -311,6 +311,15 @@ register_tm_clones (void)
>  }
>  #endif /* USE_TM_CLONE_REGISTRY */
>  
> +#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
> +void *_omp_func_table[0]
> +  __attribute__ ((__used__, visibility ("protected"),
> +		  section (".offload_func_table_section"))) = { };
> +void *_omp_var_table[0]
> +  __attribute__ ((__used__, visibility ("protected"),
> +		  section (".offload_var_table_section"))) = { };
> +#endif
> +
>  #if defined(INIT_SECTION_ASM_OP) || defined(INIT_ARRAY_SECTION_ASM_OP)
>  
>  #ifdef OBJECT_FORMAT_ELF
> @@ -752,6 +761,23 @@ __do_global_ctors (void)
>  #error "What are you doing with crtstuff.c, then?"
>  #endif
>  
> +#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
> +void *_omp_funcs_end[0]
> +  __attribute__ ((__used__, visibility ("protected"),
> +		  section (".offload_func_table_section"))) = { };
> +void *_omp_vars_end[0]
> +  __attribute__ ((__used__, visibility ("protected"),
> +		  section (".offload_var_table_section"))) = { };
> +extern void *_omp_func_table[];
> +extern void *_omp_var_table[];
> +void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("protected"))) =
> +{
> +  &_omp_func_table, &_omp_funcs_end,
> +  &_omp_var_table, &_omp_vars_end
> +};
> +#endif
> +
> +
>  #else /* ! CRT_BEGIN && ! CRT_END */
>  #error "One of CRT_BEGIN or CRT_END must be defined."
>  #endif

I don't like these libgcc changes at all.

First of all, we should find a way which has no runtime costs for at least
programs not compiled with -fopenmp/-fopenacc at all, preferrably no runtime
cost for any program or shared library that actually doesn't contain any
offloading code.  The above costs every single binary/shared library 5
exported symbols (with the worst ever visibility, protected should basically
never be used, it is even more costly than normal symbol visibility, why it
isn't just hidden?) and 4 * sizeof (void *) bytes in data section and 4
runtime relocations (with the protected visibility costly ones).

When we were discussing the design last year, my strong preference was that
either this lives in some other crt object that mkoffload/linker plugin adds
to link, or that it would be completely mkoffload synthetized.

Also, I'd prefer if __OPENMP_TARGET__ header was as compact as possible for the
case when there is nothing to offload (ideally, if __OPENMP_TARGET__ symbol
is never referenced, not create it at all, if it is referenced, but there is
nothing to offload, say just a single 0 byte, otherwise say an uleb128
number how many different kinds of offload data there are and then for each
one some identification which offload it is for, the tables, where to find
it.

	Jakub

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-20 19:24 ` Jakub Jelinek
@ 2014-03-21 15:21   ` Bernd Schmidt
  2014-03-21 15:28     ` Jakub Jelinek
  0 siblings, 1 reply; 48+ messages in thread
From: Bernd Schmidt @ 2014-03-21 15:21 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: GCC Patches, Ilya Verbin, Michael Zolotukhin

[-- Attachment #1: Type: text/plain, Size: 520 bytes --]

On 03/20/2014 07:56 PM, Jakub Jelinek wrote:
> When we were discussing the design last year, my strong preference was that
> either this lives in some other crt object that mkoffload/linker plugin adds
> to link, or that it would be completely mkoffload synthetized.

mkoffload is only concerned with generating target images. These 
fragments are for the host tables.

How's this? It moves everything to ompbegin.o/ompend.o and only links in 
these files if we have produced at least one target offload image.


Bernd


[-- Attachment #2: ompbeginend.diff --]
[-- Type: text/x-patch, Size: 11158 bytes --]

Index: gomp-4_0-branch/gcc/lto-wrapper.c
===================================================================
--- gomp-4_0-branch.orig/gcc/lto-wrapper.c
+++ gomp-4_0-branch/gcc/lto-wrapper.c
@@ -67,6 +67,7 @@ static unsigned int nr;
 static char **input_names;
 static char **output_names;
 static char **offload_names;
+static const char *ompbegin, *ompend;
 static char *makefile;
 
 const char tool_name[] = "lto-wrapper";
@@ -479,6 +480,61 @@ compile_images_for_openmp_targets (unsig
   free_array_of_ptrs ((void**) names, num_targets);
 }
 
+/* Copy a file from SRC to DEST.  */
+static void
+copy_file (const char *dest, const char *src)
+{
+  FILE *d = fopen (dest, "wb");
+  FILE *s = fopen (src, "rb");
+  char buffer[512];
+  while (!feof (s))
+    {
+      size_t len = fread (buffer, 1, 512, s);
+      if (ferror (s) != 0)
+	fatal ("reading input file");
+      if (len > 0)
+	{
+	  fwrite (buffer, 1, len, d);
+	  if (ferror (d) != 0)
+	    fatal ("writing output file");
+	}
+    }
+}
+
+/* Find the omp_begin.o and omp_end.o files in LIBRARY_PATH, make copies
+   and store the names of the copies in ompbegin and ompend.  */
+
+static void
+find_ompbeginend (void)
+{
+  char **paths;
+  const char *library_path = getenv ("LIBRARY_PATH");
+  if (library_path == NULL)
+    return;
+  int n_paths = parse_env_var (library_path, &paths, "/ompbegin.o");
+
+  for (int i = 0; i < n_paths; i++)
+    if (access_check (paths[i], R_OK) == 0)
+      {
+	size_t len = strlen (paths[i]);
+	char *tmp = xstrdup (paths[i]);
+	strcpy (paths[i] + len - 7, "end.o");
+	if (access_check (paths[i], R_OK) != 0)
+	  fatal ("installation error, can't find ompend.o");
+	/* The linker will delete the filenames we give it, so make
+	   copies.  */
+	const char *omptmp1 = make_temp_file (".o");
+	const char *omptmp2 = make_temp_file (".o");
+	copy_file (omptmp1, tmp);
+	ompbegin = omptmp1;
+	copy_file (omptmp2, paths[i]);
+	ompend = oindmptmp2;
+	free (tmp);
+	break;
+      }
+
+  free_array_of_ptrs ((void**) paths, n_paths);
+}
 
 /* Execute gcc. ARGC is the number of arguments. ARGV contains the arguments. */
 
@@ -964,6 +1020,7 @@ cont:
 	  compile_images_for_openmp_targets (argc, argv);
 	  if (offload_names)
 	    {
+	      find_ompbeginend ();
 	      for (i = 0; offload_names[i]; i++)
 		{
 		  fputs (offload_names[i], stdout);
@@ -972,12 +1029,23 @@ cont:
 	      free_array_of_ptrs ((void **)offload_names, i);
 	    }
 	}
+      if (ompbegin)
+	{
+	  fputs (ompbegin, stdout);
+	  putc ('\n', stdout);
+	}
+
       for (i = 0; i < nr; ++i)
 	{
 	  fputs (output_names[i], stdout);
 	  putc ('\n', stdout);
 	  free (input_names[i]);
 	}
+      if (ompend)
+	{
+	  fputs (ompend, stdout);
+	  putc ('\n', stdout);
+	}
       nr = 0;
       free (output_names);
       free (input_names);
Index: gomp-4_0-branch/libgcc/configure
===================================================================
--- gomp-4_0-branch.orig/libgcc/configure
+++ gomp-4_0-branch/libgcc/configure
@@ -566,6 +566,7 @@ sfp_machine_header
 set_use_emutls
 set_have_cc_tls
 vis_hide
+enable_accelerator
 fixed_point
 enable_decimal_float
 decimal_float
@@ -664,6 +665,8 @@ with_build_libsubdir
 enable_decimal_float
 with_system_libunwind
 enable_sjlj_exceptions
+enable_accelerator
+enable_offload_targets
 enable_tls
 '
       ac_precious_vars='build_alias
@@ -1301,6 +1304,9 @@ Optional Features:
 			to use
   --enable-sjlj-exceptions
                           force use of builtin_setjmp for exceptions
+  --enable-accelerator    build accelerator [ARG={no,device-triplet}]
+  --enable-offload-targets=LIST
+                          enable offloading to devices from LIST
   --enable-tls            Use thread-local storage [default=yes]
 
 Optional Packages:
@@ -4357,6 +4363,43 @@ esac
 # Collect host-machine-specific information.
 . ${srcdir}/config.host
 
+offload_targets=
+# Check whether --enable-accelerator was given.
+if test "${enable_accelerator+set}" = set; then :
+  enableval=$enable_accelerator;
+  case $enable_accelerator in
+  no) ;;
+  *)
+    offload_targets=$enable_accelerator
+    ;;
+  esac
+
+fi
+
+
+
+# Check whether --enable-offload-targets was given.
+if test "${enable_offload_targets+set}" = set; then :
+  enableval=$enable_offload_targets;
+  if test x$enable_offload_targets = x; then
+    as_fn_error "no offload targets specified" "$LINENO" 5
+  else
+    if test x$offload_targets = x; then
+      offload_targets=$enable_offload_targets
+    else
+      offload_targets=$offload_targets,$enable_offload_targets
+    fi
+  fi
+
+else
+  enable_accelerator=no
+fi
+
+
+if test x$offload_targets != x; then
+  extra_parts="${extra_parts} ompbegin.o ompend.o"
+fi
+
 # Check if Solaris/x86 linker supports ZERO terminator unwind entries.
 # This is after config.host so we can augment tmake_file.
 # Link with -nostartfiles -nodefaultlibs since neither are present while
Index: gomp-4_0-branch/libgcc/configure.ac
===================================================================
--- gomp-4_0-branch.orig/libgcc/configure.ac
+++ gomp-4_0-branch/libgcc/configure.ac
@@ -307,6 +307,38 @@ esac
 # Collect host-machine-specific information.
 . ${srcdir}/config.host
 
+offload_targets=
+AC_ARG_ENABLE(accelerator,
+[AS_HELP_STRING([--enable-accelerator], [build accelerator @<:@ARG={no,device-triplet}@:>@])],
+[
+  case $enable_accelerator in
+  no) ;;
+  *)
+    offload_targets=$enable_accelerator
+    ;;
+  esac
+], [])
+AC_SUBST(enable_accelerator)
+
+AC_ARG_ENABLE(offload-targets,
+[AS_HELP_STRING([--enable-offload-targets=LIST],
+ [enable offloading to devices from LIST])],
+[
+  if test x$enable_offload_targets = x; then
+    AC_MSG_ERROR([no offload targets specified])
+  else
+    if test x$offload_targets = x; then
+      offload_targets=$enable_offload_targets
+    else
+      offload_targets=$offload_targets,$enable_offload_targets
+    fi
+  fi
+], [enable_accelerator=no])
+AC_SUBST(enable_accelerator)
+if test x$offload_targets != x; then
+  extra_parts="${extra_parts} ompbegin.o ompend.o"
+fi
+
 # Check if Solaris/x86 linker supports ZERO terminator unwind entries.
 # This is after config.host so we can augment tmake_file.
 # Link with -nostartfiles -nodefaultlibs since neither are present while
Index: gomp-4_0-branch/libgcc/crtstuff.c
===================================================================
--- gomp-4_0-branch.orig/libgcc/crtstuff.c
+++ gomp-4_0-branch/libgcc/crtstuff.c
@@ -311,15 +311,6 @@ register_tm_clones (void)
 }
 #endif /* USE_TM_CLONE_REGISTRY */
 
-#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
-void *_omp_func_table[0]
-  __attribute__ ((__used__, visibility ("protected"),
-		  section (".offload_func_table_section"))) = { };
-void *_omp_var_table[0]
-  __attribute__ ((__used__, visibility ("protected"),
-		  section (".offload_var_table_section"))) = { };
-#endif
-
 #if defined(INIT_SECTION_ASM_OP) || defined(INIT_ARRAY_SECTION_ASM_OP)
 
 #ifdef OBJECT_FORMAT_ELF
@@ -761,22 +752,6 @@ __do_global_ctors (void)
 #error "What are you doing with crtstuff.c, then?"
 #endif
 
-#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
-void *_omp_funcs_end[0]
-  __attribute__ ((__used__, visibility ("protected"),
-		  section (".offload_func_table_section"))) = { };
-void *_omp_vars_end[0]
-  __attribute__ ((__used__, visibility ("protected"),
-		  section (".offload_var_table_section"))) = { };
-extern void *_omp_func_table[];
-extern void *_omp_var_table[];
-void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("protected"))) =
-{
-  &_omp_func_table, &_omp_funcs_end,
-  &_omp_var_table, &_omp_vars_end
-};
-#endif
-
 
 #else /* ! CRT_BEGIN && ! CRT_END */
 #error "One of CRT_BEGIN or CRT_END must be defined."
Index: gomp-4_0-branch/libgcc/Makefile.in
===================================================================
--- gomp-4_0-branch.orig/libgcc/Makefile.in
+++ gomp-4_0-branch/libgcc/Makefile.in
@@ -975,6 +975,12 @@ crtbegin$(objext): $(srcdir)/crtstuff.c
 crtend$(objext): $(srcdir)/crtstuff.c
 	$(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_END
 
+ompbegin$(objext): $(srcdir)/ompstuff.c
+	$(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_BEGIN
+
+ompend$(objext): $(srcdir)/ompstuff.c
+	$(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_END
+
 # These are versions of crtbegin and crtend for shared libraries.
 crtbeginS$(objext): $(srcdir)/crtstuff.c
 	$(crt_compile) $(CRTSTUFF_T_CFLAGS_S) -c $< -DCRT_BEGIN -DCRTSTUFFS_O
Index: gomp-4_0-branch/libgcc/ompstuff.c
===================================================================
--- /dev/null
+++ gomp-4_0-branch/libgcc/ompstuff.c
@@ -0,0 +1,73 @@
+/* Specialized bits of code needed for the OpenMP offloading tables.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+/* Target machine header files require this define. */
+#define IN_LIBGCC2
+
+/* FIXME: Including auto-host is incorrect, but until we have
+   identified the set of defines that need to go into auto-target.h,
+   this will have to do.  */
+#include "auto-host.h"
+#undef pid_t
+#undef rlim_t
+#undef ssize_t
+#undef vfork
+#include "tconfig.h"
+#include "tsystem.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "libgcc_tm.h"
+
+#ifdef CRT_BEGIN
+
+#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
+void *_omp_func_table[0]
+  __attribute__ ((__used__, visibility ("protected"),
+		  section (".offload_func_table_section"))) = { };
+void *_omp_var_table[0]
+  __attribute__ ((__used__, visibility ("protected"),
+		  section (".offload_var_table_section"))) = { };
+#endif
+
+#elif defined CRT_END
+
+#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
+void *_omp_funcs_end[0]
+  __attribute__ ((__used__, visibility ("protected"),
+		  section (".offload_func_table_section"))) = { };
+void *_omp_vars_end[0]
+  __attribute__ ((__used__, visibility ("protected"),
+		  section (".offload_var_table_section"))) = { };
+extern void *_omp_func_table[];
+extern void *_omp_var_table[];
+void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("protected"))) =
+{
+  &_omp_func_table, &_omp_funcs_end,
+  &_omp_var_table, &_omp_vars_end
+};
+#endif
+
+#else /* ! CRT_BEGIN && ! CRT_END */
+#error "One of CRT_BEGIN or CRT_END must be defined."
+#endif

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-21 15:21   ` Bernd Schmidt
@ 2014-03-21 15:28     ` Jakub Jelinek
  2014-03-21 16:16       ` Bernd Schmidt
  2014-04-04  9:33       ` Bernd Schmidt
  0 siblings, 2 replies; 48+ messages in thread
From: Jakub Jelinek @ 2014-03-21 15:28 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: GCC Patches, Ilya Verbin, Michael Zolotukhin

On Fri, Mar 21, 2014 at 04:13:45PM +0100, Bernd Schmidt wrote:
> On 03/20/2014 07:56 PM, Jakub Jelinek wrote:
> >When we were discussing the design last year, my strong preference was that
> >either this lives in some other crt object that mkoffload/linker plugin adds
> >to link, or that it would be completely mkoffload synthetized.
> 
> mkoffload is only concerned with generating target images. These
> fragments are for the host tables.
> 
> How's this? It moves everything to ompbegin.o/ompend.o and only
> links in these files if we have produced at least one target offload
> image.

I'd call the files crtompbegin.o/crtompend.o instead.
And, what is the exact reason why you are using protected visibility rather
than hidden?
Also, supposedly if you've used section names without . in them, the linker
itself would provide the symbols automatically and you wouldn't actually
need begin/end, but just one object that would reference the linker created
symbols.  Just use say __gnu_offload_whatever__ or similar section names.
As for the __OPENMP_TARGET__ header format, that can be certainly resolved
later on.

	Jakub

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-21 15:28     ` Jakub Jelinek
@ 2014-03-21 16:16       ` Bernd Schmidt
  2014-04-04  9:33       ` Bernd Schmidt
  1 sibling, 0 replies; 48+ messages in thread
From: Bernd Schmidt @ 2014-03-21 16:16 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: GCC Patches, Ilya Verbin, Michael Zolotukhin

On 03/21/2014 04:20 PM, Jakub Jelinek wrote:
> And, what is the exact reason why you are using protected visibility rather
> than hidden?
> Also, supposedly if you've used section names without . in them, the linker
> itself would provide the symbols automatically and you wouldn't actually
> need begin/end, but just one object that would reference the linker created
> symbols.  Just use say __gnu_offload_whatever__ or similar section names.

Hmm, okay. No real reason for any of these except things were set up 
like this in Michael Zolotukhin's original patch. I'll tweak it some more.


Bernd


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-20 17:01 [gomp4] Add tables generation Bernd Schmidt
  2014-03-20 19:24 ` Jakub Jelinek
@ 2014-03-27 13:44 ` Ilya Verbin
  2014-03-27 13:52   ` Bernd Schmidt
                     ` (2 more replies)
  2014-04-02  7:34 ` Thomas Schwinge
  2014-04-02  8:34 ` Thomas Schwinge
  3 siblings, 3 replies; 48+ messages in thread
From: Ilya Verbin @ 2014-03-27 13:44 UTC (permalink / raw)
  To: Bernd Schmidt, Jakub Jelinek; +Cc: GCC Patches, Michael Zolotukhin

>+#ifdef ACCEL_COMPILER
>+  /* Decls are placed in reversed order in fat-objects, so we need to
>+     revert them back if we compile target.  */
>...

Actually this change is incorrect.  If host binary is built with -flto, then
both host gcc and target gcc read decls from lto and target_lto sections in the
same order, and resulting tables are identical.
So, in this case there is no need to change the order.

But what if one wants to link non-lto host object files with a target image,
produced from target_lto sections?
In this case the order of host table, produced during ordinary compilation will
differ from the order of target table, produced during lto compilation.

Jakub, what do you think?


Here is a simple example with 4 functions and 4 global variables:

#define N 100

#pragma omp declare target
int arr1[N];
int arr2[N];
int arr3[N];
int arr4[N];
#pragma omp end declare target

void foo ()
{
  #pragma omp target
  for (int i = 0; i < N; i++)
    arr1[i] = 41 + i;

  #pragma omp target
  for (int i = 0; i < N; i++)
    arr2[i] = 42 + i;

  #pragma omp target
  for (int i = 0; i < N; i++)
    arr3[i] = 43 + i;

  #pragma omp target
  for (int i = 0; i < N; i++)
    arr4[i] = 44 + i;
}


I print DECL_NAME ((*v_funcs)[i]) and DECL_NAME ((*v_vars)[i]) in
omp_finish_file:

Host compilation:
$ gcc -std=c99 -fopenmp -flto -c test.c -o test.o

host func 0: foo._omp_fn.0
host func 1: foo._omp_fn.1
host func 2: foo._omp_fn.2
host func 3: foo._omp_fn.3
host var 0:  arr4
host var 1:  arr3
host var 2:  arr2
host var 3:  arr1

Host lto and target lto:
$ gcc -std=c99 -fopenmp -flto test.o -o test

host func 0: foo._omp_fn.3
host func 1: foo._omp_fn.2
host func 2: foo._omp_fn.1
host func 3: foo._omp_fn.0
host var 0:  arr4
host var 1:  arr3
host var 2:  arr2
host var 3:  arr1

target func 0: foo._omp_fn.3
target func 1: foo._omp_fn.2
target func 2: foo._omp_fn.1
target func 3: foo._omp_fn.0
target var 0:  arr4
target var 1:  arr3
target var 2:  arr2
target var 3:  arr1

The func tables produced during ordinary compilation and lto are different.

  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-27 13:44 ` Ilya Verbin
@ 2014-03-27 13:52   ` Bernd Schmidt
  2014-03-27 13:59   ` Bernd Schmidt
  2014-03-27 15:12   ` Jakub Jelinek
  2 siblings, 0 replies; 48+ messages in thread
From: Bernd Schmidt @ 2014-03-27 13:52 UTC (permalink / raw)
  To: Ilya Verbin, Jakub Jelinek; +Cc: GCC Patches, Michael Zolotukhin

On 03/27/2014 02:31 PM, Ilya Verbin wrote:
>> +#ifdef ACCEL_COMPILER
>> +  /* Decls are placed in reversed order in fat-objects, so we need to
>> +     revert them back if we compile target.  */
>> ...
>
> Actually this change is incorrect.  If host binary is built with -flto, then
> both host gcc and target gcc read decls from lto and target_lto sections in the
> same order, and resulting tables are identical.
> So, in this case there is no need to change the order.
>
> But what if one wants to link non-lto host object files with a target image,
> produced from target_lto sections?
> In this case the order of host table, produced during ordinary compilation will
> differ from the order of target table, produced during lto compilation.

I haven't looked into the ordering issue here (the reversing of the 
order is from Michael's original patch), because I still think the whole 
scheme can't work and I was intending to produce a testcase to 
demonstrate that. Looks like you saved me some time here :)

My suggestion would be to augment the tables with the unique-name scheme 
I posted previously. I think the objections against it were a little 
exaggerated, and it would ensure reliability.


Bernd

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-27 13:44 ` Ilya Verbin
  2014-03-27 13:52   ` Bernd Schmidt
@ 2014-03-27 13:59   ` Bernd Schmidt
  2014-03-27 15:12   ` Jakub Jelinek
  2 siblings, 0 replies; 48+ messages in thread
From: Bernd Schmidt @ 2014-03-27 13:59 UTC (permalink / raw)
  To: Ilya Verbin, Jakub Jelinek; +Cc: GCC Patches, Michael Zolotukhin

On 03/27/2014 02:31 PM, Ilya Verbin wrote:
>> +#ifdef ACCEL_COMPILER
>> +  /* Decls are placed in reversed order in fat-objects, so we need to
>> +     revert them back if we compile target.  */
>> ...
>
> Actually this change is incorrect.  If host binary is built with -flto, then
> both host gcc and target gcc read decls from lto and target_lto sections in the
> same order, and resulting tables are identical.
> So, in this case there is no need to change the order.
>
> But what if one wants to link non-lto host object files with a target image,
> produced from target_lto sections?
> In this case the order of host table, produced during ordinary compilation will
> differ from the order of target table, produced during lto compilation.

I haven't looked into the ordering issue here (the reversing of the 
order is from Michael's original patch), because I still think the whole 
scheme can't work and I was intending to produce a testcase to 
demonstrate that. Looks like you saved me some time here :)

My suggestion would be to augment the tables with the unique-name scheme 
I posted previously. I think the objections against it were a little 
exaggerated, and it would ensure reliability.


Bernd

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-27 13:44 ` Ilya Verbin
  2014-03-27 13:52   ` Bernd Schmidt
  2014-03-27 13:59   ` Bernd Schmidt
@ 2014-03-27 15:12   ` Jakub Jelinek
  2014-03-27 16:17     ` Ilya Verbin
  2 siblings, 1 reply; 48+ messages in thread
From: Jakub Jelinek @ 2014-03-27 15:12 UTC (permalink / raw)
  To: Ilya Verbin; +Cc: Bernd Schmidt, GCC Patches, Michael Zolotukhin

On Thu, Mar 27, 2014 at 05:31:29PM +0400, Ilya Verbin wrote:
> >+#ifdef ACCEL_COMPILER
> >+  /* Decls are placed in reversed order in fat-objects, so we need to
> >+     revert them back if we compile target.  */
> >...
> 
> Actually this change is incorrect.  If host binary is built with -flto, then
> both host gcc and target gcc read decls from lto and target_lto sections in the
> same order, and resulting tables are identical.
> So, in this case there is no need to change the order.
> 
> But what if one wants to link non-lto host object files with a target image,
> produced from target_lto sections?
> In this case the order of host table, produced during ordinary compilation will
> differ from the order of target table, produced during lto compilation.
> 
> Jakub, what do you think?

The tables need to be created before IPA, that way it really shouldn't
matter in what order you emit them.  E.g. the outlined target functions
could be added to the table during ompexp pass which actually creates the
outlined functions, the vars need to be added before target lto or host lto
is streamed.

	Jakub

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-27 15:12   ` Jakub Jelinek
@ 2014-03-27 16:17     ` Ilya Verbin
  2014-03-27 16:23       ` Jakub Jelinek
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Verbin @ 2014-03-27 16:17 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Schmidt, GCC Patches, Michael Zolotukhin

On 27 Mar 15:02, Jakub Jelinek wrote:
> The tables need to be created before IPA, that way it really shouldn't
> matter in what order you emit them.  E.g. the outlined target functions
> could be added to the table during ompexp pass which actually creates the
> outlined functions, the vars need to be added before target lto or host lto
> is streamed.

For host tables it's ok, but when target compiler will create tables with functions?
It reads bytecode from target_lto sections, so it never executes ompexp pass.

  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-27 16:17     ` Ilya Verbin
@ 2014-03-27 16:23       ` Jakub Jelinek
  2014-03-27 18:51         ` Ilya Verbin
  2014-04-17 18:44         ` Ilya Verbin
  0 siblings, 2 replies; 48+ messages in thread
From: Jakub Jelinek @ 2014-03-27 16:23 UTC (permalink / raw)
  To: Ilya Verbin; +Cc: Bernd Schmidt, GCC Patches, Michael Zolotukhin

On Thu, Mar 27, 2014 at 08:13:00PM +0400, Ilya Verbin wrote:
> On 27 Mar 15:02, Jakub Jelinek wrote:
> > The tables need to be created before IPA, that way it really shouldn't
> > matter in what order you emit them.  E.g. the outlined target functions
> > could be added to the table during ompexp pass which actually creates the
> > outlined functions, the vars need to be added before target lto or host lto
> > is streamed.
> 
> For host tables it's ok, but when target compiler will create tables with functions?
> It reads bytecode from target_lto sections, so it never executes ompexp pass.

Which is why the table created for host by the ompexp pass should be
streamed into the target_lto sections (marked specially somehow, special
attribute or whatever), and then corresponding target table created from
that, rather then created from some possibly different ordering there.

	Jakub

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-27 16:23       ` Jakub Jelinek
@ 2014-03-27 18:51         ` Ilya Verbin
  2014-04-17 18:44         ` Ilya Verbin
  1 sibling, 0 replies; 48+ messages in thread
From: Ilya Verbin @ 2014-03-27 18:51 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Schmidt, GCC Patches, Michael Zolotukhin

On 27 Mar 17:16, Jakub Jelinek wrote:
> Which is why the table created for host by the ompexp pass should be
> streamed into the target_lto sections (marked specially somehow, special
> attribute or whatever), and then corresponding target table created from
> that, rather then created from some possibly different ordering there.

Ok, this should work.  I'll rewrite tables generation.

  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-20 17:01 [gomp4] Add tables generation Bernd Schmidt
  2014-03-20 19:24 ` Jakub Jelinek
  2014-03-27 13:44 ` Ilya Verbin
@ 2014-04-02  7:34 ` Thomas Schwinge
  2014-04-02  8:36   ` Thomas Schwinge
  2014-04-02  8:34 ` Thomas Schwinge
  3 siblings, 1 reply; 48+ messages in thread
From: Thomas Schwinge @ 2014-04-02  7:34 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Ilya Verbin, Michael Zolotukhin, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 5233 bytes --]

Hi!

On Thu, 20 Mar 2014 17:50:13 +0100, Bernd Schmidt <bernds@codesourcery.com> wrote:
> This is based on Michael Zolotukhin's patch 2/3 from a while ago. It 
> adds functionality to build function/variable tables that will allow 
> libgomp to look up offload target code based on the address of the 
> corresponding host function. There are two alternatives, one based on 
> named sections, and one based on a target hook when named sections are 
> unavailable (as on ptx).
> 
> Committed on gomp-4_0-branch.

I see regressions in the libgomp testsuite for configurations where
offloading is not enabled:

    spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ [...]/source/libgomp/testsuite/libgomp.c/for-3.c -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -I[...]/build/x86_64-unknown-linux-gnu/./libgomp -I[...]/source/libgomp/testsuite/.. -fmessage-length=0 -fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 -fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o ./for-3.exe
    /tmp/ccGnT0ei.o: In function `main':
    for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__'
    collect2: error: ld returned 1 exit status

I suppose that's because even if...

> --- gcc/configure.ac	(revision 208715)
> +++ gcc/configure.ac	(working copy)
> @@ -887,6 +887,10 @@ AC_SUBST(enable_accelerator)
>  offload_targets=`echo $offload_targets | sed -e 's#,#:#'`
>  AC_DEFINE_UNQUOTED(OFFLOAD_TARGETS, "$offload_targets",
>   [Define to hold the list of target names suitable for offloading.])
> +if test x$offload_targets != x; then
> +  AC_DEFINE(ENABLE_OFFLOADING, 1,
> +    [Define this to enable support for offloading.])
> +fi

... offloading is not enabled, this...

> --- gcc/omp-low.c	(revision 208706)
> +++ gcc/omp-low.c	(working copy)
> @@ -8671,19 +8672,22 @@ expand_omp_target (struct omp_region *re
>      }
>  
>    gimple g;
> -  /* FIXME: This will be address of
> -     extern char __OPENMP_TARGET__[] __attribute__((visibility ("hidden")))
> -     symbol, as soon as the linker plugin is able to create it for us.  */
> -  tree openmp_target = build_zero_cst (ptr_type_node);
> +  tree openmp_target
> +    = build_decl (UNKNOWN_LOCATION, VAR_DECL,
> +		  get_identifier ("__OPENMP_TARGET__"), ptr_type_node);
> +  TREE_PUBLIC (openmp_target) = 1;
> +  DECL_EXTERNAL (openmp_target) = 1;
>    if (kind == GF_OMP_TARGET_KIND_REGION)
>      {
>        tree fnaddr = build_fold_addr_expr (child_fn);
> -      g = gimple_build_call (builtin_decl_explicit (start_ix), 7,
> -			     device, fnaddr, openmp_target, t1, t2, t3, t4);
> +      g = gimple_build_call (builtin_decl_explicit (start_ix), 7, device,
> +			     fnaddr, build_fold_addr_expr (openmp_target),
> +			     t1, t2, t3, t4);
>      }
>    else
> -    g = gimple_build_call (builtin_decl_explicit (start_ix), 6,
> -			   device, openmp_target, t1, t2, t3, t4);
> +    g = gimple_build_call (builtin_decl_explicit (start_ix), 6, device,
> +			   build_fold_addr_expr (openmp_target),
> +			   t1, t2, t3, t4);

... will now cause a reference to __OPENMP_TARGET__, but...

> --- libgcc/crtstuff.c	(revision 208706)
> +++ libgcc/crtstuff.c	(working copy)
> @@ -311,6 +311,15 @@ register_tm_clones (void)
>  }
>  #endif /* USE_TM_CLONE_REGISTRY */
>  
> +#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
> +void *_omp_func_table[0]
> +  __attribute__ ((__used__, visibility ("protected"),
> +		  section (".offload_func_table_section"))) = { };
> +void *_omp_var_table[0]
> +  __attribute__ ((__used__, visibility ("protected"),
> +		  section (".offload_var_table_section"))) = { };
> +#endif
> +
>  #if defined(INIT_SECTION_ASM_OP) || defined(INIT_ARRAY_SECTION_ASM_OP)
>  
>  #ifdef OBJECT_FORMAT_ELF
> @@ -752,6 +761,23 @@ __do_global_ctors (void)
>  #error "What are you doing with crtstuff.c, then?"
>  #endif
>  
> +#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
> +void *_omp_funcs_end[0]
> +  __attribute__ ((__used__, visibility ("protected"),
> +		  section (".offload_func_table_section"))) = { };
> +void *_omp_vars_end[0]
> +  __attribute__ ((__used__, visibility ("protected"),
> +		  section (".offload_var_table_section"))) = { };
> +extern void *_omp_func_table[];
> +extern void *_omp_var_table[];
> +void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("protected"))) =
> +{
> +  &_omp_func_table, &_omp_funcs_end,
> +  &_omp_var_table, &_omp_vars_end
> +};
> +#endif

... __OPENMP_TARGET__ is not being defined here for the
!ENABLE_OFFLOADING case.  In
<http://news.gmane.org/find-root.php?message_id=%3C20130905082455.GH23437%40tucnak.redhat.com%3E>,
Jakub had suggested this to be a weak symbol, so we'd get NULL in this
case, which would be what's needed here, I think?


Also, I'd suggest to rename __OPENMP_TARGET__ (and similar ones) to
__GNU_OFFLOAD__ (or similar).  As we're using this offloading stuff for
both OpenACC and OpenMP target, it makes sense to me to use a generic
name; we still have the chance to do so now while this stuff is not yet
in trunk.


Grüße,
 Thomas

[-- Attachment #2: Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-20 17:01 [gomp4] Add tables generation Bernd Schmidt
                   ` (2 preceding siblings ...)
  2014-04-02  7:34 ` Thomas Schwinge
@ 2014-04-02  8:34 ` Thomas Schwinge
  3 siblings, 0 replies; 48+ messages in thread
From: Thomas Schwinge @ 2014-04-02  8:34 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Ilya Verbin, Michael Zolotukhin, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 6441 bytes --]

Hi!

On Thu, 20 Mar 2014 17:50:13 +0100, Bernd Schmidt <bernds@codesourcery.com> wrote:
> This is based on Michael Zolotukhin's patch 2/3 from a while ago. It 
> adds functionality to build function/variable tables that will allow 
> libgomp to look up offload target code based on the address of the 
> corresponding host function. There are two alternatives, one based on 
> named sections, and one based on a target hook when named sections are 
> unavailable (as on ptx).
> 
> Committed on gomp-4_0-branch.

> --- gcc/omp-low.c	(revision 208706)
> +++ gcc/omp-low.c	(working copy)
> @@ -8671,19 +8672,22 @@ expand_omp_target (struct omp_region *re
>      }
>  
>    gimple g;
> -  /* FIXME: This will be address of
> -     extern char __OPENMP_TARGET__[] __attribute__((visibility ("hidden")))
> -     symbol, as soon as the linker plugin is able to create it for us.  */
> -  tree openmp_target = build_zero_cst (ptr_type_node);
> +  tree openmp_target
> +    = build_decl (UNKNOWN_LOCATION, VAR_DECL,
> +		  get_identifier ("__OPENMP_TARGET__"), ptr_type_node);
> +  TREE_PUBLIC (openmp_target) = 1;
> +  DECL_EXTERNAL (openmp_target) = 1;
>    if (kind == GF_OMP_TARGET_KIND_REGION)
>      {
>        tree fnaddr = build_fold_addr_expr (child_fn);
> -      g = gimple_build_call (builtin_decl_explicit (start_ix), 7,
> -			     device, fnaddr, openmp_target, t1, t2, t3, t4);
> +      g = gimple_build_call (builtin_decl_explicit (start_ix), 7, device,
> +			     fnaddr, build_fold_addr_expr (openmp_target),
> +			     t1, t2, t3, t4);
>      }
>    else
> -    g = gimple_build_call (builtin_decl_explicit (start_ix), 6,
> -			   device, openmp_target, t1, t2, t3, t4);
> +    g = gimple_build_call (builtin_decl_explicit (start_ix), 6, device,
> +			   build_fold_addr_expr (openmp_target),
> +			   t1, t2, t3, t4);

Committed in r209013:

commit 1f54e08135bd8be59438977b4edbc102e7cef2d7
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Wed Apr 2 08:28:54 2014 +0000

    Handle __OPENMP_TARGET__ symbol for OpenACC offloading functions, too.
    
    	gcc/
    	* omp-low.c (expand_oacc_offload): Handle __OPENMP_TARGET__
    	symbol.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@209013 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |  5 +++++
 gcc/omp-low.c      | 14 ++++++++------
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 1d35b58..8983632 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,8 @@
+2014-04-02  Thomas Schwinge  <thomas@codesourcery.com>
+
+	* omp-low.c (expand_oacc_offload): Handle __OPENMP_TARGET__
+	symbol.
+
 2014-03-20  Thomas Schwinge  <thomas@codesourcery.com>
 
 	* gimple.h (enum gf_mask): Add GF_OMP_FOR_KIND_OACC_LOOP.
diff --git gcc/omp-low.c gcc/omp-low.c
index a7b93bc..01eda9d 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -5138,13 +5138,15 @@ expand_oacc_offload (struct omp_region *region)
     }
 
   gimple g;
-  /* FIXME: This will be address of
-     extern char __OPENMP_TARGET__[] __attribute__((visibility ("hidden")))
-     symbol, as soon as the linker plugin is able to create it for us.  */
-  tree openmp_target = build_zero_cst (ptr_type_node);
+  tree openmp_target
+    = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+		  get_identifier ("__OPENMP_TARGET__"), ptr_type_node);
+  TREE_PUBLIC (openmp_target) = 1;
+  DECL_EXTERNAL (openmp_target) = 1;
   tree fnaddr = build_fold_addr_expr (child_fn);
-  g = gimple_build_call (builtin_decl_explicit (start_ix),
-			 10, device, fnaddr, openmp_target, t1, t2, t3, t4,
+  g = gimple_build_call (builtin_decl_explicit (start_ix), 10, device,
+			 fnaddr, build_fold_addr_expr (openmp_target),
+			 t1, t2, t3, t4,
 			 t_num_gangs, t_num_workers, t_vector_length);
   gimple_set_location (g, gimple_location (entry_stmt));
   gsi_insert_before (&gsi, g, GSI_SAME_STMT);


> +/* Create new symbol containing (address, size) pairs for omp-marked
> +   functions and global variables.  */
> +void
> +omp_finish_file (void)
> +{
> +  struct cgraph_node *node;
> +  struct varpool_node *vnode;
> +  const char *funcs_section_name = ".offload_func_table_section";
> +  const char *vars_section_name = ".offload_var_table_section";
> +  vec<tree, va_gc> *v_funcs, *v_vars;
> +
> +  vec_alloc (v_vars, 0);
> +  vec_alloc (v_funcs, 0);
> +
> +  [...]
> +  unsigned num_vars = vec_safe_length (v_vars);
> +  unsigned num_funcs = vec_safe_length (v_funcs);
> +  [...]
> +  if (targetm_common.have_named_sections)
> +    {
> +      [...]
> +   }
> +  else
> +    {
> +      for (unsigned i = 0; i < num_funcs; i++)
> +	{
> +	  tree it = (*v_funcs)[i];
> +	  targetm.record_offload_symbol (it);
> +	}  
> +      for (unsigned i = 0; i < num_funcs; i++)
> +	{
> +	  tree it = (*v_vars)[i];
> +	  targetm.record_offload_symbol (it);
> +	}  
> +    }
> +}

Committed in r209014:

commit abae7b762c0b9787dd21e863561af44472096eb3
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Wed Apr 2 08:29:07 2014 +0000

    Fix typo/copy'n'pasto.
    
    	gcc/
    	* omp-low.c (omp_finish_file): Use num_vars instead of num_funcs
    	when recording offload symbols v_vars.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@209014 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp | 3 +++
 gcc/omp-low.c      | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 8983632..64e0c35 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,8 @@
 2014-04-02  Thomas Schwinge  <thomas@codesourcery.com>
 
+	* omp-low.c (omp_finish_file): Use num_vars instead of num_funcs
+	when recording offload symbols v_vars.
+
 	* omp-low.c (expand_oacc_offload): Handle __OPENMP_TARGET__
 	symbol.
 
diff --git gcc/omp-low.c gcc/omp-low.c
index 01eda9d..6c803a8 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -12934,7 +12934,7 @@ omp_finish_file (void)
 	  tree it = (*v_funcs)[i];
 	  targetm.record_offload_symbol (it);
 	}  
-      for (unsigned i = 0; i < num_funcs; i++)
+      for (unsigned i = 0; i < num_vars; i++)
 	{
 	  tree it = (*v_vars)[i];
 	  targetm.record_offload_symbol (it);


Grüße,
 Thomas

[-- Attachment #2: Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-02  7:34 ` Thomas Schwinge
@ 2014-04-02  8:36   ` Thomas Schwinge
  2014-04-03 16:15     ` Bernd Schmidt
  0 siblings, 1 reply; 48+ messages in thread
From: Thomas Schwinge @ 2014-04-02  8:36 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Ilya Verbin, Michael Zolotukhin, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 3258 bytes --]

Hi!

On Wed, 02 Apr 2014 09:34:29 +0200, I wrote:
> On Thu, 20 Mar 2014 17:50:13 +0100, Bernd Schmidt <bernds@codesourcery.com> wrote:
> > This is based on Michael Zolotukhin's patch 2/3 from a while ago. It 
> > adds functionality to build function/variable tables that will allow 
> > libgomp to look up offload target code based on the address of the 
> > corresponding host function. There are two alternatives, one based on 
> > named sections, and one based on a target hook when named sections are 
> > unavailable (as on ptx).
> > 
> > Committed on gomp-4_0-branch.
> 
> I see regressions in the libgomp testsuite for configurations where
> offloading is not enabled:
> 
>     spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ [...]/source/libgomp/testsuite/libgomp.c/for-3.c -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -I[...]/build/x86_64-unknown-linux-gnu/./libgomp -I[...]/source/libgomp/testsuite/.. -fmessage-length=0 -fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 -fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o ./for-3.exe
>     /tmp/ccGnT0ei.o: In function `main':
>     for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__'
>     collect2: error: ld returned 1 exit status
> 
> I suppose that's because [...]

Workaround committed in r209015:

commit 6a015f81a5fafe32cf45656e3de121f4088dbf41
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Wed Apr 2 08:29:17 2014 +0000

    Work around __OPENMP_TARGET__ not being defined for !ENABLE_OFFLOADING.
    
    	libgcc/
    	* crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to
    	NULL.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@209015 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgcc/ChangeLog.gomp | 10 ++++++++++
 libgcc/crtstuff.c     |  2 ++
 2 files changed, 12 insertions(+)

diff --git libgcc/ChangeLog.gomp libgcc/ChangeLog.gomp
new file mode 100644
index 0000000..7d08efa
--- /dev/null
+++ libgcc/ChangeLog.gomp
@@ -0,0 +1,10 @@
+2014-04-02  Thomas Schwinge  <thomas@codesourcery.com>
+
+	* crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to
+	NULL.
+\f
+Copyright (C) 2014 Free Software Foundation, Inc.
+
+Copying and distribution of this file, with or without modification,
+are permitted in any medium without royalty provided the copyright
+notice and this notice are preserved.
diff --git libgcc/crtstuff.c libgcc/crtstuff.c
index cda0bae..79af7f0 100644
--- libgcc/crtstuff.c
+++ libgcc/crtstuff.c
@@ -775,6 +775,8 @@ void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("protected"))) =
   &_omp_func_table, &_omp_funcs_end,
   &_omp_var_table, &_omp_vars_end
 };
+#else
+void **__OPENMP_TARGET__ __attribute__ ((__visibility__ ("protected"))) = NULL;
 #endif
 


> Also, I'd suggest to rename __OPENMP_TARGET__ (and similar ones) to
> __GNU_OFFLOAD__ (or similar).  As we're using this offloading stuff for
> both OpenACC and OpenMP target, it makes sense to me to use a generic
> name; we still have the chance to do so now while this stuff is not yet
> in trunk.


Grüße,
 Thomas

[-- Attachment #2: Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-02  8:36   ` Thomas Schwinge
@ 2014-04-03 16:15     ` Bernd Schmidt
  2014-04-03 16:53       ` Ilya Verbin
  2014-04-04  5:55       ` Thomas Schwinge
  0 siblings, 2 replies; 48+ messages in thread
From: Bernd Schmidt @ 2014-04-03 16:15 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Ilya Verbin, Michael Zolotukhin, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1094 bytes --]

On 04/02/2014 10:36 AM, Thomas Schwinge wrote:
>> I see regressions in the libgomp testsuite for configurations where
>> offloading is not enabled:
>>
>>      spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ [...]/source/libgomp/testsuite/libgomp.c/for-3.c -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -I[...]/build/x86_64-unknown-linux-gnu/./libgomp -I[...]/source/libgomp/testsuite/.. -fmessage-length=0 -fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 -fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o ./for-3.exe
>>      /tmp/ccGnT0ei.o: In function `main':
>>      for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__'
>>      collect2: error: ld returned 1 exit status
>>
>> I suppose that's because [...]
>
> Workaround committed in r209015:

>      	libgcc/
>      	* crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to
>      	NULL.

The patch below should be a better fix, making the references to 
__OPENMP_TARGET__ weak. Does this work for you?


Bernd


[-- Attachment #2: weak-omp.diff --]
[-- Type: text/x-patch, Size: 1908 bytes --]

Index: gcc/omp-low.c
===================================================================
--- gcc/omp-low.c	(revision 429741)
+++ gcc/omp-low.c	(working copy)
@@ -221,6 +221,28 @@ static tree scan_omp_1_op (tree *, int *
       *handled_ops_p = false; \
       break;
 
+static GTY(()) tree offload_symbol_decl;
+
+/* Get the __OPENMP_TARGET__ symbol.  */
+static tree
+get_offload_symbol_decl (void)
+{
+  if (!offload_symbol_decl)
+    {
+      tree decl = build_decl (UNKNOWN_LOCATION, VAR_DECL,
+			      get_identifier ("__OPENMP_TARGET__"),
+			      ptr_type_node);
+      TREE_PUBLIC (decl) = 1;
+      DECL_EXTERNAL (decl) = 1;
+      DECL_WEAK (decl) = 1;
+      DECL_ATTRIBUTES (decl)
+	= tree_cons (get_identifier ("weak"),
+		     NULL_TREE, DECL_ATTRIBUTES (decl));
+      offload_symbol_decl = decl;
+    }
+  return offload_symbol_decl;
+}
+
 /* Convenience function for calling scan_omp_1_op on tree operands.  */
 
 static inline tree
@@ -5148,11 +5170,7 @@ expand_oacc_offload (struct omp_region *
     }
 
   gimple g;
-  tree openmp_target
-    = build_decl (UNKNOWN_LOCATION, VAR_DECL,
-		  get_identifier ("__OPENMP_TARGET__"), ptr_type_node);
-  TREE_PUBLIC (openmp_target) = 1;
-  DECL_EXTERNAL (openmp_target) = 1;
+  tree openmp_target = get_offload_symbol_decl ();
   tree fnaddr = build_fold_addr_expr (child_fn);
   g = gimple_build_call (builtin_decl_explicit (start_ix), 10, device,
 			 fnaddr, build_fold_addr_expr (openmp_target),
@@ -8686,11 +8704,7 @@ expand_omp_target (struct omp_region *re
     }
 
   gimple g;
-  tree openmp_target
-    = build_decl (UNKNOWN_LOCATION, VAR_DECL,
-		  get_identifier ("__OPENMP_TARGET__"), ptr_type_node);
-  TREE_PUBLIC (openmp_target) = 1;
-  DECL_EXTERNAL (openmp_target) = 1;
+  tree openmp_target = get_offload_symbol_decl ();
   if (kind == GF_OMP_TARGET_KIND_REGION)
     {
       tree fnaddr = build_fold_addr_expr (child_fn);

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-03 16:15     ` Bernd Schmidt
@ 2014-04-03 16:53       ` Ilya Verbin
  2014-04-03 17:08         ` Bernd Schmidt
  2014-04-04  5:55       ` Thomas Schwinge
  1 sibling, 1 reply; 48+ messages in thread
From: Ilya Verbin @ 2014-04-03 16:53 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Thomas Schwinge, Michael Zolotukhin, GCC Patches

2014-04-03 20:13 GMT+04:00 Bernd Schmidt <bernds@codesourcery.com>:
> The patch below should be a better fix, making the references to > __OPENMP_TARGET__ weak. Does this work for you?

Shouldn't we just remove __OPENMP_TARGET__ argument from GOMP_target,
since we decided to pass it to GOMP_offload_register?

  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-03 16:53       ` Ilya Verbin
@ 2014-04-03 17:08         ` Bernd Schmidt
  2014-04-03 17:25           ` Ilya Verbin
  0 siblings, 1 reply; 48+ messages in thread
From: Bernd Schmidt @ 2014-04-03 17:08 UTC (permalink / raw)
  To: Ilya Verbin; +Cc: Thomas Schwinge, Michael Zolotukhin, GCC Patches

On 04/03/2014 06:53 PM, Ilya Verbin wrote:
> 2014-04-03 20:13 GMT+04:00 Bernd Schmidt <bernds@codesourcery.com>:
>> The patch below should be a better fix, making the references to > __OPENMP_TARGET__ weak. Does this work for you?
>
> Shouldn't we just remove __OPENMP_TARGET__ argument from GOMP_target,
> since we decided to pass it to GOMP_offload_register?

I thought it was used to look up the right function? With shared 
libraries you'd get multiple __OPENMP_TARGET__ tables.


Bernd


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-03 17:08         ` Bernd Schmidt
@ 2014-04-03 17:25           ` Ilya Verbin
  2014-04-03 17:31             ` Bernd Schmidt
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Verbin @ 2014-04-03 17:25 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Thomas Schwinge, Michael Zolotukhin, GCC Patches

2014-04-03 21:06 GMT+04:00 Bernd Schmidt <bernds@codesourcery.com>:
> On 04/03/2014 06:53 PM, Ilya Verbin wrote:
>>
>> 2014-04-03 20:13 GMT+04:00 Bernd Schmidt <bernds@codesourcery.com>:
>>>
>>> The patch below should be a better fix, making the references to >
>>> __OPENMP_TARGET__ weak. Does this work for you?
>>
>>
>> Shouldn't we just remove __OPENMP_TARGET__ argument from GOMP_target,
>> since we decided to pass it to GOMP_offload_register?
>
>
> I thought it was used to look up the right function? With shared libraries
> you'd get multiple __OPENMP_TARGET__ tables.
>
>
> Bernd
>

Yes, initially the idea was to use it for look up the right function.
But now each DSO will call GOMP_offload_register, and pass unique
pointer to __OPENMP_TARGET__ (host_table) for this DSO.  Then
gomp_register_images_for_device registers all this host tables in the
plugin.  And when libgomp calls device_get_table_func, the plugin
returns the joint table for all DSO's.

  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-03 17:25           ` Ilya Verbin
@ 2014-04-03 17:31             ` Bernd Schmidt
  2014-04-03 17:39               ` Ilya Verbin
  0 siblings, 1 reply; 48+ messages in thread
From: Bernd Schmidt @ 2014-04-03 17:31 UTC (permalink / raw)
  To: Ilya Verbin; +Cc: Thomas Schwinge, Michael Zolotukhin, GCC Patches

On 04/03/2014 07:25 PM, Ilya Verbin wrote:
> Yes, initially the idea was to use it for look up the right function.
> But now each DSO will call GOMP_offload_register, and pass unique
> pointer to __OPENMP_TARGET__ (host_table) for this DSO.  Then
> gomp_register_images_for_device registers all this host tables in the
> plugin.  And when libgomp calls device_get_table_func, the plugin
> returns the joint table for all DSO's.

Why make a joint table? It seems better to use the __OPENMP_TARGET__ 
symbol to restrict lookups to the subset of symbols that could actually 
be found.
BTW, I still expect that the lookup by ordering will turn out to be 
fundamentally unreliable and we'll need to use the unique id patch I 
posted a while ago. In that case using __OPENMP_TARGET__ as a first 
order key for the lookups eliminates any problem with duplicate names 
across multiple libraries.


Bernd

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-03 17:31             ` Bernd Schmidt
@ 2014-04-03 17:39               ` Ilya Verbin
  0 siblings, 0 replies; 48+ messages in thread
From: Ilya Verbin @ 2014-04-03 17:39 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Thomas Schwinge, Michael Zolotukhin, GCC Patches

2014-04-03 21:28 GMT+04:00 Bernd Schmidt <bernds@codesourcery.com>:
> On 04/03/2014 07:25 PM, Ilya Verbin wrote:
>>
>> Yes, initially the idea was to use it for look up the right function.
>> But now each DSO will call GOMP_offload_register, and pass unique
>> pointer to __OPENMP_TARGET__ (host_table) for this DSO.  Then
>> gomp_register_images_for_device registers all this host tables in the
>> plugin.  And when libgomp calls device_get_table_func, the plugin
>> returns the joint table for all DSO's.
>
>
> Why make a joint table? It seems better to use the __OPENMP_TARGET__ symbol
> to restrict lookups to the subset of symbols that could actually be found.
> BTW, I still expect that the lookup by ordering will turn out to be
> fundamentally unreliable and we'll need to use the unique id patch I posted
> a while ago. In that case using __OPENMP_TARGET__ as a first order key for
> the lookups eliminates any problem with duplicate names across multiple
> libraries.
>
>
> Bernd
>

In current implementation each gomp_device_descr contains one
dev_splay_tree.  And all addresses are inserted into this splay tree.
There is no need to restrict lookup, because the addresses from
multiple DSO's can't overlap.

  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-03 16:15     ` Bernd Schmidt
  2014-04-03 16:53       ` Ilya Verbin
@ 2014-04-04  5:55       ` Thomas Schwinge
  2014-04-04  9:25         ` Bernd Schmidt
  1 sibling, 1 reply; 48+ messages in thread
From: Thomas Schwinge @ 2014-04-04  5:55 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Ilya Verbin, Michael Zolotukhin, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 1515 bytes --]

Hi!

On Thu, 3 Apr 2014 18:13:08 +0200, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 04/02/2014 10:36 AM, Thomas Schwinge wrote:
> >> I see regressions in the libgomp testsuite for configurations where
> >> offloading is not enabled:
> >>
> >>      spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ [...]/source/libgomp/testsuite/libgomp.c/for-3.c -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -I[...]/build/x86_64-unknown-linux-gnu/./libgomp -I[...]/source/libgomp/testsuite/.. -fmessage-length=0 -fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 -fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o ./for-3.exe
> >>      /tmp/ccGnT0ei.o: In function `main':
> >>      for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__'
> >>      collect2: error: ld returned 1 exit status
> >>
> >> I suppose that's because [...]
> >
> > Workaround committed in r209015:
> 
> >      	libgcc/
> >      	* crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to
> >      	NULL.
> 
> The patch below should be a better fix, making the references to 
> __OPENMP_TARGET__ weak. Does this work for you?

Yes, it does, thanks!  Please revert my patch when committing yours.


Oh, and please use ChangeLog.gomp files on gomp-4_0-branch; also please
move the entries for your recent commits from the ChangeLog file(s) to
the respective ChangeLog.gomp one(s).


Grüße,
 Thomas

[-- Attachment #2: Type: application/pgp-signature, Size: 489 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-04  5:55       ` Thomas Schwinge
@ 2014-04-04  9:25         ` Bernd Schmidt
  0 siblings, 0 replies; 48+ messages in thread
From: Bernd Schmidt @ 2014-04-04  9:25 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Ilya Verbin, Michael Zolotukhin, GCC Patches

On 04/04/2014 07:55 AM, Thomas Schwinge wrote:
> Hi!
>
> On Thu, 3 Apr 2014 18:13:08 +0200, Bernd Schmidt <bernds@codesourcery.com> wrote:
>> The patch below should be a better fix, making the references to
>> __OPENMP_TARGET__ weak. Does this work for you?
>
> Yes, it does, thanks!  Please revert my patch when committing yours.
>
>
> Oh, and please use ChangeLog.gomp files on gomp-4_0-branch; also please
> move the entries for your recent commits from the ChangeLog file(s) to
> the respective ChangeLog.gomp one(s).

All done.


Bernd


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-21 15:28     ` Jakub Jelinek
  2014-03-21 16:16       ` Bernd Schmidt
@ 2014-04-04  9:33       ` Bernd Schmidt
  2014-04-05 15:05         ` Thomas Schwinge
  1 sibling, 1 reply; 48+ messages in thread
From: Bernd Schmidt @ 2014-04-04  9:33 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: GCC Patches, Ilya Verbin, Michael Zolotukhin

[-- Attachment #1: Type: text/plain, Size: 1176 bytes --]

On 03/21/2014 04:20 PM, Jakub Jelinek wrote:
> On Fri, Mar 21, 2014 at 04:13:45PM +0100, Bernd Schmidt wrote:
>> On 03/20/2014 07:56 PM, Jakub Jelinek wrote:
>>> When we were discussing the design last year, my strong preference was that
>>> either this lives in some other crt object that mkoffload/linker plugin adds
>>> to link, or that it would be completely mkoffload synthetized.
>>
>> mkoffload is only concerned with generating target images. These
>> fragments are for the host tables.
>>
>> How's this? It moves everything to ompbegin.o/ompend.o and only
>> links in these files if we have produced at least one target offload
>> image.
>
> I'd call the files crtompbegin.o/crtompend.o instead.
> And, what is the exact reason why you are using protected visibility rather
> than hidden?
> Also, supposedly if you've used section names without . in them, the linker
> itself would provide the symbols automatically and you wouldn't actually
> need begin/end, but just one object that would reference the linker created
> symbols.  Just use say __gnu_offload_whatever__ or similar section names.

I've checked in the following which should address all this.


Bernd


[-- Attachment #2: ompstuff.diff --]
[-- Type: text/x-patch, Size: 13400 bytes --]

Index: gcc/ChangeLog.gomp
===================================================================
--- gcc/ChangeLog.gomp	(revision 209074)
+++ gcc/ChangeLog.gomp	(working copy)
@@ -1,5 +1,15 @@
 2014-04-04  Bernd Schmidt  <bernds@codesourcery.com>
 
+	* lto-section-names.h (OFFLOAD_VAR_TABLE_SECTION_NAME,
+	OFFLOAD_FUNC_TABLE_SECTION_NAME): Define.
+	* lto-wrapper.c (OFFLOAD_FUNC_TABLE_SECTION_NAME): Don't define.
+	(ompend): New static variable.
+	(copy_file, find_ompend): New static functions.
+	(run_gcc): Call find_ompend if we have offload images.  Add its
+	return value to the output.
+	* omp-low.c: Include "lto-section-names.h".
+	(omp_finish_file): Initialize section names from macros defined there.
+	
 	* omp-low.c (offload_symbol_decl): New static variable.
 	(get_offload_symbol_decl): New static function.
 	(expand_oacc_offload, expand_omp_target): Use it.
Index: gcc/lto-section-names.h
===================================================================
--- gcc/lto-section-names.h	(revision 209072)
+++ gcc/lto-section-names.h	(working copy)
@@ -31,3 +31,6 @@ along with GCC; see the file COPYING3.
 /* Can be either OMP_SECTION_NAME_PREFIX when we stream pragma omp target
    stuff, or LTO_SECTION_NAME_PREFIX for lto case.  */
 extern const char  *section_name_prefix;
+
+#define OFFLOAD_VAR_TABLE_SECTION_NAME "__gnu_offload_vars"
+#define OFFLOAD_FUNC_TABLE_SECTION_NAME "__gnu_offload_funcs"
Index: gcc/lto-wrapper.c
===================================================================
--- gcc/lto-wrapper.c	(revision 209072)
+++ gcc/lto-wrapper.c	(working copy)
@@ -49,7 +49,6 @@ along with GCC; see the file COPYING3.
 #include "lto-section-names.h"
 #include "collect-utils.h"
 
-#define OFFLOAD_FUNC_TABLE_SECTION_NAME ".offload_func_table_section"
 #define OFFLOAD_TARGET_NAMES_ENV	"OFFLOAD_TARGET_NAMES"
 
 enum lto_mode_d {
@@ -67,6 +66,7 @@ static unsigned int nr;
 static char **input_names;
 static char **output_names;
 static char **offload_names;
+static const char *ompend;
 static char *makefile;
 
 const char tool_name[] = "lto-wrapper";
@@ -479,6 +479,54 @@ compile_images_for_openmp_targets (unsig
   free_array_of_ptrs ((void**) names, num_targets);
 }
 
+/* Copy a file from SRC to DEST.  */
+static void
+copy_file (const char *dest, const char *src)
+{
+  FILE *d = fopen (dest, "wb");
+  FILE *s = fopen (src, "rb");
+  char buffer[512];
+  while (!feof (s))
+    {
+      size_t len = fread (buffer, 1, 512, s);
+      if (ferror (s) != 0)
+	fatal ("reading input file");
+      if (len > 0)
+	{
+	  fwrite (buffer, 1, len, d);
+	  if (ferror (d) != 0)
+	    fatal ("writing output file");
+	}
+    }
+}
+
+/* Find the crtompend.o file in LIBRARY_PATH, make a copy and store
+   the name of the copy in ompend.  */
+
+static void
+find_ompend (void)
+{
+  char **paths;
+  const char *library_path = getenv ("LIBRARY_PATH");
+  if (library_path == NULL)
+    return;
+  int n_paths = parse_env_var (library_path, &paths, "/crtompend.o");
+
+  for (int i = 0; i < n_paths; i++)
+    if (access_check (paths[i], R_OK) == 0)
+      {
+	/* The linker will delete the filenames we give it, so make
+	   copies.  */
+	const char *omptmp = make_temp_file (".o");
+	copy_file (omptmp, paths[i]);
+	ompend = omptmp;
+	break;
+      }
+  if (ompend == 0)
+    fatal ("installation error, can't find crtompend.o");
+
+  free_array_of_ptrs ((void**) paths, n_paths);
+}
 
 /* Execute gcc. ARGC is the number of arguments. ARGV contains the arguments. */
 
@@ -964,6 +1012,7 @@ cont:
 	  compile_images_for_openmp_targets (argc, argv);
 	  if (offload_names)
 	    {
+	      find_ompend ();
 	      for (i = 0; offload_names[i]; i++)
 		{
 		  fputs (offload_names[i], stdout);
@@ -972,12 +1021,18 @@ cont:
 	      free_array_of_ptrs ((void **)offload_names, i);
 	    }
 	}
+
       for (i = 0; i < nr; ++i)
 	{
 	  fputs (output_names[i], stdout);
 	  putc ('\n', stdout);
 	  free (input_names[i]);
 	}
+      if (ompend)
+	{
+	  fputs (ompend, stdout);
+	  putc ('\n', stdout);
+	}
       nr = 0;
       free (output_names);
       free (input_names);
Index: gcc/omp-low.c
===================================================================
--- gcc/omp-low.c	(revision 209074)
+++ gcc/omp-low.c	(working copy)
@@ -72,7 +72,7 @@ along with GCC; see the file COPYING3.
 #include "ipa-prop.h"
 #include "tree-nested.h"
 #include "tree-eh.h"
-
+#include "lto-section-names.h"
 
 /* Lowering of OpenMP parallel and workshare constructs proceeds in two
    phases.  The first phase scans the function looking for OMP statements
@@ -12851,8 +12851,8 @@ omp_finish_file (void)
 {
   struct cgraph_node *node;
   struct varpool_node *vnode;
-  const char *funcs_section_name = ".offload_func_table_section";
-  const char *vars_section_name = ".offload_var_table_section";
+  const char *funcs_section_name = OFFLOAD_FUNC_TABLE_SECTION_NAME;
+  const char *vars_section_name = OFFLOAD_VAR_TABLE_SECTION_NAME;
   vec<tree, va_gc> *v_funcs, *v_vars;
 
   vec_alloc (v_vars, 0);
Index: libgcc/ChangeLog.gomp
===================================================================
--- libgcc/ChangeLog.gomp	(revision 209074)
+++ libgcc/ChangeLog.gomp	(working copy)
@@ -1,6 +1,16 @@
 2014-04-04  Bernd Schmidt  <bernds@codesourcery.com>
 
-	* crtstuff.c (__OPENMP_TARGET__): Revert previous change.
+	* Makefile.in (crtompend$(objext)): New rule.
+	* configure.ac (--enable-accelerator, --enable-offload-targets):
+	Handle options.
+	(offload_targets): Compute list.
+	(extra_parts): Add crtompend.o if offload_targets is not empty.
+	* configure: Regenerate.
+	* crtstuff.c (_omp_func_table, _omp_var_table, _omp_funcs_end)
+	_omp_vars_end, __OPENMP_TARGET__): Remove.
+	* ompstuff.c: New file.
+
+	(* crtstuff.c (__OPENMP_TARGET__): Revert previous change.
 
 2014-04-02  Thomas Schwinge  <thomas@codesourcery.com>
 
Index: libgcc/Makefile.in
===================================================================
--- libgcc/Makefile.in	(revision 209072)
+++ libgcc/Makefile.in	(working copy)
@@ -975,6 +975,9 @@ crtbegin$(objext): $(srcdir)/crtstuff.c
 crtend$(objext): $(srcdir)/crtstuff.c
 	$(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_END
 
+crtompend$(objext): $(srcdir)/ompstuff.c
+	$(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_END
+
 # These are versions of crtbegin and crtend for shared libraries.
 crtbeginS$(objext): $(srcdir)/crtstuff.c
 	$(crt_compile) $(CRTSTUFF_T_CFLAGS_S) -c $< -DCRT_BEGIN -DCRTSTUFFS_O
Index: libgcc/configure
===================================================================
--- libgcc/configure	(revision 209072)
+++ libgcc/configure	(working copy)
@@ -566,6 +566,7 @@ sfp_machine_header
 set_use_emutls
 set_have_cc_tls
 vis_hide
+enable_accelerator
 fixed_point
 enable_decimal_float
 decimal_float
@@ -664,6 +665,8 @@ with_build_libsubdir
 enable_decimal_float
 with_system_libunwind
 enable_sjlj_exceptions
+enable_accelerator
+enable_offload_targets
 enable_tls
 '
       ac_precious_vars='build_alias
@@ -1301,6 +1304,9 @@ Optional Features:
 			to use
   --enable-sjlj-exceptions
                           force use of builtin_setjmp for exceptions
+  --enable-accelerator    build accelerator [ARG={no,device-triplet}]
+  --enable-offload-targets=LIST
+                          enable offloading to devices from LIST
   --enable-tls            Use thread-local storage [default=yes]
 
 Optional Packages:
@@ -4357,6 +4363,43 @@ esac
 # Collect host-machine-specific information.
 . ${srcdir}/config.host
 
+offload_targets=
+# Check whether --enable-accelerator was given.
+if test "${enable_accelerator+set}" = set; then :
+  enableval=$enable_accelerator;
+  case $enable_accelerator in
+  no) ;;
+  *)
+    offload_targets=$enable_accelerator
+    ;;
+  esac
+
+fi
+
+
+
+# Check whether --enable-offload-targets was given.
+if test "${enable_offload_targets+set}" = set; then :
+  enableval=$enable_offload_targets;
+  if test x$enable_offload_targets = x; then
+    as_fn_error "no offload targets specified" "$LINENO" 5
+  else
+    if test x$offload_targets = x; then
+      offload_targets=$enable_offload_targets
+    else
+      offload_targets=$offload_targets,$enable_offload_targets
+    fi
+  fi
+
+else
+  enable_accelerator=no
+fi
+
+
+if test x$offload_targets != x; then
+  extra_parts="${extra_parts} crtompend.o"
+fi
+
 # Check if Solaris/x86 linker supports ZERO terminator unwind entries.
 # This is after config.host so we can augment tmake_file.
 # Link with -nostartfiles -nodefaultlibs since neither are present while
Index: libgcc/configure.ac
===================================================================
--- libgcc/configure.ac	(revision 209072)
+++ libgcc/configure.ac	(working copy)
@@ -307,6 +307,38 @@ esac
 # Collect host-machine-specific information.
 . ${srcdir}/config.host
 
+offload_targets=
+AC_ARG_ENABLE(accelerator,
+[AS_HELP_STRING([--enable-accelerator], [build accelerator @<:@ARG={no,device-triplet}@:>@])],
+[
+  case $enable_accelerator in
+  no) ;;
+  *)
+    offload_targets=$enable_accelerator
+    ;;
+  esac
+], [])
+AC_SUBST(enable_accelerator)
+
+AC_ARG_ENABLE(offload-targets,
+[AS_HELP_STRING([--enable-offload-targets=LIST],
+ [enable offloading to devices from LIST])],
+[
+  if test x$enable_offload_targets = x; then
+    AC_MSG_ERROR([no offload targets specified])
+  else
+    if test x$offload_targets = x; then
+      offload_targets=$enable_offload_targets
+    else
+      offload_targets=$offload_targets,$enable_offload_targets
+    fi
+  fi
+], [enable_accelerator=no])
+AC_SUBST(enable_accelerator)
+if test x$offload_targets != x; then
+  extra_parts="${extra_parts} crtompend.o"
+fi
+
 # Check if Solaris/x86 linker supports ZERO terminator unwind entries.
 # This is after config.host so we can augment tmake_file.
 # Link with -nostartfiles -nodefaultlibs since neither are present while
Index: libgcc/crtstuff.c
===================================================================
--- libgcc/crtstuff.c	(revision 209074)
+++ libgcc/crtstuff.c	(working copy)
@@ -311,15 +311,6 @@ register_tm_clones (void)
 }
 #endif /* USE_TM_CLONE_REGISTRY */
 
-#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
-void *_omp_func_table[0]
-  __attribute__ ((__used__, visibility ("protected"),
-		  section (".offload_func_table_section"))) = { };
-void *_omp_var_table[0]
-  __attribute__ ((__used__, visibility ("protected"),
-		  section (".offload_var_table_section"))) = { };
-#endif
-
 #if defined(INIT_SECTION_ASM_OP) || defined(INIT_ARRAY_SECTION_ASM_OP)
 
 #ifdef OBJECT_FORMAT_ELF
@@ -761,22 +752,6 @@ __do_global_ctors (void)
 #error "What are you doing with crtstuff.c, then?"
 #endif
 
-#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
-void *_omp_funcs_end[0]
-  __attribute__ ((__used__, visibility ("protected"),
-		  section (".offload_func_table_section"))) = { };
-void *_omp_vars_end[0]
-  __attribute__ ((__used__, visibility ("protected"),
-		  section (".offload_var_table_section"))) = { };
-extern void *_omp_func_table[];
-extern void *_omp_var_table[];
-void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("protected"))) =
-{
-  &_omp_func_table, &_omp_funcs_end,
-  &_omp_var_table, &_omp_vars_end
-};
-#endif
-
 
 #else /* ! CRT_BEGIN && ! CRT_END */
 #error "One of CRT_BEGIN or CRT_END must be defined."
Index: libgcc/ompstuff.c
===================================================================
--- libgcc/ompstuff.c	(revision 0)
+++ libgcc/ompstuff.c	(working copy)
@@ -0,0 +1,52 @@
+/* Specialized bits of code needed for the OpenMP offloading tables.
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+/* Target machine header files require this define. */
+#define IN_LIBGCC2
+
+/* FIXME: Including auto-host is incorrect, but until we have
+   identified the set of defines that need to go into auto-target.h,
+   this will have to do.  */
+#include "auto-host.h"
+#undef pid_t
+#undef rlim_t
+#undef ssize_t
+#undef vfork
+#include "tconfig.h"
+#include "tsystem.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "libgcc_tm.h"
+
+#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
+extern void __start___gnu_offload_funcs;
+extern void __stop___gnu_offload_funcs;
+extern void __start___gnu_offload_vars;
+extern void __stop___gnu_offload_vars;
+void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("hidden"))) =
+{
+  &__start___gnu_offload_funcs, &__stop___gnu_offload_funcs,
+  &__start___gnu_offload_vars, &__stop___gnu_offload_vars
+};
+#endif

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-04  9:33       ` Bernd Schmidt
@ 2014-04-05 15:05         ` Thomas Schwinge
  2014-04-05 15:24           ` Bernd Schmidt
  0 siblings, 1 reply; 48+ messages in thread
From: Thomas Schwinge @ 2014-04-05 15:05 UTC (permalink / raw)
  To: Bernd Schmidt, Jakub Jelinek; +Cc: GCC Patches, Ilya Verbin, Michael Zolotukhin

[-- Attachment #1: Type: text/plain, Size: 3377 bytes --]

Hi!

On Fri, 4 Apr 2014 11:30:49 +0200, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 03/21/2014 04:20 PM, Jakub Jelinek wrote:
> > On Fri, Mar 21, 2014 at 04:13:45PM +0100, Bernd Schmidt wrote:
> >> On 03/20/2014 07:56 PM, Jakub Jelinek wrote:
> >>> When we were discussing the design last year, my strong preference was that
> >>> either this lives in some other crt object that mkoffload/linker plugin adds
> >>> to link, or that it would be completely mkoffload synthetized.
> >>
> >> mkoffload is only concerned with generating target images. These
> >> fragments are for the host tables.
> >>
> >> How's this? It moves everything to ompbegin.o/ompend.o and only
> >> links in these files if we have produced at least one target offload
> >> image.
> >
> > I'd call the files crtompbegin.o/crtompend.o instead.

I'd go with crtoffload* (or similar).  ;-)


> > Also, supposedly if you've used section names without . in them, the linker
> > itself would provide the symbols automatically and you wouldn't actually
> > need begin/end, but just one object that would reference the linker created
> > symbols.  Just use say __gnu_offload_whatever__ or similar section names.
> 
> I've checked in the following which should address all this.

Is it a linker bug that I need to add something like the following?

--- libgcc/ompstuff.c
+++ libgcc/ompstuff.c
@@ -40,6 +40,12 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 #include "libgcc_tm.h"
 
 #if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
+# if 1
+/* TODO: Without the following, will get missing symbols for __start_* and
+   __stop_*.  Linker bug?  */
+static void *_funcs[0] __attribute__ ((section ("__gnu_offload_funcs"))) = { };
+static void *_vars[0] __attribute__ ((section ("__gnu_offload_vars"))) = { };
+# endif
 extern void __start___gnu_offload_funcs;
 extern void __stop___gnu_offload_funcs;
 extern void __start___gnu_offload_vars;

    $ ld --version
    GNU ld (Sourcery CodeBench 2013.11-17) 2.23.52.20130912
    [...]


> --- libgcc/ompstuff.c	(revision 0)
> +++ libgcc/ompstuff.c	(working copy)

> +extern void __start___gnu_offload_funcs;
> +extern void __stop___gnu_offload_funcs;
> +extern void __start___gnu_offload_vars;
> +extern void __stop___gnu_offload_vars;
> +void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("hidden"))) =
> +{
> +  &__start___gnu_offload_funcs, &__stop___gnu_offload_funcs,
> +  &__start___gnu_offload_vars, &__stop___gnu_offload_vars
> +};

    ../../../source/libgcc/ompstuff.c:49:3: warning: taking address of expression of type 'void'
       &__start___gnu_offload_funcs, &__stop___gnu_offload_funcs,
       ^
    ../../../source/libgcc/ompstuff.c:49:33: warning: taking address of expression of type 'void'
       &__start___gnu_offload_funcs, &__stop___gnu_offload_funcs,
                                     ^
    ../../../source/libgcc/ompstuff.c:50:3: warning: taking address of expression of type 'void'
       &__start___gnu_offload_vars, &__stop___gnu_offload_vars
       ^
    ../../../source/libgcc/ompstuff.c:50:32: warning: taking address of expression of type 'void'
       &__start___gnu_offload_vars, &__stop___gnu_offload_vars
                                    ^

s%void%char makes this go away.


Grüße,
 Thomas

[-- Attachment #2: Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-05 15:05         ` Thomas Schwinge
@ 2014-04-05 15:24           ` Bernd Schmidt
  2014-04-08  6:03             ` Jakub Jelinek
  2014-05-06 15:32             ` Ilya Verbin
  0 siblings, 2 replies; 48+ messages in thread
From: Bernd Schmidt @ 2014-04-05 15:24 UTC (permalink / raw)
  To: Thomas Schwinge, Jakub Jelinek
  Cc: GCC Patches, Ilya Verbin, Michael Zolotukhin

On 04/05/2014 05:04 PM, Thomas Schwinge wrote:
> Is it a linker bug that I need to add something like the following?
>
> --- libgcc/ompstuff.c
> +++ libgcc/ompstuff.c
> @@ -40,6 +40,12 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>   #include "libgcc_tm.h"
>
>   #if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
> +# if 1
> +/* TODO: Without the following, will get missing symbols for __start_* and
> +   __stop_*.  Linker bug?  */
> +static void *_funcs[0] __attribute__ ((section ("__gnu_offload_funcs"))) = { };
> +static void *_vars[0] __attribute__ ((section ("__gnu_offload_vars"))) = { };
> +# endif

Things seemed to work over here, but now I'm not certain whether the 
__start_/__stop_ functionality is GNU ld specific? Maybe we should just 
go back to the previous version of this patch which didn't try to use this.


Bernd

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-05 15:24           ` Bernd Schmidt
@ 2014-04-08  6:03             ` Jakub Jelinek
  2014-05-06 15:32             ` Ilya Verbin
  1 sibling, 0 replies; 48+ messages in thread
From: Jakub Jelinek @ 2014-04-08  6:03 UTC (permalink / raw)
  To: Bernd Schmidt
  Cc: Thomas Schwinge, GCC Patches, Ilya Verbin, Michael Zolotukhin

On Sat, Apr 05, 2014 at 05:22:09PM +0200, Bernd Schmidt wrote:
> On 04/05/2014 05:04 PM, Thomas Schwinge wrote:
> >Is it a linker bug that I need to add something like the following?
> >
> >--- libgcc/ompstuff.c
> >+++ libgcc/ompstuff.c
> >@@ -40,6 +40,12 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> >  #include "libgcc_tm.h"
> >
> >  #if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
> >+# if 1
> >+/* TODO: Without the following, will get missing symbols for __start_* and
> >+   __stop_*.  Linker bug?  */
> >+static void *_funcs[0] __attribute__ ((section ("__gnu_offload_funcs"))) = { };
> >+static void *_vars[0] __attribute__ ((section ("__gnu_offload_vars"))) = { };
> >+# endif
> 
> Things seemed to work over here, but now I'm not certain whether the
> __start_/__stop_ functionality is GNU ld specific? Maybe we should
> just go back to the previous version of this patch which didn't try
> to use this.

Somebody needs to try it with gold, I think it should support the same.
As for other linkers, don't we need linker plugin support anyway, which is
not available for other linkers?

	Jakub

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-03-27 16:23       ` Jakub Jelinek
  2014-03-27 18:51         ` Ilya Verbin
@ 2014-04-17 18:44         ` Ilya Verbin
  2014-04-25 11:55           ` Ilya Verbin
  2014-06-10 13:52           ` Bernd Schmidt
  1 sibling, 2 replies; 48+ messages in thread
From: Ilya Verbin @ 2014-04-17 18:44 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Schmidt, GCC Patches, Michael Zolotukhin

On 27 Mar 17:16, Jakub Jelinek wrote:
> On Thu, Mar 27, 2014 at 08:13:00PM +0400, Ilya Verbin wrote:
> > On 27 Mar 15:02, Jakub Jelinek wrote:
> > > The tables need to be created before IPA, that way it really shouldn't
> > > matter in what order you emit them.  E.g. the outlined target functions
> > > could be added to the table during ompexp pass which actually creates the
> > > outlined functions, the vars need to be added before target lto or host lto
> > > is streamed.
> > 
> > For host tables it's ok, but when target compiler will create tables with functions?
> > It reads bytecode from target_lto sections, so it never executes ompexp pass.
> 
> Which is why the table created for host by the ompexp pass should be
> streamed into the target_lto sections (marked specially somehow, special
> attribute or whatever), and then corresponding target table created from
> that, rather then created from some possibly different ordering there.
> 
> 	Jakub

Hi Jakub,

Could you please take a look at this patch?  It fixes the ordering issue in the
tables stated above, and passes all the tests that I have.  But I'm not sure
about its correctness from the architectural point of view.


---
 gcc/lto-cgraph.c       | 93 ++++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/lto-section-in.c   |  3 +-
 gcc/lto-streamer-out.c |  2 ++
 gcc/lto-streamer.h     |  3 ++
 gcc/lto/lto.c          |  2 ++
 gcc/omp-low.c          | 68 +++++++-----------------------------
 6 files changed, 115 insertions(+), 56 deletions(-)

diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index 544f04b..3d6637e 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -82,6 +82,8 @@ enum LTO_symtab_tags
   LTO_symtab_last_tag
 };
 
+extern vec<tree, va_gc> *offload_funcs, *offload_vars;
+
 /* Create a new symtab encoder.
    if FOR_INPUT, the encoder allocate only datastructures needed
    to read the symtab.  */
@@ -958,6 +960,51 @@ output_symtab (void)
   output_refs (encoder);
 }
 
+void
+output_offload_tables (void)
+{
+  /* Collect all omp-target global variables to offload_vars, if they have not
+     been gathered earlier by input_offload_tables.  */
+  if (vec_safe_is_empty (offload_vars))
+    {
+      struct varpool_node *vnode;
+      FOR_EACH_DEFINED_VARIABLE (vnode)
+	{
+	  if (!lookup_attribute ("omp declare target",
+				 DECL_ATTRIBUTES (vnode->decl))
+	      || TREE_CODE (vnode->decl) != VAR_DECL
+	      || DECL_SIZE (vnode->decl) == 0)
+	    continue;
+	  vec_safe_push (offload_vars, vnode->decl);
+	}
+    }
+
+  if (vec_safe_is_empty (offload_funcs) && vec_safe_is_empty (offload_vars))
+    return;
+
+  struct lto_simple_output_block *ob
+    = lto_create_simple_output_block (LTO_section_offload_table);
+
+  for (unsigned i = 0; i < vec_safe_length (offload_funcs); i++)
+    {
+      streamer_write_enum (ob->main_stream, LTO_symtab_tags,
+			   LTO_symtab_last_tag, LTO_symtab_unavail_node);
+      lto_output_fn_decl_index (ob->decl_state, ob->main_stream,
+				(*offload_funcs)[i]);
+    }
+
+  for (unsigned i = 0; i < vec_safe_length (offload_vars); i++)
+    {
+      streamer_write_enum (ob->main_stream, LTO_symtab_tags,
+			   LTO_symtab_last_tag, LTO_symtab_variable);
+      lto_output_var_decl_index (ob->decl_state, ob->main_stream,
+				 (*offload_vars)[i]);
+    }
+
+  streamer_write_uhwi_stream (ob->main_stream, 0);
+  lto_destroy_simple_output_block (ob);
+}
+
 /* Overwrite the information in NODE based on FILE_DATA, TAG, FLAGS,
    STACK_SIZE, SELF_TIME and SELF_SIZE.  This is called either to initialize
    NODE or to replace the values in it, for instance because the first
@@ -1611,6 +1658,52 @@ input_symtab (void)
     }
 }
 
+void
+input_offload_tables (void)
+{
+  struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data ();
+  struct lto_file_decl_data *file_data;
+  unsigned int j = 0;
+
+  while ((file_data = file_data_vec[j++]))
+    {
+      const char *data;
+      size_t len;
+      struct lto_input_block *ib
+	= lto_create_simple_input_block (file_data, LTO_section_offload_table,
+					 &data, &len);
+      if (!ib)
+	continue;
+
+      enum LTO_symtab_tags tag
+	= streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag);
+      while (tag)
+	{
+	  if (tag == LTO_symtab_unavail_node)
+	    {
+	      int decl_index = streamer_read_uhwi (ib);
+	      tree fn_decl
+		= lto_file_decl_data_get_fn_decl (file_data, decl_index);
+	      vec_safe_push (offload_funcs, fn_decl);
+	    }
+	  else if (tag == LTO_symtab_variable)
+	    {
+	      int decl_index = streamer_read_uhwi (ib);
+	      tree var_decl
+		= lto_file_decl_data_get_var_decl (file_data, decl_index);
+	      vec_safe_push (offload_vars, var_decl);
+	    }
+	  else
+	    fatal_error ("invalid offload table in %s", file_data->file_name);
+
+	  tag = streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag);
+	}
+
+      lto_destroy_simple_input_block (file_data, LTO_section_offload_table,
+				      ib, data, len);
+    }
+}
+
 /* True when we need optimization summary for NODE.  */
 
 static int
diff --git a/gcc/lto-section-in.c b/gcc/lto-section-in.c
index 9aa7639..df2fd8f 100644
--- a/gcc/lto-section-in.c
+++ b/gcc/lto-section-in.c
@@ -60,7 +60,8 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] =
   "opts",
   "cgraphopt",
   "inline",
-  "ipcp_trans"
+  "ipcp_trans",
+  "offload_table"
 };
 
 
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 0f37f1c..2358a5e 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -2072,6 +2072,8 @@ lto_output (void)
      statements using the statement UIDs.  */
   output_symtab ();
 
+  output_offload_tables ();
+
 #ifdef ENABLE_CHECKING
   lto_bitmap_free (output);
 #endif
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index b1dc7dc..edc5be4 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -247,6 +247,7 @@ enum lto_section_type
   LTO_section_cgraph_opt_sum,
   LTO_section_inline_summary,
   LTO_section_ipcp_transform,
+  LTO_section_offload_table,
   LTO_N_SECTION_TYPES		/* Must be last.  */
 };
 
@@ -883,6 +884,8 @@ bool lto_symtab_encoder_encode_initializer_p (lto_symtab_encoder_t,
 					      varpool_node *);
 void output_symtab (void);
 void input_symtab (void);
+void output_offload_tables (void);
+void input_offload_tables (void);
 bool referenced_from_other_partition_p (struct ipa_ref_list *,
 				        lto_symtab_encoder_t);
 bool reachable_from_other_partition_p (struct cgraph_node *,
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 8aaf8d3..7a2506d 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -3020,6 +3020,8 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames)
   /* Read the symtab.  */
   input_symtab ();
 
+  input_offload_tables ();
+
   /* Store resolutions into the symbol table.  */
 
   FOR_EACH_SYMBOL (snode)
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 45a8eb2..117021d 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -224,6 +224,9 @@ static tree scan_omp_1_op (tree *, int *, void *);
 /* Holds a decl for __OPENMP_TARGET__.  */
 static GTY(()) tree offload_symbol_decl;
 
+/* Holds offload tables with decls.  */
+vec<tree, va_gc> *offload_funcs, *offload_vars;
+
 /* Get the __OPENMP_TARGET__ symbol.  */
 static tree
 get_offload_symbol_decl (void)
@@ -8548,6 +8551,9 @@ expand_omp_target (struct omp_region *region)
       DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties;
       cgraph_add_new_function (child_fn, true);
 
+      /* Add the new function to the offload table.  */
+      vec_safe_push (offload_funcs, child_fn);
+
       /* Fix the callgraph edges for child_cfun.  Those for cfun will be
 	 fixed in a following pass.  */
       push_cfun (child_cfun);
@@ -12849,71 +12855,23 @@ add_decls_addresses_to_decl_constructor (vec<tree, va_gc> *v_decls,
 void
 omp_finish_file (void)
 {
-  struct cgraph_node *node;
-  struct varpool_node *vnode;
   const char *funcs_section_name = OFFLOAD_FUNC_TABLE_SECTION_NAME;
   const char *vars_section_name = OFFLOAD_VAR_TABLE_SECTION_NAME;
-  vec<tree, va_gc> *v_funcs, *v_vars;
-
-  vec_alloc (v_vars, 0);
-  vec_alloc (v_funcs, 0);
-
-  /* Collect all omp-target functions.  */
-  FOR_EACH_DEFINED_FUNCTION (node)
-    {
-      /* TODO: This check could fail on functions, created by omp
-	 parallel/task pragmas.  It's better to name outlined for offloading
-	 functions in some different way and to check here the function name.
-	 It could be something like "*_omp_tgtfn" in contrast with "*_omp_fn"
-	 for functions from omp parallel/task pragmas.  */
-      if (!lookup_attribute ("omp declare target",
-			     DECL_ATTRIBUTES (node->decl))
-	  || !DECL_ARTIFICIAL (node->decl))
-	continue;
-      vec_safe_push (v_funcs, node->decl);
-    }
-  /* Collect all omp-target global variables.  */
-  FOR_EACH_DEFINED_VARIABLE (vnode)
-    {
-      if (!lookup_attribute ("omp declare target",
-			     DECL_ATTRIBUTES (vnode->decl))
-	  || TREE_CODE (vnode->decl) != VAR_DECL
-	  || DECL_SIZE (vnode->decl) == 0)
-	continue;
 
-      vec_safe_push (v_vars, vnode->decl);
-    }
-  unsigned num_vars = vec_safe_length (v_vars);
-  unsigned num_funcs = vec_safe_length (v_funcs);
+  unsigned num_funcs = vec_safe_length (offload_funcs);
+  unsigned num_vars = vec_safe_length (offload_vars);
 
-  if (num_vars == 0 && num_funcs == 0)
+  if (num_funcs == 0 && num_vars == 0)
     return;
 
-#ifdef ACCEL_COMPILER
-  /* Decls are placed in reversed order in fat-objects, so we need to
-     revert them back if we compile target.  */
-  for (unsigned i = 0; i < num_funcs / 2; i++)
-    {
-      tree it = (*v_funcs)[i];
-      (*v_funcs)[i] = (*v_funcs)[num_funcs - i - 1];
-      (*v_funcs)[num_funcs - i - 1] = it;
-    }
-  for (unsigned i = 0; i < num_vars / 2; i++)
-    {
-      tree it = (*v_vars)[i];
-      (*v_vars)[i] = (*v_vars)[num_vars - i - 1];
-      (*v_vars)[num_vars - i - 1] = it;
-    }
-#endif
-
   if (targetm_common.have_named_sections)
     {
       vec<constructor_elt, va_gc> *v_f, *v_v;
       vec_alloc (v_f, num_funcs);
       vec_alloc (v_v, num_vars * 2);
 
-      add_decls_addresses_to_decl_constructor (v_funcs, v_f);
-      add_decls_addresses_to_decl_constructor (v_vars, v_v);
+      add_decls_addresses_to_decl_constructor (offload_funcs, v_f);
+      add_decls_addresses_to_decl_constructor (offload_vars, v_v);
 
       tree vars_decl_type = build_array_type_nelts (pointer_sized_int_node,
 						    num_vars * 2);
@@ -12946,12 +12904,12 @@ omp_finish_file (void)
     {
       for (unsigned i = 0; i < num_funcs; i++)
 	{
-	  tree it = (*v_funcs)[i];
+	  tree it = (*offload_funcs)[i];
 	  targetm.record_offload_symbol (it);
 	}  
       for (unsigned i = 0; i < num_vars; i++)
 	{
-	  tree it = (*v_vars)[i];
+	  tree it = (*offload_vars)[i];
 	  targetm.record_offload_symbol (it);
 	}  
     }
-- 
1.7.11.7


Thanks,
  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-17 18:44         ` Ilya Verbin
@ 2014-04-25 11:55           ` Ilya Verbin
  2014-06-10 13:52           ` Bernd Schmidt
  1 sibling, 0 replies; 48+ messages in thread
From: Ilya Verbin @ 2014-04-25 11:55 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Bernd Schmidt, GCC Patches, Michael Zolotukhin

On 17 Apr 22:33, Ilya Verbin wrote:
> Hi Jakub,
> 
> Could you please take a look at this patch?  It fixes the ordering issue in the
> tables stated above, and passes all the tests that I have.  But I'm not sure
> about its correctness from the architectural point of view.
> 
> Thanks,
>   -- Ilya

Ping.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-05 15:24           ` Bernd Schmidt
  2014-04-08  6:03             ` Jakub Jelinek
@ 2014-05-06 15:32             ` Ilya Verbin
  2014-05-08 10:11               ` Bernd Schmidt
  1 sibling, 1 reply; 48+ messages in thread
From: Ilya Verbin @ 2014-05-06 15:32 UTC (permalink / raw)
  To: Bernd Schmidt, Jakub Jelinek
  Cc: Thomas Schwinge, GCC Patches, Michael Zolotukhin

On 05 Apr 17:22, Bernd Schmidt wrote:
> Things seemed to work over here, but now I'm not certain whether the
> __start_/__stop_ functionality is GNU ld specific? Maybe we should
> just go back to the previous version of this patch which didn't try
> to use this.
> 
> Bernd

This approach does not work with shared libraries.

The automatically inserted symbols have GLOBAL binding, therefore the
__start_/__stop_ from the executable overwrite the respective symbols in DSO.

Here is a simple example with 2 DSOs and one executable.  The function
GOMP_offload_register is called with the following pointers in HOST_TABLE:

1. (funcs 0x604880:0x604898, vars 0x604840:0x604880)
2. (funcs 0x604880:0x604898, vars 0x604840:0x604880)
3. (funcs 0x604880:0x604898, vars 0x604840:0x604880)

But with "manually" added start/stop and LOCAL binding everything works fine:

1. (funcs 0x7f286b425530:0x7f286b425540, vars 0x7f286b425540:0x7f286b425540)
2. (funcs 0x7f286b8624a0:0x7f286b8624b0, vars 0x7f286b8624b0:0x7f286b8624b0)
3. (funcs 0x604760:0x604778, vars 0x604780:0x6047c0)

  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-05-06 15:32             ` Ilya Verbin
@ 2014-05-08 10:11               ` Bernd Schmidt
  2014-05-12 20:44                 ` Bernd Schmidt
  0 siblings, 1 reply; 48+ messages in thread
From: Bernd Schmidt @ 2014-05-08 10:11 UTC (permalink / raw)
  To: Ilya Verbin, Jakub Jelinek
  Cc: Thomas Schwinge, GCC Patches, Michael Zolotukhin

[-- Attachment #1: Type: text/plain, Size: 667 bytes --]

On 05/06/2014 05:32 PM, Ilya Verbin wrote:
> On 05 Apr 17:22, Bernd Schmidt wrote:
>> Things seemed to work over here, but now I'm not certain whether the
>> __start_/__stop_ functionality is GNU ld specific? Maybe we should
>> just go back to the previous version of this patch which didn't try
>> to use this.
>>
>> Bernd
>
> This approach does not work with shared libraries.
>
> The automatically inserted symbols have GLOBAL binding, therefore the
> __start_/__stop_ from the executable overwrite the respective symbols in DSO.

Ok, I guess we should just go back to what we had previously. Here's 
what I intend to commit if there are no objections.


Bernd




[-- Attachment #2: ompbegend.diff --]
[-- Type: text/x-patch, Size: 5542 bytes --]

Index: gcc/lto-wrapper.c
===================================================================
--- gcc/lto-wrapper.c	(revision 210170)
+++ gcc/lto-wrapper.c	(working copy)
@@ -66,7 +66,7 @@ static unsigned int nr;
 static char **input_names;
 static char **output_names;
 static char **offload_names;
-static const char *ompend;
+static const char *ompbegin, *ompend;
 static char *makefile;
 
 const char tool_name[] = "lto-wrapper";
@@ -554,30 +554,40 @@ copy_file (const char *dest, const char
     }
 }
 
-/* Find the crtompend.o file in LIBRARY_PATH, make a copy and store
-   the name of the copy in ompend.  */
+/* Find the omp_begin.o and omp_end.o files in LIBRARY_PATH, make copies
+   and store the names of the copies in ompbegin and ompend.  */
 
 static void
-find_ompend (void)
+find_ompbeginend (void)
 {
   char **paths;
   const char *library_path = getenv ("LIBRARY_PATH");
   if (library_path == NULL)
     return;
-  int n_paths = parse_env_var (library_path, &paths, "/crtompend.o");
+  int n_paths = parse_env_var (library_path, &paths, "/crtompbegin.o");
 
-  for (int i = 0; i < n_paths; i++)
+  int i;
+  for (i = 0; i < n_paths; i++)
     if (access_check (paths[i], R_OK) == 0)
       {
+	size_t len = strlen (paths[i]);
+	char *tmp = xstrdup (paths[i]);
+	strcpy (paths[i] + len - 7, "end.o");
+	if (access_check (paths[i], R_OK) != 0)
+	  fatal ("installation error, can't find crtompend.o");
 	/* The linker will delete the filenames we give it, so make
 	   copies.  */
-	const char *omptmp = make_temp_file (".o");
-	copy_file (omptmp, paths[i]);
-	ompend = omptmp;
+	const char *omptmp1 = make_temp_file (".o");
+	const char *omptmp2 = make_temp_file (".o");
+	copy_file (omptmp1, tmp);
+	ompbegin = omptmp1;
+	copy_file (omptmp2, paths[i]);
+	ompend = omptmp2;
+	free (tmp);
 	break;
       }
-  if (ompend == 0)
-    fatal ("installation error, can't find crtompend.o");
+  if (i == n_paths)
+    fatal ("installation error, can't find crtompbegin.o");
 
   free_array_of_ptrs ((void**) paths, n_paths);
 }
@@ -1073,7 +1083,7 @@ cont:
 	  compile_images_for_openmp_targets (argc, argv);
 	  if (offload_names)
 	    {
-	      find_ompend ();
+	      find_ompbeginend ();
 	      for (i = 0; offload_names[i]; i++)
 		{
 		  fputs (offload_names[i], stdout);
@@ -1082,6 +1092,11 @@ cont:
 	      free_array_of_ptrs ((void **)offload_names, i);
 	    }
 	}
+      if (ompbegin)
+	{
+	  fputs (ompbegin, stdout);
+	  putc ('\n', stdout);
+	}
 
       for (i = 0; i < nr; ++i)
 	{
Index: libgcc/Makefile.in
===================================================================
--- libgcc/Makefile.in	(revision 210170)
+++ libgcc/Makefile.in	(working copy)
@@ -975,6 +975,9 @@ crtbegin$(objext): $(srcdir)/crtstuff.c
 crtend$(objext): $(srcdir)/crtstuff.c
 	$(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_END
 
+crtompbegin$(objext): $(srcdir)/ompstuff.c
+	$(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_BEGIN
+
 crtompend$(objext): $(srcdir)/ompstuff.c
 	$(crt_compile) $(CRTSTUFF_T_CFLAGS) -c $< -DCRT_END
 
Index: libgcc/configure
===================================================================
--- libgcc/configure	(revision 210170)
+++ libgcc/configure	(working copy)
@@ -4397,7 +4397,7 @@ fi
 
 
 if test x$offload_targets != x; then
-  extra_parts="${extra_parts} crtompend.o"
+  extra_parts="${extra_parts} crtompbegin.o crtompend.o"
 fi
 
 # Check if Solaris/x86 linker supports ZERO terminator unwind entries.
Index: libgcc/configure.ac
===================================================================
--- libgcc/configure.ac	(revision 210170)
+++ libgcc/configure.ac	(working copy)
@@ -336,7 +336,7 @@ AC_ARG_ENABLE(offload-targets,
 ], [enable_accelerator=no])
 AC_SUBST(enable_accelerator)
 if test x$offload_targets != x; then
-  extra_parts="${extra_parts} crtompend.o"
+  extra_parts="${extra_parts} crtompbegin.o crtompend.o"
 fi
 
 # Check if Solaris/x86 linker supports ZERO terminator unwind entries.
Index: libgcc/ompstuff.c
===================================================================
--- libgcc/ompstuff.c	(revision 210170)
+++ libgcc/ompstuff.c	(working copy)
@@ -39,14 +39,35 @@ see the files COPYING3 and COPYING.RUNTI
 #include "tm.h"
 #include "libgcc_tm.h"
 
+#ifdef CRT_BEGIN
+
 #if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
-extern void __start___gnu_offload_funcs;
-extern void __stop___gnu_offload_funcs;
-extern void __start___gnu_offload_vars;
-extern void __stop___gnu_offload_vars;
+void *_omp_func_table[0]
+  __attribute__ ((__used__, visibility ("hidden"),
+		  section (".offload_func_table_section"))) = { };
+void *_omp_var_table[0]
+  __attribute__ ((__used__, visibility ("hidden"),
+		  section (".offload_var_table_section"))) = { };
+#endif
+
+#elif defined CRT_END
+
+#if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
+void *_omp_funcs_end[0]
+  __attribute__ ((__used__, visibility ("hidden"),
+		  section (".offload_func_table_section"))) = { };
+void *_omp_vars_end[0]
+  __attribute__ ((__used__, visibility ("hidden"),
+		  section (".offload_var_table_section"))) = { };
+extern void *_omp_func_table[];
+extern void *_omp_var_table[];
 void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("hidden"))) =
 {
-  &__start___gnu_offload_funcs, &__stop___gnu_offload_funcs,
-  &__start___gnu_offload_vars, &__stop___gnu_offload_vars
+  &_omp_func_table, &_omp_funcs_end,
+  &_omp_var_table, &_omp_vars_end
 };
 #endif
+
+#else /* ! CRT_BEGIN && ! CRT_END */
+#error "One of CRT_BEGIN or CRT_END must be defined."
+#endif

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-05-08 10:11               ` Bernd Schmidt
@ 2014-05-12 20:44                 ` Bernd Schmidt
  0 siblings, 0 replies; 48+ messages in thread
From: Bernd Schmidt @ 2014-05-12 20:44 UTC (permalink / raw)
  To: Ilya Verbin, Jakub Jelinek
  Cc: Thomas Schwinge, GCC Patches, Michael Zolotukhin

[-- Attachment #1: Type: text/plain, Size: 804 bytes --]

On 05/08/2014 12:11 PM, Bernd Schmidt wrote:
> On 05/06/2014 05:32 PM, Ilya Verbin wrote:
>> On 05 Apr 17:22, Bernd Schmidt wrote:
>>> Things seemed to work over here, but now I'm not certain whether the
>>> __start_/__stop_ functionality is GNU ld specific? Maybe we should
>>> just go back to the previous version of this patch which didn't try
>>> to use this.
>>>
>>> Bernd
>>
>> This approach does not work with shared libraries.
>>
>> The automatically inserted symbols have GLOBAL binding, therefore the
>> __start_/__stop_ from the executable overwrite the respective symbols
>> in DSO.
>
> Ok, I guess we should just go back to what we had previously. Here's
> what I intend to commit if there are no objections.

... plus the following to make it work with the changed section names.


Bernd



[-- Attachment #2: fixup-secnames.diff --]
[-- Type: text/x-patch, Size: 1270 bytes --]

Index: libgcc/ompstuff.c
===================================================================
--- libgcc/ompstuff.c	(revision 432834)
+++ libgcc/ompstuff.c	(working copy)
@@ -44,10 +44,10 @@ see the files COPYING3 and COPYING.RUNTI
 #if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
 void *_omp_func_table[0]
   __attribute__ ((__used__, visibility ("hidden"),
-		  section (".offload_func_table_section"))) = { };
+		  section ("__gnu_offload_funcs"))) = { };
 void *_omp_var_table[0]
   __attribute__ ((__used__, visibility ("hidden"),
-		  section (".offload_var_table_section"))) = { };
+		  section ("__gnu_offload_vars"))) = { };
 #endif
 
 #elif defined CRT_END
@@ -55,10 +55,10 @@ void *_omp_var_table[0]
 #if defined(HAVE_GAS_HIDDEN) && defined(ENABLE_OFFLOADING)
 void *_omp_funcs_end[0]
   __attribute__ ((__used__, visibility ("hidden"),
-		  section (".offload_func_table_section"))) = { };
+		  section ("__gnu_offload_funcs"))) = { };
 void *_omp_vars_end[0]
   __attribute__ ((__used__, visibility ("hidden"),
-		  section (".offload_var_table_section"))) = { };
+		  section ("__gnu_offload_vars"))) = { };
 extern void *_omp_func_table[];
 extern void *_omp_var_table[];
 void *__OPENMP_TARGET__[] __attribute__ ((__visibility__ ("hidden"))) =

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-04-17 18:44         ` Ilya Verbin
  2014-04-25 11:55           ` Ilya Verbin
@ 2014-06-10 13:52           ` Bernd Schmidt
  2014-06-10 18:07             ` Ilya Verbin
  1 sibling, 1 reply; 48+ messages in thread
From: Bernd Schmidt @ 2014-06-10 13:52 UTC (permalink / raw)
  To: Ilya Verbin, Jakub Jelinek; +Cc: GCC Patches, Michael Zolotukhin

On 04/17/2014 08:33 PM, Ilya Verbin wrote:
> Could you please take a look at this patch?  It fixes the ordering issue in the
> tables stated above, and passes all the tests that I have.  But I'm not sure
> about its correctness from the architectural point of view.

I'm still skeptical relying on ordering is going to work in the long 
run, but in the meantime this looks better than what we have at the 
moment. So I think this should probably go in for now, but first it 
needs a few small changes:

> --- a/gcc/lto-cgraph.c
> +++ b/gcc/lto-cgraph.c
> @@ -82,6 +82,8 @@ enum LTO_symtab_tags
>     LTO_symtab_last_tag
>   };
>
> +extern vec<tree, va_gc> *offload_funcs, *offload_vars;

Declarations go into header files.

> +void
> +output_offload_tables (void)

All functions should have a comment.

> +{
> +  /* Collect all omp-target global variables to offload_vars, if they have not
> +     been gathered earlier by input_offload_tables.  */
> +  if (vec_safe_is_empty (offload_vars))

What if a variable was entered into the table by something other than 
input_offload_tables? We'll skip this code entirely, which doesn't seem 
right. Can we even get here after input_offload_tables has been called, 
and if so, maybe this step of collecting variables belongs elsewhere?

Also, the previous code did the same for functions, and I can't find 
anything corresponding to that after the patch. Is this intentional?


Bernd

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-06-10 13:52           ` Bernd Schmidt
@ 2014-06-10 18:07             ` Ilya Verbin
  2014-08-13 16:19               ` Ilya Verbin
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Verbin @ 2014-06-10 18:07 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Jakub Jelinek, GCC Patches, Richard Biener

On 10 Jun 15:52, Bernd Schmidt wrote:
> On 04/17/2014 08:33 PM, Ilya Verbin wrote:
> >+{
> >+  /* Collect all omp-target global variables to offload_vars, if they have not
> >+     been gathered earlier by input_offload_tables.  */
> >+  if (vec_safe_is_empty (offload_vars))
> 
> What if a variable was entered into the table by something other
> than input_offload_tables? We'll skip this code entirely, which
> doesn't seem right. Can we even get here after input_offload_tables
> has been called, and if so, maybe this step of collecting variables
> belongs elsewhere?
> 
> Also, the previous code did the same for functions, and I can't find
> anything corresponding to that after the patch. Is this intentional?

I'll try to explain with an example bellow:

Suppose there are 2 source files: test1.c and test2.c.

    1. During the compilation of test1.c:
  1.1. In expand_omp_target gcc adds new target functions into offload_funcs;
  1.2. In output_offload_tables gcc adds all target variables into offload_vars;
  1.3. In output_offload_tables gcc streams offload_funcs/vars into TARGET LTO_section_offload_table.
       And if there is -flto, it also streams them into the HOST LTO_section_offload_table;
  1.4. In omp_finish_file gcc writes addresses from offload_funcs/vars into test1.o.

    2. The same steps happen for test2.c.

   3a. If there is no -flto, ld will join raw tables from test1.o and test2.o.
       And accel compiler will join tables from target LTO_section_offload_table.
       For now this mode isn't implemented, to run accel compiler we need -flto.
   3b. If there is -flto (let's consider WHOPR mode, since LTO mode is simpler), there are 2 stages:
  3.1. WPA:
3.1.1. In input_offload_tables gcc reads host LTO_section_offload_table from test1.o and test2.o;
3.1.2. In output_offload_tables gcc streams the joined tables into LTO_section_offload_table in the new partition xxx.ltrans0.o;
  3.2. LTRANS:
3.2.1. In input_offload_tables gcc reads host LTO_section_offload_table from xxx.ltrans0.o;
3.2.2. In omp_finish_file gcc writes addresses from offload_funcs/vars into the final xxx.ltrans0.ltrans.o.

So, the question is what is the right place for collecting decls into offload_funcs/vars?
I collect offload_funcs in expand_omp_target where they're created.
But for offload_vars I couldn't find a place better than output_offload_tables.
That's why I added "if (vec_safe_is_empty (offload_vars))".
If the var decls have been read by input_offload_tables on the step 3.1.1, there is no need to
collect them from FOR_EACH_DEFINED_VARIABLE on the step 3.1.2, because that order might be incorrect.

Thanks,
  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-06-10 18:07             ` Ilya Verbin
@ 2014-08-13 16:19               ` Ilya Verbin
  2014-08-18 16:08                 ` Ilya Verbin
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Verbin @ 2014-08-13 16:19 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Thomas Schwinge, Jakub Jelinek, GCC Patches

Hi,

Here is the updated patch.  offload_funcs/vars are now declared in omp-low.h,
the functions have a comment.  Also it fixes the issue of offload_funcs/vars
corruption by the garbage collector.  OK for gomp-4_0-branch?

  -- Ilya

---
 gcc/Makefile.in        |    1 +
 gcc/gengtype.c         |    2 +-
 gcc/lto-cgraph.c       |  110 ++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/lto-section-in.c   |    3 +-
 gcc/lto-streamer-out.c |    2 +
 gcc/lto-streamer.h     |    3 +
 gcc/lto/lto.c          |    2 +
 gcc/omp-low.c          |   68 ++++++------------------------
 gcc/omp-low.h          |    3 +
 9 files changed, 137 insertions(+), 57 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index bfa5f32..372f586 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2290,6 +2290,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/tree-profile.c $(srcdir)/tree-nested.c \
   $(srcdir)/tree-parloops.c \
   $(srcdir)/omp-low.c \
+  $(srcdir)/omp-low.h \
   $(srcdir)/targhooks.c $(out_file) $(srcdir)/passes.c $(srcdir)/cgraphunit.c \
   $(srcdir)/cgraphclones.c \
   $(srcdir)/tree-phinodes.c \
diff --git a/gcc/gengtype.c b/gcc/gengtype.c
index ffe3f94..5bcbbe2 100644
--- a/gcc/gengtype.c
+++ b/gcc/gengtype.c
@@ -1800,7 +1800,7 @@ open_base_files (void)
       "tree-ssa.h", "reload.h", "cpp-id-data.h", "tree-chrec.h",
       "except.h", "output.h",  "cfgloop.h",
       "target.h", "ipa-prop.h", "lto-streamer.h", "target-globals.h",
-      "ipa-inline.h", "dwarf2out.h", NULL
+      "ipa-inline.h", "dwarf2out.h", "omp-low.h", NULL
     };
     const char *const *ifp;
     outf_p gtype_desc_c;
diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index bc05400..64ad599 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "context.h"
 #include "pass_manager.h"
 #include "ipa-utils.h"
+#include "omp-low.h"
 
 /* True when asm nodes has been output.  */
 bool asm_nodes_output = false;
@@ -1044,6 +1045,66 @@ read_string (struct lto_input_block *ib)
   return str;
 }
 
+/* Output function/variable tables that will allow libgomp to look up offload
+   target code.  OFFLOAD_FUNCS is filled in expand_omp_target, OFFLOAD_VARS is
+   filled here just before streaming.  In WHOPR (partitioned) mode during the
+   WPA stage both OFFLOAD_FUNCS and OFFLOAD_VARS are filled by
+   input_offload_tables.  */
+
+void
+output_offload_tables (void)
+{
+  /* Collect all omp-target global variables to offload_vars, if they have not
+     been gathered earlier by input_offload_tables on the WPA stage.  */
+  if (!flag_wpa && vec_safe_is_empty (offload_vars))
+    {
+      struct varpool_node *vnode;
+      FOR_EACH_DEFINED_VARIABLE (vnode)
+	{
+	  if (!lookup_attribute ("omp declare target",
+				 DECL_ATTRIBUTES (vnode->decl))
+	      || TREE_CODE (vnode->decl) != VAR_DECL
+	      || DECL_SIZE (vnode->decl) == 0)
+	    continue;
+	  vec_safe_push (offload_vars, vnode->decl);
+	}
+    }
+
+  if (vec_safe_is_empty (offload_funcs) && vec_safe_is_empty (offload_vars))
+    return;
+
+  struct lto_simple_output_block *ob
+    = lto_create_simple_output_block (LTO_section_offload_table);
+
+  for (unsigned i = 0; i < vec_safe_length (offload_funcs); i++)
+    {
+      streamer_write_enum (ob->main_stream, LTO_symtab_tags,
+			   LTO_symtab_last_tag, LTO_symtab_unavail_node);
+      lto_output_fn_decl_index (ob->decl_state, ob->main_stream,
+				(*offload_funcs)[i]);
+    }
+
+  for (unsigned i = 0; i < vec_safe_length (offload_vars); i++)
+    {
+      streamer_write_enum (ob->main_stream, LTO_symtab_tags,
+			   LTO_symtab_last_tag, LTO_symtab_variable);
+      lto_output_var_decl_index (ob->decl_state, ob->main_stream,
+				 (*offload_vars)[i]);
+    }
+
+  streamer_write_uhwi_stream (ob->main_stream, 0);
+  lto_destroy_simple_output_block (ob);
+
+  /* In WHOPR mode during the WPA stage the joint offload tables need to be
+     streamed to one partition only.  That's why we free offload_funcs and
+     offload_vars after the first call of output_offload_tables.  */
+  if (flag_wpa)
+    {
+      vec_free (offload_funcs);
+      vec_free (offload_vars);
+    }
+}
+
 /* Overwrite the information in NODE based on FILE_DATA, TAG, FLAGS,
    STACK_SIZE, SELF_TIME and SELF_SIZE.  This is called either to initialize
    NODE or to replace the values in it, for instance because the first
@@ -1739,6 +1800,55 @@ input_symtab (void)
     }
 }
 
+/* Input function/variable tables that will allow libgomp to look up offload
+   target code, and store them into OFFLOAD_FUNCS and OFFLOAD_VARS.  */
+
+void
+input_offload_tables (void)
+{
+  struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data ();
+  struct lto_file_decl_data *file_data;
+  unsigned int j = 0;
+
+  while ((file_data = file_data_vec[j++]))
+    {
+      const char *data;
+      size_t len;
+      struct lto_input_block *ib
+	= lto_create_simple_input_block (file_data, LTO_section_offload_table,
+					 &data, &len);
+      if (!ib)
+	continue;
+
+      enum LTO_symtab_tags tag
+	= streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag);
+      while (tag)
+	{
+	  if (tag == LTO_symtab_unavail_node)
+	    {
+	      int decl_index = streamer_read_uhwi (ib);
+	      tree fn_decl
+		= lto_file_decl_data_get_fn_decl (file_data, decl_index);
+	      vec_safe_push (offload_funcs, fn_decl);
+	    }
+	  else if (tag == LTO_symtab_variable)
+	    {
+	      int decl_index = streamer_read_uhwi (ib);
+	      tree var_decl
+		= lto_file_decl_data_get_var_decl (file_data, decl_index);
+	      vec_safe_push (offload_vars, var_decl);
+	    }
+	  else
+	    fatal_error ("invalid offload table in %s", file_data->file_name);
+
+	  tag = streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag);
+	}
+
+      lto_destroy_simple_input_block (file_data, LTO_section_offload_table,
+				      ib, data, len);
+    }
+}
+
 /* True when we need optimization summary for NODE.  */
 
 static int
diff --git a/gcc/lto-section-in.c b/gcc/lto-section-in.c
index d887763..b705c75 100644
--- a/gcc/lto-section-in.c
+++ b/gcc/lto-section-in.c
@@ -60,7 +60,8 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] =
   "opts",
   "cgraphopt",
   "inline",
-  "ipcp_trans"
+  "ipcp_trans",
+  "offload_table"
 };
 
 
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 3064562..ff8572d 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -2108,6 +2108,8 @@ lto_output (void)
      statements using the statement UIDs.  */
   output_symtab ();
 
+  output_offload_tables ();
+
 #ifdef ENABLE_CHECKING
   lto_bitmap_free (output);
 #endif
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index eedec95..3607634 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -248,6 +248,7 @@ enum lto_section_type
   LTO_section_cgraph_opt_sum,
   LTO_section_inline_summary,
   LTO_section_ipcp_transform,
+  LTO_section_offload_table,
   LTO_N_SECTION_TYPES		/* Must be last.  */
 };
 
@@ -884,6 +885,8 @@ bool lto_symtab_encoder_encode_initializer_p (lto_symtab_encoder_t,
 					      varpool_node *);
 void output_symtab (void);
 void input_symtab (void);
+void output_offload_tables (void);
+void input_offload_tables (void);
 bool referenced_from_other_partition_p (struct ipa_ref_list *,
 				        lto_symtab_encoder_t);
 bool reachable_from_other_partition_p (struct cgraph_node *,
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 28c896d..a0b606c 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -3015,6 +3015,8 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames)
   /* Read the symtab.  */
   input_symtab ();
 
+  input_offload_tables ();
+
   /* Store resolutions into the symbol table.  */
 
   FOR_EACH_SYMBOL (snode)
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index ce97a0e..6bea2c3 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -240,6 +240,9 @@ omp_get_id (tree node)
 /* Holds a decl for __OPENMP_TARGET__.  */
 static GTY(()) tree offload_symbol_decl;
 
+/* Holds offload tables with decls.  */
+vec<tree, va_gc> *offload_funcs, *offload_vars;
+
 /* Get the __OPENMP_TARGET__ symbol.  */
 static tree
 get_offload_symbol_decl (void)
@@ -8906,6 +8909,9 @@ expand_omp_target (struct omp_region *region)
       DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties;
       cgraph_add_new_function (child_fn, true);
 
+      /* Add the new function to the offload table.  */
+      vec_safe_push (offload_funcs, child_fn);
+
       /* Fix the callgraph edges for child_cfun.  Those for cfun will be
 	 fixed in a following pass.  */
       push_cfun (child_cfun);
@@ -13730,71 +13736,23 @@ add_decls_addresses_to_decl_constructor (vec<tree, va_gc> *v_decls,
 void
 omp_finish_file (void)
 {
-  struct cgraph_node *node;
-  struct varpool_node *vnode;
   const char *funcs_section_name = OFFLOAD_FUNC_TABLE_SECTION_NAME;
   const char *vars_section_name = OFFLOAD_VAR_TABLE_SECTION_NAME;
-  vec<tree, va_gc> *v_funcs, *v_vars;
-
-  vec_alloc (v_vars, 0);
-  vec_alloc (v_funcs, 0);
-
-  /* Collect all omp-target functions.  */
-  FOR_EACH_DEFINED_FUNCTION (node)
-    {
-      /* TODO: This check could fail on functions, created by omp
-	 parallel/task pragmas.  It's better to name outlined for offloading
-	 functions in some different way and to check here the function name.
-	 It could be something like "*_omp_tgtfn" in contrast with "*_omp_fn"
-	 for functions from omp parallel/task pragmas.  */
-      if (!lookup_attribute ("omp declare target",
-			     DECL_ATTRIBUTES (node->decl))
-	  || !DECL_ARTIFICIAL (node->decl))
-	continue;
-      vec_safe_push (v_funcs, node->decl);
-    }
-  /* Collect all omp-target global variables.  */
-  FOR_EACH_DEFINED_VARIABLE (vnode)
-    {
-      if (!lookup_attribute ("omp declare target",
-			     DECL_ATTRIBUTES (vnode->decl))
-	  || TREE_CODE (vnode->decl) != VAR_DECL
-	  || DECL_SIZE (vnode->decl) == 0)
-	continue;
 
-      vec_safe_push (v_vars, vnode->decl);
-    }
-  unsigned num_vars = vec_safe_length (v_vars);
-  unsigned num_funcs = vec_safe_length (v_funcs);
+  unsigned num_funcs = vec_safe_length (offload_funcs);
+  unsigned num_vars = vec_safe_length (offload_vars);
 
-  if (num_vars == 0 && num_funcs == 0)
+  if (num_funcs == 0 && num_vars == 0)
     return;
 
-#ifdef ACCEL_COMPILER
-  /* Decls are placed in reversed order in fat-objects, so we need to
-     revert them back if we compile target.  */
-  for (unsigned i = 0; i < num_funcs / 2; i++)
-    {
-      tree it = (*v_funcs)[i];
-      (*v_funcs)[i] = (*v_funcs)[num_funcs - i - 1];
-      (*v_funcs)[num_funcs - i - 1] = it;
-    }
-  for (unsigned i = 0; i < num_vars / 2; i++)
-    {
-      tree it = (*v_vars)[i];
-      (*v_vars)[i] = (*v_vars)[num_vars - i - 1];
-      (*v_vars)[num_vars - i - 1] = it;
-    }
-#endif
-
   if (targetm_common.have_named_sections)
     {
       vec<constructor_elt, va_gc> *v_f, *v_v;
       vec_alloc (v_f, num_funcs);
       vec_alloc (v_v, num_vars * 2);
 
-      add_decls_addresses_to_decl_constructor (v_funcs, v_f);
-      add_decls_addresses_to_decl_constructor (v_vars, v_v);
+      add_decls_addresses_to_decl_constructor (offload_funcs, v_f);
+      add_decls_addresses_to_decl_constructor (offload_vars, v_v);
 
       tree vars_decl_type = build_array_type_nelts (pointer_sized_int_node,
 						    num_vars * 2);
@@ -13825,12 +13783,12 @@ omp_finish_file (void)
     {
       for (unsigned i = 0; i < num_funcs; i++)
 	{
-	  tree it = (*v_funcs)[i];
+	  tree it = (*offload_funcs)[i];
 	  targetm.record_offload_symbol (it);
 	}  
       for (unsigned i = 0; i < num_vars; i++)
 	{
-	  tree it = (*v_vars)[i];
+	  tree it = (*offload_vars)[i];
 	  targetm.record_offload_symbol (it);
 	}  
     }
diff --git a/gcc/omp-low.h b/gcc/omp-low.h
index f904eda..ac587d0 100644
--- a/gcc/omp-low.h
+++ b/gcc/omp-low.h
@@ -29,4 +29,7 @@ extern tree omp_reduction_init (tree, tree);
 extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *);
 extern void omp_finish_file (void);
 
+extern GTY(()) vec<tree, va_gc> *offload_funcs;
+extern GTY(()) vec<tree, va_gc> *offload_vars;
+
 #endif /* GCC_OMP_LOW_H */
-- 
1.7.1

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-08-13 16:19               ` Ilya Verbin
@ 2014-08-18 16:08                 ` Ilya Verbin
  2014-08-18 16:25                   ` Bernd Schmidt
  2014-09-03 19:24                   ` Thomas Schwinge
  0 siblings, 2 replies; 48+ messages in thread
From: Ilya Verbin @ 2014-08-18 16:08 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Thomas Schwinge, Jakub Jelinek, GCC Patches

Hi Bernd,

I discovered an issue in the LTO streaming out for target - currently any file (even without any pragma) compiled with -fopenmp/-fopenacc contains .gnu.target_lto_* sections.  This increases the size of an object file and makes lto-wrapper to run mkoffload.

Therefore, I propose to replace the condition before ipa_write_summaries:
- if (flag_openacc || flag_openmp)
+ if ((flag_openacc || flag_openmp) && !(vec_safe_is_empty (offload_funcs) && vec_safe_is_empty (offload_vars)))

But to do this, the offload_vars must be filled before the check (offload_funcs is already filled in expand_omp_target).
Here is the updated patch.  Bootstrap passed.  OK for gomp-4_0-branch?


On 13 Aug 20:19, Ilya Verbin wrote:
> Here is the updated patch.  offload_funcs/vars are now declared in omp-low.h,
> the functions have a comment.  Also it fixes the issue of offload_funcs/vars
> corruption by the garbage collector.  OK for gomp-4_0-branch?


---
 gcc/Makefile.in        |    1 +
 gcc/cgraphunit.c       |   25 ++++++++++++-
 gcc/gengtype.c         |    2 +-
 gcc/lto-cgraph.c       |   93 ++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/lto-section-in.c   |    3 +-
 gcc/lto-streamer-out.c |    2 +
 gcc/lto-streamer.h     |    3 ++
 gcc/lto/lto.c          |    2 +
 gcc/omp-low.c          |   74 ++++++++------------------------------
 gcc/omp-low.h          |    3 ++
 10 files changed, 147 insertions(+), 61 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index bfa5f32..372f586 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2290,6 +2290,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/tree-profile.c $(srcdir)/tree-nested.c \
   $(srcdir)/tree-parloops.c \
   $(srcdir)/omp-low.c \
+  $(srcdir)/omp-low.h \
   $(srcdir)/targhooks.c $(out_file) $(srcdir)/passes.c $(srcdir)/cgraphunit.c \
   $(srcdir)/cgraphclones.c \
   $(srcdir)/tree-phinodes.c \
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 308c534..f0c9f5c 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -211,6 +211,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-nested.h"
 #include "gimplify.h"
 #include "dbgcnt.h"
+#include "omp-low.h"
 #include "lto-section-names.h"
 
 /* Queue of cgraph nodes scheduled to be added into cgraph.  This is a
@@ -2039,6 +2040,24 @@ output_in_order (void)
   free (nodes);
 }
 
+/* Collect all global variables with "omp declare target" attribute into
+   OFFLOAD_VARS.  It will be streamed out in ipa_write_summaries.  */
+
+static void
+init_offload_var_table (void)
+{
+  struct varpool_node *vnode;
+  FOR_EACH_DEFINED_VARIABLE (vnode)
+    {
+      if (!lookup_attribute ("omp declare target",
+			     DECL_ATTRIBUTES (vnode->decl))
+	  || TREE_CODE (vnode->decl) != VAR_DECL
+	  || DECL_SIZE (vnode->decl) == 0)
+	continue;
+      vec_safe_push (offload_vars, vnode->decl);
+    }
+}
+
 static void
 ipa_passes (void)
 {
@@ -2089,7 +2108,11 @@ ipa_passes (void)
 
   if (!in_lto_p)
     {
-      if (flag_openacc || flag_openmp)
+      init_offload_var_table ();
+
+      if ((flag_openacc || flag_openmp)
+	  && !(vec_safe_is_empty (offload_funcs)
+	       && vec_safe_is_empty (offload_vars)))
 	{
 	  section_name_prefix = OMP_SECTION_NAME_PREFIX;
 	  ipa_write_summaries (true);
diff --git a/gcc/gengtype.c b/gcc/gengtype.c
index ffe3f94..5bcbbe2 100644
--- a/gcc/gengtype.c
+++ b/gcc/gengtype.c
@@ -1800,7 +1800,7 @@ open_base_files (void)
       "tree-ssa.h", "reload.h", "cpp-id-data.h", "tree-chrec.h",
       "except.h", "output.h",  "cfgloop.h",
       "target.h", "ipa-prop.h", "lto-streamer.h", "target-globals.h",
-      "ipa-inline.h", "dwarf2out.h", NULL
+      "ipa-inline.h", "dwarf2out.h", "omp-low.h", NULL
     };
     const char *const *ifp;
     outf_p gtype_desc_c;
diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index bc05400..8fb7078 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -52,6 +52,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "context.h"
 #include "pass_manager.h"
 #include "ipa-utils.h"
+#include "omp-low.h"
 
 /* True when asm nodes has been output.  */
 bool asm_nodes_output = false;
@@ -1044,6 +1045,49 @@ read_string (struct lto_input_block *ib)
   return str;
 }
 
+/* Output function/variable tables that will allow libgomp to look up offload
+   target code.  OFFLOAD_FUNCS is filled in expand_omp_target, OFFLOAD_VARS is
+   filled in ipa_passes.  In WHOPR (partitioned) mode during the WPA stage both
+   OFFLOAD_FUNCS and OFFLOAD_VARS are filled by input_offload_tables.  */
+
+void
+output_offload_tables (void)
+{
+  if (vec_safe_is_empty (offload_funcs) && vec_safe_is_empty (offload_vars))
+    return;
+
+  struct lto_simple_output_block *ob
+    = lto_create_simple_output_block (LTO_section_offload_table);
+
+  for (unsigned i = 0; i < vec_safe_length (offload_funcs); i++)
+    {
+      streamer_write_enum (ob->main_stream, LTO_symtab_tags,
+			   LTO_symtab_last_tag, LTO_symtab_unavail_node);
+      lto_output_fn_decl_index (ob->decl_state, ob->main_stream,
+				(*offload_funcs)[i]);
+    }
+
+  for (unsigned i = 0; i < vec_safe_length (offload_vars); i++)
+    {
+      streamer_write_enum (ob->main_stream, LTO_symtab_tags,
+			   LTO_symtab_last_tag, LTO_symtab_variable);
+      lto_output_var_decl_index (ob->decl_state, ob->main_stream,
+				 (*offload_vars)[i]);
+    }
+
+  streamer_write_uhwi_stream (ob->main_stream, 0);
+  lto_destroy_simple_output_block (ob);
+
+  /* In WHOPR mode during the WPA stage the joint offload tables need to be
+     streamed to one partition only.  That's why we free offload_funcs and
+     offload_vars after the first call of output_offload_tables.  */
+  if (flag_wpa)
+    {
+      vec_free (offload_funcs);
+      vec_free (offload_vars);
+    }
+}
+
 /* Overwrite the information in NODE based on FILE_DATA, TAG, FLAGS,
    STACK_SIZE, SELF_TIME and SELF_SIZE.  This is called either to initialize
    NODE or to replace the values in it, for instance because the first
@@ -1739,6 +1783,55 @@ input_symtab (void)
     }
 }
 
+/* Input function/variable tables that will allow libgomp to look up offload
+   target code, and store them into OFFLOAD_FUNCS and OFFLOAD_VARS.  */
+
+void
+input_offload_tables (void)
+{
+  struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data ();
+  struct lto_file_decl_data *file_data;
+  unsigned int j = 0;
+
+  while ((file_data = file_data_vec[j++]))
+    {
+      const char *data;
+      size_t len;
+      struct lto_input_block *ib
+	= lto_create_simple_input_block (file_data, LTO_section_offload_table,
+					 &data, &len);
+      if (!ib)
+	continue;
+
+      enum LTO_symtab_tags tag
+	= streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag);
+      while (tag)
+	{
+	  if (tag == LTO_symtab_unavail_node)
+	    {
+	      int decl_index = streamer_read_uhwi (ib);
+	      tree fn_decl
+		= lto_file_decl_data_get_fn_decl (file_data, decl_index);
+	      vec_safe_push (offload_funcs, fn_decl);
+	    }
+	  else if (tag == LTO_symtab_variable)
+	    {
+	      int decl_index = streamer_read_uhwi (ib);
+	      tree var_decl
+		= lto_file_decl_data_get_var_decl (file_data, decl_index);
+	      vec_safe_push (offload_vars, var_decl);
+	    }
+	  else
+	    fatal_error ("invalid offload table in %s", file_data->file_name);
+
+	  tag = streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag);
+	}
+
+      lto_destroy_simple_input_block (file_data, LTO_section_offload_table,
+				      ib, data, len);
+    }
+}
+
 /* True when we need optimization summary for NODE.  */
 
 static int
diff --git a/gcc/lto-section-in.c b/gcc/lto-section-in.c
index d887763..b705c75 100644
--- a/gcc/lto-section-in.c
+++ b/gcc/lto-section-in.c
@@ -60,7 +60,8 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] =
   "opts",
   "cgraphopt",
   "inline",
-  "ipcp_trans"
+  "ipcp_trans",
+  "offload_table"
 };
 
 
diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 3064562..ff8572d 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -2108,6 +2108,8 @@ lto_output (void)
      statements using the statement UIDs.  */
   output_symtab ();
 
+  output_offload_tables ();
+
 #ifdef ENABLE_CHECKING
   lto_bitmap_free (output);
 #endif
diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
index eedec95..3607634 100644
--- a/gcc/lto-streamer.h
+++ b/gcc/lto-streamer.h
@@ -248,6 +248,7 @@ enum lto_section_type
   LTO_section_cgraph_opt_sum,
   LTO_section_inline_summary,
   LTO_section_ipcp_transform,
+  LTO_section_offload_table,
   LTO_N_SECTION_TYPES		/* Must be last.  */
 };
 
@@ -884,6 +885,8 @@ bool lto_symtab_encoder_encode_initializer_p (lto_symtab_encoder_t,
 					      varpool_node *);
 void output_symtab (void);
 void input_symtab (void);
+void output_offload_tables (void);
+void input_offload_tables (void);
 bool referenced_from_other_partition_p (struct ipa_ref_list *,
 				        lto_symtab_encoder_t);
 bool reachable_from_other_partition_p (struct cgraph_node *,
diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index 28c896d..a0b606c 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -3015,6 +3015,8 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames)
   /* Read the symtab.  */
   input_symtab ();
 
+  input_offload_tables ();
+
   /* Store resolutions into the symbol table.  */
 
   FOR_EACH_SYMBOL (snode)
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index ce97a0e..1ad98ab 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -240,6 +240,9 @@ omp_get_id (tree node)
 /* Holds a decl for __OPENMP_TARGET__.  */
 static GTY(()) tree offload_symbol_decl;
 
+/* Holds offload tables with decls.  */
+vec<tree, va_gc> *offload_funcs, *offload_vars;
+
 /* Get the __OPENMP_TARGET__ symbol.  */
 static tree
 get_offload_symbol_decl (void)
@@ -8906,6 +8909,9 @@ expand_omp_target (struct omp_region *region)
       DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties;
       cgraph_add_new_function (child_fn, true);
 
+      /* Add the new function to the offload table.  */
+      vec_safe_push (offload_funcs, child_fn);
+
       /* Fix the callgraph edges for child_cfun.  Those for cfun will be
 	 fixed in a following pass.  */
       push_cfun (child_cfun);
@@ -13730,71 +13736,23 @@ add_decls_addresses_to_decl_constructor (vec<tree, va_gc> *v_decls,
 void
 omp_finish_file (void)
 {
-  struct cgraph_node *node;
-  struct varpool_node *vnode;
   const char *funcs_section_name = OFFLOAD_FUNC_TABLE_SECTION_NAME;
   const char *vars_section_name = OFFLOAD_VAR_TABLE_SECTION_NAME;
-  vec<tree, va_gc> *v_funcs, *v_vars;
-
-  vec_alloc (v_vars, 0);
-  vec_alloc (v_funcs, 0);
-
-  /* Collect all omp-target functions.  */
-  FOR_EACH_DEFINED_FUNCTION (node)
-    {
-      /* TODO: This check could fail on functions, created by omp
-	 parallel/task pragmas.  It's better to name outlined for offloading
-	 functions in some different way and to check here the function name.
-	 It could be something like "*_omp_tgtfn" in contrast with "*_omp_fn"
-	 for functions from omp parallel/task pragmas.  */
-      if (!lookup_attribute ("omp declare target",
-			     DECL_ATTRIBUTES (node->decl))
-	  || !DECL_ARTIFICIAL (node->decl))
-	continue;
-      vec_safe_push (v_funcs, node->decl);
-    }
-  /* Collect all omp-target global variables.  */
-  FOR_EACH_DEFINED_VARIABLE (vnode)
-    {
-      if (!lookup_attribute ("omp declare target",
-			     DECL_ATTRIBUTES (vnode->decl))
-	  || TREE_CODE (vnode->decl) != VAR_DECL
-	  || DECL_SIZE (vnode->decl) == 0)
-	continue;
 
-      vec_safe_push (v_vars, vnode->decl);
-    }
-  unsigned num_vars = vec_safe_length (v_vars);
-  unsigned num_funcs = vec_safe_length (v_funcs);
+  unsigned num_funcs = vec_safe_length (offload_funcs);
+  unsigned num_vars = vec_safe_length (offload_vars);
 
-  if (num_vars == 0 && num_funcs == 0)
+  if (num_funcs == 0 && num_vars == 0)
     return;
 
-#ifdef ACCEL_COMPILER
-  /* Decls are placed in reversed order in fat-objects, so we need to
-     revert them back if we compile target.  */
-  for (unsigned i = 0; i < num_funcs / 2; i++)
-    {
-      tree it = (*v_funcs)[i];
-      (*v_funcs)[i] = (*v_funcs)[num_funcs - i - 1];
-      (*v_funcs)[num_funcs - i - 1] = it;
-    }
-  for (unsigned i = 0; i < num_vars / 2; i++)
-    {
-      tree it = (*v_vars)[i];
-      (*v_vars)[i] = (*v_vars)[num_vars - i - 1];
-      (*v_vars)[num_vars - i - 1] = it;
-    }
-#endif
-
   if (targetm_common.have_named_sections)
     {
       vec<constructor_elt, va_gc> *v_f, *v_v;
       vec_alloc (v_f, num_funcs);
       vec_alloc (v_v, num_vars * 2);
 
-      add_decls_addresses_to_decl_constructor (v_funcs, v_f);
-      add_decls_addresses_to_decl_constructor (v_vars, v_v);
+      add_decls_addresses_to_decl_constructor (offload_funcs, v_f);
+      add_decls_addresses_to_decl_constructor (offload_vars, v_v);
 
       tree vars_decl_type = build_array_type_nelts (pointer_sized_int_node,
 						    num_vars * 2);
@@ -13817,7 +13775,7 @@ omp_finish_file (void)
       DECL_INITIAL (vars_decl) = ctor_v;
       set_decl_section_name (funcs_decl, funcs_section_name);
       set_decl_section_name (vars_decl, vars_section_name);
- 
+
       varpool_assemble_decl (varpool_node_for_decl (vars_decl));
       varpool_assemble_decl (varpool_node_for_decl (funcs_decl));
    }
@@ -13825,14 +13783,14 @@ omp_finish_file (void)
     {
       for (unsigned i = 0; i < num_funcs; i++)
 	{
-	  tree it = (*v_funcs)[i];
+	  tree it = (*offload_funcs)[i];
 	  targetm.record_offload_symbol (it);
-	}  
+	}
       for (unsigned i = 0; i < num_vars; i++)
 	{
-	  tree it = (*v_vars)[i];
+	  tree it = (*offload_vars)[i];
 	  targetm.record_offload_symbol (it);
-	}  
+	}
     }
 }
 
diff --git a/gcc/omp-low.h b/gcc/omp-low.h
index f904eda..ac587d0 100644
--- a/gcc/omp-low.h
+++ b/gcc/omp-low.h
@@ -29,4 +29,7 @@ extern tree omp_reduction_init (tree, tree);
 extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *);
 extern void omp_finish_file (void);
 
+extern GTY(()) vec<tree, va_gc> *offload_funcs;
+extern GTY(()) vec<tree, va_gc> *offload_vars;
+
 #endif /* GCC_OMP_LOW_H */
-- 
1.7.1


Thanks,
   -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-08-18 16:08                 ` Ilya Verbin
@ 2014-08-18 16:25                   ` Bernd Schmidt
  2014-08-19 10:41                     ` Ilya Verbin
  2014-09-03 19:24                   ` Thomas Schwinge
  1 sibling, 1 reply; 48+ messages in thread
From: Bernd Schmidt @ 2014-08-18 16:25 UTC (permalink / raw)
  To: Ilya Verbin; +Cc: Thomas Schwinge, Jakub Jelinek, GCC Patches

On 08/18/2014 06:07 PM, Ilya Verbin wrote:
> I discovered an issue in the LTO streaming out for target - currently any file (even without any pragma) compiled with -fopenmp/-fopenacc contains .gnu.target_lto_* sections.  This increases the size of an object file and makes lto-wrapper to run mkoffload.
>
> Therefore, I propose to replace the condition before ipa_write_summaries:
> - if (flag_openacc || flag_openmp)
> + if ((flag_openacc || flag_openmp) && !(vec_safe_is_empty (offload_funcs) && vec_safe_is_empty (offload_vars)))
>
> But to do this, the offload_vars must be filled before the check (offload_funcs is already filled in expand_omp_target).
> Here is the updated patch.  Bootstrap passed.  OK for gomp-4_0-branch?

I think I'd be happier if the function was called init_offload_table and 
also collected functions, rather than leaving that to expand_omp_target. 
I think the patch would be ok with that change.


Bernd

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-08-18 16:25                   ` Bernd Schmidt
@ 2014-08-19 10:41                     ` Ilya Verbin
  2014-08-19 11:55                       ` Bernd Schmidt
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Verbin @ 2014-08-19 10:41 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Thomas Schwinge, Jakub Jelinek, GCC Patches

On 18 Aug 18:24, Bernd Schmidt wrote:
> I think I'd be happier if the function was called init_offload_table
> and also collected functions, rather than leaving that to
> expand_omp_target. I think the patch would be ok with that change.

For the functions it's not so easy to identify which of them to add into the table, e.g.:
  #pragma omp target
    #pragma omp parallel
      x++;
Here 2 functions with "omp declare target" attribute are created.  But only the outer must be added to the table.
So I believe that expand_omp_target is better place for the functions.

  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-08-19 10:41                     ` Ilya Verbin
@ 2014-08-19 11:55                       ` Bernd Schmidt
  2014-08-19 12:26                         ` Ilya Verbin
  2014-08-19 13:27                         ` Ilya Verbin
  0 siblings, 2 replies; 48+ messages in thread
From: Bernd Schmidt @ 2014-08-19 11:55 UTC (permalink / raw)
  To: Ilya Verbin; +Cc: Thomas Schwinge, Jakub Jelinek, GCC Patches

On 08/19/2014 12:41 PM, Ilya Verbin wrote:
> On 18 Aug 18:24, Bernd Schmidt wrote:
>> I think I'd be happier if the function was called init_offload_table
>> and also collected functions, rather than leaving that to
>> expand_omp_target. I think the patch would be ok with that change.
>
> For the functions it's not so easy to identify which of them to add into the table, e.g.:
>    #pragma omp target
>      #pragma omp parallel
>        x++;
> Here 2 functions with "omp declare target" attribute are created.  But only the outer must be added to the table.
> So I believe that expand_omp_target is better place for the functions.

Hmm, ok. Can you elaborate how this happens and why only one must be 
added to the table?

In any case, I think let's get this checked in for now and iterate later.


Bernd


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-08-19 11:55                       ` Bernd Schmidt
@ 2014-08-19 12:26                         ` Ilya Verbin
  2014-08-19 13:27                         ` Ilya Verbin
  1 sibling, 0 replies; 48+ messages in thread
From: Ilya Verbin @ 2014-08-19 12:26 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Thomas Schwinge, Jakub Jelinek, GCC Patches

On 19 Aug 13:55, Bernd Schmidt wrote:
> On 08/19/2014 12:41 PM, Ilya Verbin wrote:
> >For the functions it's not so easy to identify which of them to add into the table, e.g.:
> >   #pragma omp target
> >     #pragma omp parallel
> >       x++;
> >Here 2 functions with "omp declare target" attribute are created.  But only the outer must be added to the table.
> >So I believe that expand_omp_target is better place for the functions.
> 
> Hmm, ok. Can you elaborate how this happens and why only one must be
> added to the table?

Here is gimple for this testcase:

foo ()
{
  /* prepare data */
  __builtin_GOMP_target (-1, foo._omp_fn.0, /* data */);
}

foo._omp_fn.0 (struct .omp_data_t.0 * .omp_data_i)
{
  /* prepare data */
  __builtin_GOMP_parallel (foo._omp_fn.1, /* data */);
}

foo._omp_fn.1 (struct .omp_data_s.1 * .omp_data_i)
{
  _3 = .omp_data_i_2(D)->x;
  _4 = _3 + 1;
  .omp_data_i_2(D)->x = _4;
}

Both fn.0 and fn.1 can be executed on host and on target, therefore they have "omp declare target" attribute.  And there are 2 alternatives during runtime:
1. GOMP_target calls fn.0 on host, which calls fn.1 on host.
2. GOMP_target offloads fn.0 and fn.1, and runs fn.0 on target, which calls fn.1.
So, there is only one "entry point" for GOMP_target - fn.0, and GOMP_target can't run fn.1 on target directly, that's why only fn.0 must be added to the table.

  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-08-19 11:55                       ` Bernd Schmidt
  2014-08-19 12:26                         ` Ilya Verbin
@ 2014-08-19 13:27                         ` Ilya Verbin
  2014-09-02 17:50                           ` Ilya Verbin
  1 sibling, 1 reply; 48+ messages in thread
From: Ilya Verbin @ 2014-08-19 13:27 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Thomas Schwinge, Jakub Jelinek, GCC Patches

On 19 Aug 13:55, Bernd Schmidt wrote:
> In any case, I think let's get this checked in for now and iterate later.

Committed revision 214148.

  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-08-19 13:27                         ` Ilya Verbin
@ 2014-09-02 17:50                           ` Ilya Verbin
  2014-09-04 13:08                             ` Thomas Schwinge
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Verbin @ 2014-09-02 17:50 UTC (permalink / raw)
  To: Bernd Schmidt; +Cc: Thomas Schwinge, Jakub Jelinek, GCC Patches

Hi Bernd,

This patch allows to compile binaries with offloading without passing -flto option, and
w/o performing link-time optimizations of the host code.

How it works:
1.  If there is at least one function or global variable to offload, gcc sets flag_generate_lto.
This enables writing the bytecode produced by ipa_write_summaries into
.gnu.target_lto_* sections (.gnu.lto_* sections are not created).
Also this flag emits LTO marker (__gnu_lto_v1).
2.  This step is not changed: collect2 scans object files for the LTO marker and fills the list
of LTO objects.  If the list is not empty, it runs lto-wrapper to perform link-time recompilation.
3.  lto-wrapper compiles images for targets.  And if -flto option is absent
(lto_mode == LTO_MODE_NONE), then it just returns the list of input objects without recompilation.

One known issue -- the final binary contains temporary .gnu.target_lto_* sections.
This can be solved by adding the following linker script to the list of input files:
SECTIONS { /DISCARD/ : { *(.gnu.target_lto_*) } }
But I'm sure what is the best way to this automatically.

Bootstrap and make check passed, tests with '#pragma omp target' without -flto passed.
What do you think?

Thanks,
  -- Ilya


---
 gcc/cgraphunit.c  | 39 +++++++++++++++++++++++--------
 gcc/lto-wrapper.c | 68 +++++++++++++++++++++++++++++--------------------------
 gcc/omp-low.c     |  6 +++++
 gcc/passes.c      |  2 +-
 4 files changed, 73 insertions(+), 42 deletions(-)

diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index f0c9f5c..32b35f3 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -2040,13 +2040,26 @@ output_in_order (void)
   free (nodes);
 }
 
-/* Collect all global variables with "omp declare target" attribute into
-   OFFLOAD_VARS.  It will be streamed out in ipa_write_summaries.  */
+/* Check whether there is at least one function or global variable to offload.
+   Also collect all such global variables into OFFLOAD_VARS, the functions were
+   already collected in omp-low.c.  They will be streamed out in
+   ipa_write_summaries.  */
 
-static void
-init_offload_var_table (void)
+static bool
+initialize_offload (void)
 {
+  bool have_offload = false;
+  struct cgraph_node *node;
   struct varpool_node *vnode;
+
+  FOR_EACH_DEFINED_FUNCTION (node)
+    if (lookup_attribute ("omp declare target", DECL_ATTRIBUTES (node->decl)))
+      {
+	have_offload = true;
+	break;
+      }
+
   FOR_EACH_DEFINED_VARIABLE (vnode)
     {
       if (!lookup_attribute ("omp declare target",
@@ -2054,13 +2067,17 @@ init_offload_var_table (void)
 	  || TREE_CODE (vnode->decl) != VAR_DECL
 	  || DECL_SIZE (vnode->decl) == 0)
 	continue;
+      have_offload = true;
       vec_safe_push (offload_vars, vnode->decl);
     }
+
+  return have_offload;
 }
 
 static void
 ipa_passes (void)
 {
+  bool have_offload = false;
   gcc::pass_manager *passes = g->get_passes ();
 
   set_cfun (NULL);
@@ -2068,6 +2085,14 @@ ipa_passes (void)
   gimple_register_cfg_hooks ();
   bitmap_obstack_initialize (NULL);
 
+  if (!in_lto_p && (flag_openacc || flag_openmp))
+    {
+      have_offload = initialize_offload ();
+      /* OpenACC / OpenMP offloading requires LTO infrastructure.  */
+      if (have_offload)
+	flag_generate_lto = 1;
+    }
+
   invoke_plugin_callbacks (PLUGIN_ALL_IPA_PASSES_START, NULL);
 
   if (!in_lto_p)
@@ -2108,11 +2133,7 @@ ipa_passes (void)
 
   if (!in_lto_p)
     {
-      init_offload_var_table ();
-
-      if ((flag_openacc || flag_openmp)
-	  && !(vec_safe_is_empty (offload_funcs)
-	       && vec_safe_is_empty (offload_vars)))
+      if (have_offload)
 	{
 	  section_name_prefix = OMP_SECTION_NAME_PREFIX;
 	  ipa_write_summaries (true);
diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index 80d10f3..e9245f1 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -668,6 +668,11 @@ run_gcc (unsigned argc, char *argv[])
 	  close (fd);
 	  continue;
 	}
+      /* We may choose not to write out this .opts section in the future.  In
+	 that case we'll have to use something else to look for.  */
+      if (simple_object_find_section (sobj, OMP_SECTION_NAME_PREFIX "." "opts",
+				      &offset, &length, &errmsg, &err))
+	have_offload = true;
       if (!simple_object_find_section (sobj, LTO_SECTION_NAME_PREFIX "." "opts",
 				       &offset, &length, &errmsg, &err))
 	{
@@ -675,11 +680,6 @@ run_gcc (unsigned argc, char *argv[])
 	  close (fd);
 	  continue;
 	}
-      /* We may choose not to write out this .opts section in the future.  In
-	 that case we'll have to use something else to look for.  */
-      if (simple_object_find_section (sobj, OMP_SECTION_NAME_PREFIX "." "opts",
-				      &offset, &length, &errmsg, &err))
-	have_offload = true;
       lseek (fd, file_offset + offset, SEEK_SET);
       data = (char *)xmalloc (length);
       read (fd, data, length);
@@ -871,7 +871,31 @@ run_gcc (unsigned argc, char *argv[])
   /* Remember at which point we can scrub args to re-use the commons.  */
   new_head_argc = obstack_object_size (&argv_obstack) / sizeof (void *);
 
-  if (lto_mode == LTO_MODE_LTO)
+  if (have_offload)
+    {
+      compile_images_for_openmp_targets (argc, argv);
+      if (offload_names)
+	{
+	  find_ompbeginend ();
+	  for (i = 0; offload_names[i]; i++)
+	    printf ("%s\n", offload_names[i]);
+	  free_array_of_ptrs ((void **) offload_names, i);
+	}
+    }
+
+  if (ompbegin)
+    printf ("%s\n", ompbegin);
+
+  if (lto_mode == LTO_MODE_NONE)
+    {
+      /* If we are in lto-wrapper, but -flto option is absent, it means that
+	 there is no need to perform a link-time recompilation, i.e. lto-wrapper
+	 is used only for compiling offload images.  */
+      for (i = 1; i < argc; ++i)
+	printf ("%s\n", argv[i]);
+      goto finish;
+    }
+  else if (lto_mode == LTO_MODE_LTO)
     {
       flto_out = make_temp_file (".lto.o");
       if (linker_output)
@@ -879,7 +903,7 @@ run_gcc (unsigned argc, char *argv[])
       obstack_ptr_grow (&argv_obstack, "-o");
       obstack_ptr_grow (&argv_obstack, flto_out);
     }
-  else 
+  else if (lto_mode == LTO_MODE_WHOPR)
     {
       const char *list_option = "-fltrans-output-list=";
       size_t list_option_len = strlen (list_option);
@@ -939,7 +963,7 @@ run_gcc (unsigned argc, char *argv[])
       free (flto_out);
       flto_out = NULL;
     }
-  else
+  else if (lto_mode == LTO_MODE_WHOPR)
     {
       FILE *stream = fopen (ltrans_output_file, "r");
       FILE *mstream = NULL;
@@ -1084,25 +1108,6 @@ cont:
 	  for (i = 0; i < nr; ++i)
 	    maybe_unlink (input_names[i]);
 	}
-      if (have_offload)
-	{
-	  compile_images_for_openmp_targets (argc, argv);
-	  if (offload_names)
-	    {
-	      find_ompbeginend ();
-	      for (i = 0; offload_names[i]; i++)
-		{
-		  fputs (offload_names[i], stdout);
-		  putc ('\n', stdout);
-		}
-	      free_array_of_ptrs ((void **)offload_names, i);
-	    }
-	}
-      if (ompbegin)
-	{
-	  fputs (ompbegin, stdout);
-	  putc ('\n', stdout);
-	}
 
       for (i = 0; i < nr; ++i)
 	{
@@ -1110,11 +1115,6 @@ cont:
 	  putc ('\n', stdout);
 	  free (input_names[i]);
 	}
-      if (ompend)
-	{
-	  fputs (ompend, stdout);
-	  putc ('\n', stdout);
-	}
       nr = 0;
       free (output_names);
       free (input_names);
@@ -1122,6 +1122,10 @@ cont:
       obstack_free (&env_obstack, NULL);
     }
 
+finish:
+  if (ompend)
+    printf ("%s\n", ompend);
+
   obstack_free (&argv_obstack, NULL);
 }
 
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 1ad98ab..9289031 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -13771,6 +13771,12 @@ omp_finish_file (void)
 				   get_identifier (".omp_var_table"),
 				   vars_decl_type);
       TREE_STATIC (funcs_decl) = TREE_STATIC (vars_decl) = 1;
+      /* Do not align tables more than TYPE_ALIGN (pointer_sized_int_node),
+	 otherwise a joint table in a binary will contain padding between
+	 tables from multiple object files.  */
+      DECL_USER_ALIGN (funcs_decl) = DECL_USER_ALIGN (vars_decl) = 1;
+      DECL_ALIGN (funcs_decl) = TYPE_ALIGN (funcs_decl_type);
+      DECL_ALIGN (vars_decl) = TYPE_ALIGN (vars_decl_type);
       DECL_INITIAL (funcs_decl) = ctor_f;
       DECL_INITIAL (vars_decl) = ctor_v;
       set_decl_section_name (funcs_decl, funcs_section_name);
diff --git a/gcc/passes.c b/gcc/passes.c
index 8172185..e776059 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -2303,7 +2303,7 @@ ipa_write_summaries (bool is_omp)
   struct cgraph_node *node;
   struct cgraph_node **order;
 
-  if (!(flag_generate_lto || flag_openacc || flag_openmp) || seen_error () )
+  if (!flag_generate_lto || seen_error ())
     return;
 
   select_what_to_dump (is_omp);
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-08-18 16:08                 ` Ilya Verbin
  2014-08-18 16:25                   ` Bernd Schmidt
@ 2014-09-03 19:24                   ` Thomas Schwinge
  1 sibling, 0 replies; 48+ messages in thread
From: Thomas Schwinge @ 2014-09-03 19:24 UTC (permalink / raw)
  To: Ilya Verbin; +Cc: Jakub Jelinek, GCC Patches, Bernd Schmidt

[-- Attachment #1: Type: text/plain, Size: 3139 bytes --]

Hi!

On Mon, 18 Aug 2014 20:07:59 +0400, Ilya Verbin <iverbin@gmail.com> wrote:
> I discovered an issue in the LTO streaming out for target - currently any file (even without any pragma) compiled with -fopenmp/-fopenacc contains .gnu.target_lto_* sections.  This increases the size of an object file and makes lto-wrapper to run mkoffload.
> 
> Therefore, I propose to replace the condition before ipa_write_summaries:
> - if (flag_openacc || flag_openmp)
> + if ((flag_openacc || flag_openmp) && !(vec_safe_is_empty (offload_funcs) && vec_safe_is_empty (offload_vars)))
> 
> But to do this, the offload_vars must be filled before the check (offload_funcs is already filled in expand_omp_target).
> Here is the updated patch.  Bootstrap passed.  OK for gomp-4_0-branch?
> 
> 
> On 13 Aug 20:19, Ilya Verbin wrote:
> > Here is the updated patch.  offload_funcs/vars are now declared in omp-low.h,
> > the functions have a comment.  Also it fixes the issue of offload_funcs/vars
> > corruption by the garbage collector.  OK for gomp-4_0-branch?

> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c

> @@ -8906,6 +8909,9 @@ expand_omp_target (struct omp_region *region)
>        DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties;
>        cgraph_add_new_function (child_fn, true);
>  
> +      /* Add the new function to the offload table.  */
> +      vec_safe_push (offload_funcs, child_fn);
> +
>        /* Fix the callgraph edges for child_cfun.  Those for cfun will be
>  	 fixed in a following pass.  */
>        push_cfun (child_cfun);

The same change needs to be done for OpenACC offloading; addressed in
r214892:

commit 9fb900482bd3bca9bfa89301e417174caabd7176
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Wed Sep 3 19:10:43 2014 +0000

    Restore OpenACC offloading.
    
    	gcc/
    	* omp-low.c (expand_oacc_offload): Add child_fn to offload_funcs.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@214892 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp | 4 ++++
 gcc/omp-low.c      | 3 +++
 2 files changed, 7 insertions(+)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 40688df..0c55814 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,7 @@
+2014-09-03  Thomas Schwinge  <thomas@codesourcery.com>
+
+	* omp-low.c (expand_oacc_offload): Add child_fn to offload_funcs.
+
 2014-08-19  Ilya Verbin  <ilya.verbin@intel.com>
 
 	* Makefile.in (GTFILES): Add omp-low.h.
diff --git gcc/omp-low.c gcc/omp-low.c
index 1ad98ab..6ed8239 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -5351,6 +5351,9 @@ expand_oacc_offload (struct omp_region *region)
       DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties;
       cgraph_add_new_function (child_fn, true);
 
+      /* Add the new function to the offload table.  */
+      vec_safe_push (offload_funcs, child_fn);
+
       /* Fix the callgraph edges for child_cfun.  Those for cfun will be
 	 fixed in a following pass.  */
       push_cfun (child_cfun);


Grüße,
 Thomas

[-- Attachment #2: Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-09-02 17:50                           ` Ilya Verbin
@ 2014-09-04 13:08                             ` Thomas Schwinge
  2014-09-04 13:47                               ` Ilya Verbin
  2014-09-04 13:54                               ` Thomas Schwinge
  0 siblings, 2 replies; 48+ messages in thread
From: Thomas Schwinge @ 2014-09-04 13:08 UTC (permalink / raw)
  To: Ilya Verbin, Bernd Schmidt; +Cc: Jakub Jelinek, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 2014 bytes --]

Hi!

On Tue, 2 Sep 2014 21:49:46 +0400, Ilya Verbin <iverbin@gmail.com> wrote:
> This patch allows to compile binaries with offloading without passing -flto option, and
> w/o performing link-time optimizations of the host code.

Thanks for working on this!

> How it works:
> 1.  If there is at least one function or global variable to offload, gcc sets flag_generate_lto.
> This enables writing the bytecode produced by ipa_write_summaries into
> .gnu.target_lto_* sections (.gnu.lto_* sections are not created).
> Also this flag emits LTO marker (__gnu_lto_v1).
> 2.  This step is not changed: collect2 scans object files for the LTO marker and fills the list
> of LTO objects.  If the list is not empty, it runs lto-wrapper to perform link-time recompilation.
> 3.  lto-wrapper compiles images for targets.  And if -flto option is absent
> (lto_mode == LTO_MODE_NONE), then it just returns the list of input objects without recompilation.

That seems sane to me.  (But you guys have looked into this design/code
in much more detail than I have.)

I'm facing one problem; I guess the crucial detail is that in my scenario
I'm using the linker plugin.  The lto-wrapper is not being executed (and
thus no mkoffload being run), because »num_claimed_files == 0«.  In
lto-plugin/lto-plugin.c:process_symtab, only LTO_SECTION_PREFIX
(".gnu.lto_.symtab") is considered, which (correctly so) is not generated
anymore by GCC in the new scenario, but ".gnu.target_lto_" is not
considered there.  (Should this maybe look only for the LTO marker
"__gnu_lto_v1", or am I misunderstanding what this is doing?)  If I make
that also accept the offloading section, the compilation process proceeds
further, but still fails, because no resolution file is available:
»[...]/ld: cannot find -fresolution=/tmp/cc7xeiW0.res: No such file or
directory«.  Is this enough information for someone who is more familiar
with the design/code to already see what needs to be done?


Grüße,
 Thomas

[-- Attachment #2: Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-09-04 13:08                             ` Thomas Schwinge
@ 2014-09-04 13:47                               ` Ilya Verbin
  2014-09-04 13:54                               ` Thomas Schwinge
  1 sibling, 0 replies; 48+ messages in thread
From: Ilya Verbin @ 2014-09-04 13:47 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Bernd Schmidt, Jakub Jelinek, GCC Patches

On 04 Sep 15:08, Thomas Schwinge wrote:
> I'm facing one problem; I guess the crucial detail is that in my scenario
> I'm using the linker plugin.  The lto-wrapper is not being executed (and
> thus no mkoffload being run), because »num_claimed_files == 0«.

Yeah I missed that.  I tested this patch on the system with old ld, therefore
the scenario with linker plugin wasn't tested.  I'm going to fix my patch.

  -- Ilya

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-09-04 13:08                             ` Thomas Schwinge
  2014-09-04 13:47                               ` Ilya Verbin
@ 2014-09-04 13:54                               ` Thomas Schwinge
  2014-09-05 15:09                                 ` Ilya Verbin
  1 sibling, 1 reply; 48+ messages in thread
From: Thomas Schwinge @ 2014-09-04 13:54 UTC (permalink / raw)
  To: Ilya Verbin, Bernd Schmidt; +Cc: Jakub Jelinek, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 4142 bytes --]

Hi!

On Thu, 04 Sep 2014 15:08:06 +0200, I wrote:
> On Tue, 2 Sep 2014 21:49:46 +0400, Ilya Verbin <iverbin@gmail.com> wrote:
> > This patch allows to compile binaries with offloading without passing -flto option, and
> > w/o performing link-time optimizations of the host code.
> 
> Thanks for working on this!
> 
> > How it works:
> > 1.  If there is at least one function or global variable to offload, gcc sets flag_generate_lto.
> > This enables writing the bytecode produced by ipa_write_summaries into
> > .gnu.target_lto_* sections (.gnu.lto_* sections are not created).
> > Also this flag emits LTO marker (__gnu_lto_v1).
> > 2.  This step is not changed: collect2 scans object files for the LTO marker and fills the list
> > of LTO objects.  If the list is not empty, it runs lto-wrapper to perform link-time recompilation.
> > 3.  lto-wrapper compiles images for targets.  And if -flto option is absent
> > (lto_mode == LTO_MODE_NONE), then it just returns the list of input objects without recompilation.
> 
> That seems sane to me.  (But you guys have looked into this design/code
> in much more detail than I have.)
> 
> I'm facing one problem; I guess the crucial detail is that in my scenario
> I'm using the linker plugin.  The lto-wrapper is not being executed (and
> thus no mkoffload being run), because »num_claimed_files == 0«.  In
> lto-plugin/lto-plugin.c:process_symtab, only LTO_SECTION_PREFIX
> (".gnu.lto_.symtab") is considered, which (correctly so) is not generated
> anymore by GCC in the new scenario, but ".gnu.target_lto_" is not
> considered there.  (Should this maybe look only for the LTO marker
> "__gnu_lto_v1", or am I misunderstanding what this is doing?)  If I make
> that also accept the offloading section, the compilation process proceeds
> further, but still fails, because no resolution file is available:
> »[...]/ld: cannot find -fresolution=/tmp/cc7xeiW0.res: No such file or
> directory«.  Is this enough information for someone who is more familiar
> with the design/code to already see what needs to be done?

Aha, it's gcc/gcc.c:LINK_PLUGIN_SPEC that is unconditionally adding the
-fresolution option.  Here is a hack that seems to make it work, but that
most certainly should be done differently:

commit 9de71e209f5a75454ddb6922009425eb1f6bec1c
Author: Thomas Schwinge <thomas@codesourcery.com>
Date:   Thu Sep 4 15:44:37 2014 +0200

    Hack for offloading without -flto, with linker plugin.

diff --git gcc/lto-wrapper.c gcc/lto-wrapper.c
index d40f8ae..9556cdc 100644
--- gcc/lto-wrapper.c
+++ gcc/lto-wrapper.c
@@ -910,7 +910,8 @@ run_gcc (unsigned argc, char *argv[])
 	 there is no need to perform a link-time recompilation, i.e. lto-wrapper
 	 is used only for compiling offload images.  */
       for (i = 1; i < argc; ++i)
-	printf ("%s\n", argv[i]);
+	if (strncmp (argv[i], "-fresolution=", sizeof ("-fresolution=") - 1))
+	  printf ("%s\n", argv[i]);
       goto finish;
     }
   else if (lto_mode == LTO_MODE_LTO)
diff --git lto-plugin/lto-plugin.c lto-plugin/lto-plugin.c
index 910e23c..a397276 100644
--- lto-plugin/lto-plugin.c
+++ lto-plugin/lto-plugin.c
@@ -84,8 +84,8 @@ along with this program; see the file COPYING3.  If not see
 
 /* LTO magic section name.  */
 
-#define LTO_SECTION_PREFIX	".gnu.lto_.symtab"
-#define LTO_SECTION_PREFIX_LEN	(sizeof (LTO_SECTION_PREFIX) - 1)
+#define LTO_SECTION_SYMTAB	".gnu.lto_.symtab"
+#define OMP_SECTION_SYMTAB	".gnu.target_lto_.symtab"
 
 /* The part of the symbol table the plugin has to keep track of. Note that we
    must keep SYMS until all_symbols_read is called to give the linker time to
@@ -820,7 +820,8 @@ process_symtab (void *data, const char *name, off_t offset, off_t length)
   char *s;
   char *secdatastart, *secdata;
 
-  if (strncmp (name, LTO_SECTION_PREFIX, LTO_SECTION_PREFIX_LEN) != 0)
+  if (strncmp (name, LTO_SECTION_SYMTAB, strlen (LTO_SECTION_SYMTAB)) != 0
+      && strncmp (name, OMP_SECTION_SYMTAB, strlen (OMP_SECTION_SYMTAB)) != 0)
     return 1;
 
   s = strrchr (name, '.');


Grüße,
 Thomas

[-- Attachment #2: Type: application/pgp-signature, Size: 472 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-09-04 13:54                               ` Thomas Schwinge
@ 2014-09-05 15:09                                 ` Ilya Verbin
  2014-09-08 18:46                                   ` Ilya Verbin
  0 siblings, 1 reply; 48+ messages in thread
From: Ilya Verbin @ 2014-09-05 15:09 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Bernd Schmidt, Jakub Jelinek, GCC Patches

Hi,

On 04 Sep 15:54, Thomas Schwinge wrote:
> Aha, it's gcc/gcc.c:LINK_PLUGIN_SPEC that is unconditionally adding the
> -fresolution option.  Here is a hack that seems to make it work, but that
> most certainly should be done differently:

Here is updated patch, which solves the problem with linker plugin.
I hope that it will not affect the regular LTO compilation.  At least, the
'make check' did not reveal any regression on the system with linker plugin.
Bootstrap also passed.  Ok for gomp-4_0-branch?

  -- Ilya


---
 gcc/cgraphunit.c        | 38 ++++++++++++++++++++-------
 gcc/lto-wrapper.c       | 69 ++++++++++++++++++++++++++-----------------------
 gcc/omp-low.c           |  6 +++++
 gcc/passes.c            |  2 +-
 lto-plugin/lto-plugin.c | 25 +++++++++++++++++-
 5 files changed, 97 insertions(+), 43 deletions(-)

diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index f0c9f5c..b87e6dd 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -2040,13 +2040,25 @@ output_in_order (void)
   free (nodes);
 }
 
-/* Collect all global variables with "omp declare target" attribute into
-   OFFLOAD_VARS.  It will be streamed out in ipa_write_summaries.  */
+/* Check whether there is at least one function or global variable to offload.
+   Also collect all such global variables into OFFLOAD_VARS, the functions were
+   already collected in omp-low.c.  They will be streamed out in
+   ipa_write_summaries.  */
 
-static void
-init_offload_var_table (void)
+static bool
+initialize_offload (void)
 {
+  bool have_offload = false;
+  struct cgraph_node *node;
   struct varpool_node *vnode;
+
+  FOR_EACH_DEFINED_FUNCTION (node)
+    if (lookup_attribute ("omp declare target", DECL_ATTRIBUTES (node->decl)))
+      {
+	have_offload = true;
+	break;
+      }
+
   FOR_EACH_DEFINED_VARIABLE (vnode)
     {
       if (!lookup_attribute ("omp declare target",
@@ -2054,13 +2066,17 @@ init_offload_var_table (void)
 	  || TREE_CODE (vnode->decl) != VAR_DECL
 	  || DECL_SIZE (vnode->decl) == 0)
 	continue;
+      have_offload = true;
       vec_safe_push (offload_vars, vnode->decl);
     }
+
+  return have_offload;
 }
 
 static void
 ipa_passes (void)
 {
+  bool have_offload = false;
   gcc::pass_manager *passes = g->get_passes ();
 
   set_cfun (NULL);
@@ -2068,6 +2084,14 @@ ipa_passes (void)
   gimple_register_cfg_hooks ();
   bitmap_obstack_initialize (NULL);
 
+  if (!in_lto_p && (flag_openacc || flag_openmp))
+    {
+      have_offload = initialize_offload ();
+      /* OpenACC / OpenMP offloading requires LTO infrastructure.  */
+      if (have_offload)
+	flag_generate_lto = 1;
+    }
+
   invoke_plugin_callbacks (PLUGIN_ALL_IPA_PASSES_START, NULL);
 
   if (!in_lto_p)
@@ -2108,11 +2132,7 @@ ipa_passes (void)
 
   if (!in_lto_p)
     {
-      init_offload_var_table ();
-
-      if ((flag_openacc || flag_openmp)
-	  && !(vec_safe_is_empty (offload_funcs)
-	       && vec_safe_is_empty (offload_vars)))
+      if (have_offload)
 	{
 	  section_name_prefix = OMP_SECTION_NAME_PREFIX;
 	  ipa_write_summaries (true);
diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index 80d10f3..01e8f06 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -668,6 +668,11 @@ run_gcc (unsigned argc, char *argv[])
 	  close (fd);
 	  continue;
 	}
+      /* We may choose not to write out this .opts section in the future.  In
+	 that case we'll have to use something else to look for.  */
+      if (simple_object_find_section (sobj, OMP_SECTION_NAME_PREFIX "." "opts",
+				      &offset, &length, &errmsg, &err))
+	have_offload = true;
       if (!simple_object_find_section (sobj, LTO_SECTION_NAME_PREFIX "." "opts",
 				       &offset, &length, &errmsg, &err))
 	{
@@ -675,11 +680,6 @@ run_gcc (unsigned argc, char *argv[])
 	  close (fd);
 	  continue;
 	}
-      /* We may choose not to write out this .opts section in the future.  In
-	 that case we'll have to use something else to look for.  */
-      if (simple_object_find_section (sobj, OMP_SECTION_NAME_PREFIX "." "opts",
-				      &offset, &length, &errmsg, &err))
-	have_offload = true;
       lseek (fd, file_offset + offset, SEEK_SET);
       data = (char *)xmalloc (length);
       read (fd, data, length);
@@ -871,7 +871,32 @@ run_gcc (unsigned argc, char *argv[])
   /* Remember at which point we can scrub args to re-use the commons.  */
   new_head_argc = obstack_object_size (&argv_obstack) / sizeof (void *);
 
-  if (lto_mode == LTO_MODE_LTO)
+  if (have_offload)
+    {
+      compile_images_for_openmp_targets (argc, argv);
+      if (offload_names)
+	{
+	  find_ompbeginend ();
+	  for (i = 0; offload_names[i]; i++)
+	    printf ("%s\n", offload_names[i]);
+	  free_array_of_ptrs ((void **) offload_names, i);
+	}
+    }
+
+  if (ompbegin)
+    printf ("%s\n", ompbegin);
+
+  if (lto_mode == LTO_MODE_NONE)
+    {
+      /* If we are in lto-wrapper, but -flto option is absent, it means that
+	 there is no need to perform a link-time recompilation, i.e. lto-wrapper
+	 is used only for compiling offload images.  */
+      for (i = 1; i < argc; ++i)
+	if (strncmp (argv[i], "-fresolution=", sizeof ("-fresolution=") - 1))
+	  printf ("%s\n", argv[i]);
+      goto finish;
+    }
+  else if (lto_mode == LTO_MODE_LTO)
     {
       flto_out = make_temp_file (".lto.o");
       if (linker_output)
@@ -879,7 +904,7 @@ run_gcc (unsigned argc, char *argv[])
       obstack_ptr_grow (&argv_obstack, "-o");
       obstack_ptr_grow (&argv_obstack, flto_out);
     }
-  else 
+  else if (lto_mode == LTO_MODE_WHOPR)
     {
       const char *list_option = "-fltrans-output-list=";
       size_t list_option_len = strlen (list_option);
@@ -939,7 +964,7 @@ run_gcc (unsigned argc, char *argv[])
       free (flto_out);
       flto_out = NULL;
     }
-  else
+  else if (lto_mode == LTO_MODE_WHOPR)
     {
       FILE *stream = fopen (ltrans_output_file, "r");
       FILE *mstream = NULL;
@@ -1084,25 +1109,6 @@ cont:
 	  for (i = 0; i < nr; ++i)
 	    maybe_unlink (input_names[i]);
 	}
-      if (have_offload)
-	{
-	  compile_images_for_openmp_targets (argc, argv);
-	  if (offload_names)
-	    {
-	      find_ompbeginend ();
-	      for (i = 0; offload_names[i]; i++)
-		{
-		  fputs (offload_names[i], stdout);
-		  putc ('\n', stdout);
-		}
-	      free_array_of_ptrs ((void **)offload_names, i);
-	    }
-	}
-      if (ompbegin)
-	{
-	  fputs (ompbegin, stdout);
-	  putc ('\n', stdout);
-	}
 
       for (i = 0; i < nr; ++i)
 	{
@@ -1110,11 +1116,6 @@ cont:
 	  putc ('\n', stdout);
 	  free (input_names[i]);
 	}
-      if (ompend)
-	{
-	  fputs (ompend, stdout);
-	  putc ('\n', stdout);
-	}
       nr = 0;
       free (output_names);
       free (input_names);
@@ -1122,6 +1123,10 @@ cont:
       obstack_free (&env_obstack, NULL);
     }
 
+finish:
+  if (ompend)
+    printf ("%s\n", ompend);
+
   obstack_free (&argv_obstack, NULL);
 }
 
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 6ed8239..cf00407 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -13774,6 +13774,12 @@ omp_finish_file (void)
 				   get_identifier (".omp_var_table"),
 				   vars_decl_type);
       TREE_STATIC (funcs_decl) = TREE_STATIC (vars_decl) = 1;
+      /* Do not align tables more than TYPE_ALIGN (pointer_sized_int_node),
+	 otherwise a joint table in a binary will contain padding between
+	 tables from multiple object files.  */
+      DECL_USER_ALIGN (funcs_decl) = DECL_USER_ALIGN (vars_decl) = 1;
+      DECL_ALIGN (funcs_decl) = TYPE_ALIGN (funcs_decl_type);
+      DECL_ALIGN (vars_decl) = TYPE_ALIGN (vars_decl_type);
       DECL_INITIAL (funcs_decl) = ctor_f;
       DECL_INITIAL (vars_decl) = ctor_v;
       set_decl_section_name (funcs_decl, funcs_section_name);
diff --git a/gcc/passes.c b/gcc/passes.c
index 8172185..e776059 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -2303,7 +2303,7 @@ ipa_write_summaries (bool is_omp)
   struct cgraph_node *node;
   struct cgraph_node **order;
 
-  if (!(flag_generate_lto || flag_openacc || flag_openmp) || seen_error () )
+  if (!flag_generate_lto || seen_error ())
     return;
 
   select_what_to_dump (is_omp);
diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
index 910e23c..f53d9e2 100644
--- a/lto-plugin/lto-plugin.c
+++ b/lto-plugin/lto-plugin.c
@@ -86,6 +86,8 @@ along with this program; see the file COPYING3.  If not see
 
 #define LTO_SECTION_PREFIX	".gnu.lto_.symtab"
 #define LTO_SECTION_PREFIX_LEN	(sizeof (LTO_SECTION_PREFIX) - 1)
+#define OFFLOAD_SECTION		".gnu.target_lto_.opts"
+#define OFFLOAD_SECTION_LEN	(sizeof (OFFLOAD_SECTION) - 1)
 
 /* The part of the symbol table the plugin has to keep track of. Note that we
    must keep SYMS until all_symbols_read is called to give the linker time to
@@ -111,6 +113,7 @@ struct plugin_symtab
 struct plugin_objfile
 {
   int found;
+  int offload;
   simple_object_read *objfile;
   struct plugin_symtab *out;
   const struct ld_plugin_input_file *file;
@@ -862,6 +865,21 @@ err:
   return 0;
 }
 
+/* Find an offload section of an object file.  */
+
+static int
+process_offload_section (void *data, const char *name, off_t offset, off_t len)
+{
+  if (!strncmp (name, OFFLOAD_SECTION, OFFLOAD_SECTION_LEN))
+    {
+      struct plugin_objfile *obj = (struct plugin_objfile *) data;
+      obj->offload = 1;
+      return 0;
+    }
+
+  return 1;
+}
+
 /* Callback used by gold to check if the plugin will claim FILE. Writes
    the result in CLAIMED. */
 
@@ -899,6 +917,7 @@ claim_file_handler (const struct ld_plugin_input_file *file, int *claimed)
   *claimed = 0;
   obj.file = file;
   obj.found = 0;
+  obj.offload = 0;
   obj.out = &lto_file.symtab;
   errmsg = NULL;
   obj.objfile = simple_object_start_read (file->fd, file->offset, LTO_SEGMENT_NAME,
@@ -920,7 +939,11 @@ claim_file_handler (const struct ld_plugin_input_file *file, int *claimed)
       goto err;
     }
 
-  if (obj.found == 0)
+  if (obj.objfile)
+    simple_object_find_sections (obj.objfile, process_offload_section,
+				 &obj, &err);
+
+  if (obj.found == 0 && obj.offload == 0)
     goto err;
 
   if (obj.found > 1)
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [gomp4] Add tables generation
  2014-09-05 15:09                                 ` Ilya Verbin
@ 2014-09-08 18:46                                   ` Ilya Verbin
  0 siblings, 0 replies; 48+ messages in thread
From: Ilya Verbin @ 2014-09-08 18:46 UTC (permalink / raw)
  To: Thomas Schwinge; +Cc: Bernd Schmidt, Jakub Jelinek, GCC Patches

Hi,

On 05 Sep 19:09, Ilya Verbin wrote:
> I hope that it will not affect the regular LTO compilation.  At least, the
> 'make check' did not reveal any regression on the system with linker plugin.

I found that relying on -flto option in lto-wrapper was a bad idea.
E.g., this simple case is not working:

$ gcc -c -flto test.c
$ gcc test.o  # Here -flto is absent, but lto-wrapper must recompile IR from test.o

So I reverted back all conditions with lto_mode in lto-wrapper, and instead introduced a new flag have_lto.
The following patch looks like working now.

Thanks,
  -- Ilya


---
 gcc/cgraphunit.c        | 38 +++++++++++++++++++++-------
 gcc/lto-wrapper.c       | 66 +++++++++++++++++++++++++++----------------------
 gcc/omp-low.c           |  6 +++++
 gcc/passes.c            |  2 +-
 lto-plugin/lto-plugin.c | 25 ++++++++++++++++++-
 5 files changed, 97 insertions(+), 40 deletions(-)

diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index f0c9f5c..b87e6dd 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -2040,13 +2040,25 @@ output_in_order (void)
   free (nodes);
 }
 
-/* Collect all global variables with "omp declare target" attribute into
-   OFFLOAD_VARS.  It will be streamed out in ipa_write_summaries.  */
+/* Check whether there is at least one function or global variable to offload.
+   Also collect all such global variables into OFFLOAD_VARS, the functions were
+   already collected in omp-low.c.  They will be streamed out in
+   ipa_write_summaries.  */
 
-static void
-init_offload_var_table (void)
+static bool
+initialize_offload (void)
 {
+  bool have_offload = false;
+  struct cgraph_node *node;
   struct varpool_node *vnode;
+
+  FOR_EACH_DEFINED_FUNCTION (node)
+    if (lookup_attribute ("omp declare target", DECL_ATTRIBUTES (node->decl)))
+      {
+	have_offload = true;
+	break;
+      }
+
   FOR_EACH_DEFINED_VARIABLE (vnode)
     {
       if (!lookup_attribute ("omp declare target",
@@ -2054,13 +2066,17 @@ init_offload_var_table (void)
 	  || TREE_CODE (vnode->decl) != VAR_DECL
 	  || DECL_SIZE (vnode->decl) == 0)
 	continue;
+      have_offload = true;
       vec_safe_push (offload_vars, vnode->decl);
     }
+
+  return have_offload;
 }
 
 static void
 ipa_passes (void)
 {
+  bool have_offload = false;
   gcc::pass_manager *passes = g->get_passes ();
 
   set_cfun (NULL);
@@ -2068,6 +2084,14 @@ ipa_passes (void)
   gimple_register_cfg_hooks ();
   bitmap_obstack_initialize (NULL);
 
+  if (!in_lto_p && (flag_openacc || flag_openmp))
+    {
+      have_offload = initialize_offload ();
+      /* OpenACC / OpenMP offloading requires LTO infrastructure.  */
+      if (have_offload)
+	flag_generate_lto = 1;
+    }
+
   invoke_plugin_callbacks (PLUGIN_ALL_IPA_PASSES_START, NULL);
 
   if (!in_lto_p)
@@ -2108,11 +2132,7 @@ ipa_passes (void)
 
   if (!in_lto_p)
     {
-      init_offload_var_table ();
-
-      if ((flag_openacc || flag_openmp)
-	  && !(vec_safe_is_empty (offload_funcs)
-	       && vec_safe_is_empty (offload_vars)))
+      if (have_offload)
 	{
 	  section_name_prefix = OMP_SECTION_NAME_PREFIX;
 	  ipa_write_summaries (true);
diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index 80d10f3..86669cf 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -619,6 +619,7 @@ run_gcc (unsigned argc, char *argv[])
   unsigned int decoded_options_count;
   struct obstack argv_obstack;
   int new_head_argc;
+  bool have_lto = false;
   bool have_offload = false;
 
   /* Get the driver and options.  */
@@ -668,6 +669,11 @@ run_gcc (unsigned argc, char *argv[])
 	  close (fd);
 	  continue;
 	}
+      /* We may choose not to write out this .opts section in the future.  In
+	 that case we'll have to use something else to look for.  */
+      if (simple_object_find_section (sobj, OMP_SECTION_NAME_PREFIX "." "opts",
+				      &offset, &length, &errmsg, &err))
+	have_offload = true;
       if (!simple_object_find_section (sobj, LTO_SECTION_NAME_PREFIX "." "opts",
 				       &offset, &length, &errmsg, &err))
 	{
@@ -675,11 +681,7 @@ run_gcc (unsigned argc, char *argv[])
 	  close (fd);
 	  continue;
 	}
-      /* We may choose not to write out this .opts section in the future.  In
-	 that case we'll have to use something else to look for.  */
-      if (simple_object_find_section (sobj, OMP_SECTION_NAME_PREFIX "." "opts",
-				      &offset, &length, &errmsg, &err))
-	have_offload = true;
+      have_lto = true;
       lseek (fd, file_offset + offset, SEEK_SET);
       data = (char *)xmalloc (length);
       read (fd, data, length);
@@ -871,6 +873,32 @@ run_gcc (unsigned argc, char *argv[])
   /* Remember at which point we can scrub args to re-use the commons.  */
   new_head_argc = obstack_object_size (&argv_obstack) / sizeof (void *);
 
+  if (have_offload)
+    {
+      compile_images_for_openmp_targets (argc, argv);
+      if (offload_names)
+	{
+	  find_ompbeginend ();
+	  for (i = 0; offload_names[i]; i++)
+	    printf ("%s\n", offload_names[i]);
+	  free_array_of_ptrs ((void **) offload_names, i);
+	}
+    }
+
+  if (ompbegin)
+    printf ("%s\n", ompbegin);
+
+  /* If object files contain offload sections, but do not contain LTO sections,
+     then there is no need to perform a link-time recompilation, i.e.
+     lto-wrapper is used only for a compilation of offload images.  */
+  if (have_offload && !have_lto)
+    {
+      for (i = 1; i < argc; ++i)
+	if (strncmp (argv[i], "-fresolution=", sizeof ("-fresolution=") - 1))
+	  printf ("%s\n", argv[i]);
+      goto finish;
+    }
+
   if (lto_mode == LTO_MODE_LTO)
     {
       flto_out = make_temp_file (".lto.o");
@@ -1084,25 +1112,6 @@ cont:
 	  for (i = 0; i < nr; ++i)
 	    maybe_unlink (input_names[i]);
 	}
-      if (have_offload)
-	{
-	  compile_images_for_openmp_targets (argc, argv);
-	  if (offload_names)
-	    {
-	      find_ompbeginend ();
-	      for (i = 0; offload_names[i]; i++)
-		{
-		  fputs (offload_names[i], stdout);
-		  putc ('\n', stdout);
-		}
-	      free_array_of_ptrs ((void **)offload_names, i);
-	    }
-	}
-      if (ompbegin)
-	{
-	  fputs (ompbegin, stdout);
-	  putc ('\n', stdout);
-	}
 
       for (i = 0; i < nr; ++i)
 	{
@@ -1110,11 +1119,6 @@ cont:
 	  putc ('\n', stdout);
 	  free (input_names[i]);
 	}
-      if (ompend)
-	{
-	  fputs (ompend, stdout);
-	  putc ('\n', stdout);
-	}
       nr = 0;
       free (output_names);
       free (input_names);
@@ -1122,6 +1126,10 @@ cont:
       obstack_free (&env_obstack, NULL);
     }
 
+finish:
+  if (ompend)
+    printf ("%s\n", ompend);
+
   obstack_free (&argv_obstack, NULL);
 }
 
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 6ed8239..cf00407 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -13774,6 +13774,12 @@ omp_finish_file (void)
 				   get_identifier (".omp_var_table"),
 				   vars_decl_type);
       TREE_STATIC (funcs_decl) = TREE_STATIC (vars_decl) = 1;
+      /* Do not align tables more than TYPE_ALIGN (pointer_sized_int_node),
+	 otherwise a joint table in a binary will contain padding between
+	 tables from multiple object files.  */
+      DECL_USER_ALIGN (funcs_decl) = DECL_USER_ALIGN (vars_decl) = 1;
+      DECL_ALIGN (funcs_decl) = TYPE_ALIGN (funcs_decl_type);
+      DECL_ALIGN (vars_decl) = TYPE_ALIGN (vars_decl_type);
       DECL_INITIAL (funcs_decl) = ctor_f;
       DECL_INITIAL (vars_decl) = ctor_v;
       set_decl_section_name (funcs_decl, funcs_section_name);
diff --git a/gcc/passes.c b/gcc/passes.c
index 8172185..e776059 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -2303,7 +2303,7 @@ ipa_write_summaries (bool is_omp)
   struct cgraph_node *node;
   struct cgraph_node **order;
 
-  if (!(flag_generate_lto || flag_openacc || flag_openmp) || seen_error () )
+  if (!flag_generate_lto || seen_error ())
     return;
 
   select_what_to_dump (is_omp);
diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
index 910e23c..f53d9e2 100644
--- a/lto-plugin/lto-plugin.c
+++ b/lto-plugin/lto-plugin.c
@@ -86,6 +86,8 @@ along with this program; see the file COPYING3.  If not see
 
 #define LTO_SECTION_PREFIX	".gnu.lto_.symtab"
 #define LTO_SECTION_PREFIX_LEN	(sizeof (LTO_SECTION_PREFIX) - 1)
+#define OFFLOAD_SECTION		".gnu.target_lto_.opts"
+#define OFFLOAD_SECTION_LEN	(sizeof (OFFLOAD_SECTION) - 1)
 
 /* The part of the symbol table the plugin has to keep track of. Note that we
    must keep SYMS until all_symbols_read is called to give the linker time to
@@ -111,6 +113,7 @@ struct plugin_symtab
 struct plugin_objfile
 {
   int found;
+  int offload;
   simple_object_read *objfile;
   struct plugin_symtab *out;
   const struct ld_plugin_input_file *file;
@@ -862,6 +865,21 @@ err:
   return 0;
 }
 
+/* Find an offload section of an object file.  */
+
+static int
+process_offload_section (void *data, const char *name, off_t offset, off_t len)
+{
+  if (!strncmp (name, OFFLOAD_SECTION, OFFLOAD_SECTION_LEN))
+    {
+      struct plugin_objfile *obj = (struct plugin_objfile *) data;
+      obj->offload = 1;
+      return 0;
+    }
+
+  return 1;
+}
+
 /* Callback used by gold to check if the plugin will claim FILE. Writes
    the result in CLAIMED. */
 
@@ -899,6 +917,7 @@ claim_file_handler (const struct ld_plugin_input_file *file, int *claimed)
   *claimed = 0;
   obj.file = file;
   obj.found = 0;
+  obj.offload = 0;
   obj.out = &lto_file.symtab;
   errmsg = NULL;
   obj.objfile = simple_object_start_read (file->fd, file->offset, LTO_SEGMENT_NAME,
@@ -920,7 +939,11 @@ claim_file_handler (const struct ld_plugin_input_file *file, int *claimed)
       goto err;
     }
 
-  if (obj.found == 0)
+  if (obj.objfile)
+    simple_object_find_sections (obj.objfile, process_offload_section,
+				 &obj, &err);
+
+  if (obj.found == 0 && obj.offload == 0)
     goto err;
 
   if (obj.found > 1)
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2014-09-08 18:46 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-20 17:01 [gomp4] Add tables generation Bernd Schmidt
2014-03-20 19:24 ` Jakub Jelinek
2014-03-21 15:21   ` Bernd Schmidt
2014-03-21 15:28     ` Jakub Jelinek
2014-03-21 16:16       ` Bernd Schmidt
2014-04-04  9:33       ` Bernd Schmidt
2014-04-05 15:05         ` Thomas Schwinge
2014-04-05 15:24           ` Bernd Schmidt
2014-04-08  6:03             ` Jakub Jelinek
2014-05-06 15:32             ` Ilya Verbin
2014-05-08 10:11               ` Bernd Schmidt
2014-05-12 20:44                 ` Bernd Schmidt
2014-03-27 13:44 ` Ilya Verbin
2014-03-27 13:52   ` Bernd Schmidt
2014-03-27 13:59   ` Bernd Schmidt
2014-03-27 15:12   ` Jakub Jelinek
2014-03-27 16:17     ` Ilya Verbin
2014-03-27 16:23       ` Jakub Jelinek
2014-03-27 18:51         ` Ilya Verbin
2014-04-17 18:44         ` Ilya Verbin
2014-04-25 11:55           ` Ilya Verbin
2014-06-10 13:52           ` Bernd Schmidt
2014-06-10 18:07             ` Ilya Verbin
2014-08-13 16:19               ` Ilya Verbin
2014-08-18 16:08                 ` Ilya Verbin
2014-08-18 16:25                   ` Bernd Schmidt
2014-08-19 10:41                     ` Ilya Verbin
2014-08-19 11:55                       ` Bernd Schmidt
2014-08-19 12:26                         ` Ilya Verbin
2014-08-19 13:27                         ` Ilya Verbin
2014-09-02 17:50                           ` Ilya Verbin
2014-09-04 13:08                             ` Thomas Schwinge
2014-09-04 13:47                               ` Ilya Verbin
2014-09-04 13:54                               ` Thomas Schwinge
2014-09-05 15:09                                 ` Ilya Verbin
2014-09-08 18:46                                   ` Ilya Verbin
2014-09-03 19:24                   ` Thomas Schwinge
2014-04-02  7:34 ` Thomas Schwinge
2014-04-02  8:36   ` Thomas Schwinge
2014-04-03 16:15     ` Bernd Schmidt
2014-04-03 16:53       ` Ilya Verbin
2014-04-03 17:08         ` Bernd Schmidt
2014-04-03 17:25           ` Ilya Verbin
2014-04-03 17:31             ` Bernd Schmidt
2014-04-03 17:39               ` Ilya Verbin
2014-04-04  5:55       ` Thomas Schwinge
2014-04-04  9:25         ` Bernd Schmidt
2014-04-02  8:34 ` Thomas Schwinge

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).