From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27475 invoked by alias); 12 Nov 2014 09:26:51 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Received: (qmail 27395 invoked by uid 89); 12 Nov 2014 09:26:51 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_50,RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mx2.suse.de Received: from cantor2.suse.de (HELO mx2.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Wed, 12 Nov 2014 09:26:48 +0000 Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id C186EAAF1; Wed, 12 Nov 2014 09:26:44 +0000 (UTC) Date: Wed, 12 Nov 2014 09:29:00 -0000 From: Richard Biener To: Ilya Verbin cc: gcc-patches@gcc.gnu.org, Jakub Jelinek , Jan Hubicka , Bernd Schmidt , Thomas Schwinge , Kirill Yukhin , Andrey Turetskiy , Jeff Law Subject: Re: [PATCH 3/n] OpenMP 4.0 offloading infrastructure: offload tables In-Reply-To: <20141105131916.GB42356@msticlxl57.ims.intel.com> Message-ID: References: <20140930145320.GA19623@msticlxl57.ims.intel.com> <20141008092302.GU1986@tucnak.redhat.com> <20141105131916.GB42356@msticlxl57.ims.intel.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-1609908220-1648939029-1415784046=:374" X-SW-Source: 2014-11/txt/msg01143.txt.bz2 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1609908220-1648939029-1415784046=:374 Content-Type: TEXT/PLAIN; charset=US-ASCII Content-Transfer-Encoding: 8BIT Content-length: 18353 On Wed, 5 Nov 2014, Ilya Verbin wrote: > On 08 Oct 11:23, Jakub Jelinek wrote: > > LGTM, with the requested var/section renames. > > Would like if Honza and/or Richard had a look at the cgraph/LTO stuff > > in the patch though. > > Since patch 2 was updated, this patch also should be updated. > Now the offload_vars array is filled in varpool_node::get_create . > > Richard, is it OK for trunk? The LTO parts are ok for trunk. Thanks, Richard. > Thanks, > -- Ilya > > > --- > > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > index f31af05..3db30bf 100644 > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -2303,6 +2303,7 @@ GTFILES = $(CPP_ID_DATA_H) $(srcdir)/input.h $(srcdir)/coretypes.h \ > $(srcdir)/tree-profile.c $(srcdir)/tree-nested.c \ > $(srcdir)/tree-parloops.c \ > $(srcdir)/omp-low.c \ > + $(srcdir)/omp-low.h \ > $(srcdir)/targhooks.c $(out_file) $(srcdir)/passes.c $(srcdir)/cgraphunit.c \ > $(srcdir)/cgraphclones.c \ > $(srcdir)/tree-phinodes.c \ > diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c > index 83ab419..bafbadb 100644 > --- a/gcc/cgraphunit.c > +++ b/gcc/cgraphunit.c > @@ -218,6 +218,7 @@ along with GCC; see the file COPYING3. If not see > #include "tree-nested.h" > #include "gimplify.h" > #include "dbgcnt.h" > +#include "omp-low.h" > #include "lto-section-names.h" > > /* Queue of cgraph nodes scheduled to be added into cgraph. This is a > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > index 5036d4f..6a5a031 100644 > --- a/gcc/doc/tm.texi > +++ b/gcc/doc/tm.texi > @@ -11205,6 +11205,12 @@ If defined, this function returns an appropriate alignment in bits for an atomic > ISO C11 requires atomic compound assignments that may raise floating-point exceptions to raise exceptions corresponding to the arithmetic operation whose result was successfully stored in a compare-and-exchange sequence. This requires code equivalent to calls to @code{feholdexcept}, @code{feclearexcept} and @code{feupdateenv} to be generated at appropriate points in the compare-and-exchange sequence. This hook should set @code{*@var{hold}} to an expression equivalent to the call to @code{feholdexcept}, @code{*@var{clear}} to an expression equivalent to the call to @code{feclearexcept} and @code{*@var{update}} to an expression equivalent to the call to @code{feupdateenv}. The three expressions are @code{NULL_TREE} on entry to the hook and may be left as @code{NULL_TREE} if no code is required in a particular place. The default implementation leaves all three expressions as @code{NULL_TREE}. The @code{__atomic_feraiseexcept} function from @code{libatomic} may be of use as part of the code generated in @code{*@var{update}}. > @end deftypefn > > +@deftypefn {Target Hook} void TARGET_RECORD_OFFLOAD_SYMBOL (tree) > +Used when offloaded functions are seen in the compilation unit and no named > +sections are available. It is called once for each symbol that must be > +recorded in the offload function and variable table. > +@end deftypefn > + > @defmac TARGET_SUPPORTS_WIDE_INT > > On older ports, large integers are stored in @code{CONST_DOUBLE} rtl > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in > index 5674e6c..cadf05d 100644 > --- a/gcc/doc/tm.texi.in > +++ b/gcc/doc/tm.texi.in > @@ -8167,6 +8167,8 @@ and the associated definitions of those functions. > > @hook TARGET_ATOMIC_ASSIGN_EXPAND_FENV > > +@hook TARGET_RECORD_OFFLOAD_SYMBOL > + > @defmac TARGET_SUPPORTS_WIDE_INT > > On older ports, large integers are stored in @code{CONST_DOUBLE} rtl > diff --git a/gcc/gengtype.c b/gcc/gengtype.c > index e48b448..06c37d5 100644 > --- a/gcc/gengtype.c > +++ b/gcc/gengtype.c > @@ -1843,7 +1843,7 @@ open_base_files (void) > "tree-ssa.h", "reload.h", "cpp-id-data.h", "tree-chrec.h", > "except.h", "output.h", "cfgloop.h", > "target.h", "ipa-prop.h", "lto-streamer.h", "target-globals.h", > - "ipa-inline.h", "dwarf2out.h", NULL > + "ipa-inline.h", "dwarf2out.h", "omp-low.h", NULL > }; > const char *const *ifp; > outf_p gtype_desc_c; > diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c > index 45655ba..6c442cf 100644 > --- a/gcc/lto-cgraph.c > +++ b/gcc/lto-cgraph.c > @@ -57,6 +57,7 @@ along with GCC; see the file COPYING3. If not see > #include "context.h" > #include "pass_manager.h" > #include "ipa-utils.h" > +#include "omp-low.h" > > /* True when asm nodes has been output. */ > bool asm_nodes_output = false; > @@ -1057,6 +1058,49 @@ read_string (struct lto_input_block *ib) > return str; > } > > +/* Output function/variable tables that will allow libgomp to look up offload > + target code. OFFLOAD_FUNCS is filled in expand_omp_target, OFFLOAD_VARS is > + filled in ipa_passes. In WHOPR (partitioned) mode during the WPA stage both > + OFFLOAD_FUNCS and OFFLOAD_VARS are filled by input_offload_tables. */ > + > +void > +output_offload_tables (void) > +{ > + if (vec_safe_is_empty (offload_funcs) && vec_safe_is_empty (offload_vars)) > + return; > + > + struct lto_simple_output_block *ob > + = lto_create_simple_output_block (LTO_section_offload_table); > + > + for (unsigned i = 0; i < vec_safe_length (offload_funcs); i++) > + { > + streamer_write_enum (ob->main_stream, LTO_symtab_tags, > + LTO_symtab_last_tag, LTO_symtab_unavail_node); > + lto_output_fn_decl_index (ob->decl_state, ob->main_stream, > + (*offload_funcs)[i]); > + } > + > + for (unsigned i = 0; i < vec_safe_length (offload_vars); i++) > + { > + streamer_write_enum (ob->main_stream, LTO_symtab_tags, > + LTO_symtab_last_tag, LTO_symtab_variable); > + lto_output_var_decl_index (ob->decl_state, ob->main_stream, > + (*offload_vars)[i]); > + } > + > + streamer_write_uhwi_stream (ob->main_stream, 0); > + lto_destroy_simple_output_block (ob); > + > + /* In WHOPR mode during the WPA stage the joint offload tables need to be > + streamed to one partition only. That's why we free offload_funcs and > + offload_vars after the first call of output_offload_tables. */ > + if (flag_wpa) > + { > + vec_free (offload_funcs); > + vec_free (offload_vars); > + } > +} > + > /* Overwrite the information in NODE based on FILE_DATA, TAG, FLAGS, > STACK_SIZE, SELF_TIME and SELF_SIZE. This is called either to initialize > NODE or to replace the values in it, for instance because the first > @@ -1756,6 +1800,55 @@ input_symtab (void) > } > } > > +/* Input function/variable tables that will allow libgomp to look up offload > + target code, and store them into OFFLOAD_FUNCS and OFFLOAD_VARS. */ > + > +void > +input_offload_tables (void) > +{ > + struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data (); > + struct lto_file_decl_data *file_data; > + unsigned int j = 0; > + > + while ((file_data = file_data_vec[j++])) > + { > + const char *data; > + size_t len; > + struct lto_input_block *ib > + = lto_create_simple_input_block (file_data, LTO_section_offload_table, > + &data, &len); > + if (!ib) > + continue; > + > + enum LTO_symtab_tags tag > + = streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag); > + while (tag) > + { > + if (tag == LTO_symtab_unavail_node) > + { > + int decl_index = streamer_read_uhwi (ib); > + tree fn_decl > + = lto_file_decl_data_get_fn_decl (file_data, decl_index); > + vec_safe_push (offload_funcs, fn_decl); > + } > + else if (tag == LTO_symtab_variable) > + { > + int decl_index = streamer_read_uhwi (ib); > + tree var_decl > + = lto_file_decl_data_get_var_decl (file_data, decl_index); > + vec_safe_push (offload_vars, var_decl); > + } > + else > + fatal_error ("invalid offload table in %s", file_data->file_name); > + > + tag = streamer_read_enum (ib, LTO_symtab_tags, LTO_symtab_last_tag); > + } > + > + lto_destroy_simple_input_block (file_data, LTO_section_offload_table, > + ib, data, len); > + } > +} > + > /* True when we need optimization summary for NODE. */ > > static int > diff --git a/gcc/lto-section-in.c b/gcc/lto-section-in.c > index 2485da9..83c8166 100644 > --- a/gcc/lto-section-in.c > +++ b/gcc/lto-section-in.c > @@ -66,7 +66,8 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] = > "cgraphopt", > "inline", > "ipcp_trans", > - "icf" > + "icf", > + "offload_table" > }; > > > diff --git a/gcc/lto-section-names.h b/gcc/lto-section-names.h > index f5dbed2..d84deac 100644 > --- a/gcc/lto-section-names.h > +++ b/gcc/lto-section-names.h > @@ -35,4 +35,7 @@ extern const char *section_name_prefix; > > #define LTO_SEGMENT_NAME "__GNU_LTO" > > +#define OFFLOAD_VAR_TABLE_SECTION_NAME ".gnu.offload_vars" > +#define OFFLOAD_FUNC_TABLE_SECTION_NAME ".gnu.offload_funcs" > + > #endif /* GCC_LTO_SECTION_NAMES_H */ > diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c > index 39931e3..6d99439 100644 > --- a/gcc/lto-streamer-out.c > +++ b/gcc/lto-streamer-out.c > @@ -2304,6 +2304,8 @@ lto_output (void) > statements using the statement UIDs. */ > output_symtab (); > > + output_offload_tables (); > + > #ifdef ENABLE_CHECKING > lto_bitmap_free (output); > #endif > diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h > index 0b3fb6a..3f9a8a9 100644 > --- a/gcc/lto-streamer.h > +++ b/gcc/lto-streamer.h > @@ -248,6 +248,7 @@ enum lto_section_type > LTO_section_inline_summary, > LTO_section_ipcp_transform, > LTO_section_ipa_icf, > + LTO_section_offload_table, > LTO_N_SECTION_TYPES /* Must be last. */ > }; > > @@ -823,6 +824,8 @@ bool lto_symtab_encoder_encode_initializer_p (lto_symtab_encoder_t, > varpool_node *); > void output_symtab (void); > void input_symtab (void); > +void output_offload_tables (void); > +void input_offload_tables (void); > bool referenced_from_other_partition_p (struct ipa_ref_list *, > lto_symtab_encoder_t); > bool reachable_from_other_partition_p (struct cgraph_node *, > diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c > index 0451e71..06887a8 100644 > --- a/gcc/lto/lto.c > +++ b/gcc/lto/lto.c > @@ -3024,6 +3024,8 @@ read_cgraph_and_symbols (unsigned nfiles, const char **fnames) > /* Read the symtab. */ > input_symtab (); > > + input_offload_tables (); > + > /* Store resolutions into the symbol table. */ > > ld_plugin_symbol_resolution_t *res; > diff --git a/gcc/omp-low.c b/gcc/omp-low.c > index 1404b5e..79e6ab3 100644 > --- a/gcc/omp-low.c > +++ b/gcc/omp-low.c > @@ -73,6 +73,7 @@ along with GCC; see the file COPYING3. If not see > #include "optabs.h" > #include "cfgloop.h" > #include "target.h" > +#include "common/common-target.h" > #include "omp-low.h" > #include "gimple-low.h" > #include "tree-cfgcleanup.h" > @@ -82,6 +83,7 @@ along with GCC; see the file COPYING3. If not see > #include "tree-eh.h" > #include "cilk.h" > #include "context.h" > +#include "lto-section-names.h" > > > /* Lowering of OpenMP parallel and workshare constructs proceeds in two > @@ -230,6 +232,9 @@ static tree scan_omp_1_op (tree *, int *, void *); > *handled_ops_p = false; \ > break; > > +/* Holds offload tables with decls. */ > +vec *offload_funcs, *offload_vars; > + > /* Convenience function for calling scan_omp_1_op on tree operands. */ > > static inline tree > @@ -8406,6 +8411,9 @@ expand_omp_target (struct omp_region *region) > DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties; > cgraph_node::add_new_function (child_fn, true); > > + /* Add the new function to the offload table. */ > + vec_safe_push (offload_funcs, child_fn); > + > /* Fix the callgraph edges for child_cfun. Those for cfun will be > fixed in a following pass. */ > push_cfun (child_cfun); > @@ -12376,4 +12384,91 @@ make_pass_omp_simd_clone (gcc::context *ctxt) > return new pass_omp_simd_clone (ctxt); > } > > +/* Helper function for omp_finish_file routine. Takes decls from V_DECLS and > + adds their addresses and sizes to constructor-vector V_CTOR. */ > +static void > +add_decls_addresses_to_decl_constructor (vec *v_decls, > + vec *v_ctor) > +{ > + unsigned len = vec_safe_length (v_decls); > + for (unsigned i = 0; i < len; i++) > + { > + tree it = (*v_decls)[i]; > + bool is_function = TREE_CODE (it) != VAR_DECL; > + > + CONSTRUCTOR_APPEND_ELT (v_ctor, NULL_TREE, build_fold_addr_expr (it)); > + if (!is_function) > + CONSTRUCTOR_APPEND_ELT (v_ctor, NULL_TREE, > + fold_convert (const_ptr_type_node, > + DECL_SIZE_UNIT (it))); > + } > +} > + > +/* Create new symbols containing (address, size) pairs for global variables, > + marked with "omp declare target" attribute, as well as addresses for the > + functions, which are outlined target regions. */ > +void > +omp_finish_file (void) > +{ > + unsigned num_funcs = vec_safe_length (offload_funcs); > + unsigned num_vars = vec_safe_length (offload_vars); > + > + if (num_funcs == 0 && num_vars == 0) > + return; > + > + if (targetm_common.have_named_sections) > + { > + vec *v_f, *v_v; > + vec_alloc (v_f, num_funcs); > + vec_alloc (v_v, num_vars * 2); > + > + add_decls_addresses_to_decl_constructor (offload_funcs, v_f); > + add_decls_addresses_to_decl_constructor (offload_vars, v_v); > + > + tree vars_decl_type = build_array_type_nelts (pointer_sized_int_node, > + num_vars * 2); > + tree funcs_decl_type = build_array_type_nelts (pointer_sized_int_node, > + num_funcs); > + TYPE_ALIGN (vars_decl_type) = TYPE_ALIGN (pointer_sized_int_node); > + TYPE_ALIGN (funcs_decl_type) = TYPE_ALIGN (pointer_sized_int_node); > + tree ctor_v = build_constructor (vars_decl_type, v_v); > + tree ctor_f = build_constructor (funcs_decl_type, v_f); > + TREE_CONSTANT (ctor_v) = TREE_CONSTANT (ctor_f) = 1; > + TREE_STATIC (ctor_v) = TREE_STATIC (ctor_f) = 1; > + tree funcs_decl = build_decl (UNKNOWN_LOCATION, VAR_DECL, > + get_identifier (".offload_func_table"), > + funcs_decl_type); > + tree vars_decl = build_decl (UNKNOWN_LOCATION, VAR_DECL, > + get_identifier (".offload_var_table"), > + vars_decl_type); > + TREE_STATIC (funcs_decl) = TREE_STATIC (vars_decl) = 1; > + /* Do not align tables more than TYPE_ALIGN (pointer_sized_int_node), > + otherwise a joint table in a binary will contain padding between > + tables from multiple object files. */ > + DECL_USER_ALIGN (funcs_decl) = DECL_USER_ALIGN (vars_decl) = 1; > + DECL_ALIGN (funcs_decl) = TYPE_ALIGN (funcs_decl_type); > + DECL_ALIGN (vars_decl) = TYPE_ALIGN (vars_decl_type); > + DECL_INITIAL (funcs_decl) = ctor_f; > + DECL_INITIAL (vars_decl) = ctor_v; > + set_decl_section_name (funcs_decl, OFFLOAD_FUNC_TABLE_SECTION_NAME); > + set_decl_section_name (vars_decl, OFFLOAD_VAR_TABLE_SECTION_NAME); > + > + varpool_node::finalize_decl (vars_decl); > + varpool_node::finalize_decl (funcs_decl); > + } > + else > + { > + for (unsigned i = 0; i < num_funcs; i++) > + { > + tree it = (*offload_funcs)[i]; > + targetm.record_offload_symbol (it); > + } > + for (unsigned i = 0; i < num_vars; i++) > + { > + tree it = (*offload_vars)[i]; > + targetm.record_offload_symbol (it); > + } > + } > +} > + > #include "gt-omp-low.h" > diff --git a/gcc/omp-low.h b/gcc/omp-low.h > index d80c2d6..ac587d0 100644 > --- a/gcc/omp-low.h > +++ b/gcc/omp-low.h > @@ -27,5 +27,9 @@ extern void omp_expand_local (basic_block); > extern void free_omp_regions (void); > extern tree omp_reduction_init (tree, tree); > extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *); > +extern void omp_finish_file (void); > + > +extern GTY(()) vec *offload_funcs; > +extern GTY(()) vec *offload_vars; > > #endif /* GCC_OMP_LOW_H */ > diff --git a/gcc/target.def b/gcc/target.def > index 4d90fc2..290a466 100644 > --- a/gcc/target.def > +++ b/gcc/target.def > @@ -1760,6 +1760,14 @@ HOOK_VECTOR_END (vectorize) > #undef HOOK_PREFIX > #define HOOK_PREFIX "TARGET_" > > +DEFHOOK > +(record_offload_symbol, > + "Used when offloaded functions are seen in the compilation unit and no named\n\ > +sections are available. It is called once for each symbol that must be\n\ > +recorded in the offload function and variable table.", > + void, (tree), > + hook_void_tree) > + > /* Allow target specific overriding of option settings after options have > been changed by an attribute or pragma or when it is reset at the > end of the code affected by an attribute or pragma. */ > diff --git a/gcc/toplev.c b/gcc/toplev.c > index 7d8cefa..d9d325b 100644 > --- a/gcc/toplev.c > +++ b/gcc/toplev.c > @@ -91,6 +91,7 @@ along with GCC; see the file COPYING3. If not see > #include "ipa-prop.h" > #include "gcse.h" > #include "optabs.h" > +#include "omp-low.h" > > #if defined(DBX_DEBUGGING_INFO) || defined(XCOFF_DEBUGGING_INFO) > #include "dbxout.h" > @@ -591,6 +592,8 @@ compile_file (void) > if (flag_sanitize & SANITIZE_THREAD) > tsan_finish_file (); > > + omp_finish_file (); > + > output_shared_constant_pool (); > output_object_blocks (); > finish_tm_clone_pairs (); > diff --git a/gcc/varpool.c b/gcc/varpool.c > index c508bf9..cd3710d 100644 > --- a/gcc/varpool.c > +++ b/gcc/varpool.c > @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3. If not see > #include "lto-streamer.h" > #include "hash-set.h" > #include "context.h" > +#include "omp-low.h" > > const char * const tls_model_names[]={"none", "tls-emulated", "tls-real", > "tls-global-dynamic", "tls-local-dynamic", > @@ -162,6 +163,8 @@ varpool_node::get_create (tree decl) > { > node->offloadable = 1; > g->have_offload = true; > + if (!in_lto_p) > + vec_safe_push (offload_vars, decl); > } > > node->register_symbol (); > -- Richard Biener SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 21284 (AG Nuernberg) Maxfeldstrasse 5, 90409 Nuernberg, Germany ---1609908220-1648939029-1415784046=:374--