Hi! On Wed, 11 Feb 2015 15:44:26 +0100, I wrote: > If -freorder-blocks-and-partition is active, this results in PTX code > such as: > > // BEGIN PREAMBLE > .version 3.1 > .target sm_30 > .address_size 64 > // END PREAMBLE > > $LCOLDB0: > $LHOTB0: > // BEGIN FUNCTION DECL: vec_mult$_omp_fn$1 > .entry vec_mult$_omp_fn$1(.param.u64 %in_ar1); > // BEGIN FUNCTION DEF: vec_mult$_omp_fn$1 > .entry vec_mult$_omp_fn$1(.param.u64 %in_ar1) > { > .reg.u64 %ar1; > [...] > > Note the global cold/hot labels. Such partitioning might not make a lot of sense for the virtual ISA that PTX is, but disabling it in nvptx.c:nvptx_option_override does not work. (Because that is not invoked in the offloading code path?) I see x86 has a ix86_option_override_internal (but I don't know how that options processing works) -- is something like that needed for nvptx, too, and how to interconnect that with the offloading code path? Sounds a bit like what Jakub suggests in ? Maybe -freorder-functions (of no use?) should also be disabled? Here is a WIP patch for -freorder-blocks-and-partition (missing handling of the offloading code path) -- does something like that make sense? --- gcc/config/nvptx/nvptx.c +++ gcc/config/nvptx/nvptx.c @@ -93,6 +93,18 @@ nvptx_option_override (void) init_machine_status = nvptx_init_machine_status; /* Gives us a predictable order, which we need especially for variables. */ flag_toplevel_reorder = 1; + /* If enabled, global cold/hot labels will be emitted, which our mkoffload + currently doesn't cope with. Also, it's not clear whether such + partitioning actually has any positive effect on the virtual ISA that PTX + is. */ + if (flag_reorder_blocks_and_partition) + { + inform (input_location, + "-freorder-blocks-and-partition not supported on this " + "architecture"); + flag_reorder_blocks_and_partition = 0; + flag_reorder_blocks = 1; + } /* Assumes that it will see only hard registers. */ flag_var_tracking = 0; write_symbols = NO_DEBUG; Grüße, Thomas