public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Group static constructors and destructors in specific subsections
@ 2010-10-14 21:09 Jan Hubicka
  2010-10-14 21:10 ` Joseph S. Myers
                   ` (6 more replies)
  0 siblings, 7 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-10-14 21:09 UTC (permalink / raw)
  To: gcc-patches, rguenther, iant, iains, tglek

[-- Attachment #1: Type: text/plain, Size: 15771 bytes --]

Hi,
this patch puts static constructors, functions called only from the static
constructors and the function main() into .text.startup subsection.  Similarly
the static desturctors and functions called only from the static destructors
into the .text.exit subsection.

Together with linker script update (attached at binutils side) this cause
significant improvements in the startup time (40% of Mozilla's binary startup
is paging in hundreds of static constructors evenly spred across the binary).

I am not sure how to update gold - I basically copied existing code in binutils
for .text.unlikely group in GNU LD linker script, but I think gold is doing
independent decisions somewhere. 

Boostrapped/regtested x86_64 linux.  I am not sure if darwin.h update is correct,
I am just ugessing based on unlikely section definition. OK for the target bits?

Honza

	* doc/tm.texi.in (STARTUP_EXECUTED_TEXT_SECTION_NAME,
	EXIT_EXECUTED_TEXT_SECTION_NAME): Document
	* config/i386/darwin.h (STARTUP_TEXT_SECTION_NAME,
	* EXIT_TEXT_SECTION_NAME): Define.
	* config/rs6000/darwin.h (STARTUP_TEXT_SECTION_NAME,
	* EXIT_TEXT_SECTION_NAME): Define.
	* config/ia64/hpux.h (STARTUP_TEXT_SECTION_NAME,
	EXIT_TEXT_SECTION_NAME): Define.
	* cgraph.c (dump_cgraph_node): Dump SAME_COMDAT_GROUP,
	ONLY_CALLED_AT_STARTUP and ONLY_CALLED_AT_EXIT.
	(cgraph_propagate_frequency): Compute only_called_at_startup
	and only_called_at_exit.
	* cgraph.h (struct cgraph_node): Add  only_called_at_startup and
	only_called_at_exit fields.
	* defaults.h (STARTUP_TEXT_SECTION_NAME, EXIT_TEXT_SECTION_NAME): New.
	* lto-cgraph.c (lto_output_node): Add only_called_at_startup and
	only_called_at_exit.
	(input_overwrite_node): Likewise.
	* predict.c (compute_function_frequency): Set
	only_called_at_startup and only_called_at_exit.
	(choose_function_section): Set function sections.
Index: doc/tm.texi
===================================================================
*** doc/tm.texi	(revision 165478)
--- doc/tm.texi	(working copy)
*************** If defined, a C string constant for the
*** 6873,6878 ****
--- 6873,6888 ----
  executed functions in the program.
  @end defmac
  
+ @defmac STARTUP_EXECUTED_TEXT_SECTION_NAME
+ If defined, a C string constant for the name of the section containing 
+ functions executed at startup of the program.
+ @end defmac
+ 
+ @defmac EXIT_EXECUTED_TEXT_SECTION_NAME
+ If defined, a C string constant for the name of the section containing
+ functions executed only when the program is doing exit.
+ @end defmac
+ 
  @defmac DATA_SECTION_ASM_OP
  A C expression whose value is a string, including spacing, containing the
  assembler operation to identify the following data as writable initialized
Index: doc/tm.texi.in
===================================================================
*** doc/tm.texi.in	(revision 165478)
--- doc/tm.texi.in	(working copy)
*************** If defined, a C string constant for the
*** 6869,6874 ****
--- 6869,6884 ----
  executed functions in the program.
  @end defmac
  
+ @defmac STARTUP_EXECUTED_TEXT_SECTION_NAME
+ If defined, a C string constant for the name of the section containing 
+ functions executed at startup of the program.
+ @end defmac
+ 
+ @defmac EXIT_EXECUTED_TEXT_SECTION_NAME
+ If defined, a C string constant for the name of the section containing
+ functions executed only when the program is doing exit.
+ @end defmac
+ 
  @defmac DATA_SECTION_ASM_OP
  A C expression whose value is a string, including spacing, containing the
  assembler operation to identify the following data as writable initialized
Index: cgraph.c
===================================================================
*** cgraph.c	(revision 165478)
--- cgraph.c	(working copy)
*************** dump_cgraph_node (FILE *f, struct cgraph
*** 1807,1812 ****
--- 1807,1816 ----
      fprintf (f, " (inline copy in %s/%i)",
  	     cgraph_node_name (node->global.inlined_to),
  	     node->global.inlined_to->uid);
+   if (node->same_comdat_group)
+     fprintf (f, " (same comdat group as %s/%i)",
+ 	     cgraph_node_name (node->same_comdat_group),
+ 	     node->same_comdat_group->uid);
    if (node->clone_of)
      fprintf (f, " (clone of %s/%i)",
  	     cgraph_node_name (node->clone_of),
*************** dump_cgraph_node (FILE *f, struct cgraph
*** 1867,1872 ****
--- 1871,1880 ----
      fprintf (f, " redefined_extern_inline");
    if (TREE_ASM_WRITTEN (node->decl))
      fprintf (f, " asm_written");
+   if (node->only_called_at_startup)
+     fprintf (f, " only_called_at_startup");
+   if (node->only_called_at_exit)
+     fprintf (f, " only_called_at_exit");
  
    fprintf (f, "\n  called by: ");
    for (edge = node->callers; edge; edge = edge->next_caller)
*************** bool
*** 2618,2637 ****
  cgraph_propagate_frequency (struct cgraph_node *node)
  {
    bool maybe_unlikely_executed = true, maybe_executed_once = true;
    struct cgraph_edge *edge;
    if (!node->local.local)
      return false;
    gcc_assert (node->analyzed);
-   if (node->frequency == NODE_FREQUENCY_HOT)
-     return false;
-   if (node->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
-     return false;
    if (dump_file && (dump_flags & TDF_DETAILS))
      fprintf (dump_file, "Processing frequency %s\n", cgraph_node_name (node));
    for (edge = node->callers;
!        edge && (maybe_unlikely_executed || maybe_executed_once);
         edge = edge->next_caller)
      {
        if (!edge->frequency)
  	continue;
        switch (edge->caller->frequency)
--- 2626,2657 ----
  cgraph_propagate_frequency (struct cgraph_node *node)
  {
    bool maybe_unlikely_executed = true, maybe_executed_once = true;
+   bool only_called_at_startup = true;
+   bool only_called_at_exit = true;
+   bool changed = false;
    struct cgraph_edge *edge;
+ 
    if (!node->local.local)
      return false;
    gcc_assert (node->analyzed);
    if (dump_file && (dump_flags & TDF_DETAILS))
      fprintf (dump_file, "Processing frequency %s\n", cgraph_node_name (node));
+ 
    for (edge = node->callers;
!        edge && (maybe_unlikely_executed || maybe_executed_once
! 	        || only_called_at_startup || only_called_at_exit);
         edge = edge->next_caller)
      {
+       if (edge->caller != node)
+ 	{
+           only_called_at_startup &= edge->caller->only_called_at_startup;
+ 	  /* It makes snese to put main() together with the static constructors.
+ 	     It will be executed for sure, but rest of functions called from
+ 	     main are definitly not at startup only.  */
+ 	  if (MAIN_NAME_P (DECL_NAME (edge->caller->decl)))
+ 	    only_called_at_startup = 0;
+           only_called_at_exit &= edge->caller->only_called_at_exit;
+ 	}
        if (!edge->frequency)
  	continue;
        switch (edge->caller->frequency)
*************** cgraph_propagate_frequency (struct cgrap
*** 2640,2646 ****
  	  break;
  	case NODE_FREQUENCY_EXECUTED_ONCE:
  	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    fprintf (dump_file, "  Called by %s that is executed once\n", cgraph_node_name (node));
  	  maybe_unlikely_executed = false;
  	  if (edge->loop_nest)
  	    {
--- 2660,2667 ----
  	  break;
  	case NODE_FREQUENCY_EXECUTED_ONCE:
  	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    fprintf (dump_file, "  Called by %s that is executed once\n",
! 		     cgraph_node_name (node));
  	  maybe_unlikely_executed = false;
  	  if (edge->loop_nest)
  	    {
*************** cgraph_propagate_frequency (struct cgrap
*** 2652,2678 ****
  	case NODE_FREQUENCY_HOT:
  	case NODE_FREQUENCY_NORMAL:
  	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    fprintf (dump_file, "  Called by %s that is normal or hot\n", cgraph_node_name (node));
  	  maybe_unlikely_executed = false;
  	  maybe_executed_once = false;
  	  break;
  	}
      }
!    if (maybe_unlikely_executed)
!      {
!        node->frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED;
         if (dump_file)
!          fprintf (dump_file, "Node %s promoted to unlikely executed.\n", cgraph_node_name (node));
!        return true;
!      }
!    if (maybe_executed_once && node->frequency != NODE_FREQUENCY_EXECUTED_ONCE)
!      {
!        node->frequency = NODE_FREQUENCY_EXECUTED_ONCE;
         if (dump_file)
!          fprintf (dump_file, "Node %s promoted to executed once.\n", cgraph_node_name (node));
!        return true;
!      }
!    return false;
  }
  
  /* Return true when NODE can not return or throw and thus
--- 2673,2724 ----
  	case NODE_FREQUENCY_HOT:
  	case NODE_FREQUENCY_NORMAL:
  	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    fprintf (dump_file, "  Called by %s that is normal or hot\n",
! 		     cgraph_node_name (node));
  	  maybe_unlikely_executed = false;
  	  maybe_executed_once = false;
  	  break;
  	}
      }
!   if ((only_called_at_startup && !only_called_at_exit)
!       && !node->only_called_at_startup)
!     {
!        node->only_called_at_startup = true;
         if (dump_file)
!          fprintf (dump_file, "Node %s promoted to only called at startup.\n",
! 		  cgraph_node_name (node));
!        changed = true;
!     }
!   if ((only_called_at_exit && !only_called_at_startup)
!       && !node->only_called_at_exit)
!     {
!        node->only_called_at_exit = true;
         if (dump_file)
!          fprintf (dump_file, "Node %s promoted to only called at exit.\n",
! 		  cgraph_node_name (node));
!        changed = true;
!     }
!   /* These come either from profile or user hints; never update them.  */
!   if (node->frequency == NODE_FREQUENCY_HOT
!       || node->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
!     return changed;
!   if (maybe_unlikely_executed)
!     {
!       node->frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED;
!       if (dump_file)
! 	fprintf (dump_file, "Node %s promoted to unlikely executed.\n",
! 		 cgraph_node_name (node));
!       changed = true;
!     }
!   if (maybe_executed_once && node->frequency != NODE_FREQUENCY_EXECUTED_ONCE)
!     {
!       node->frequency = NODE_FREQUENCY_EXECUTED_ONCE;
!       if (dump_file)
! 	fprintf (dump_file, "Node %s promoted to executed once.\n",
! 		 cgraph_node_name (node));
!       changed = true;
!     }
!   return changed;
  }
  
  /* Return true when NODE can not return or throw and thus
Index: cgraph.h
===================================================================
*** cgraph.h	(revision 165478)
--- cgraph.h	(working copy)
*************** struct GTY((chain_next ("%h.next"), chai
*** 301,306 ****
--- 301,310 ----
    /* How commonly executed the node is.  Initialized during branch
       probabilities pass.  */
    ENUM_BITFIELD (node_frequency) frequency : 2;
+   /* True when function can only be called at startup (from static ctor).  */
+   unsigned only_called_at_startup : 1;
+   /* True when function can only be called at startup (from static dtor).  */
+   unsigned only_called_at_exit : 1;
  };
  
  typedef struct cgraph_node *cgraph_node_ptr;
Index: defaults.h
===================================================================
*** defaults.h	(revision 165478)
--- defaults.h	(working copy)
*************** see the files COPYING3 and COPYING.RUNTI
*** 939,944 ****
--- 939,952 ----
  #define TARGET_DEC_EVAL_METHOD 2
  #endif
  
+ #ifndef STARTUP_TEXT_SECTION_NAME
+ #define STARTUP_TEXT_SECTION_NAME ".text.startup"
+ #endif
+ 
+ #ifndef EXIT_TEXT_SECTION_NAME
+ #define EXIT_TEXT_SECTION_NAME ".text.exit"
+ #endif
+ 
  #ifndef HOT_TEXT_SECTION_NAME
  #define HOT_TEXT_SECTION_NAME ".text.hot"
  #endif
Index: lto-cgraph.c
===================================================================
*** lto-cgraph.c	(revision 165478)
--- lto-cgraph.c	(working copy)
*************** lto_output_node (struct lto_simple_outpu
*** 518,523 ****
--- 518,525 ----
    bp_pack_value (&bp, node->alias, 1);
    bp_pack_value (&bp, node->finalized_by_frontend, 1);
    bp_pack_value (&bp, node->frequency, 2);
+   bp_pack_value (&bp, node->only_called_at_startup, 1);
+   bp_pack_value (&bp, node->only_called_at_exit, 1);
    lto_output_bitpack (&bp);
    lto_output_uleb128_stream (ob->main_stream, node->resolution);
  
*************** input_overwrite_node (struct lto_file_de
*** 978,983 ****
--- 980,987 ----
    node->alias = bp_unpack_value (bp, 1);
    node->finalized_by_frontend = bp_unpack_value (bp, 1);
    node->frequency = (enum node_frequency)bp_unpack_value (bp, 2);
+   node->only_called_at_startup = bp_unpack_value (bp, 1);
+   node->only_called_at_exit = bp_unpack_value (bp, 1);
    node->resolution = resolution;
  }
  
Index: predict.c
===================================================================
*** predict.c	(revision 165478)
--- predict.c	(working copy)
*************** compute_function_frequency (void)
*** 2187,2192 ****
--- 2187,2197 ----
  {
    basic_block bb;
    struct cgraph_node *node = cgraph_node (current_function_decl);
+   if (DECL_STATIC_CONSTRUCTOR (current_function_decl)
+       || MAIN_NAME_P (DECL_NAME (current_function_decl)))
+     node->only_called_at_startup = true;
+   if (DECL_STATIC_DESTRUCTOR (current_function_decl))
+     node->only_called_at_exit = true;
  
    if (!profile_info || !flag_branch_probabilities)
      {
*************** choose_function_section (void)
*** 2233,2238 ****
--- 2238,2250 ----
        || DECL_ONE_ONLY (current_function_decl))
      return;
  
+   if (node->only_called_at_startup)
+     DECL_SECTION_NAME (current_function_decl) =
+       build_string (strlen (STARTUP_TEXT_SECTION_NAME), STARTUP_TEXT_SECTION_NAME);
+   else if (node->only_called_at_exit)
+     DECL_SECTION_NAME (current_function_decl) =
+       build_string (strlen (STARTUP_TEXT_SECTION_NAME), EXIT_TEXT_SECTION_NAME);
+ 
    /* If we are doing the partitioning optimization, let the optimization
       choose the correct section into which to put things.  */
  
Index: config/i386/darwin.h
===================================================================
*** config/i386/darwin.h	(revision 165478)
--- config/i386/darwin.h	(working copy)
*************** extern int darwin_emit_branch_islands;
*** 198,203 ****
--- 198,207 ----
  #define HOT_TEXT_SECTION_NAME "__TEXT,__text,regular,pure_instructions"
  #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \
                                "__TEXT,__unlikely,regular,pure_instructions"
+ #define STARTUP_TEXT_SECTION_NAME \
+                               "__TEXT,__startup,regular,pure_instructions"
+ #define EXIT_TEXT_SECTION_NAME \
+                               "__TEXT,__exit,regular,pure_instructions"
  
  /* Assembler pseudos to introduce constants of various size.  */
  
Index: config/ia64/hpux.h
===================================================================
*** config/ia64/hpux.h	(revision 165478)
--- config/ia64/hpux.h	(working copy)
*************** do {								\
*** 226,228 ****
--- 226,234 ----
  
  #undef HOT_TEXT_SECTION_NAME
  #define HOT_TEXT_SECTION_NAME ".text"
+ 
+ #undef STARTUP_TEXT_SECTION_NAME
+ #define STARTUP_TEXT_SECTION_NAME ".text"
+ 
+ #undef EXIT_TEXT_SECTION_NAME
+ #define EXIT_TEXT_SECTION_NAME ".text"
Index: config/rs6000/darwin.h
===================================================================
*** config/rs6000/darwin.h	(revision 165478)
--- config/rs6000/darwin.h	(working copy)
*************** extern int darwin_emit_branch_islands;
*** 188,193 ****
--- 188,195 ----
  #define HOT_TEXT_SECTION_NAME "__TEXT,__text,regular,pure_instructions"
  #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \
                                "__TEXT,__unlikely,regular,pure_instructions"
+ #define STARTUP_TEXT_SECTION_NAME "__TEXT,__startup,regular,pure_instructions"
+ #define EXIT_TEXT_SECTION_NAME "__TEXT,__exit,regular,pure_instructions"
  
  /* Define cutoff for using external functions to save floating point.
     Currently on Darwin, always use inline stores.  */

[-- Attachment #2: unlikely --]
[-- Type: text/plain, Size: 835 bytes --]

	* scripttempl/elf.sf: Group exit, startup and hot sections, too.
Index: ld/scripttempl/elf.sc
===================================================================
RCS file: /cvs/src/src/ld/scripttempl/elf.sc,v
retrieving revision 1.102
diff -c -p -r1.102 elf.sc
*** ld/scripttempl/elf.sc	23 Mar 2010 16:02:02 -0000	1.102
--- ld/scripttempl/elf.sc	14 Oct 2010 20:09:46 -0000
*************** cat <<EOF
*** 433,438 ****
--- 433,441 ----
    {
      ${RELOCATING+${TEXT_START_SYMBOLS}}
      ${RELOCATING+*(.text.unlikely .text.*_unlikely)}
+     ${RELOCATING+*(.text.exit .text.*_exit)}
+     ${RELOCATING+*(.text.startup .text.*_startup)}
+     ${RELOCATING+*(.text.hot .text.*_hot)}
      *(.text .stub${RELOCATING+ .text.* .gnu.linkonce.t.*})
      /* .gnu.warning sections are handled specially by elf32.em.  */
      *(.gnu.warning)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:09 Group static constructors and destructors in specific subsections Jan Hubicka
@ 2010-10-14 21:10 ` Joseph S. Myers
  2010-10-14 21:12   ` Richard Henderson
  2010-10-14 21:17   ` Group static constructors and destructors in specific subsections Jan Hubicka
  2010-10-14 23:53 ` Mike Stump
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 70+ messages in thread
From: Joseph S. Myers @ 2010-10-14 21:10 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches, rguenther, iant, iains, tglek

On Thu, 14 Oct 2010, Jan Hubicka wrote:

> 	* doc/tm.texi.in (STARTUP_EXECUTED_TEXT_SECTION_NAME,
> 	EXIT_EXECUTED_TEXT_SECTION_NAME): Document

Unless you really need target macros, could you please create hooks 
instead?  There's at least one hook giving a section name already 
(TARGET_ASM_RECORD_GCC_SWITCHES_SECTION).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:10 ` Joseph S. Myers
@ 2010-10-14 21:12   ` Richard Henderson
  2010-10-14 21:18     ` Jan Hubicka
                       ` (3 more replies)
  2010-10-14 21:17   ` Group static constructors and destructors in specific subsections Jan Hubicka
  1 sibling, 4 replies; 70+ messages in thread
From: Richard Henderson @ 2010-10-14 21:12 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Jan Hubicka, gcc-patches, rguenther, iant, iains, tglek

On 10/14/2010 01:47 PM, Joseph S. Myers wrote:
> On Thu, 14 Oct 2010, Jan Hubicka wrote:
> 
>> 	* doc/tm.texi.in (STARTUP_EXECUTED_TEXT_SECTION_NAME,
>> 	EXIT_EXECUTED_TEXT_SECTION_NAME): Document
> 
> Unless you really need target macros, could you please create hooks 
> instead?  There's at least one hook giving a section name already 
> (TARGET_ASM_RECORD_GCC_SWITCHES_SECTION).
> 
Should we be returning section NAMES at all?

We have a nicely modular section object.  It would seem
we should make more use of that.


r~

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:10 ` Joseph S. Myers
  2010-10-14 21:12   ` Richard Henderson
@ 2010-10-14 21:17   ` Jan Hubicka
  1 sibling, 0 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-10-14 21:17 UTC (permalink / raw)
  To: Joseph S. Myers; +Cc: Jan Hubicka, gcc-patches, rguenther, iant, iains, tglek

> On Thu, 14 Oct 2010, Jan Hubicka wrote:
> 
> > 	* doc/tm.texi.in (STARTUP_EXECUTED_TEXT_SECTION_NAME,
> > 	EXIT_EXECUTED_TEXT_SECTION_NAME): Document
> 
> Unless you really need target macros, could you please create hooks 
> instead?  There's at least one hook giving a section name already 
> (TARGET_ASM_RECORD_GCC_SWITCHES_SECTION).

I've just copied what existing HOT/UNLIKELY_EXECUTED macros does.  I can switch them to hooks
all at once i guess.

Honza
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:12   ` Richard Henderson
@ 2010-10-14 21:18     ` Jan Hubicka
  2010-10-14 21:21       ` Richard Henderson
  2010-10-14 21:48     ` Joseph S. Myers
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 70+ messages in thread
From: Jan Hubicka @ 2010-10-14 21:18 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Joseph S. Myers, Jan Hubicka, gcc-patches, rguenther, iant, iains, tglek

> On 10/14/2010 01:47 PM, Joseph S. Myers wrote:
> > On Thu, 14 Oct 2010, Jan Hubicka wrote:
> > 
> >> 	* doc/tm.texi.in (STARTUP_EXECUTED_TEXT_SECTION_NAME,
> >> 	EXIT_EXECUTED_TEXT_SECTION_NAME): Document
> > 
> > Unless you really need target macros, could you please create hooks 
> > instead?  There's at least one hook giving a section name already 
> > (TARGET_ASM_RECORD_GCC_SWITCHES_SECTION).
> > 
> Should we be returning section NAMES at all?
> 
> We have a nicely modular section object.  It would seem
> we should make more use of that.

Hmm, you got me ;) I am not quite sure what do you suggest.
The hooks exists mostly so darwin can redefine them and hp-ux can disable them
to avoid linker bug.  I guess HPUX could go with just disabling the function
reordering flag, what darwin should do?

Honza
> 
> 
> r~

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:18     ` Jan Hubicka
@ 2010-10-14 21:21       ` Richard Henderson
  2010-10-14 22:02         ` Jan Hubicka
  0 siblings, 1 reply; 70+ messages in thread
From: Richard Henderson @ 2010-10-14 21:21 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Joseph S. Myers, gcc-patches, rguenther, iant, iains, tglek

On 10/14/2010 02:11 PM, Jan Hubicka wrote:
>> We have a nicely modular section object.  It would seem
>> we should make more use of that.
> 
> Hmm, you got me ;) I am not quite sure what do you suggest.
> The hooks exists mostly so darwin can redefine them and hp-ux can disable them
> to avoid linker bug.  I guess HPUX could go with just disabling the function
> reordering flag, what darwin should do?

Default would do something very much like the
unlikely_text_section function.

Darwin likewise, except with their funny names.

Hpux would simply return text_section.


r~

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:12   ` Richard Henderson
  2010-10-14 21:18     ` Jan Hubicka
@ 2010-10-14 21:48     ` Joseph S. Myers
  2010-10-23 16:01     ` Jan Hubicka
  2010-11-10  4:59     ` Group static constructors and destructors in specific subsections, take 2 Jan Hubicka
  3 siblings, 0 replies; 70+ messages in thread
From: Joseph S. Myers @ 2010-10-14 21:48 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Jan Hubicka, gcc-patches, rguenther, iant, iains, tglek

On Thu, 14 Oct 2010, Richard Henderson wrote:

> On 10/14/2010 01:47 PM, Joseph S. Myers wrote:
> > On Thu, 14 Oct 2010, Jan Hubicka wrote:
> > 
> >> 	* doc/tm.texi.in (STARTUP_EXECUTED_TEXT_SECTION_NAME,
> >> 	EXIT_EXECUTED_TEXT_SECTION_NAME): Document
> > 
> > Unless you really need target macros, could you please create hooks 
> > instead?  There's at least one hook giving a section name already 
> > (TARGET_ASM_RECORD_GCC_SWITCHES_SECTION).
> > 
> Should we be returning section NAMES at all?
> 
> We have a nicely modular section object.  It would seem
> we should make more use of that.

Sure, there are some hooks returning section objects as well; it might be 
better to create some of those.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:21       ` Richard Henderson
@ 2010-10-14 22:02         ` Jan Hubicka
  2010-10-14 22:39           ` Richard Henderson
  0 siblings, 1 reply; 70+ messages in thread
From: Jan Hubicka @ 2010-10-14 22:02 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jan Hubicka, Joseph S. Myers, gcc-patches, rguenther, iant, iains, tglek

> On 10/14/2010 02:11 PM, Jan Hubicka wrote:
> >> We have a nicely modular section object.  It would seem
> >> we should make more use of that.
> > 
> > Hmm, you got me ;) I am not quite sure what do you suggest.
> > The hooks exists mostly so darwin can redefine them and hp-ux can disable them
> > to avoid linker bug.  I guess HPUX could go with just disabling the function
> > reordering flag, what darwin should do?
> 
> Default would do something very much like the
> unlikely_text_section function.
> 
> Darwin likewise, except with their funny names.
> 
> Hpux would simply return text_section.

OK, so we will have target hooks that do return function pointers (instead of
current function names) for function returning section object.  We will defult
them to return existing unlikely_text_section function pointer for
UNLIKELY_EXECUTED.  For HOT/STARTUP and TEXT we will have newly born functions
hot_text_section, startup_text_section and exit_text_section?

From where the actual names will be fed?  Every target (i.e. rs6000 and i386
darwin ;) that requires different section names will need to define 4
equivalents of unlikely_text_section? 

It always seemed to me that unlikely_text_section complexity comes from fact
that it attempts to handle functions with names sections (especially ones
supplied by -ffunction-sections) so function splitting works.  We don't do
function splitting into the hot/startup nor exit subsections.  Always the whole
thing goes there or nothing, so for functions with named sections we don't want
to do this optimization at all.

Also choose_function_section will need reorg to somehow not operate on section
names but section structures.  What should be interface there?

I must say it is bit escaping me how this helps to make things more modular,
but I will look a bit more into the code.

Honza

> 
> 
> r~

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 22:02         ` Jan Hubicka
@ 2010-10-14 22:39           ` Richard Henderson
  0 siblings, 0 replies; 70+ messages in thread
From: Richard Henderson @ 2010-10-14 22:39 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Joseph S. Myers, gcc-patches, rguenther, iant, iains, tglek

On 10/14/2010 02:48 PM, Jan Hubicka wrote:
> OK, so we will have target hooks that do return function pointers (instead of
> current function names) for function returning section object.  We will defult
> them to return existing unlikely_text_section function pointer for
> UNLIKELY_EXECUTED.  For HOT/STARTUP and TEXT we will have newly born functions
> hot_text_section, startup_text_section and exit_text_section?

Um, no.  Do not return function pointers.  Only return section objects.

> Also choose_function_section will need reorg to somehow not operate on section
> names but section structures.  What should be interface there?

Probably to delete that function entirely and instead test frequency in
in function_section.  I don't see why you need to set DECL_SECTION_NAME
there at all.


r~

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:09 Group static constructors and destructors in specific subsections Jan Hubicka
  2010-10-14 21:10 ` Joseph S. Myers
@ 2010-10-14 23:53 ` Mike Stump
  2010-10-15  0:05 ` Mark Mitchell
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 70+ messages in thread
From: Mike Stump @ 2010-10-14 23:53 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches, rguenther, iant, iains, tglek

On Oct 14, 2010, at 1:38 PM, Jan Hubicka wrote:
> this patch puts static constructors, functions called only from the static
> constructors and the function main() into .text.startup subsection.

> Boostrapped/regtested x86_64 linux.  I am not sure if darwin.h update is correct

Yes, it looks fine.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:09 Group static constructors and destructors in specific subsections Jan Hubicka
  2010-10-14 21:10 ` Joseph S. Myers
  2010-10-14 23:53 ` Mike Stump
@ 2010-10-15  0:05 ` Mark Mitchell
  2010-10-15  0:33   ` Jan Hubicka
  2010-10-15  7:24 ` Mike Stump
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 70+ messages in thread
From: Mark Mitchell @ 2010-10-15  0:05 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches, rguenther, iant, iains, tglek

On 10/14/2010 1:38 PM, Jan Hubicka wrote:

> Together with linker script update (attached at binutils side) this cause
> significant improvements in the startup time (40% of Mozilla's binary startup
> is paging in hundreds of static constructors evenly spred across the binary).

Nice!

I wonder if we could avoid requiring the linker-script updates by using
the existing .ctors section stuff?  The linker script already keeps such
sections together.  That would be nice because it would avoid a
dependency on having a GNU ld update.

-- 
Mark Mitchell
CodeSourcery
mark@codesourcery.com
(650) 331-3385 x713

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-15  0:05 ` Mark Mitchell
@ 2010-10-15  0:33   ` Jan Hubicka
  0 siblings, 0 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-10-15  0:33 UTC (permalink / raw)
  To: Mark Mitchell; +Cc: Jan Hubicka, gcc-patches, rguenther, iant, iains, tglek

> On 10/14/2010 1:38 PM, Jan Hubicka wrote:
> 
> > Together with linker script update (attached at binutils side) this cause
> > significant improvements in the startup time (40% of Mozilla's binary startup
> > is paging in hundreds of static constructors evenly spred across the binary).
> 
> Nice!
> 
> I wonder if we could avoid requiring the linker-script updates by using
> the existing .ctors section stuff?  The linker script already keeps such
> sections together.  That would be nice because it would avoid a
> dependency on having a GNU ld update.

.ctors grouping is done via extra data section.  Here I need to group .text
subsection so I am not sure how it would help me, but suggestions are welcome ;)
I an using same machinery as we use for .text.unlikely/.text.hot for some years
already, so it is not that non-standard either.

Honza
> 
> -- 
> Mark Mitchell
> CodeSourcery
> mark@codesourcery.com
> (650) 331-3385 x713

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:09 Group static constructors and destructors in specific subsections Jan Hubicka
                   ` (2 preceding siblings ...)
  2010-10-15  0:05 ` Mark Mitchell
@ 2010-10-15  7:24 ` Mike Stump
  2010-10-15 12:14   ` Jan Hubicka
  2010-10-15 14:05 ` Paolo Carlini
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 70+ messages in thread
From: Mike Stump @ 2010-10-15  7:24 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches, rguenther, iant, iains, tglek

On Oct 14, 2010, at 1:38 PM, Jan Hubicka wrote:
> Together with linker script update (attached at binutils side) this cause
> significant improvements in the startup time (40% of Mozilla's binary startup
> is paging in hundreds of static constructors evenly spred across the binary).

I'm curious, I'd expect the same improvement with no binutils change?  How do the numbers differ without it?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-15  7:24 ` Mike Stump
@ 2010-10-15 12:14   ` Jan Hubicka
  0 siblings, 0 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-10-15 12:14 UTC (permalink / raw)
  To: Mike Stump; +Cc: Jan Hubicka, gcc-patches, rguenther, iant, iains, tglek

> On Oct 14, 2010, at 1:38 PM, Jan Hubicka wrote:
> > Together with linker script update (attached at binutils side) this cause
> > significant improvements in the startup time (40% of Mozilla's binary startup
> > is paging in hundreds of static constructors evenly spred across the binary).
> 
> I'm curious, I'd expect the same improvement with no binutils change?  How do the numbers differ without it?

Without binutils change the subsections are laid our sequentially.  We
basically get constructor at the end of each original .o file instead of in the
middle of it, but it makes no difference.

Honza

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:09 Group static constructors and destructors in specific subsections Jan Hubicka
                   ` (3 preceding siblings ...)
  2010-10-15  7:24 ` Mike Stump
@ 2010-10-15 14:05 ` Paolo Carlini
  2010-10-15 16:54   ` Jan Hubicka
  2010-10-15 14:36 ` Jack Howarth
  2010-10-21 18:02 ` Cary Coutant
  6 siblings, 1 reply; 70+ messages in thread
From: Paolo Carlini @ 2010-10-15 14:05 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches, rguenther, iant, iains, tglek

On 10/14/2010 10:38 PM, Jan Hubicka wrote:
> Together with linker script update (attached at binutils side) this cause
> significant improvements in the startup time (40% of Mozilla's binary startup
> is paging in hundreds of static constructors evenly spred across the binary).
>   
Thanks for the great work. Then I understand the libstdc++ cases are
automatically dealt with correctly and we don't have to do anything
special at the library level, right? I mean with reference to the PR
which you opened some time ago...

Thanks,
Paolo.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:09 Group static constructors and destructors in specific subsections Jan Hubicka
                   ` (4 preceding siblings ...)
  2010-10-15 14:05 ` Paolo Carlini
@ 2010-10-15 14:36 ` Jack Howarth
  2010-10-15 16:14   ` Jan Hubicka
  2010-10-21 18:02 ` Cary Coutant
  6 siblings, 1 reply; 70+ messages in thread
From: Jack Howarth @ 2010-10-15 14:36 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches, rguenther, iant, iains, tglek

On Thu, Oct 14, 2010 at 10:38:17PM +0200, Jan Hubicka wrote:
> Hi,
> this patch puts static constructors, functions called only from the static
> constructors and the function main() into .text.startup subsection.  Similarly
> the static desturctors and functions called only from the static destructors
> into the .text.exit subsection.
> 
> Together with linker script update (attached at binutils side) this cause
> significant improvements in the startup time (40% of Mozilla's binary startup
> is paging in hundreds of static constructors evenly spred across the binary).
> 
> I am not sure how to update gold - I basically copied existing code in binutils
> for .text.unlikely group in GNU LD linker script, but I think gold is doing
> independent decisions somewhere. 
> 
> Boostrapped/regtested x86_64 linux.  I am not sure if darwin.h update is correct,
> I am just ugessing based on unlikely section definition. OK for the target bits?

Honza,
  These changes produce no regressions on x86_64-apple-darwin10.

http://gcc.gnu.org/ml/gcc-testresults/2010-10/msg01167.html

      Jack
> 
> Honza
> 
> 	* doc/tm.texi.in (STARTUP_EXECUTED_TEXT_SECTION_NAME,
> 	EXIT_EXECUTED_TEXT_SECTION_NAME): Document
> 	* config/i386/darwin.h (STARTUP_TEXT_SECTION_NAME,
> 	* EXIT_TEXT_SECTION_NAME): Define.
> 	* config/rs6000/darwin.h (STARTUP_TEXT_SECTION_NAME,
> 	* EXIT_TEXT_SECTION_NAME): Define.
> 	* config/ia64/hpux.h (STARTUP_TEXT_SECTION_NAME,
> 	EXIT_TEXT_SECTION_NAME): Define.
> 	* cgraph.c (dump_cgraph_node): Dump SAME_COMDAT_GROUP,
> 	ONLY_CALLED_AT_STARTUP and ONLY_CALLED_AT_EXIT.
> 	(cgraph_propagate_frequency): Compute only_called_at_startup
> 	and only_called_at_exit.
> 	* cgraph.h (struct cgraph_node): Add  only_called_at_startup and
> 	only_called_at_exit fields.
> 	* defaults.h (STARTUP_TEXT_SECTION_NAME, EXIT_TEXT_SECTION_NAME): New.
> 	* lto-cgraph.c (lto_output_node): Add only_called_at_startup and
> 	only_called_at_exit.
> 	(input_overwrite_node): Likewise.
> 	* predict.c (compute_function_frequency): Set
> 	only_called_at_startup and only_called_at_exit.
> 	(choose_function_section): Set function sections.
> Index: doc/tm.texi
> ===================================================================
> *** doc/tm.texi	(revision 165478)
> --- doc/tm.texi	(working copy)
> *************** If defined, a C string constant for the
> *** 6873,6878 ****
> --- 6873,6888 ----
>   executed functions in the program.
>   @end defmac
>   
> + @defmac STARTUP_EXECUTED_TEXT_SECTION_NAME
> + If defined, a C string constant for the name of the section containing 
> + functions executed at startup of the program.
> + @end defmac
> + 
> + @defmac EXIT_EXECUTED_TEXT_SECTION_NAME
> + If defined, a C string constant for the name of the section containing
> + functions executed only when the program is doing exit.
> + @end defmac
> + 
>   @defmac DATA_SECTION_ASM_OP
>   A C expression whose value is a string, including spacing, containing the
>   assembler operation to identify the following data as writable initialized
> Index: doc/tm.texi.in
> ===================================================================
> *** doc/tm.texi.in	(revision 165478)
> --- doc/tm.texi.in	(working copy)
> *************** If defined, a C string constant for the
> *** 6869,6874 ****
> --- 6869,6884 ----
>   executed functions in the program.
>   @end defmac
>   
> + @defmac STARTUP_EXECUTED_TEXT_SECTION_NAME
> + If defined, a C string constant for the name of the section containing 
> + functions executed at startup of the program.
> + @end defmac
> + 
> + @defmac EXIT_EXECUTED_TEXT_SECTION_NAME
> + If defined, a C string constant for the name of the section containing
> + functions executed only when the program is doing exit.
> + @end defmac
> + 
>   @defmac DATA_SECTION_ASM_OP
>   A C expression whose value is a string, including spacing, containing the
>   assembler operation to identify the following data as writable initialized
> Index: cgraph.c
> ===================================================================
> *** cgraph.c	(revision 165478)
> --- cgraph.c	(working copy)
> *************** dump_cgraph_node (FILE *f, struct cgraph
> *** 1807,1812 ****
> --- 1807,1816 ----
>       fprintf (f, " (inline copy in %s/%i)",
>   	     cgraph_node_name (node->global.inlined_to),
>   	     node->global.inlined_to->uid);
> +   if (node->same_comdat_group)
> +     fprintf (f, " (same comdat group as %s/%i)",
> + 	     cgraph_node_name (node->same_comdat_group),
> + 	     node->same_comdat_group->uid);
>     if (node->clone_of)
>       fprintf (f, " (clone of %s/%i)",
>   	     cgraph_node_name (node->clone_of),
> *************** dump_cgraph_node (FILE *f, struct cgraph
> *** 1867,1872 ****
> --- 1871,1880 ----
>       fprintf (f, " redefined_extern_inline");
>     if (TREE_ASM_WRITTEN (node->decl))
>       fprintf (f, " asm_written");
> +   if (node->only_called_at_startup)
> +     fprintf (f, " only_called_at_startup");
> +   if (node->only_called_at_exit)
> +     fprintf (f, " only_called_at_exit");
>   
>     fprintf (f, "\n  called by: ");
>     for (edge = node->callers; edge; edge = edge->next_caller)
> *************** bool
> *** 2618,2637 ****
>   cgraph_propagate_frequency (struct cgraph_node *node)
>   {
>     bool maybe_unlikely_executed = true, maybe_executed_once = true;
>     struct cgraph_edge *edge;
>     if (!node->local.local)
>       return false;
>     gcc_assert (node->analyzed);
> -   if (node->frequency == NODE_FREQUENCY_HOT)
> -     return false;
> -   if (node->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
> -     return false;
>     if (dump_file && (dump_flags & TDF_DETAILS))
>       fprintf (dump_file, "Processing frequency %s\n", cgraph_node_name (node));
>     for (edge = node->callers;
> !        edge && (maybe_unlikely_executed || maybe_executed_once);
>          edge = edge->next_caller)
>       {
>         if (!edge->frequency)
>   	continue;
>         switch (edge->caller->frequency)
> --- 2626,2657 ----
>   cgraph_propagate_frequency (struct cgraph_node *node)
>   {
>     bool maybe_unlikely_executed = true, maybe_executed_once = true;
> +   bool only_called_at_startup = true;
> +   bool only_called_at_exit = true;
> +   bool changed = false;
>     struct cgraph_edge *edge;
> + 
>     if (!node->local.local)
>       return false;
>     gcc_assert (node->analyzed);
>     if (dump_file && (dump_flags & TDF_DETAILS))
>       fprintf (dump_file, "Processing frequency %s\n", cgraph_node_name (node));
> + 
>     for (edge = node->callers;
> !        edge && (maybe_unlikely_executed || maybe_executed_once
> ! 	        || only_called_at_startup || only_called_at_exit);
>          edge = edge->next_caller)
>       {
> +       if (edge->caller != node)
> + 	{
> +           only_called_at_startup &= edge->caller->only_called_at_startup;
> + 	  /* It makes snese to put main() together with the static constructors.
> + 	     It will be executed for sure, but rest of functions called from
> + 	     main are definitly not at startup only.  */
> + 	  if (MAIN_NAME_P (DECL_NAME (edge->caller->decl)))
> + 	    only_called_at_startup = 0;
> +           only_called_at_exit &= edge->caller->only_called_at_exit;
> + 	}
>         if (!edge->frequency)
>   	continue;
>         switch (edge->caller->frequency)
> *************** cgraph_propagate_frequency (struct cgrap
> *** 2640,2646 ****
>   	  break;
>   	case NODE_FREQUENCY_EXECUTED_ONCE:
>   	  if (dump_file && (dump_flags & TDF_DETAILS))
> ! 	    fprintf (dump_file, "  Called by %s that is executed once\n", cgraph_node_name (node));
>   	  maybe_unlikely_executed = false;
>   	  if (edge->loop_nest)
>   	    {
> --- 2660,2667 ----
>   	  break;
>   	case NODE_FREQUENCY_EXECUTED_ONCE:
>   	  if (dump_file && (dump_flags & TDF_DETAILS))
> ! 	    fprintf (dump_file, "  Called by %s that is executed once\n",
> ! 		     cgraph_node_name (node));
>   	  maybe_unlikely_executed = false;
>   	  if (edge->loop_nest)
>   	    {
> *************** cgraph_propagate_frequency (struct cgrap
> *** 2652,2678 ****
>   	case NODE_FREQUENCY_HOT:
>   	case NODE_FREQUENCY_NORMAL:
>   	  if (dump_file && (dump_flags & TDF_DETAILS))
> ! 	    fprintf (dump_file, "  Called by %s that is normal or hot\n", cgraph_node_name (node));
>   	  maybe_unlikely_executed = false;
>   	  maybe_executed_once = false;
>   	  break;
>   	}
>       }
> !    if (maybe_unlikely_executed)
> !      {
> !        node->frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED;
>          if (dump_file)
> !          fprintf (dump_file, "Node %s promoted to unlikely executed.\n", cgraph_node_name (node));
> !        return true;
> !      }
> !    if (maybe_executed_once && node->frequency != NODE_FREQUENCY_EXECUTED_ONCE)
> !      {
> !        node->frequency = NODE_FREQUENCY_EXECUTED_ONCE;
>          if (dump_file)
> !          fprintf (dump_file, "Node %s promoted to executed once.\n", cgraph_node_name (node));
> !        return true;
> !      }
> !    return false;
>   }
>   
>   /* Return true when NODE can not return or throw and thus
> --- 2673,2724 ----
>   	case NODE_FREQUENCY_HOT:
>   	case NODE_FREQUENCY_NORMAL:
>   	  if (dump_file && (dump_flags & TDF_DETAILS))
> ! 	    fprintf (dump_file, "  Called by %s that is normal or hot\n",
> ! 		     cgraph_node_name (node));
>   	  maybe_unlikely_executed = false;
>   	  maybe_executed_once = false;
>   	  break;
>   	}
>       }
> !   if ((only_called_at_startup && !only_called_at_exit)
> !       && !node->only_called_at_startup)
> !     {
> !        node->only_called_at_startup = true;
>          if (dump_file)
> !          fprintf (dump_file, "Node %s promoted to only called at startup.\n",
> ! 		  cgraph_node_name (node));
> !        changed = true;
> !     }
> !   if ((only_called_at_exit && !only_called_at_startup)
> !       && !node->only_called_at_exit)
> !     {
> !        node->only_called_at_exit = true;
>          if (dump_file)
> !          fprintf (dump_file, "Node %s promoted to only called at exit.\n",
> ! 		  cgraph_node_name (node));
> !        changed = true;
> !     }
> !   /* These come either from profile or user hints; never update them.  */
> !   if (node->frequency == NODE_FREQUENCY_HOT
> !       || node->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
> !     return changed;
> !   if (maybe_unlikely_executed)
> !     {
> !       node->frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED;
> !       if (dump_file)
> ! 	fprintf (dump_file, "Node %s promoted to unlikely executed.\n",
> ! 		 cgraph_node_name (node));
> !       changed = true;
> !     }
> !   if (maybe_executed_once && node->frequency != NODE_FREQUENCY_EXECUTED_ONCE)
> !     {
> !       node->frequency = NODE_FREQUENCY_EXECUTED_ONCE;
> !       if (dump_file)
> ! 	fprintf (dump_file, "Node %s promoted to executed once.\n",
> ! 		 cgraph_node_name (node));
> !       changed = true;
> !     }
> !   return changed;
>   }
>   
>   /* Return true when NODE can not return or throw and thus
> Index: cgraph.h
> ===================================================================
> *** cgraph.h	(revision 165478)
> --- cgraph.h	(working copy)
> *************** struct GTY((chain_next ("%h.next"), chai
> *** 301,306 ****
> --- 301,310 ----
>     /* How commonly executed the node is.  Initialized during branch
>        probabilities pass.  */
>     ENUM_BITFIELD (node_frequency) frequency : 2;
> +   /* True when function can only be called at startup (from static ctor).  */
> +   unsigned only_called_at_startup : 1;
> +   /* True when function can only be called at startup (from static dtor).  */
> +   unsigned only_called_at_exit : 1;
>   };
>   
>   typedef struct cgraph_node *cgraph_node_ptr;
> Index: defaults.h
> ===================================================================
> *** defaults.h	(revision 165478)
> --- defaults.h	(working copy)
> *************** see the files COPYING3 and COPYING.RUNTI
> *** 939,944 ****
> --- 939,952 ----
>   #define TARGET_DEC_EVAL_METHOD 2
>   #endif
>   
> + #ifndef STARTUP_TEXT_SECTION_NAME
> + #define STARTUP_TEXT_SECTION_NAME ".text.startup"
> + #endif
> + 
> + #ifndef EXIT_TEXT_SECTION_NAME
> + #define EXIT_TEXT_SECTION_NAME ".text.exit"
> + #endif
> + 
>   #ifndef HOT_TEXT_SECTION_NAME
>   #define HOT_TEXT_SECTION_NAME ".text.hot"
>   #endif
> Index: lto-cgraph.c
> ===================================================================
> *** lto-cgraph.c	(revision 165478)
> --- lto-cgraph.c	(working copy)
> *************** lto_output_node (struct lto_simple_outpu
> *** 518,523 ****
> --- 518,525 ----
>     bp_pack_value (&bp, node->alias, 1);
>     bp_pack_value (&bp, node->finalized_by_frontend, 1);
>     bp_pack_value (&bp, node->frequency, 2);
> +   bp_pack_value (&bp, node->only_called_at_startup, 1);
> +   bp_pack_value (&bp, node->only_called_at_exit, 1);
>     lto_output_bitpack (&bp);
>     lto_output_uleb128_stream (ob->main_stream, node->resolution);
>   
> *************** input_overwrite_node (struct lto_file_de
> *** 978,983 ****
> --- 980,987 ----
>     node->alias = bp_unpack_value (bp, 1);
>     node->finalized_by_frontend = bp_unpack_value (bp, 1);
>     node->frequency = (enum node_frequency)bp_unpack_value (bp, 2);
> +   node->only_called_at_startup = bp_unpack_value (bp, 1);
> +   node->only_called_at_exit = bp_unpack_value (bp, 1);
>     node->resolution = resolution;
>   }
>   
> Index: predict.c
> ===================================================================
> *** predict.c	(revision 165478)
> --- predict.c	(working copy)
> *************** compute_function_frequency (void)
> *** 2187,2192 ****
> --- 2187,2197 ----
>   {
>     basic_block bb;
>     struct cgraph_node *node = cgraph_node (current_function_decl);
> +   if (DECL_STATIC_CONSTRUCTOR (current_function_decl)
> +       || MAIN_NAME_P (DECL_NAME (current_function_decl)))
> +     node->only_called_at_startup = true;
> +   if (DECL_STATIC_DESTRUCTOR (current_function_decl))
> +     node->only_called_at_exit = true;
>   
>     if (!profile_info || !flag_branch_probabilities)
>       {
> *************** choose_function_section (void)
> *** 2233,2238 ****
> --- 2238,2250 ----
>         || DECL_ONE_ONLY (current_function_decl))
>       return;
>   
> +   if (node->only_called_at_startup)
> +     DECL_SECTION_NAME (current_function_decl) =
> +       build_string (strlen (STARTUP_TEXT_SECTION_NAME), STARTUP_TEXT_SECTION_NAME);
> +   else if (node->only_called_at_exit)
> +     DECL_SECTION_NAME (current_function_decl) =
> +       build_string (strlen (STARTUP_TEXT_SECTION_NAME), EXIT_TEXT_SECTION_NAME);
> + 
>     /* If we are doing the partitioning optimization, let the optimization
>        choose the correct section into which to put things.  */
>   
> Index: config/i386/darwin.h
> ===================================================================
> *** config/i386/darwin.h	(revision 165478)
> --- config/i386/darwin.h	(working copy)
> *************** extern int darwin_emit_branch_islands;
> *** 198,203 ****
> --- 198,207 ----
>   #define HOT_TEXT_SECTION_NAME "__TEXT,__text,regular,pure_instructions"
>   #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \
>                                 "__TEXT,__unlikely,regular,pure_instructions"
> + #define STARTUP_TEXT_SECTION_NAME \
> +                               "__TEXT,__startup,regular,pure_instructions"
> + #define EXIT_TEXT_SECTION_NAME \
> +                               "__TEXT,__exit,regular,pure_instructions"
>   
>   /* Assembler pseudos to introduce constants of various size.  */
>   
> Index: config/ia64/hpux.h
> ===================================================================
> *** config/ia64/hpux.h	(revision 165478)
> --- config/ia64/hpux.h	(working copy)
> *************** do {								\
> *** 226,228 ****
> --- 226,234 ----
>   
>   #undef HOT_TEXT_SECTION_NAME
>   #define HOT_TEXT_SECTION_NAME ".text"
> + 
> + #undef STARTUP_TEXT_SECTION_NAME
> + #define STARTUP_TEXT_SECTION_NAME ".text"
> + 
> + #undef EXIT_TEXT_SECTION_NAME
> + #define EXIT_TEXT_SECTION_NAME ".text"
> Index: config/rs6000/darwin.h
> ===================================================================
> *** config/rs6000/darwin.h	(revision 165478)
> --- config/rs6000/darwin.h	(working copy)
> *************** extern int darwin_emit_branch_islands;
> *** 188,193 ****
> --- 188,195 ----
>   #define HOT_TEXT_SECTION_NAME "__TEXT,__text,regular,pure_instructions"
>   #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \
>                                 "__TEXT,__unlikely,regular,pure_instructions"
> + #define STARTUP_TEXT_SECTION_NAME "__TEXT,__startup,regular,pure_instructions"
> + #define EXIT_TEXT_SECTION_NAME "__TEXT,__exit,regular,pure_instructions"
>   
>   /* Define cutoff for using external functions to save floating point.
>      Currently on Darwin, always use inline stores.  */

> 	* scripttempl/elf.sf: Group exit, startup and hot sections, too.
> Index: ld/scripttempl/elf.sc
> ===================================================================
> RCS file: /cvs/src/src/ld/scripttempl/elf.sc,v
> retrieving revision 1.102
> diff -c -p -r1.102 elf.sc
> *** ld/scripttempl/elf.sc	23 Mar 2010 16:02:02 -0000	1.102
> --- ld/scripttempl/elf.sc	14 Oct 2010 20:09:46 -0000
> *************** cat <<EOF
> *** 433,438 ****
> --- 433,441 ----
>     {
>       ${RELOCATING+${TEXT_START_SYMBOLS}}
>       ${RELOCATING+*(.text.unlikely .text.*_unlikely)}
> +     ${RELOCATING+*(.text.exit .text.*_exit)}
> +     ${RELOCATING+*(.text.startup .text.*_startup)}
> +     ${RELOCATING+*(.text.hot .text.*_hot)}
>       *(.text .stub${RELOCATING+ .text.* .gnu.linkonce.t.*})
>       /* .gnu.warning sections are handled specially by elf32.em.  */
>       *(.gnu.warning)

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-15 14:36 ` Jack Howarth
@ 2010-10-15 16:14   ` Jan Hubicka
  0 siblings, 0 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-10-15 16:14 UTC (permalink / raw)
  To: Jack Howarth; +Cc: Jan Hubicka, gcc-patches, rguenther, iant, iains, tglek

> On Thu, Oct 14, 2010 at 10:38:17PM +0200, Jan Hubicka wrote:
> > Hi,
> > this patch puts static constructors, functions called only from the static
> > constructors and the function main() into .text.startup subsection.  Similarly
> > the static desturctors and functions called only from the static destructors
> > into the .text.exit subsection.
> > 
> > Together with linker script update (attached at binutils side) this cause
> > significant improvements in the startup time (40% of Mozilla's binary startup
> > is paging in hundreds of static constructors evenly spred across the binary).
> > 
> > I am not sure how to update gold - I basically copied existing code in binutils
> > for .text.unlikely group in GNU LD linker script, but I think gold is doing
> > independent decisions somewhere. 
> > 
> > Boostrapped/regtested x86_64 linux.  I am not sure if darwin.h update is correct,
> > I am just ugessing based on unlikely section definition. OK for the target bits?
> 
> Honza,
>   These changes produce no regressions on x86_64-apple-darwin10.
> 
> http://gcc.gnu.org/ml/gcc-testresults/2010-10/msg01167.html

Thanks,
I am glad to know that.   After discussion with Rth i decided to break up the patch
into the cgraph changes and changes introducing a target hooks.  So I will regtest and
commit the first part shortly and hope to have updated patch soon.

Honza

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-15 14:05 ` Paolo Carlini
@ 2010-10-15 16:54   ` Jan Hubicka
  2010-10-15 17:15     ` Paolo Carlini
  0 siblings, 1 reply; 70+ messages in thread
From: Jan Hubicka @ 2010-10-15 16:54 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: Jan Hubicka, gcc-patches, rguenther, iant, iains, tglek

> On 10/14/2010 10:38 PM, Jan Hubicka wrote:
> > Together with linker script update (attached at binutils side) this cause
> > significant improvements in the startup time (40% of Mozilla's binary startup
> > is paging in hundreds of static constructors evenly spred across the binary).
> >   
> Thanks for the great work. Then I understand the libstdc++ cases are
> automatically dealt with correctly and we don't have to do anything
> special at the library level, right? I mean with reference to the PR
> which you opened some time ago...

Hi,
to get the constructors grouped, there should be no need for any source file
updates.  Libstdc++ has only about 8 static constructors (guessing from ctor
section), so the gains on libstdc++ startup will be limited however.

We still have isue with including iostream causing static constructor that is bit
careless.  Now hever the constructors will all sit in the same section that will
be paged in quite effecitvely at startup.  Chrome is told to have about 1300 of such
constructors, so it will account from few extra KB at startup read in sequence, but
backward.  Should not be disasterous, still something more effective would be nice.

Honza
> 
> Thanks,
> Paolo.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-15 16:54   ` Jan Hubicka
@ 2010-10-15 17:15     ` Paolo Carlini
  2010-10-16 18:39       ` Jan Hubicka
  0 siblings, 1 reply; 70+ messages in thread
From: Paolo Carlini @ 2010-10-15 17:15 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches, rguenther, iant, iains, tglek

On 10/15/2010 06:16 PM, Jan Hubicka wrote:
> We still have isue with including iostream causing static constructor that is bit careless.  Now hever the constructors will all sit in the same section that will be paged in quite effecitvely at startup.  Chrome is told to have about 1300 of such constructors, so it will account from few extra KB at startup read in sequence, but backward.  Should not be disasterous, still something more effective would be nice.
>   
Yes, I meant exactly the <iostream> case. I seem to remember that we
came to the conclusion that it's hard to do it very differently at the
library level, because the letter of the standard is pretty close to
what we have been doing. Nice to see that we are making progress thanks
to generic improvements.

Paolo.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-15 17:15     ` Paolo Carlini
@ 2010-10-16 18:39       ` Jan Hubicka
  0 siblings, 0 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-10-16 18:39 UTC (permalink / raw)
  To: Paolo Carlini; +Cc: Jan Hubicka, gcc-patches, rguenther, iant, iains, tglek

> On 10/15/2010 06:16 PM, Jan Hubicka wrote:
> > We still have isue with including iostream causing static constructor that is bit careless.  Now hever the constructors will all sit in the same section that will be paged in quite effecitvely at startup.  Chrome is told to have about 1300 of such constructors, so it will account from few extra KB at startup read in sequence, but backward.  Should not be disasterous, still something more effective would be nice.
> >   
> Yes, I meant exactly the <iostream> case. I seem to remember that we
> came to the conclusion that it's hard to do it very differently at the
> library level, because the letter of the standard is pretty close to
> what we have been doing. Nice to see that we are making progress thanks
> to generic improvements.

We still can do better here.  First with LTO we now inline all the constructors
together and we should be able to optimize out the redundant initialization
calls. I can think of function attribute that says that the function needs to
be called just once during execution of the program, but it is bit hard to implement.
Perhaps there are cleaner ways?

Also if the constructor compiled into a single function call, we can probably implement
optimization that replaces a constructor function that only calls different function
by the ctors sections entry alone.
That way we would get a lot of duplicated entries into the ctors sections instead of
a lot of duplicated functions in the program that would definitly save some code size.

Anyway, to move forward this is patch I comitted that implements the infrastructure
to discover functions used only at startup/exit.  I will send as a followup the patch
adding target hooks for section selection.

Index: ChangeLog
===================================================================
*** ChangeLog	(revision 165559)
--- ChangeLog	(working copy)
***************
*** 1,3 ****
--- 1,15 ----
+ 2010-10-16  Jan Hubicka  <jh@suse.cz>
+ 
+ 	* cgraph.c (dump_cgraph_node): Dump same_comdat_group,
+ 	only_called_at_startup and only_called_at_exit.
+ 	(cgraph_propagate_frequency): Compute only_called_at_startup and 
+ 	only_called_at_exit.
+ 	* cgraph.h (struct cgraph_node): New fileds only_called_at_startup and
+ 	only_called_at_exit.
+ 	* lto-cgraph.c (lto_output_node, input_overwrite_node): Stream the new
+ 	flags.
+ 	* predict.c (compute_function_frequency): Initialize the new flags.
+ 
  2010-10-16  Eric Botcazou  <ebotcazou@adacore.com>
  
  	* gimplify.c (gimplify_type_sizes) <ARRAY_TYPE>: If the type is to be
Index: cgraph.c
===================================================================
*** cgraph.c	(revision 165559)
--- cgraph.c	(working copy)
*************** dump_cgraph_node (FILE *f, struct cgraph
*** 1816,1821 ****
--- 1816,1825 ----
      fprintf (f, " (inline copy in %s/%i)",
  	     cgraph_node_name (node->global.inlined_to),
  	     node->global.inlined_to->uid);
+   if (node->same_comdat_group)
+     fprintf (f, " (same comdat group as %s/%i)",
+ 	     cgraph_node_name (node->same_comdat_group),
+ 	     node->same_comdat_group->uid);
    if (node->clone_of)
      fprintf (f, " (clone of %s/%i)",
  	     cgraph_node_name (node->clone_of),
*************** dump_cgraph_node (FILE *f, struct cgraph
*** 1876,1881 ****
--- 1880,1889 ----
      fprintf (f, " redefined_extern_inline");
    if (TREE_ASM_WRITTEN (node->decl))
      fprintf (f, " asm_written");
+   if (node->only_called_at_startup)
+     fprintf (f, " only_called_at_startup");
+   if (node->only_called_at_exit)
+     fprintf (f, " only_called_at_exit");
  
    fprintf (f, "\n  called by: ");
    for (edge = node->callers; edge; edge = edge->next_caller)
*************** bool
*** 2627,2646 ****
  cgraph_propagate_frequency (struct cgraph_node *node)
  {
    bool maybe_unlikely_executed = true, maybe_executed_once = true;
    struct cgraph_edge *edge;
    if (!node->local.local)
      return false;
    gcc_assert (node->analyzed);
-   if (node->frequency == NODE_FREQUENCY_HOT)
-     return false;
-   if (node->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
-     return false;
    if (dump_file && (dump_flags & TDF_DETAILS))
      fprintf (dump_file, "Processing frequency %s\n", cgraph_node_name (node));
    for (edge = node->callers;
!        edge && (maybe_unlikely_executed || maybe_executed_once);
         edge = edge->next_caller)
      {
        if (!edge->frequency)
  	continue;
        switch (edge->caller->frequency)
--- 2635,2666 ----
  cgraph_propagate_frequency (struct cgraph_node *node)
  {
    bool maybe_unlikely_executed = true, maybe_executed_once = true;
+   bool only_called_at_startup = true;
+   bool only_called_at_exit = true;
+   bool changed = false;
    struct cgraph_edge *edge;
+ 
    if (!node->local.local)
      return false;
    gcc_assert (node->analyzed);
    if (dump_file && (dump_flags & TDF_DETAILS))
      fprintf (dump_file, "Processing frequency %s\n", cgraph_node_name (node));
+ 
    for (edge = node->callers;
!        edge && (maybe_unlikely_executed || maybe_executed_once
! 	        || only_called_at_startup || only_called_at_exit);
         edge = edge->next_caller)
      {
+       if (edge->caller != node)
+ 	{
+           only_called_at_startup &= edge->caller->only_called_at_startup;
+ 	  /* It makes snese to put main() together with the static constructors.
+ 	     It will be executed for sure, but rest of functions called from
+ 	     main are definitly not at startup only.  */
+ 	  if (MAIN_NAME_P (DECL_NAME (edge->caller->decl)))
+ 	    only_called_at_startup = 0;
+           only_called_at_exit &= edge->caller->only_called_at_exit;
+ 	}
        if (!edge->frequency)
  	continue;
        switch (edge->caller->frequency)
*************** cgraph_propagate_frequency (struct cgrap
*** 2649,2655 ****
  	  break;
  	case NODE_FREQUENCY_EXECUTED_ONCE:
  	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    fprintf (dump_file, "  Called by %s that is executed once\n", cgraph_node_name (node));
  	  maybe_unlikely_executed = false;
  	  if (edge->loop_nest)
  	    {
--- 2669,2676 ----
  	  break;
  	case NODE_FREQUENCY_EXECUTED_ONCE:
  	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    fprintf (dump_file, "  Called by %s that is executed once\n",
! 		     cgraph_node_name (node));
  	  maybe_unlikely_executed = false;
  	  if (edge->loop_nest)
  	    {
*************** cgraph_propagate_frequency (struct cgrap
*** 2661,2687 ****
  	case NODE_FREQUENCY_HOT:
  	case NODE_FREQUENCY_NORMAL:
  	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    fprintf (dump_file, "  Called by %s that is normal or hot\n", cgraph_node_name (node));
  	  maybe_unlikely_executed = false;
  	  maybe_executed_once = false;
  	  break;
  	}
      }
!    if (maybe_unlikely_executed)
!      {
!        node->frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED;
         if (dump_file)
!          fprintf (dump_file, "Node %s promoted to unlikely executed.\n", cgraph_node_name (node));
!        return true;
!      }
!    if (maybe_executed_once && node->frequency != NODE_FREQUENCY_EXECUTED_ONCE)
!      {
!        node->frequency = NODE_FREQUENCY_EXECUTED_ONCE;
         if (dump_file)
!          fprintf (dump_file, "Node %s promoted to executed once.\n", cgraph_node_name (node));
!        return true;
!      }
!    return false;
  }
  
  /* Return true when NODE can not return or throw and thus
--- 2682,2733 ----
  	case NODE_FREQUENCY_HOT:
  	case NODE_FREQUENCY_NORMAL:
  	  if (dump_file && (dump_flags & TDF_DETAILS))
! 	    fprintf (dump_file, "  Called by %s that is normal or hot\n",
! 		     cgraph_node_name (node));
  	  maybe_unlikely_executed = false;
  	  maybe_executed_once = false;
  	  break;
  	}
      }
!   if ((only_called_at_startup && !only_called_at_exit)
!       && !node->only_called_at_startup)
!     {
!        node->only_called_at_startup = true;
         if (dump_file)
!          fprintf (dump_file, "Node %s promoted to only called at startup.\n",
! 		  cgraph_node_name (node));
!        changed = true;
!     }
!   if ((only_called_at_exit && !only_called_at_startup)
!       && !node->only_called_at_exit)
!     {
!        node->only_called_at_exit = true;
         if (dump_file)
!          fprintf (dump_file, "Node %s promoted to only called at exit.\n",
! 		  cgraph_node_name (node));
!        changed = true;
!     }
!   /* These come either from profile or user hints; never update them.  */
!   if (node->frequency == NODE_FREQUENCY_HOT
!       || node->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
!     return changed;
!   if (maybe_unlikely_executed)
!     {
!       node->frequency = NODE_FREQUENCY_UNLIKELY_EXECUTED;
!       if (dump_file)
! 	fprintf (dump_file, "Node %s promoted to unlikely executed.\n",
! 		 cgraph_node_name (node));
!       changed = true;
!     }
!   if (maybe_executed_once && node->frequency != NODE_FREQUENCY_EXECUTED_ONCE)
!     {
!       node->frequency = NODE_FREQUENCY_EXECUTED_ONCE;
!       if (dump_file)
! 	fprintf (dump_file, "Node %s promoted to executed once.\n",
! 		 cgraph_node_name (node));
!       changed = true;
!     }
!   return changed;
  }
  
  /* Return true when NODE can not return or throw and thus
Index: cgraph.h
===================================================================
*** cgraph.h	(revision 165559)
--- cgraph.h	(working copy)
*************** struct GTY((chain_next ("%h.next"), chai
*** 298,303 ****
--- 298,307 ----
    /* How commonly executed the node is.  Initialized during branch
       probabilities pass.  */
    ENUM_BITFIELD (node_frequency) frequency : 2;
+   /* True when function can only be called at startup (from static ctor).  */
+   unsigned only_called_at_startup : 1;
+   /* True when function can only be called at startup (from static dtor).  */
+   unsigned only_called_at_exit : 1;
  };
  
  typedef struct cgraph_node *cgraph_node_ptr;
Index: lto-cgraph.c
===================================================================
*** lto-cgraph.c	(revision 165559)
--- lto-cgraph.c	(working copy)
*************** lto_output_node (struct lto_simple_outpu
*** 518,523 ****
--- 518,525 ----
    bp_pack_value (&bp, node->alias, 1);
    bp_pack_value (&bp, node->finalized_by_frontend, 1);
    bp_pack_value (&bp, node->frequency, 2);
+   bp_pack_value (&bp, node->only_called_at_startup, 1);
+   bp_pack_value (&bp, node->only_called_at_exit, 1);
    lto_output_bitpack (&bp);
    lto_output_uleb128_stream (ob->main_stream, node->resolution);
  
*************** input_overwrite_node (struct lto_file_de
*** 978,983 ****
--- 980,987 ----
    node->alias = bp_unpack_value (bp, 1);
    node->finalized_by_frontend = bp_unpack_value (bp, 1);
    node->frequency = (enum node_frequency)bp_unpack_value (bp, 2);
+   node->only_called_at_startup = bp_unpack_value (bp, 1);
+   node->only_called_at_exit = bp_unpack_value (bp, 1);
    node->resolution = resolution;
  }
  
Index: predict.c
===================================================================
*** predict.c	(revision 165559)
--- predict.c	(working copy)
*************** compute_function_frequency (void)
*** 2187,2192 ****
--- 2187,2197 ----
  {
    basic_block bb;
    struct cgraph_node *node = cgraph_node (current_function_decl);
+   if (DECL_STATIC_CONSTRUCTOR (current_function_decl)
+       || MAIN_NAME_P (DECL_NAME (current_function_decl)))
+     node->only_called_at_startup = true;
+   if (DECL_STATIC_DESTRUCTOR (current_function_decl))
+     node->only_called_at_exit = true;
  
    if (!profile_info || !flag_branch_probabilities)
      {

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:09 Group static constructors and destructors in specific subsections Jan Hubicka
                   ` (5 preceding siblings ...)
  2010-10-15 14:36 ` Jack Howarth
@ 2010-10-21 18:02 ` Cary Coutant
  2010-10-21 19:43   ` Ian Lance Taylor
  2013-07-15 17:46   ` Cary Coutant
  6 siblings, 2 replies; 70+ messages in thread
From: Cary Coutant @ 2010-10-21 18:02 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches, rguenther, iant, iains, tglek

> I am not sure how to update gold - I basically copied existing code in binutils
> for .text.unlikely group in GNU LD linker script, but I think gold is doing
> independent decisions somewhere.

Gold doesn't have a default linker script. The code that handles the
default section name mappings is Layout::output_section_name() in
gold/layout.cc. If you add a couple of entries to the mapping array
Layout::section_name_mapping[], you can easily change gold to group
.text.startup and .text.exit sections together without mapping them
all into .text. You can tweak the ordering of the output sections, if
you want to, in Layout::default_section_order().

-cary

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-21 18:02 ` Cary Coutant
@ 2010-10-21 19:43   ` Ian Lance Taylor
  2010-10-21 21:00     ` Mike Stump
  2010-10-21 22:14     ` Jan Hubicka
  2013-07-15 17:46   ` Cary Coutant
  1 sibling, 2 replies; 70+ messages in thread
From: Ian Lance Taylor @ 2010-10-21 19:43 UTC (permalink / raw)
  To: Cary Coutant; +Cc: Jan Hubicka, gcc-patches, rguenther, iains, tglek

Cary Coutant <ccoutant@google.com> writes:

>> I am not sure how to update gold - I basically copied existing code in binutils
>> for .text.unlikely group in GNU LD linker script, but I think gold is doing
>> independent decisions somewhere.
>
> Gold doesn't have a default linker script. The code that handles the
> default section name mappings is Layout::output_section_name() in
> gold/layout.cc. If you add a couple of entries to the mapping array
> Layout::section_name_mapping[], you can easily change gold to group
> .text.startup and .text.exit sections together without mapping them
> all into .text. You can tweak the ordering of the output sections, if
> you want to, in Layout::default_section_order().

I wonder if we should adjust gold's layout code so that all sections
named .FOO.BAR are grouped together within the larger section .FOO.

Ian

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-21 19:43   ` Ian Lance Taylor
@ 2010-10-21 21:00     ` Mike Stump
  2010-10-21 22:32       ` Jan Hubicka
  2010-10-21 22:14     ` Jan Hubicka
  1 sibling, 1 reply; 70+ messages in thread
From: Mike Stump @ 2010-10-21 21:00 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Cary Coutant, Jan Hubicka, gcc-patches, rguenther, iains, tglek

On Oct 21, 2010, at 10:38 AM, Ian Lance Taylor wrote:
> I wonder if we should adjust gold's layout code so that all sections
> named .FOO.BAR are grouped together within the larger section .FOO.

Yes, I think conceptually, we want to sort them.  Then, for example, people could generate text.%09d, where %09d is the log2 of the gcov count, and logically, we get the the hottest stuff first, and so on down the line.  The initialization code, can use %05d, where that is the priority field, and have all the same priority init routines next to each other.  Or, maybe we want the compiler to choose layout, so that when code is executed, sequentially, we incur fewer page faults and we tend to have a more linear faulting pattern, based upon -fwhole-program reasoning.

On the other hand, maybe I'm just a heretic.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-21 19:43   ` Ian Lance Taylor
  2010-10-21 21:00     ` Mike Stump
@ 2010-10-21 22:14     ` Jan Hubicka
  1 sibling, 0 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-10-21 22:14 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Cary Coutant, Jan Hubicka, gcc-patches, rguenther, iains, tglek

> Cary Coutant <ccoutant@google.com> writes:
> 
> >> I am not sure how to update gold - I basically copied existing code in binutils
> >> for .text.unlikely group in GNU LD linker script, but I think gold is doing
> >> independent decisions somewhere.
> >
> > Gold doesn't have a default linker script. The code that handles the
> > default section name mappings is Layout::output_section_name() in
> > gold/layout.cc. If you add a couple of entries to the mapping array
> > Layout::section_name_mapping[], you can easily change gold to group
> > .text.startup and .text.exit sections together without mapping them
> > all into .text. You can tweak the ordering of the output sections, if
> > you want to, in Layout::default_section_order().
> 
> I wonder if we should adjust gold's layout code so that all sections
> named .FOO.BAR are grouped together within the larger section .FOO.

That would be handy.  In fact I believed GNU LD to believe this way until I was
proven otherwise.
One can play with order of these sections, i.e. probably one want to put startup, hot, normal, exit, unlikely
in this order so the binary has more chnace to load sequentially.
I doubt however that this order is any better than any other order.

Honza
> 
> Ian

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-21 21:00     ` Mike Stump
@ 2010-10-21 22:32       ` Jan Hubicka
  0 siblings, 0 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-10-21 22:32 UTC (permalink / raw)
  To: Mike Stump
  Cc: Ian Lance Taylor, Cary Coutant, Jan Hubicka, gcc-patches,
	rguenther, iains, tglek

> On Oct 21, 2010, at 10:38 AM, Ian Lance Taylor wrote:
> > I wonder if we should adjust gold's layout code so that all sections
> > named .FOO.BAR are grouped together within the larger section .FOO.
> 
> Yes, I think conceptually, we want to sort them.  Then, for example, people could generate text.%09d, where %09d is the log2 of the gcov count, and logically, we get the the hottest stuff first, and so on down the line.  The initialization code, can use %05d, where that is the priority field, and have all the same priority init routines next to each other.  Or, maybe we want the compiler to choose layout, so that when code is executed, sequentially, we incur fewer page faults and we tend to have a more linear faulting pattern, based upon -fwhole-program reasoning.
> 
> On the other hand, maybe I'm just a heretic.

I have function reordering pass in the queue.
The way I implemented it for now is to (at LTO time) fill in node->order fileds
and then output similar way as -fno-toplevel-reorder has.  This has some
negative code quality impact as late optimizer pass down some info (like stack
alignment) so it is good idea to order functions topologically, while this pass
tends to produce reverse ordering (caller first, callees later).  I am hoping
to use gas subsections to enforce final order with some extra output machinery.
I am not sure if named sections for every individual function is the best idea.

However for -fprofile-generate I plan to add counter for every function entry measuring
time it is first called during the program execution (and averaging it across the run).
Then the program layout problem is just sorting according to the execution times
and it should have great impact on startup times of firefox and such.
I planned to implement this via my function reordering infrastructure that wouold make
sense only with LTO, but having those implicitly sorted text subsections, like
text.gcov_%8x would indeed save my day and make this pass usefull w/o LTO.
I didn't think of that at all ;)

Is there chance having both gold and GNU LD behave this way?

Honza

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-14 21:12   ` Richard Henderson
  2010-10-14 21:18     ` Jan Hubicka
  2010-10-14 21:48     ` Joseph S. Myers
@ 2010-10-23 16:01     ` Jan Hubicka
  2010-11-10  4:59     ` Group static constructors and destructors in specific subsections, take 2 Jan Hubicka
  3 siblings, 0 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-10-23 16:01 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Joseph S. Myers, Jan Hubicka, gcc-patches, rguenther, iant, iains, tglek

> On 10/14/2010 01:47 PM, Joseph S. Myers wrote:
> > On Thu, 14 Oct 2010, Jan Hubicka wrote:
> > 
> >> 	* doc/tm.texi.in (STARTUP_EXECUTED_TEXT_SECTION_NAME,
> >> 	EXIT_EXECUTED_TEXT_SECTION_NAME): Document
> > 
> > Unless you really need target macros, could you please create hooks 
> > instead?  There's at least one hook giving a section name already 
> > (TARGET_ASM_RECORD_GCC_SWITCHES_SECTION).
> > 
> Should we be returning section NAMES at all?
> 
> We have a nicely modular section object.  It would seem
> we should make more use of that.
> 
Hi,
I finally had chance to return to this.  This is patch I ended up with, it needs
some polishing and updating darwin target, but it would be great if you checked it
for overall sanity.  The function partitioning is bit of a mess.

The idea is to have only one hook deciding on function section based on the profile
so it imply bit less cut&pasting on targets that need to change it completely.
I also tried to unify the infrastructure for function splitting and the infrastructure
for function subsections so everything is handled via common channels
(named_text_subsection).

I am bit confused about the need for unlikely_text_section_p that is used only by rs6000
target, but it should preserve its function in the new implementation.
Also default_section_type_flags somewhat confuse me.

Seems sane?

Honza

Index: targhooks.h
===================================================================
*** targhooks.h	(revision 165800)
--- targhooks.h	(working copy)
*************** extern int default_label_align_after_bar
*** 167,169 ****
--- 167,171 ----
  extern int default_loop_align_max_skip (rtx);
  extern int default_label_align_max_skip (rtx);
  extern int default_jump_align_max_skip (rtx);
+ extern section * default_function_section(tree decl, enum node_frequency freq,
+ 					  bool startup, bool exit);
Index: target.def
===================================================================
*** target.def	(revision 165800)
--- target.def	(working copy)
*************** DEFHOOK
*** 273,278 ****
--- 273,291 ----
   void, (const char *name, unsigned int flags, tree decl),
   default_no_named_section)
  
+ /* Return preferred text (sub)section for function DECL.
+    Main purpose of this function is to separate cold, normal and hot
+    functions. STARTUP is true when function is known to be used only 
+    at startup (from static constructors or it is main()).
+    EXIT is true when function is known to be used only at exit
+    (from static destructors).
+    Return NULL if function should go to default text section.  */
+ DEFHOOK
+ (function_section,
+  "",
+  section *, (tree decl, enum node_frequency freq, bool startup, bool exit),
+  default_function_section)
+ 
  /* Return a mask describing how relocations should be treated when
     selecting sections.  Bit 1 should be set if global relocations
     should be placed in a read-write section; bit 0 should be set if
Index: cgraph.h
===================================================================
*** cgraph.h	(revision 165800)
--- cgraph.h	(working copy)
*************** struct GTY(()) cgraph_clone_info
*** 179,198 ****
    bitmap combined_args_to_skip;
  };
  
- enum node_frequency {
-   /* This function most likely won't be executed at all.
-      (set only when profile feedback is available or via function attribute). */
-   NODE_FREQUENCY_UNLIKELY_EXECUTED,
-   /* For functions that are known to be executed once (i.e. constructors, destructors
-      and main function.  */
-   NODE_FREQUENCY_EXECUTED_ONCE,
-   /* The default value.  */
-   NODE_FREQUENCY_NORMAL,
-   /* Optimize this function hard
-      (set only when profile feedback is available or via function attribute). */
-   NODE_FREQUENCY_HOT
- };
- 
  
  /* The cgraph data structure.
     Each function decl has assigned cgraph_node listing callees and callers.  */
--- 179,184 ----
Index: defaults.h
===================================================================
*** defaults.h	(revision 165800)
--- defaults.h	(working copy)
*************** see the files COPYING3 and COPYING.RUNTI
*** 939,952 ****
  #define TARGET_DEC_EVAL_METHOD 2
  #endif
  
- #ifndef HOT_TEXT_SECTION_NAME
- #define HOT_TEXT_SECTION_NAME ".text.hot"
- #endif
- 
- #ifndef UNLIKELY_EXECUTED_TEXT_SECTION_NAME
- #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text.unlikely"
- #endif
- 
  #ifndef HAS_LONG_COND_BRANCH
  #define HAS_LONG_COND_BRANCH 0
  #endif
--- 939,944 ----
Index: predict.c
===================================================================
*** predict.c	(revision 165800)
--- predict.c	(working copy)
*************** static sreal real_zero, real_one, real_a
*** 78,84 ****
  static void combine_predictions_for_insn (rtx, basic_block);
  static void dump_prediction (FILE *, enum br_predictor, int, basic_block, int);
  static void predict_paths_leading_to (basic_block, enum br_predictor, enum prediction);
- static void choose_function_section (void);
  static bool can_predict_insn_p (const_rtx);
  
  /* Information we hold about each branch predictor.
--- 78,83 ----
*************** estimate_bb_frequencies (void)
*** 2177,2184 ****
        free_aux_for_edges ();
      }
    compute_function_frequency ();
-   if (flag_reorder_functions)
-     choose_function_section ();
  }
  
  /* Decide whether function is hot, cold or unlikely executed.  */
--- 2176,2181 ----
*************** compute_function_frequency (void)
*** 2224,2258 ****
      }
  }
  
- /* Choose appropriate section for the function.  */
- static void
- choose_function_section (void)
- {
-   struct cgraph_node *node = cgraph_node (current_function_decl);
-   if (DECL_SECTION_NAME (current_function_decl)
-       || !targetm.have_named_sections
-       /* Theoretically we can split the gnu.linkonce text section too,
- 	 but this requires more work as the frequency needs to match
- 	 for all generated objects so we need to merge the frequency
- 	 of all instances.  For now just never set frequency for these.  */
-       || DECL_ONE_ONLY (current_function_decl))
-     return;
- 
-   /* If we are doing the partitioning optimization, let the optimization
-      choose the correct section into which to put things.  */
- 
-   if (flag_reorder_blocks_and_partition)
-     return;
- 
-   if (node->frequency == NODE_FREQUENCY_HOT)
-     DECL_SECTION_NAME (current_function_decl) =
-       build_string (strlen (HOT_TEXT_SECTION_NAME), HOT_TEXT_SECTION_NAME);
-   if (node->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
-     DECL_SECTION_NAME (current_function_decl) =
-       build_string (strlen (UNLIKELY_EXECUTED_TEXT_SECTION_NAME),
- 		    UNLIKELY_EXECUTED_TEXT_SECTION_NAME);
- }
- 
  static bool
  gate_estimate_probability (void)
  {
--- 2221,2226 ----
Index: coretypes.h
===================================================================
*** coretypes.h	(revision 165800)
--- coretypes.h	(working copy)
*************** enum unwind_info_type
*** 121,126 ****
--- 121,142 ----
    UI_TARGET
  };
  
+ /* Callgraph node profile representation.  */
+ enum node_frequency {
+   /* This function most likely won't be executed at all.
+      (set only when profile feedback is available or via function attribute). */
+   NODE_FREQUENCY_UNLIKELY_EXECUTED,
+   /* For functions that are known to be executed once (i.e. constructors, destructors
+      and main function.  */
+   NODE_FREQUENCY_EXECUTED_ONCE,
+   /* The default value.  */
+   NODE_FREQUENCY_NORMAL,
+   /* Optimize this function hard
+      (set only when profile feedback is available or via function attribute). */
+   NODE_FREQUENCY_HOT
+ };
+ 
+ 
  struct edge_def;
  typedef struct edge_def *edge;
  typedef const struct edge_def *const_edge;
Index: varasm.c
===================================================================
*** varasm.c	(revision 165800)
--- varasm.c	(working copy)
*************** create_block_symbol (const char *label,
*** 367,435 ****
    return symbol;
  }
  
- static void
- initialize_cold_section_name (void)
- {
-   const char *stripped_name;
-   char *name, *buffer;
-   tree dsn;
- 
-   gcc_assert (cfun && current_function_decl);
-   if (crtl->subsections.unlikely_text_section_name)
-     return;
- 
-   dsn = DECL_SECTION_NAME (current_function_decl);
-   if (flag_function_sections && dsn)
-     {
-       name = (char *) alloca (TREE_STRING_LENGTH (dsn) + 1);
-       memcpy (name, TREE_STRING_POINTER (dsn), TREE_STRING_LENGTH (dsn) + 1);
- 
-       stripped_name = targetm.strip_name_encoding (name);
- 
-       buffer = ACONCAT ((stripped_name, "_unlikely", NULL));
-       crtl->subsections.unlikely_text_section_name = ggc_strdup (buffer);
-     }
-   else
-     crtl->subsections.unlikely_text_section_name =  UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
- }
- 
- /* Tell assembler to switch to unlikely-to-be-executed text section.  */
- 
- section *
- unlikely_text_section (void)
- {
-   if (cfun)
-     {
-       if (!crtl->subsections.unlikely_text_section_name)
- 	initialize_cold_section_name ();
- 
-       return get_named_section (NULL, crtl->subsections.unlikely_text_section_name, 0);
-     }
-   else
-     return get_named_section (NULL, UNLIKELY_EXECUTED_TEXT_SECTION_NAME, 0);
- }
- 
- /* When called within a function context, return true if the function
-    has been assigned a cold text section and if SECT is that section.
-    When called outside a function context, return true if SECT is the
-    default cold section.  */
- 
- bool
- unlikely_text_section_p (section *sect)
- {
-   const char *name;
- 
-   if (cfun)
-     name = crtl->subsections.unlikely_text_section_name;
-   else
-     name = UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
- 
-   return (name
- 	  && sect
- 	  && SECTION_STYLE (sect) == SECTION_NAMED
- 	  && strcmp (name, sect->named.name) == 0);
- }
- 
  /* Return a section with a particular name and with whatever SECTION_*
     flags section_type_flags deems appropriate.  The name of the section
     is taken from NAME if nonnull, otherwise it is taken from DECL's
--- 367,372 ----
*************** hot_function_section (tree decl)
*** 538,586 ****
  }
  #endif
  
  /* Return the section for function DECL.
  
     If DECL is NULL_TREE, return the text section.  We can be passed
!    NULL_TREE under some circumstances by dbxout.c at least.  */
  
! section *
! function_section (tree decl)
  {
!   int reloc = 0;
  
!   if (first_function_block_is_cold)
!     reloc = 1;
  
  #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
    if (decl != NULL_TREE
        && DECL_SECTION_NAME (decl) != NULL_TREE)
!     return reloc ? unlikely_text_section ()
! 		 : get_named_section (decl, NULL, 0);
    else
!     return targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
  #else
!   return reloc ? unlikely_text_section () : hot_function_section (decl);
  #endif
  }
  
  section *
  current_function_section (void)
  {
! #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
!   if (current_function_decl != NULL_TREE
!       && DECL_SECTION_NAME (current_function_decl) != NULL_TREE)
!     return in_cold_section_p ? unlikely_text_section ()
! 			     : get_named_section (current_function_decl,
! 						  NULL, 0);
!   else
!     return targetm.asm_out.select_section (current_function_decl,
! 					   in_cold_section_p,
! 					   DECL_ALIGN (current_function_decl));
! #else
!   return (in_cold_section_p
! 	  ? unlikely_text_section ()
! 	  : hot_function_section (current_function_decl));
! #endif
  }
  
  /* Return the read-only data section associated with function DECL.  */
--- 475,676 ----
  }
  #endif
  
+ /* Hash table used to cache names of function subsections.  */
+ struct GTY(()) named_subsection_entry
+ {
+   section *sect;
+   tree section_name;
+   const char *suffix;
+ };
+ 
+ static GTY((param_is (struct named_subsection_entry))) htab_t subsection_hash;
+ 
+ /* Returns nonzero if P1 and P2 are equal.  */
+ 
+ static int
+ eq_subsections (const void *p1, const void *p2)
+ {
+   const struct named_subsection_entry *n1 = (const struct named_subsection_entry *) p1;
+   const struct named_subsection_entry *n2 = (const struct named_subsection_entry *) p2;
+   return n1->section_name == n2->section_name && n1->suffix == n2->suffix;
+ }
+ 
+ /* Compute hash value of subsection.  */
+ static unsigned int
+ hash_subsection (const void *p)
+ {
+   const struct named_subsection_entry *n = (const struct named_subsection_entry *)p;
+   return htab_hash_pointer (n->section_name) ^ htab_hash_pointer (n->suffix);
+ }
+ 
+ /* Return section for TEXT_SECITON_NAME if DECL or DECL_SECTION_NAME (DECL)
+    is NULL.
+ 
+    If it is non-NULL, return subsection of DECL_SECTION_NAME (DECL) created
+    by appending NAMED_SECTION_SUFFIX.  */
+ 
+ static section *
+ named_text_subsection (tree decl,
+ 		       const char *text_section_name,
+ 		       const char *named_section_suffix)
+ {
+   if (decl && DECL_SECTION_NAME (decl))
+     {
+       tree dsn = DECL_SECTION_NAME (decl);
+       struct named_subsection_entry key, *entry, **slot;
+ 
+       key.suffix = named_section_suffix;
+       key.section_name = dsn;
+       if (!subsection_hash)
+ 	subsection_hash =
+ 	  htab_create_ggc (10, hash_subsection, eq_subsections, NULL);
+       slot =
+ 	(struct named_subsection_entry **) htab_find_slot (subsection_hash,
+ 							   &key, INSERT);
+       if (!*slot)
+ 	{
+ 	  const char *stripped_name;
+ 	  char *name, *buffer;
+ 	  name = (char *) alloca (TREE_STRING_LENGTH (dsn) + 1);
+ 	  memcpy (name, TREE_STRING_POINTER (dsn),
+ 		  TREE_STRING_LENGTH (dsn) + 1);
+ 
+ 	  stripped_name = targetm.strip_name_encoding (name);
+ 
+ 	  buffer = ACONCAT ((stripped_name, "_unlikely", NULL));
+ 	  entry = ggc_alloc_named_subsection_entry ();
+ 	  entry->section_name = dsn;
+ 	  entry->suffix = named_section_suffix;
+ 	  entry->sect = get_named_section (NULL, buffer, 0);
+ 	  *slot = entry;
+ 	}
+       return (*slot)->sect;
+     }
+   return get_named_section (NULL, text_section_name, 0);
+ }
+ 
+ /* Choose named function section based on its frequency.  */
+ 
+ section *
+ default_function_section (tree decl, enum node_frequency freq,
+ 			  bool startup, bool exit)
+ {
+   /* Startup code should go to startup subsection unless it is
+      unlikely executed (this happens especially with function splitting
+      where we can split away unnecesary parts of static constructors.  */
+   if (startup && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
+     return named_text_subsection (decl, ".text.startup", "_startup");
+ 
+   /* Similarly for exit.  */
+   if (exit && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
+     return named_text_subsection (decl, ".text.exit", "_exit");
+ 
+   /* Group cold functions together, similarly for hot code.  */
+   switch (freq)
+     {
+       case NODE_FREQUENCY_UNLIKELY_EXECUTED:
+ 	return named_text_subsection (decl, "text.unlikely", "_unlikely");
+       case NODE_FREQUENCY_HOT:
+ 	return named_text_subsection (decl, "text.hot", "_hot");
+       default:
+ 	return NULL;
+     }
+ }
+ 
  /* Return the section for function DECL.
  
     If DECL is NULL_TREE, return the text section.  We can be passed
!    NULL_TREE under some circumstances by dbxout.c at least. 
  
!    If FORCE_COLD is true, return cold function section ignoring
!    the frequency info of cgraph_node.  */
! 
! static section *
! function_section_1 (tree decl, bool force_cold)
  {
!   section *section = NULL;
!   enum node_frequency freq = NODE_FREQUENCY_NORMAL;
!   bool startup = false, exit = false;
  
!   if (decl)
!     {
!       struct cgraph_node *node = cgraph_node (decl);
! 
!       freq = node->frequency;
!       startup = node->only_called_at_startup;
!       exit = node->only_called_at_exit;
!     }
!   if (force_cold)
!     freq = NODE_FREQUENCY_UNLIKELY_EXECUTED;
  
  #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
    if (decl != NULL_TREE
        && DECL_SECTION_NAME (decl) != NULL_TREE)
!     {
!       if (targetm.asm_out.function_section)
! 	section = targetm.asm_out.function_section (decl, freq,
! 						    startup, exit);
!       if (section)
! 	return section;
!       return get_named_section (decl, NULL, 0);
!     }
    else
!     return targetm.asm_out.select_section
! 	    (decl, freq == NODE_FREQUENCY_UNLIKELY_EXECUTED,
! 	     DECL_ALIGN (decl));
  #else
!   if (targetm.asm_out.function_section)
!     section = targetm.asm_out.function_section (decl, freq, startup, exit);
!   if (section)
!     return section;
!   return hot_function_section (decl);
  #endif
  }
  
+ /* Return the section for function DECL.
+ 
+    If DECL is NULL_TREE, return the text section.  We can be passed
+    NULL_TREE under some circumstances by dbxout.c at least.  */
+ 
+ section *
+ function_section (tree decl)
+ {
+   /* Handle cases where function splitting code decides
+      to put function entry point into unlikely executed section
+      despite the fact that the function itself is not cold
+      (i.e. it is called rarely but contains a hot loop that is
+      better to live in hot subsection for the code locality).  */
+   return function_section_1 (decl,
+ 			     first_function_block_is_cold
+ 		             && flag_function_sections);
+ }
+ 
+ /* Return the section for the current function, take IN_COLD_SECTION_P
+    into account.  */
+ 
  section *
  current_function_section (void)
  {
!   return function_section_1 (current_function_decl, in_cold_section_p);
! }
! 
! /* Tell assembler to switch to unlikely-to-be-executed text section.  */
! 
! section *
! unlikely_text_section (void)
! {
!   return function_section_1 (current_function_decl, true);
! }
! 
! /* When called within a function context, return true if the function
!    has been assigned a cold text section and if SECT is that section.
!    When called outside a function context, return true if SECT is the
!    default cold section.  */
! 
! bool
! unlikely_text_section_p (section *sect)
! {
!   return sect == function_section_1 (current_function_decl, true);
  }
  
  /* Return the read-only data section associated with function DECL.  */
*************** assemble_start_function (tree decl, cons
*** 1482,1497 ****
    else if (DECL_SECTION_NAME (decl))
      {
        /* Calls to function_section rely on first_function_block_is_cold
! 	 being accurate.  The first block may be cold even if we aren't
! 	 doing partitioning, if the entire function was decided by
! 	 choose_function_section (predict.c) to be cold.  */
! 
!       initialize_cold_section_name ();
! 
!       if (crtl->subsections.unlikely_text_section_name
! 	  && strcmp (TREE_STRING_POINTER (DECL_SECTION_NAME (decl)),
! 		     crtl->subsections.unlikely_text_section_name) == 0)
! 	first_function_block_is_cold = true;
      }
  
    in_cold_section_p = first_function_block_is_cold;
--- 1572,1581 ----
    else if (DECL_SECTION_NAME (decl))
      {
        /* Calls to function_section rely on first_function_block_is_cold
! 	 being accurate.  */
!       first_function_block_is_cold
! 	 = (cgraph_node (current_function_decl)->frequency
! 	    == NODE_FREQUENCY_UNLIKELY_EXECUTED);
      }
  
    in_cold_section_p = first_function_block_is_cold;
*************** default_section_type_flags (tree decl, c
*** 5783,5797 ****
      flags = SECTION_CODE;
    else if (decl && decl_readonly_section (decl, reloc))
      flags = 0;
-   else if (current_function_decl
- 	   && cfun
- 	   && crtl->subsections.unlikely_text_section_name
- 	   && strcmp (name, crtl->subsections.unlikely_text_section_name) == 0)
-     flags = SECTION_CODE;
-   else if (!decl
- 	   && (!current_function_decl || !cfun)
- 	   && strcmp (name, UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
-     flags = SECTION_CODE;
    else
      flags = SECTION_WRITE;
  
--- 5867,5872 ----
*************** switch_to_section (section *new_section)
*** 6708,6719 ****
    switch (SECTION_STYLE (new_section))
      {
      case SECTION_NAMED:
-       if (cfun
- 	  && !crtl->subsections.unlikely_text_section_name
- 	  && strcmp (new_section->named.name,
- 		     UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
- 	crtl->subsections.unlikely_text_section_name = UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
- 
        targetm.asm_out.named_section (new_section->named.name,
  				     new_section->named.common.flags,
  				     new_section->named.decl);
--- 6783,6788 ----

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-10-14 21:12   ` Richard Henderson
                       ` (2 preceding siblings ...)
  2010-10-23 16:01     ` Jan Hubicka
@ 2010-11-10  4:59     ` Jan Hubicka
  2010-11-10 14:16       ` Jack Howarth
  2010-11-10 18:59       ` Richard Henderson
  3 siblings, 2 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-11-10  4:59 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Joseph S. Myers, Jan Hubicka, gcc-patches, rguenther, iant, iains, tglek

Hi,
this is the updated patch that deals with section instead of section names as
Richard requested and also uses hook instead of target macros.
The hot/cold function partitioning made things somewhat harder since we have
special code for cold sections all around.

Baic idea is to have named_text_subsection to produce subsections for functions
in generic way and function_section hook to decide proper subsection based on
profile.

I've bootstrapped/regtested the patch on x86_64-linux, I also checked that Hpux
and dwarf output seems sane at a glance.

Does the patch seem resonable?

Honza
	* targhooks.h (default_function_section): Declare.
	* target.def (function_section): New hook.
	* defaults.h (HOT_TEXT_SECTION_NAME,
	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
	* predict.c (choose_function_section): Remove.
	(estimate_bb_frequencies): Do not use choose_function_section.
	* coretypes.h (enum node_frequency): Move here from cgraph.h
	* cgraph.h (enum node_frequency): Remove.
	* varasm.c (initialize_cold_section_name, unlikely_text_section,
	unlikely_text_section_p): Remove.
	(named_subsection_entry): New structure.
	(subsection_hash): New static var.
	(eq_subsections): New function.
	(hash_subsection): New function.
	(named_text_subsection): New function.
	(default_function_section): New function.
	(function_section_1): Break out from ...; handle profile info.
	(function_section): ... here.
	(unlikely_text_section): Remove.
	(unlikely_text_section_p): Use function_section_1.
	(assemble_start_function): Do not initialize cold section.
	(default_section_type_flags): Do not special case cold subsection.
	(switch_to_section): Likewise.
	* output.h (named_text_subsection): Define.
	* config/i386/winnt.c: Do not special case cold section.
	* config/darwin-protos.h (darwin_function_section): Declare.
	* config/microblaze/microblaze.h (HOT_TEXT_SECTION_NAME,
	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
	* config/ia64/hpux.h (HOT_TEXT_SECTION_NAME,
	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
	(TARGET_ASM_FUNCTION_SECTOIN): Define to NULL.
	* config/darwin.c (machopic_select_section): Use
	darwin_function_section.
	(darwin_function_section): New function.
	* config/darwin.h (HOT_TEXT_SECTION_NAME,
	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
	(TARGET_ASM_FUNCTION_SECTION): Define.
Index: doc/tm.texi
===================================================================
--- doc/tm.texi	(revision 166490)
+++ doc/tm.texi	(working copy)
@@ -7335,6 +7335,8 @@ macro is not defined, nothing is output 
 @end defmac
 
 @deftypefn {Target Hook} void TARGET_ASM_NAMED_SECTION (const char *@var{name}, unsigned int @var{flags}, tree @var{decl})
+
+@deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_SECTION (tree @var{decl}, enum node_frequency @var{freq}, bool @var{startup}, bool @var{exit})
 Output assembly directives to switch to section @var{name}.  The section
 should have attributes as specified by @var{flags}, which is a bit mask
 of the @code{SECTION_*} flags defined in @file{output.h}.  If @var{decl}
Index: targhooks.h
===================================================================
--- targhooks.h	(revision 166490)
+++ targhooks.h	(working copy)
@@ -169,7 +169,8 @@ extern int default_label_align_after_bar
 extern int default_loop_align_max_skip (rtx);
 extern int default_label_align_max_skip (rtx);
 extern int default_jump_align_max_skip (rtx);
-
+extern section * default_function_section(tree decl, enum node_frequency freq,
+					  bool startup, bool exit);
 extern enum machine_mode default_get_reg_raw_mode(int);
 
 extern const struct default_options empty_optimization_table[];
Index: target.def
===================================================================
--- target.def	(revision 166490)
+++ target.def	(working copy)
@@ -281,6 +281,19 @@ DEFHOOK
  void, (const char *name, unsigned int flags, tree decl),
  default_no_named_section)
 
+/* Return preferred text (sub)section for function DECL.
+   Main purpose of this function is to separate cold, normal and hot
+   functions. STARTUP is true when function is known to be used only 
+   at startup (from static constructors or it is main()).
+   EXIT is true when function is known to be used only at exit
+   (from static destructors).
+   Return NULL if function should go to default text section.  */
+DEFHOOK
+(function_section,
+ "",
+ section *, (tree decl, enum node_frequency freq, bool startup, bool exit),
+ default_function_section)
+
 /* Return a mask describing how relocations should be treated when
    selecting sections.  Bit 1 should be set if global relocations
    should be placed in a read-write section; bit 0 should be set if
Index: cgraph.h
===================================================================
--- cgraph.h	(revision 166490)
+++ cgraph.h	(working copy)
@@ -179,20 +179,6 @@ struct GTY(()) cgraph_clone_info
   bitmap combined_args_to_skip;
 };
 
-enum node_frequency {
-  /* This function most likely won't be executed at all.
-     (set only when profile feedback is available or via function attribute). */
-  NODE_FREQUENCY_UNLIKELY_EXECUTED,
-  /* For functions that are known to be executed once (i.e. constructors, destructors
-     and main function.  */
-  NODE_FREQUENCY_EXECUTED_ONCE,
-  /* The default value.  */
-  NODE_FREQUENCY_NORMAL,
-  /* Optimize this function hard
-     (set only when profile feedback is available or via function attribute). */
-  NODE_FREQUENCY_HOT
-};
-
 
 /* The cgraph data structure.
    Each function decl has assigned cgraph_node listing callees and callers.  */
Index: defaults.h
===================================================================
--- defaults.h	(revision 166490)
+++ defaults.h	(working copy)
@@ -897,14 +897,6 @@ see the files COPYING3 and COPYING.RUNTI
 #define TARGET_DEC_EVAL_METHOD 2
 #endif
 
-#ifndef HOT_TEXT_SECTION_NAME
-#define HOT_TEXT_SECTION_NAME ".text.hot"
-#endif
-
-#ifndef UNLIKELY_EXECUTED_TEXT_SECTION_NAME
-#define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text.unlikely"
-#endif
-
 #ifndef HAS_LONG_COND_BRANCH
 #define HAS_LONG_COND_BRANCH 0
 #endif
Index: predict.c
===================================================================
--- predict.c	(revision 166490)
+++ predict.c	(working copy)
@@ -78,7 +78,6 @@ static sreal real_zero, real_one, real_a
 static void combine_predictions_for_insn (rtx, basic_block);
 static void dump_prediction (FILE *, enum br_predictor, int, basic_block, int);
 static void predict_paths_leading_to (basic_block, enum br_predictor, enum prediction);
-static void choose_function_section (void);
 static bool can_predict_insn_p (const_rtx);
 
 /* Information we hold about each branch predictor.
@@ -2185,8 +2184,6 @@ estimate_bb_frequencies (void)
       free_aux_for_edges ();
     }
   compute_function_frequency ();
-  if (flag_reorder_functions)
-    choose_function_section ();
 }
 
 /* Decide whether function is hot, cold or unlikely executed.  */
@@ -2232,35 +2229,6 @@ compute_function_frequency (void)
     }
 }
 
-/* Choose appropriate section for the function.  */
-static void
-choose_function_section (void)
-{
-  struct cgraph_node *node = cgraph_node (current_function_decl);
-  if (DECL_SECTION_NAME (current_function_decl)
-      || !targetm.have_named_sections
-      /* Theoretically we can split the gnu.linkonce text section too,
-	 but this requires more work as the frequency needs to match
-	 for all generated objects so we need to merge the frequency
-	 of all instances.  For now just never set frequency for these.  */
-      || DECL_ONE_ONLY (current_function_decl))
-    return;
-
-  /* If we are doing the partitioning optimization, let the optimization
-     choose the correct section into which to put things.  */
-
-  if (flag_reorder_blocks_and_partition)
-    return;
-
-  if (node->frequency == NODE_FREQUENCY_HOT)
-    DECL_SECTION_NAME (current_function_decl) =
-      build_string (strlen (HOT_TEXT_SECTION_NAME), HOT_TEXT_SECTION_NAME);
-  if (node->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
-    DECL_SECTION_NAME (current_function_decl) =
-      build_string (strlen (UNLIKELY_EXECUTED_TEXT_SECTION_NAME),
-		    UNLIKELY_EXECUTED_TEXT_SECTION_NAME);
-}
-
 static bool
 gate_estimate_probability (void)
 {
Index: function.h
===================================================================
--- function.h	(revision 166490)
+++ function.h	(working copy)
@@ -226,11 +226,6 @@ struct GTY(()) function_subsections {
   const char *cold_section_label;
   const char *hot_section_end_label;
   const char *cold_section_end_label;
-
-  /* String to be used for name of cold text sections, via
-     targetm.asm_out.named_section.  */
-
-  const char *unlikely_text_section_name;
 };
 
 /* Describe an empty area of space in the stack frame.  These can be chained
Index: coretypes.h
===================================================================
--- coretypes.h	(revision 166490)
+++ coretypes.h	(working copy)
@@ -121,6 +121,22 @@ enum unwind_info_type
   UI_TARGET
 };
 
+/* Callgraph node profile representation.  */
+enum node_frequency {
+  /* This function most likely won't be executed at all.
+     (set only when profile feedback is available or via function attribute). */
+  NODE_FREQUENCY_UNLIKELY_EXECUTED,
+  /* For functions that are known to be executed once (i.e. constructors, destructors
+     and main function.  */
+  NODE_FREQUENCY_EXECUTED_ONCE,
+  /* The default value.  */
+  NODE_FREQUENCY_NORMAL,
+  /* Optimize this function hard
+     (set only when profile feedback is available or via function attribute). */
+  NODE_FREQUENCY_HOT
+};
+
+
 struct edge_def;
 typedef struct edge_def *edge;
 typedef const struct edge_def *const_edge;
Index: varasm.c
===================================================================
--- varasm.c	(revision 166490)
+++ varasm.c	(working copy)
@@ -368,69 +368,6 @@ create_block_symbol (const char *label, 
   return symbol;
 }
 
-static void
-initialize_cold_section_name (void)
-{
-  const char *stripped_name;
-  char *name, *buffer;
-  tree dsn;
-
-  gcc_assert (cfun && current_function_decl);
-  if (crtl->subsections.unlikely_text_section_name)
-    return;
-
-  dsn = DECL_SECTION_NAME (current_function_decl);
-  if (flag_function_sections && dsn)
-    {
-      name = (char *) alloca (TREE_STRING_LENGTH (dsn) + 1);
-      memcpy (name, TREE_STRING_POINTER (dsn), TREE_STRING_LENGTH (dsn) + 1);
-
-      stripped_name = targetm.strip_name_encoding (name);
-
-      buffer = ACONCAT ((stripped_name, "_unlikely", NULL));
-      crtl->subsections.unlikely_text_section_name = ggc_strdup (buffer);
-    }
-  else
-    crtl->subsections.unlikely_text_section_name =  UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
-}
-
-/* Tell assembler to switch to unlikely-to-be-executed text section.  */
-
-section *
-unlikely_text_section (void)
-{
-  if (cfun)
-    {
-      if (!crtl->subsections.unlikely_text_section_name)
-	initialize_cold_section_name ();
-
-      return get_named_section (NULL, crtl->subsections.unlikely_text_section_name, 0);
-    }
-  else
-    return get_named_section (NULL, UNLIKELY_EXECUTED_TEXT_SECTION_NAME, 0);
-}
-
-/* When called within a function context, return true if the function
-   has been assigned a cold text section and if SECT is that section.
-   When called outside a function context, return true if SECT is the
-   default cold section.  */
-
-bool
-unlikely_text_section_p (section *sect)
-{
-  const char *name;
-
-  if (cfun)
-    name = crtl->subsections.unlikely_text_section_name;
-  else
-    name = UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
-
-  return (name
-	  && sect
-	  && SECTION_STYLE (sect) == SECTION_NAMED
-	  && strcmp (name, sect->named.name) == 0);
-}
-
 /* Return a section with a particular name and with whatever SECTION_*
    flags section_type_flags deems appropriate.  The name of the section
    is taken from NAME if nonnull, otherwise it is taken from DECL's
@@ -539,49 +476,202 @@ hot_function_section (tree decl)
 }
 #endif
 
+/* Hash table used to cache names of function subsections.  */
+struct GTY(()) named_subsection_entry
+{
+  section *sect;
+  tree section_name;
+  const char *suffix;
+};
+
+static GTY((param_is (struct named_subsection_entry))) htab_t subsection_hash;
+
+/* Returns nonzero if P1 and P2 are equal.  */
+
+static int
+eq_subsections (const void *p1, const void *p2)
+{
+  const struct named_subsection_entry *n1 = (const struct named_subsection_entry *) p1;
+  const struct named_subsection_entry *n2 = (const struct named_subsection_entry *) p2;
+  return n1->section_name == n2->section_name && n1->suffix == n2->suffix;
+}
+
+/* Compute hash value of subsection.  */
+static unsigned int
+hash_subsection (const void *p)
+{
+  const struct named_subsection_entry *n = (const struct named_subsection_entry *)p;
+  return htab_hash_pointer (n->section_name) ^ htab_hash_pointer (n->suffix);
+}
+
+/* Return section for TEXT_SECITON_NAME if DECL or DECL_SECTION_NAME (DECL)
+   is NULL.
+
+   If it is non-NULL, return subsection of DECL_SECTION_NAME (DECL) created
+   by appending NAMED_SECTION_SUFFIX.  */
+
+section *
+named_text_subsection (tree decl,
+		       const char *text_section_name,
+		       const char *named_section_suffix)
+{
+  if (decl && DECL_SECTION_NAME (decl))
+    {
+      tree dsn = DECL_SECTION_NAME (decl);
+      struct named_subsection_entry key, *entry, **slot;
+
+      key.suffix = named_section_suffix;
+      key.section_name = dsn;
+      if (!subsection_hash)
+	subsection_hash =
+	  htab_create_ggc (10, hash_subsection, eq_subsections, NULL);
+      slot =
+	(struct named_subsection_entry **) htab_find_slot (subsection_hash,
+							   &key, INSERT);
+      if (!*slot)
+	{
+	  const char *stripped_name;
+	  char *name, *buffer;
+	  name = (char *) alloca (TREE_STRING_LENGTH (dsn) + 1);
+	  memcpy (name, TREE_STRING_POINTER (dsn),
+		  TREE_STRING_LENGTH (dsn) + 1);
+
+	  stripped_name = targetm.strip_name_encoding (name);
+
+	  buffer = ACONCAT ((stripped_name, named_section_suffix, NULL));
+	  entry = ggc_alloc_named_subsection_entry ();
+	  entry->section_name = dsn;
+	  entry->suffix = named_section_suffix;
+	  entry->sect = get_named_section (decl, buffer, 0);
+	  *slot = entry;
+	}
+      return (*slot)->sect;
+    }
+  return get_named_section (decl, text_section_name, 0);
+}
+
+/* Choose named function section based on its frequency.  */
+
+section *
+default_function_section (tree decl, enum node_frequency freq,
+			  bool startup, bool exit)
+{
+  /* Startup code should go to startup subsection unless it is
+     unlikely executed (this happens especially with function splitting
+     where we can split away unnecesary parts of static constructors.  */
+  if (startup && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
+    return named_text_subsection (decl, ".text.startup", "_startup");
+
+  /* Similarly for exit.  */
+  if (exit && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
+    return named_text_subsection (decl, ".text.exit", "_exit");
+
+  /* Group cold functions together, similarly for hot code.  */
+  switch (freq)
+    {
+      case NODE_FREQUENCY_UNLIKELY_EXECUTED:
+	return named_text_subsection (decl, "text.unlikely", "_unlikely");
+      case NODE_FREQUENCY_HOT:
+	return named_text_subsection (decl, "text.hot", "_hot");
+      default:
+	return NULL;
+    }
+}
+
 /* Return the section for function DECL.
 
    If DECL is NULL_TREE, return the text section.  We can be passed
-   NULL_TREE under some circumstances by dbxout.c at least.  */
+   NULL_TREE under some circumstances by dbxout.c at least. 
 
-section *
-function_section (tree decl)
+   If FORCE_COLD is true, return cold function section ignoring
+   the frequency info of cgraph_node.  */
+
+static section *
+function_section_1 (tree decl, bool force_cold)
 {
-  int reloc = 0;
+  section *section = NULL;
+  enum node_frequency freq = NODE_FREQUENCY_NORMAL;
+  bool startup = false, exit = false;
 
-  if (first_function_block_is_cold)
-    reloc = 1;
+  if (decl)
+    {
+      struct cgraph_node *node = cgraph_node (decl);
+
+      freq = node->frequency;
+      startup = node->only_called_at_startup;
+      exit = node->only_called_at_exit;
+    }
+  if (force_cold)
+    freq = NODE_FREQUENCY_UNLIKELY_EXECUTED;
 
 #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
   if (decl != NULL_TREE
       && DECL_SECTION_NAME (decl) != NULL_TREE)
-    return reloc ? unlikely_text_section ()
-		 : get_named_section (decl, NULL, 0);
+    {
+      if (targetm.asm_out.function_section)
+	section = targetm.asm_out.function_section (decl, freq,
+						    startup, exit);
+      if (section)
+	return section;
+      return get_named_section (decl, NULL, 0);
+    }
   else
-    return targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
+    return targetm.asm_out.select_section
+	    (decl, freq == NODE_FREQUENCY_UNLIKELY_EXECUTED,
+	     DECL_ALIGN (decl));
 #else
-  return reloc ? unlikely_text_section () : hot_function_section (decl);
+  if (targetm.asm_out.function_section)
+    section = targetm.asm_out.function_section (decl, freq, startup, exit);
+  if (section)
+    return section;
+  return hot_function_section (decl);
 #endif
 }
 
+/* Return the section for function DECL.
+
+   If DECL is NULL_TREE, return the text section.  We can be passed
+   NULL_TREE under some circumstances by dbxout.c at least.  */
+
+section *
+function_section (tree decl)
+{
+  /* Handle cases where function splitting code decides
+     to put function entry point into unlikely executed section
+     despite the fact that the function itself is not cold
+     (i.e. it is called rarely but contains a hot loop that is
+     better to live in hot subsection for the code locality).  */
+  return function_section_1 (decl,
+			     first_function_block_is_cold
+		             && flag_function_sections);
+}
+
+/* Return the section for the current function, take IN_COLD_SECTION_P
+   into account.  */
+
 section *
 current_function_section (void)
 {
-#ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
-  if (current_function_decl != NULL_TREE
-      && DECL_SECTION_NAME (current_function_decl) != NULL_TREE)
-    return in_cold_section_p ? unlikely_text_section ()
-			     : get_named_section (current_function_decl,
-						  NULL, 0);
-  else
-    return targetm.asm_out.select_section (current_function_decl,
-					   in_cold_section_p,
-					   DECL_ALIGN (current_function_decl));
-#else
-  return (in_cold_section_p
-	  ? unlikely_text_section ()
-	  : hot_function_section (current_function_decl));
-#endif
+  return function_section_1 (current_function_decl, in_cold_section_p);
+}
+
+/* Tell assembler to switch to unlikely-to-be-executed text section.  */
+
+section *
+unlikely_text_section (void)
+{
+  return function_section_1 (current_function_decl, true);
+}
+
+/* When called within a function context, return true if the function
+   has been assigned a cold text section and if SECT is that section.
+   When called outside a function context, return true if SECT is the
+   default cold section.  */
+
+bool
+unlikely_text_section_p (section *sect)
+{
+  return sect == function_section_1 (current_function_decl, true);
 }
 
 /* Return the read-only data section associated with function DECL.  */
@@ -1483,16 +1573,10 @@ assemble_start_function (tree decl, cons
   else if (DECL_SECTION_NAME (decl))
     {
       /* Calls to function_section rely on first_function_block_is_cold
-	 being accurate.  The first block may be cold even if we aren't
-	 doing partitioning, if the entire function was decided by
-	 choose_function_section (predict.c) to be cold.  */
-
-      initialize_cold_section_name ();
-
-      if (crtl->subsections.unlikely_text_section_name
-	  && strcmp (TREE_STRING_POINTER (DECL_SECTION_NAME (decl)),
-		     crtl->subsections.unlikely_text_section_name) == 0)
-	first_function_block_is_cold = true;
+	 being accurate.  */
+      first_function_block_is_cold
+	 = (cgraph_node (current_function_decl)->frequency
+	    == NODE_FREQUENCY_UNLIKELY_EXECUTED);
     }
 
   in_cold_section_p = first_function_block_is_cold;
@@ -5841,15 +5925,6 @@ default_section_type_flags (tree decl, c
     flags = SECTION_CODE;
   else if (decl && decl_readonly_section (decl, reloc))
     flags = 0;
-  else if (current_function_decl
-	   && cfun
-	   && crtl->subsections.unlikely_text_section_name
-	   && strcmp (name, crtl->subsections.unlikely_text_section_name) == 0)
-    flags = SECTION_CODE;
-  else if (!decl
-	   && (!current_function_decl || !cfun)
-	   && strcmp (name, UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
-    flags = SECTION_CODE;
   else
     flags = SECTION_WRITE;
 
@@ -6776,12 +6851,6 @@ switch_to_section (section *new_section)
   switch (SECTION_STYLE (new_section))
     {
     case SECTION_NAMED:
-      if (cfun
-	  && !crtl->subsections.unlikely_text_section_name
-	  && strcmp (new_section->named.name,
-		     UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
-	crtl->subsections.unlikely_text_section_name = UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
-
       targetm.asm_out.named_section (new_section->named.name,
 				     new_section->named.common.flags,
 				     new_section->named.decl);
Index: output.h
===================================================================
--- output.h	(revision 166490)
+++ output.h	(working copy)
@@ -264,6 +264,10 @@ extern bool default_assemble_integer (rt
    be outputable. */
 extern bool assemble_integer (rtx, unsigned, unsigned, int);
 
+/* Return section for TEXT_SECITON_NAME if DECL or DECL_SECTION_NAME (DECL)
+   is NULL.  */
+extern section *named_text_subsection (tree, const char *, const char *);
+
 /* An interface to assemble_integer for the common case in which a value is
    fully aligned and must be printed.  VALUE is the value of the integer
    object and SIZE is the number of bytes it contains.  */
Index: config/i386/winnt.c
===================================================================
--- config/i386/winnt.c	(revision 166490)
+++ config/i386/winnt.c	(working copy)
@@ -414,15 +414,6 @@ i386_pe_section_type_flags (tree decl, c
     flags = SECTION_CODE;
   else if (decl && decl_readonly_section (decl, reloc))
     flags = 0;
-  else if (current_function_decl
-	   && cfun
-	   && crtl->subsections.unlikely_text_section_name
-	   && strcmp (name, crtl->subsections.unlikely_text_section_name) == 0)
-    flags = SECTION_CODE;
-  else if (!decl
-	   && (!current_function_decl || !cfun)
-	   && strcmp (name, UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
-    flags = SECTION_CODE;
   else
     {
       flags = SECTION_WRITE;
Index: config/darwin-protos.h
===================================================================
--- config/darwin-protos.h	(revision 166490)
+++ config/darwin-protos.h	(working copy)
@@ -57,6 +57,7 @@ extern section *machopic_select_section 
 extern section *machopic_select_rtx_section (enum machine_mode, rtx,
 					     unsigned HOST_WIDE_INT);
 
+extern section *darwin_function_section (tree, enum node_frequency, bool, bool); 
 extern void darwin_unique_section (tree decl, int reloc);
 extern void darwin_asm_named_section (const char *, unsigned int, tree);
 extern void darwin_non_lazy_pcrel (FILE *, rtx);
Index: config/microblaze/microblaze.h
===================================================================
--- config/microblaze/microblaze.h	(revision 166490)
+++ config/microblaze/microblaze.h	(working copy)
@@ -880,10 +880,6 @@ do {									 \
 #define SBSS_SECTION_ASM_OP     "\t.sbss"	/* Small RW uninitialized data */
 #define SBSS2_SECTION_ASM_OP    "\t.sbss2"	/* Small RO uninitialized data */
 
-#define HOT_TEXT_SECTION_NAME   ".text.hot"
-#define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \
-                                ".text.unlikely"
-
 /* We do this to save a few 10s of code space that would be taken up
    by the call_FUNC () wrappers, used by the generic CRT_CALL_STATIC_FUNCTION
    definition in crtstuff.c.  */
Index: config/ia64/hpux.h
===================================================================
--- config/ia64/hpux.h	(revision 166490)
+++ config/ia64/hpux.h	(working copy)
@@ -221,8 +221,4 @@ do {								\
    it is fixed, prevent code from being put into .text.unlikely or
    .text.hot.  */
 
-#undef UNLIKELY_EXECUTED_TEXT_SECTION_NAME
-#define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text"
-
-#undef HOT_TEXT_SECTION_NAME
-#define HOT_TEXT_SECTION_NAME ".text"
+#define TARGET_ASM_FUNCTION_SECTOIN NULL
Index: config/darwin.c
===================================================================
--- config/darwin.c	(revision 166490)
+++ config/darwin.c	(working copy)
@@ -1233,12 +1233,22 @@ machopic_select_section (tree decl,
 	       && DECL_WEAK (decl)
 	       && !lookup_attribute ("weak_import",
 				     DECL_ATTRIBUTES (decl)));
-  section *base_section;
+  section *base_section = NULL;
 
   switch (categorize_decl_for_section (decl, reloc))
     {
     case SECCAT_TEXT:
-      base_section = darwin_text_section (reloc, weak);
+      {
+	struct cgraph_node *node;
+	if (decl && TREE_CODE (decl) == FUNCTION_DECL
+	    && (node = cgraph_get_node (decl)) != NULL)
+	  base_section = darwin_function_section (decl,
+						  node->frequency,
+						  node->only_called_at_startup,
+						  node->only_called_at_exit);
+	if (!base_section)
+          base_section = darwin_text_section (reloc, weak);
+      }
       break;
 
     case SECCAT_RODATA:
@@ -2362,4 +2372,38 @@ darwin_enter_string_into_cfstring_table 
     }
 }
 
+/* Choose named function section based on its frequency.  */
+
+section *
+darwin_function_section (tree decl, enum node_frequency freq,
+			  bool startup, bool exit)
+{
+  /* Startup code should go to startup subsection unless it is
+     unlikely executed (this happens especially with function splitting
+     where we can split away unnecesary parts of static constructors.  */
+  if (startup && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
+    return named_text_subsection
+	     (decl, "__TEXT,__startup,regular,pure_instructions", "_startup");
+
+  /* Similarly for exit.  */
+  if (exit && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
+    return named_text_subsection (decl,
+				  "__TEXT,__exit,regular,pure_instructions",
+				  "_exit");
+
+  /* Group cold functions together, similarly for hot code.  */
+  switch (freq)
+    {
+      case NODE_FREQUENCY_UNLIKELY_EXECUTED:
+	return named_text_subsection
+		 (decl,
+	          "__TEXT,__unlikely,regular,pure_instructions", "_ulinkely");
+      case NODE_FREQUENCY_HOT:
+	return named_text_subsection
+		 (decl, "__TEXT,__hot,regular,pure_instructions", "_hot");
+      default:
+	return NULL;
+    }
+}
+
 #include "gt-darwin.h"
Index: config/darwin.h
===================================================================
--- config/darwin.h	(revision 166490)
+++ config/darwin.h	(working copy)
@@ -710,11 +710,6 @@ int darwin_label_is_anonymous_local_objc
 /* The generic version, archs should over-ride where required.  */
 #define MACHOPIC_NL_SYMBOL_PTR_SECTION ".non_lazy_symbol_pointer"
 
-/* These are used by -fbranch-probabilities */
-#define HOT_TEXT_SECTION_NAME "__TEXT,__text,regular,pure_instructions"
-#define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \
-                              "__TEXT,__unlikely,regular,pure_instructions"
-
 /* Declare the section variables.  */
 #ifndef USED_FOR_TARGET
 enum darwin_section_enum {
@@ -729,6 +724,8 @@ extern GTY(()) section * darwin_sections
 #undef	TARGET_ASM_SELECT_SECTION
 #define TARGET_ASM_SELECT_SECTION machopic_select_section
 #define USE_SELECT_SECTION_FOR_FUNCTIONS
+#undef	TARGET_ASM_FUNCTION_SECTION
+#define TARGET_ASM_FUNCTION_SECTION darwin_function_section
 
 #undef	TARGET_ASM_SELECT_RTX_SECTION
 #define TARGET_ASM_SELECT_RTX_SECTION machopic_select_rtx_section

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10  4:59     ` Group static constructors and destructors in specific subsections, take 2 Jan Hubicka
@ 2010-11-10 14:16       ` Jack Howarth
  2010-11-10 14:56         ` Ian Lance Taylor
  2010-11-10 18:59       ` Richard Henderson
  1 sibling, 1 reply; 70+ messages in thread
From: Jack Howarth @ 2010-11-10 14:16 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Richard Henderson, Joseph S. Myers, gcc-patches, rguenther, iant,
	iains, tglek

On Wed, Nov 10, 2010 at 05:24:52AM +0100, Jan Hubicka wrote:
> Hi,
> this is the updated patch that deals with section instead of section names as
> Richard requested and also uses hook instead of target macros.
> The hot/cold function partitioning made things somewhat harder since we have
> special code for cold sections all around.
> 
> Baic idea is to have named_text_subsection to produce subsections for functions
> in generic way and function_section hook to decide proper subsection based on
> profile.
> 
> I've bootstrapped/regtested the patch on x86_64-linux, I also checked that Hpux
> and dwarf output seems sane at a glance.

Honza,
   This patch is causing a bootstrap failure on x86_64-apple-darwin10 with the error...

makeinfo: Removing output file `doc/gccint.info' due to errors; use --force to preserve.
gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I../../gcc-4.6-20101110/gcc -I../../gcc-4.6-20101110/gcc/build -I../../gcc-4.6-20101110/gcc/../include -I../../gcc-4.6-20101110/gcc/../libcpp/include -I/sw/include -I/sw/include  -I../../gcc-4.6-20101110/gcc/../libdecnumber -I../../gcc-4.6-20101110/gcc/../libdecnumber/dpd -I../libdecnumber -I/sw/include  -I/sw/include -DCLOOG_PPL_BACKEND  -I/sw/include \
		-o build/gencodes.o ../../gcc-4.6-20101110/gcc/gencodes.c
make[3]: *** [doc/gccint.info] Error 1
make[3]: *** Waiting for unfinished jobs....

           Jack
> 
> Does the patch seem resonable?
> 
> Honza
> 	* targhooks.h (default_function_section): Declare.
> 	* target.def (function_section): New hook.
> 	* defaults.h (HOT_TEXT_SECTION_NAME,
> 	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
> 	* predict.c (choose_function_section): Remove.
> 	(estimate_bb_frequencies): Do not use choose_function_section.
> 	* coretypes.h (enum node_frequency): Move here from cgraph.h
> 	* cgraph.h (enum node_frequency): Remove.
> 	* varasm.c (initialize_cold_section_name, unlikely_text_section,
> 	unlikely_text_section_p): Remove.
> 	(named_subsection_entry): New structure.
> 	(subsection_hash): New static var.
> 	(eq_subsections): New function.
> 	(hash_subsection): New function.
> 	(named_text_subsection): New function.
> 	(default_function_section): New function.
> 	(function_section_1): Break out from ...; handle profile info.
> 	(function_section): ... here.
> 	(unlikely_text_section): Remove.
> 	(unlikely_text_section_p): Use function_section_1.
> 	(assemble_start_function): Do not initialize cold section.
> 	(default_section_type_flags): Do not special case cold subsection.
> 	(switch_to_section): Likewise.
> 	* output.h (named_text_subsection): Define.
> 	* config/i386/winnt.c: Do not special case cold section.
> 	* config/darwin-protos.h (darwin_function_section): Declare.
> 	* config/microblaze/microblaze.h (HOT_TEXT_SECTION_NAME,
> 	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
> 	* config/ia64/hpux.h (HOT_TEXT_SECTION_NAME,
> 	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
> 	(TARGET_ASM_FUNCTION_SECTOIN): Define to NULL.
> 	* config/darwin.c (machopic_select_section): Use
> 	darwin_function_section.
> 	(darwin_function_section): New function.
> 	* config/darwin.h (HOT_TEXT_SECTION_NAME,
> 	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
> 	(TARGET_ASM_FUNCTION_SECTION): Define.
> Index: doc/tm.texi
> ===================================================================
> --- doc/tm.texi	(revision 166490)
> +++ doc/tm.texi	(working copy)
> @@ -7335,6 +7335,8 @@ macro is not defined, nothing is output 
>  @end defmac
>  
>  @deftypefn {Target Hook} void TARGET_ASM_NAMED_SECTION (const char *@var{name}, unsigned int @var{flags}, tree @var{decl})
> +
> +@deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_SECTION (tree @var{decl}, enum node_frequency @var{freq}, bool @var{startup}, bool @var{exit})
>  Output assembly directives to switch to section @var{name}.  The section
>  should have attributes as specified by @var{flags}, which is a bit mask
>  of the @code{SECTION_*} flags defined in @file{output.h}.  If @var{decl}
> Index: targhooks.h
> ===================================================================
> --- targhooks.h	(revision 166490)
> +++ targhooks.h	(working copy)
> @@ -169,7 +169,8 @@ extern int default_label_align_after_bar
>  extern int default_loop_align_max_skip (rtx);
>  extern int default_label_align_max_skip (rtx);
>  extern int default_jump_align_max_skip (rtx);
> -
> +extern section * default_function_section(tree decl, enum node_frequency freq,
> +					  bool startup, bool exit);
>  extern enum machine_mode default_get_reg_raw_mode(int);
>  
>  extern const struct default_options empty_optimization_table[];
> Index: target.def
> ===================================================================
> --- target.def	(revision 166490)
> +++ target.def	(working copy)
> @@ -281,6 +281,19 @@ DEFHOOK
>   void, (const char *name, unsigned int flags, tree decl),
>   default_no_named_section)
>  
> +/* Return preferred text (sub)section for function DECL.
> +   Main purpose of this function is to separate cold, normal and hot
> +   functions. STARTUP is true when function is known to be used only 
> +   at startup (from static constructors or it is main()).
> +   EXIT is true when function is known to be used only at exit
> +   (from static destructors).
> +   Return NULL if function should go to default text section.  */
> +DEFHOOK
> +(function_section,
> + "",
> + section *, (tree decl, enum node_frequency freq, bool startup, bool exit),
> + default_function_section)
> +
>  /* Return a mask describing how relocations should be treated when
>     selecting sections.  Bit 1 should be set if global relocations
>     should be placed in a read-write section; bit 0 should be set if
> Index: cgraph.h
> ===================================================================
> --- cgraph.h	(revision 166490)
> +++ cgraph.h	(working copy)
> @@ -179,20 +179,6 @@ struct GTY(()) cgraph_clone_info
>    bitmap combined_args_to_skip;
>  };
>  
> -enum node_frequency {
> -  /* This function most likely won't be executed at all.
> -     (set only when profile feedback is available or via function attribute). */
> -  NODE_FREQUENCY_UNLIKELY_EXECUTED,
> -  /* For functions that are known to be executed once (i.e. constructors, destructors
> -     and main function.  */
> -  NODE_FREQUENCY_EXECUTED_ONCE,
> -  /* The default value.  */
> -  NODE_FREQUENCY_NORMAL,
> -  /* Optimize this function hard
> -     (set only when profile feedback is available or via function attribute). */
> -  NODE_FREQUENCY_HOT
> -};
> -
>  
>  /* The cgraph data structure.
>     Each function decl has assigned cgraph_node listing callees and callers.  */
> Index: defaults.h
> ===================================================================
> --- defaults.h	(revision 166490)
> +++ defaults.h	(working copy)
> @@ -897,14 +897,6 @@ see the files COPYING3 and COPYING.RUNTI
>  #define TARGET_DEC_EVAL_METHOD 2
>  #endif
>  
> -#ifndef HOT_TEXT_SECTION_NAME
> -#define HOT_TEXT_SECTION_NAME ".text.hot"
> -#endif
> -
> -#ifndef UNLIKELY_EXECUTED_TEXT_SECTION_NAME
> -#define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text.unlikely"
> -#endif
> -
>  #ifndef HAS_LONG_COND_BRANCH
>  #define HAS_LONG_COND_BRANCH 0
>  #endif
> Index: predict.c
> ===================================================================
> --- predict.c	(revision 166490)
> +++ predict.c	(working copy)
> @@ -78,7 +78,6 @@ static sreal real_zero, real_one, real_a
>  static void combine_predictions_for_insn (rtx, basic_block);
>  static void dump_prediction (FILE *, enum br_predictor, int, basic_block, int);
>  static void predict_paths_leading_to (basic_block, enum br_predictor, enum prediction);
> -static void choose_function_section (void);
>  static bool can_predict_insn_p (const_rtx);
>  
>  /* Information we hold about each branch predictor.
> @@ -2185,8 +2184,6 @@ estimate_bb_frequencies (void)
>        free_aux_for_edges ();
>      }
>    compute_function_frequency ();
> -  if (flag_reorder_functions)
> -    choose_function_section ();
>  }
>  
>  /* Decide whether function is hot, cold or unlikely executed.  */
> @@ -2232,35 +2229,6 @@ compute_function_frequency (void)
>      }
>  }
>  
> -/* Choose appropriate section for the function.  */
> -static void
> -choose_function_section (void)
> -{
> -  struct cgraph_node *node = cgraph_node (current_function_decl);
> -  if (DECL_SECTION_NAME (current_function_decl)
> -      || !targetm.have_named_sections
> -      /* Theoretically we can split the gnu.linkonce text section too,
> -	 but this requires more work as the frequency needs to match
> -	 for all generated objects so we need to merge the frequency
> -	 of all instances.  For now just never set frequency for these.  */
> -      || DECL_ONE_ONLY (current_function_decl))
> -    return;
> -
> -  /* If we are doing the partitioning optimization, let the optimization
> -     choose the correct section into which to put things.  */
> -
> -  if (flag_reorder_blocks_and_partition)
> -    return;
> -
> -  if (node->frequency == NODE_FREQUENCY_HOT)
> -    DECL_SECTION_NAME (current_function_decl) =
> -      build_string (strlen (HOT_TEXT_SECTION_NAME), HOT_TEXT_SECTION_NAME);
> -  if (node->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
> -    DECL_SECTION_NAME (current_function_decl) =
> -      build_string (strlen (UNLIKELY_EXECUTED_TEXT_SECTION_NAME),
> -		    UNLIKELY_EXECUTED_TEXT_SECTION_NAME);
> -}
> -
>  static bool
>  gate_estimate_probability (void)
>  {
> Index: function.h
> ===================================================================
> --- function.h	(revision 166490)
> +++ function.h	(working copy)
> @@ -226,11 +226,6 @@ struct GTY(()) function_subsections {
>    const char *cold_section_label;
>    const char *hot_section_end_label;
>    const char *cold_section_end_label;
> -
> -  /* String to be used for name of cold text sections, via
> -     targetm.asm_out.named_section.  */
> -
> -  const char *unlikely_text_section_name;
>  };
>  
>  /* Describe an empty area of space in the stack frame.  These can be chained
> Index: coretypes.h
> ===================================================================
> --- coretypes.h	(revision 166490)
> +++ coretypes.h	(working copy)
> @@ -121,6 +121,22 @@ enum unwind_info_type
>    UI_TARGET
>  };
>  
> +/* Callgraph node profile representation.  */
> +enum node_frequency {
> +  /* This function most likely won't be executed at all.
> +     (set only when profile feedback is available or via function attribute). */
> +  NODE_FREQUENCY_UNLIKELY_EXECUTED,
> +  /* For functions that are known to be executed once (i.e. constructors, destructors
> +     and main function.  */
> +  NODE_FREQUENCY_EXECUTED_ONCE,
> +  /* The default value.  */
> +  NODE_FREQUENCY_NORMAL,
> +  /* Optimize this function hard
> +     (set only when profile feedback is available or via function attribute). */
> +  NODE_FREQUENCY_HOT
> +};
> +
> +
>  struct edge_def;
>  typedef struct edge_def *edge;
>  typedef const struct edge_def *const_edge;
> Index: varasm.c
> ===================================================================
> --- varasm.c	(revision 166490)
> +++ varasm.c	(working copy)
> @@ -368,69 +368,6 @@ create_block_symbol (const char *label, 
>    return symbol;
>  }
>  
> -static void
> -initialize_cold_section_name (void)
> -{
> -  const char *stripped_name;
> -  char *name, *buffer;
> -  tree dsn;
> -
> -  gcc_assert (cfun && current_function_decl);
> -  if (crtl->subsections.unlikely_text_section_name)
> -    return;
> -
> -  dsn = DECL_SECTION_NAME (current_function_decl);
> -  if (flag_function_sections && dsn)
> -    {
> -      name = (char *) alloca (TREE_STRING_LENGTH (dsn) + 1);
> -      memcpy (name, TREE_STRING_POINTER (dsn), TREE_STRING_LENGTH (dsn) + 1);
> -
> -      stripped_name = targetm.strip_name_encoding (name);
> -
> -      buffer = ACONCAT ((stripped_name, "_unlikely", NULL));
> -      crtl->subsections.unlikely_text_section_name = ggc_strdup (buffer);
> -    }
> -  else
> -    crtl->subsections.unlikely_text_section_name =  UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
> -}
> -
> -/* Tell assembler to switch to unlikely-to-be-executed text section.  */
> -
> -section *
> -unlikely_text_section (void)
> -{
> -  if (cfun)
> -    {
> -      if (!crtl->subsections.unlikely_text_section_name)
> -	initialize_cold_section_name ();
> -
> -      return get_named_section (NULL, crtl->subsections.unlikely_text_section_name, 0);
> -    }
> -  else
> -    return get_named_section (NULL, UNLIKELY_EXECUTED_TEXT_SECTION_NAME, 0);
> -}
> -
> -/* When called within a function context, return true if the function
> -   has been assigned a cold text section and if SECT is that section.
> -   When called outside a function context, return true if SECT is the
> -   default cold section.  */
> -
> -bool
> -unlikely_text_section_p (section *sect)
> -{
> -  const char *name;
> -
> -  if (cfun)
> -    name = crtl->subsections.unlikely_text_section_name;
> -  else
> -    name = UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
> -
> -  return (name
> -	  && sect
> -	  && SECTION_STYLE (sect) == SECTION_NAMED
> -	  && strcmp (name, sect->named.name) == 0);
> -}
> -
>  /* Return a section with a particular name and with whatever SECTION_*
>     flags section_type_flags deems appropriate.  The name of the section
>     is taken from NAME if nonnull, otherwise it is taken from DECL's
> @@ -539,49 +476,202 @@ hot_function_section (tree decl)
>  }
>  #endif
>  
> +/* Hash table used to cache names of function subsections.  */
> +struct GTY(()) named_subsection_entry
> +{
> +  section *sect;
> +  tree section_name;
> +  const char *suffix;
> +};
> +
> +static GTY((param_is (struct named_subsection_entry))) htab_t subsection_hash;
> +
> +/* Returns nonzero if P1 and P2 are equal.  */
> +
> +static int
> +eq_subsections (const void *p1, const void *p2)
> +{
> +  const struct named_subsection_entry *n1 = (const struct named_subsection_entry *) p1;
> +  const struct named_subsection_entry *n2 = (const struct named_subsection_entry *) p2;
> +  return n1->section_name == n2->section_name && n1->suffix == n2->suffix;
> +}
> +
> +/* Compute hash value of subsection.  */
> +static unsigned int
> +hash_subsection (const void *p)
> +{
> +  const struct named_subsection_entry *n = (const struct named_subsection_entry *)p;
> +  return htab_hash_pointer (n->section_name) ^ htab_hash_pointer (n->suffix);
> +}
> +
> +/* Return section for TEXT_SECITON_NAME if DECL or DECL_SECTION_NAME (DECL)
> +   is NULL.
> +
> +   If it is non-NULL, return subsection of DECL_SECTION_NAME (DECL) created
> +   by appending NAMED_SECTION_SUFFIX.  */
> +
> +section *
> +named_text_subsection (tree decl,
> +		       const char *text_section_name,
> +		       const char *named_section_suffix)
> +{
> +  if (decl && DECL_SECTION_NAME (decl))
> +    {
> +      tree dsn = DECL_SECTION_NAME (decl);
> +      struct named_subsection_entry key, *entry, **slot;
> +
> +      key.suffix = named_section_suffix;
> +      key.section_name = dsn;
> +      if (!subsection_hash)
> +	subsection_hash =
> +	  htab_create_ggc (10, hash_subsection, eq_subsections, NULL);
> +      slot =
> +	(struct named_subsection_entry **) htab_find_slot (subsection_hash,
> +							   &key, INSERT);
> +      if (!*slot)
> +	{
> +	  const char *stripped_name;
> +	  char *name, *buffer;
> +	  name = (char *) alloca (TREE_STRING_LENGTH (dsn) + 1);
> +	  memcpy (name, TREE_STRING_POINTER (dsn),
> +		  TREE_STRING_LENGTH (dsn) + 1);
> +
> +	  stripped_name = targetm.strip_name_encoding (name);
> +
> +	  buffer = ACONCAT ((stripped_name, named_section_suffix, NULL));
> +	  entry = ggc_alloc_named_subsection_entry ();
> +	  entry->section_name = dsn;
> +	  entry->suffix = named_section_suffix;
> +	  entry->sect = get_named_section (decl, buffer, 0);
> +	  *slot = entry;
> +	}
> +      return (*slot)->sect;
> +    }
> +  return get_named_section (decl, text_section_name, 0);
> +}
> +
> +/* Choose named function section based on its frequency.  */
> +
> +section *
> +default_function_section (tree decl, enum node_frequency freq,
> +			  bool startup, bool exit)
> +{
> +  /* Startup code should go to startup subsection unless it is
> +     unlikely executed (this happens especially with function splitting
> +     where we can split away unnecesary parts of static constructors.  */
> +  if (startup && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
> +    return named_text_subsection (decl, ".text.startup", "_startup");
> +
> +  /* Similarly for exit.  */
> +  if (exit && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
> +    return named_text_subsection (decl, ".text.exit", "_exit");
> +
> +  /* Group cold functions together, similarly for hot code.  */
> +  switch (freq)
> +    {
> +      case NODE_FREQUENCY_UNLIKELY_EXECUTED:
> +	return named_text_subsection (decl, "text.unlikely", "_unlikely");
> +      case NODE_FREQUENCY_HOT:
> +	return named_text_subsection (decl, "text.hot", "_hot");
> +      default:
> +	return NULL;
> +    }
> +}
> +
>  /* Return the section for function DECL.
>  
>     If DECL is NULL_TREE, return the text section.  We can be passed
> -   NULL_TREE under some circumstances by dbxout.c at least.  */
> +   NULL_TREE under some circumstances by dbxout.c at least. 
>  
> -section *
> -function_section (tree decl)
> +   If FORCE_COLD is true, return cold function section ignoring
> +   the frequency info of cgraph_node.  */
> +
> +static section *
> +function_section_1 (tree decl, bool force_cold)
>  {
> -  int reloc = 0;
> +  section *section = NULL;
> +  enum node_frequency freq = NODE_FREQUENCY_NORMAL;
> +  bool startup = false, exit = false;
>  
> -  if (first_function_block_is_cold)
> -    reloc = 1;
> +  if (decl)
> +    {
> +      struct cgraph_node *node = cgraph_node (decl);
> +
> +      freq = node->frequency;
> +      startup = node->only_called_at_startup;
> +      exit = node->only_called_at_exit;
> +    }
> +  if (force_cold)
> +    freq = NODE_FREQUENCY_UNLIKELY_EXECUTED;
>  
>  #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
>    if (decl != NULL_TREE
>        && DECL_SECTION_NAME (decl) != NULL_TREE)
> -    return reloc ? unlikely_text_section ()
> -		 : get_named_section (decl, NULL, 0);
> +    {
> +      if (targetm.asm_out.function_section)
> +	section = targetm.asm_out.function_section (decl, freq,
> +						    startup, exit);
> +      if (section)
> +	return section;
> +      return get_named_section (decl, NULL, 0);
> +    }
>    else
> -    return targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
> +    return targetm.asm_out.select_section
> +	    (decl, freq == NODE_FREQUENCY_UNLIKELY_EXECUTED,
> +	     DECL_ALIGN (decl));
>  #else
> -  return reloc ? unlikely_text_section () : hot_function_section (decl);
> +  if (targetm.asm_out.function_section)
> +    section = targetm.asm_out.function_section (decl, freq, startup, exit);
> +  if (section)
> +    return section;
> +  return hot_function_section (decl);
>  #endif
>  }
>  
> +/* Return the section for function DECL.
> +
> +   If DECL is NULL_TREE, return the text section.  We can be passed
> +   NULL_TREE under some circumstances by dbxout.c at least.  */
> +
> +section *
> +function_section (tree decl)
> +{
> +  /* Handle cases where function splitting code decides
> +     to put function entry point into unlikely executed section
> +     despite the fact that the function itself is not cold
> +     (i.e. it is called rarely but contains a hot loop that is
> +     better to live in hot subsection for the code locality).  */
> +  return function_section_1 (decl,
> +			     first_function_block_is_cold
> +		             && flag_function_sections);
> +}
> +
> +/* Return the section for the current function, take IN_COLD_SECTION_P
> +   into account.  */
> +
>  section *
>  current_function_section (void)
>  {
> -#ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
> -  if (current_function_decl != NULL_TREE
> -      && DECL_SECTION_NAME (current_function_decl) != NULL_TREE)
> -    return in_cold_section_p ? unlikely_text_section ()
> -			     : get_named_section (current_function_decl,
> -						  NULL, 0);
> -  else
> -    return targetm.asm_out.select_section (current_function_decl,
> -					   in_cold_section_p,
> -					   DECL_ALIGN (current_function_decl));
> -#else
> -  return (in_cold_section_p
> -	  ? unlikely_text_section ()
> -	  : hot_function_section (current_function_decl));
> -#endif
> +  return function_section_1 (current_function_decl, in_cold_section_p);
> +}
> +
> +/* Tell assembler to switch to unlikely-to-be-executed text section.  */
> +
> +section *
> +unlikely_text_section (void)
> +{
> +  return function_section_1 (current_function_decl, true);
> +}
> +
> +/* When called within a function context, return true if the function
> +   has been assigned a cold text section and if SECT is that section.
> +   When called outside a function context, return true if SECT is the
> +   default cold section.  */
> +
> +bool
> +unlikely_text_section_p (section *sect)
> +{
> +  return sect == function_section_1 (current_function_decl, true);
>  }
>  
>  /* Return the read-only data section associated with function DECL.  */
> @@ -1483,16 +1573,10 @@ assemble_start_function (tree decl, cons
>    else if (DECL_SECTION_NAME (decl))
>      {
>        /* Calls to function_section rely on first_function_block_is_cold
> -	 being accurate.  The first block may be cold even if we aren't
> -	 doing partitioning, if the entire function was decided by
> -	 choose_function_section (predict.c) to be cold.  */
> -
> -      initialize_cold_section_name ();
> -
> -      if (crtl->subsections.unlikely_text_section_name
> -	  && strcmp (TREE_STRING_POINTER (DECL_SECTION_NAME (decl)),
> -		     crtl->subsections.unlikely_text_section_name) == 0)
> -	first_function_block_is_cold = true;
> +	 being accurate.  */
> +      first_function_block_is_cold
> +	 = (cgraph_node (current_function_decl)->frequency
> +	    == NODE_FREQUENCY_UNLIKELY_EXECUTED);
>      }
>  
>    in_cold_section_p = first_function_block_is_cold;
> @@ -5841,15 +5925,6 @@ default_section_type_flags (tree decl, c
>      flags = SECTION_CODE;
>    else if (decl && decl_readonly_section (decl, reloc))
>      flags = 0;
> -  else if (current_function_decl
> -	   && cfun
> -	   && crtl->subsections.unlikely_text_section_name
> -	   && strcmp (name, crtl->subsections.unlikely_text_section_name) == 0)
> -    flags = SECTION_CODE;
> -  else if (!decl
> -	   && (!current_function_decl || !cfun)
> -	   && strcmp (name, UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
> -    flags = SECTION_CODE;
>    else
>      flags = SECTION_WRITE;
>  
> @@ -6776,12 +6851,6 @@ switch_to_section (section *new_section)
>    switch (SECTION_STYLE (new_section))
>      {
>      case SECTION_NAMED:
> -      if (cfun
> -	  && !crtl->subsections.unlikely_text_section_name
> -	  && strcmp (new_section->named.name,
> -		     UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
> -	crtl->subsections.unlikely_text_section_name = UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
> -
>        targetm.asm_out.named_section (new_section->named.name,
>  				     new_section->named.common.flags,
>  				     new_section->named.decl);
> Index: output.h
> ===================================================================
> --- output.h	(revision 166490)
> +++ output.h	(working copy)
> @@ -264,6 +264,10 @@ extern bool default_assemble_integer (rt
>     be outputable. */
>  extern bool assemble_integer (rtx, unsigned, unsigned, int);
>  
> +/* Return section for TEXT_SECITON_NAME if DECL or DECL_SECTION_NAME (DECL)
> +   is NULL.  */
> +extern section *named_text_subsection (tree, const char *, const char *);
> +
>  /* An interface to assemble_integer for the common case in which a value is
>     fully aligned and must be printed.  VALUE is the value of the integer
>     object and SIZE is the number of bytes it contains.  */
> Index: config/i386/winnt.c
> ===================================================================
> --- config/i386/winnt.c	(revision 166490)
> +++ config/i386/winnt.c	(working copy)
> @@ -414,15 +414,6 @@ i386_pe_section_type_flags (tree decl, c
>      flags = SECTION_CODE;
>    else if (decl && decl_readonly_section (decl, reloc))
>      flags = 0;
> -  else if (current_function_decl
> -	   && cfun
> -	   && crtl->subsections.unlikely_text_section_name
> -	   && strcmp (name, crtl->subsections.unlikely_text_section_name) == 0)
> -    flags = SECTION_CODE;
> -  else if (!decl
> -	   && (!current_function_decl || !cfun)
> -	   && strcmp (name, UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
> -    flags = SECTION_CODE;
>    else
>      {
>        flags = SECTION_WRITE;
> Index: config/darwin-protos.h
> ===================================================================
> --- config/darwin-protos.h	(revision 166490)
> +++ config/darwin-protos.h	(working copy)
> @@ -57,6 +57,7 @@ extern section *machopic_select_section 
>  extern section *machopic_select_rtx_section (enum machine_mode, rtx,
>  					     unsigned HOST_WIDE_INT);
>  
> +extern section *darwin_function_section (tree, enum node_frequency, bool, bool); 
>  extern void darwin_unique_section (tree decl, int reloc);
>  extern void darwin_asm_named_section (const char *, unsigned int, tree);
>  extern void darwin_non_lazy_pcrel (FILE *, rtx);
> Index: config/microblaze/microblaze.h
> ===================================================================
> --- config/microblaze/microblaze.h	(revision 166490)
> +++ config/microblaze/microblaze.h	(working copy)
> @@ -880,10 +880,6 @@ do {									 \
>  #define SBSS_SECTION_ASM_OP     "\t.sbss"	/* Small RW uninitialized data */
>  #define SBSS2_SECTION_ASM_OP    "\t.sbss2"	/* Small RO uninitialized data */
>  
> -#define HOT_TEXT_SECTION_NAME   ".text.hot"
> -#define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \
> -                                ".text.unlikely"
> -
>  /* We do this to save a few 10s of code space that would be taken up
>     by the call_FUNC () wrappers, used by the generic CRT_CALL_STATIC_FUNCTION
>     definition in crtstuff.c.  */
> Index: config/ia64/hpux.h
> ===================================================================
> --- config/ia64/hpux.h	(revision 166490)
> +++ config/ia64/hpux.h	(working copy)
> @@ -221,8 +221,4 @@ do {								\
>     it is fixed, prevent code from being put into .text.unlikely or
>     .text.hot.  */
>  
> -#undef UNLIKELY_EXECUTED_TEXT_SECTION_NAME
> -#define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text"
> -
> -#undef HOT_TEXT_SECTION_NAME
> -#define HOT_TEXT_SECTION_NAME ".text"
> +#define TARGET_ASM_FUNCTION_SECTOIN NULL
> Index: config/darwin.c
> ===================================================================
> --- config/darwin.c	(revision 166490)
> +++ config/darwin.c	(working copy)
> @@ -1233,12 +1233,22 @@ machopic_select_section (tree decl,
>  	       && DECL_WEAK (decl)
>  	       && !lookup_attribute ("weak_import",
>  				     DECL_ATTRIBUTES (decl)));
> -  section *base_section;
> +  section *base_section = NULL;
>  
>    switch (categorize_decl_for_section (decl, reloc))
>      {
>      case SECCAT_TEXT:
> -      base_section = darwin_text_section (reloc, weak);
> +      {
> +	struct cgraph_node *node;
> +	if (decl && TREE_CODE (decl) == FUNCTION_DECL
> +	    && (node = cgraph_get_node (decl)) != NULL)
> +	  base_section = darwin_function_section (decl,
> +						  node->frequency,
> +						  node->only_called_at_startup,
> +						  node->only_called_at_exit);
> +	if (!base_section)
> +          base_section = darwin_text_section (reloc, weak);
> +      }
>        break;
>  
>      case SECCAT_RODATA:
> @@ -2362,4 +2372,38 @@ darwin_enter_string_into_cfstring_table 
>      }
>  }
>  
> +/* Choose named function section based on its frequency.  */
> +
> +section *
> +darwin_function_section (tree decl, enum node_frequency freq,
> +			  bool startup, bool exit)
> +{
> +  /* Startup code should go to startup subsection unless it is
> +     unlikely executed (this happens especially with function splitting
> +     where we can split away unnecesary parts of static constructors.  */
> +  if (startup && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
> +    return named_text_subsection
> +	     (decl, "__TEXT,__startup,regular,pure_instructions", "_startup");
> +
> +  /* Similarly for exit.  */
> +  if (exit && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
> +    return named_text_subsection (decl,
> +				  "__TEXT,__exit,regular,pure_instructions",
> +				  "_exit");
> +
> +  /* Group cold functions together, similarly for hot code.  */
> +  switch (freq)
> +    {
> +      case NODE_FREQUENCY_UNLIKELY_EXECUTED:
> +	return named_text_subsection
> +		 (decl,
> +	          "__TEXT,__unlikely,regular,pure_instructions", "_ulinkely");
> +      case NODE_FREQUENCY_HOT:
> +	return named_text_subsection
> +		 (decl, "__TEXT,__hot,regular,pure_instructions", "_hot");
> +      default:
> +	return NULL;
> +    }
> +}
> +
>  #include "gt-darwin.h"
> Index: config/darwin.h
> ===================================================================
> --- config/darwin.h	(revision 166490)
> +++ config/darwin.h	(working copy)
> @@ -710,11 +710,6 @@ int darwin_label_is_anonymous_local_objc
>  /* The generic version, archs should over-ride where required.  */
>  #define MACHOPIC_NL_SYMBOL_PTR_SECTION ".non_lazy_symbol_pointer"
>  
> -/* These are used by -fbranch-probabilities */
> -#define HOT_TEXT_SECTION_NAME "__TEXT,__text,regular,pure_instructions"
> -#define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \
> -                              "__TEXT,__unlikely,regular,pure_instructions"
> -
>  /* Declare the section variables.  */
>  #ifndef USED_FOR_TARGET
>  enum darwin_section_enum {
> @@ -729,6 +724,8 @@ extern GTY(()) section * darwin_sections
>  #undef	TARGET_ASM_SELECT_SECTION
>  #define TARGET_ASM_SELECT_SECTION machopic_select_section
>  #define USE_SELECT_SECTION_FOR_FUNCTIONS
> +#undef	TARGET_ASM_FUNCTION_SECTION
> +#define TARGET_ASM_FUNCTION_SECTION darwin_function_section
>  
>  #undef	TARGET_ASM_SELECT_RTX_SECTION
>  #define TARGET_ASM_SELECT_RTX_SECTION machopic_select_rtx_section

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 14:16       ` Jack Howarth
@ 2010-11-10 14:56         ` Ian Lance Taylor
  2010-11-10 15:04           ` Jack Howarth
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Lance Taylor @ 2010-11-10 14:56 UTC (permalink / raw)
  To: Jack Howarth
  Cc: Jan Hubicka, Richard Henderson, Joseph S. Myers, gcc-patches,
	rguenther, iains, tglek

Jack Howarth <howarth@bromo.med.uc.edu> writes:

>    This patch is causing a bootstrap failure on x86_64-apple-darwin10 with the error...
>
> makeinfo: Removing output file `doc/gccint.info' due to errors; use --force to preserve.
> gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I../../gcc-4.6-20101110/gcc -I../../gcc-4.6-20101110/gcc/build -I../../gcc-4.6-20101110/gcc/../include -I../../gcc-4.6-20101110/gcc/../libcpp/include -I/sw/include -I/sw/include  -I../../gcc-4.6-20101110/gcc/../libdecnumber -I../../gcc-4.6-20101110/gcc/../libdecnumber/dpd -I../libdecnumber -I/sw/include  -I/sw/include -DCLOOG_PPL_BACKEND  -I/sw/include \
> 		-o build/gencodes.o ../../gcc-4.6-20101110/gcc/gencodes.c
> make[3]: *** [doc/gccint.info] Error 1
> make[3]: *** Waiting for unfinished jobs....

That's not the error, that's just the part which says that there were
errors.  There should be more output from makeinfo earlier in the log
which shows the real error.

Ian

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 14:56         ` Ian Lance Taylor
@ 2010-11-10 15:04           ` Jack Howarth
  2010-11-10 15:29             ` Jan Hubicka
  0 siblings, 1 reply; 70+ messages in thread
From: Jack Howarth @ 2010-11-10 15:04 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Jan Hubicka, Richard Henderson, Joseph S. Myers, gcc-patches,
	rguenther, iains, tglek

On Wed, Nov 10, 2010 at 06:54:50AM -0800, Ian Lance Taylor wrote:
> Jack Howarth <howarth@bromo.med.uc.edu> writes:
> 
> >    This patch is causing a bootstrap failure on x86_64-apple-darwin10 with the error...
> >
> > makeinfo: Removing output file `doc/gccint.info' due to errors; use --force to preserve.
> > gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I../../gcc-4.6-20101110/gcc -I../../gcc-4.6-20101110/gcc/build -I../../gcc-4.6-20101110/gcc/../include -I../../gcc-4.6-20101110/gcc/../libcpp/include -I/sw/include -I/sw/include  -I../../gcc-4.6-20101110/gcc/../libdecnumber -I../../gcc-4.6-20101110/gcc/../libdecnumber/dpd -I../libdecnumber -I/sw/include  -I/sw/include -DCLOOG_PPL_BACKEND  -I/sw/include \
> > 		-o build/gencodes.o ../../gcc-4.6-20101110/gcc/gencodes.c
> > make[3]: *** [doc/gccint.info] Error 1
> > make[3]: *** Waiting for unfinished jobs....
> 
> That's not the error, that's just the part which says that there were
> errors.  There should be more output from makeinfo earlier in the log
> which shows the real error.
> 
> Ian

Ian,
   Opps. Looks like the real error may be...

# make
[ -f stage_final ] || echo stage3 > stage_final
rm -f stage_current
make[4]: Nothing to be done for `all'.
make[3]: Nothing to be done for `all'.
make[4]: Nothing to be done for `all'.
make[3]: Nothing to be done for `all'.
make[3]: Nothing to be done for `all'.
make[3]: Nothing to be done for `all'.
gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common  -DHAVE_CONFIG_H -I. -I. -I../../gcc-4.6-20101110/gcc -I../../gcc-4.6-20101110/gcc/. -I../../gcc-4.6-20101110/gcc/../include -I../../gcc-4.6-20101110/gcc/../libcpp/include -I/sw/include -I/sw/include  -I../../gcc-4.6-20101110/gcc/../libdecnumber -I../../gcc-4.6-20101110/gcc/../libdecnumber/dpd -I../libdecnumber -I/sw/include  -I/sw/include -DCLOOG_PPL_BACKEND  -I/sw/include ../../gcc-4.6-20101110/gcc/varasm.c -o varasm.o
../../gcc-4.6-20101110/gcc/varasm.c: In function 'assemble_start_function':
../../gcc-4.6-20101110/gcc/varasm.c:1517: error: 'struct function_subsections' has no member named 'unlikely_text_section_name'
../../gcc-4.6-20101110/gcc/varasm.c: In function 'output_constructor_bitfield':
../../gcc-4.6-20101110/gcc/varasm.c:4832: warning: unknown conversion type character 'E' in format
../../gcc-4.6-20101110/gcc/varasm.c:4832: warning: too many arguments for format
../../gcc-4.6-20101110/gcc/varasm.c: In function 'finish_aliases_1':
../../gcc-4.6-20101110/gcc/varasm.c:5610: warning: unknown conversion type character 'E' in format
../../gcc-4.6-20101110/gcc/varasm.c:5610: warning: too many arguments for format
../../gcc-4.6-20101110/gcc/varasm.c:5624: warning: unknown conversion type character 'E' in format
../../gcc-4.6-20101110/gcc/varasm.c:5624: warning: too many arguments for format
make[3]: *** [varasm.o] Error 1
make[2]: *** [all-stage1-gcc] Error 2
make[1]: *** [stage1-bubble] Error 2
make: *** [all] Error 2

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 15:04           ` Jack Howarth
@ 2010-11-10 15:29             ` Jan Hubicka
  2010-11-10 16:54               ` Jack Howarth
  0 siblings, 1 reply; 70+ messages in thread
From: Jan Hubicka @ 2010-11-10 15:29 UTC (permalink / raw)
  To: Jack Howarth
  Cc: Ian Lance Taylor, Jan Hubicka, Richard Henderson,
	Joseph S. Myers, gcc-patches, rguenther, iains, tglek

> On Wed, Nov 10, 2010 at 06:54:50AM -0800, Ian Lance Taylor wrote:
> > Jack Howarth <howarth@bromo.med.uc.edu> writes:
> > 
> > >    This patch is causing a bootstrap failure on x86_64-apple-darwin10 with the error...
> > >
> > > makeinfo: Removing output file `doc/gccint.info' due to errors; use --force to preserve.
> > > gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I../../gcc-4.6-20101110/gcc -I../../gcc-4.6-20101110/gcc/build -I../../gcc-4.6-20101110/gcc/../include -I../../gcc-4.6-20101110/gcc/../libcpp/include -I/sw/include -I/sw/include  -I../../gcc-4.6-20101110/gcc/../libdecnumber -I../../gcc-4.6-20101110/gcc/../libdecnumber/dpd -I../libdecnumber -I/sw/include  -I/sw/include -DCLOOG_PPL_BACKEND  -I/sw/include \
> > > 		-o build/gencodes.o ../../gcc-4.6-20101110/gcc/gencodes.c
> > > make[3]: *** [doc/gccint.info] Error 1
> > > make[3]: *** Waiting for unfinished jobs....
> > 
> > That's not the error, that's just the part which says that there were
> > errors.  There should be more output from makeinfo earlier in the log
> > which shows the real error.
> > 
> > Ian
> 
> Ian,
>    Opps. Looks like the real error may be...
> 
> # make
> [ -f stage_final ] || echo stage3 > stage_final
> rm -f stage_current
> make[4]: Nothing to be done for `all'.
> make[3]: Nothing to be done for `all'.
> make[4]: Nothing to be done for `all'.
> make[3]: Nothing to be done for `all'.
> make[3]: Nothing to be done for `all'.
> make[3]: Nothing to be done for `all'.
> gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common  -DHAVE_CONFIG_H -I. -I. -I../../gcc-4.6-20101110/gcc -I../../gcc-4.6-20101110/gcc/. -I../../gcc-4.6-20101110/gcc/../include -I../../gcc-4.6-20101110/gcc/../libcpp/include -I/sw/include -I/sw/include  -I../../gcc-4.6-20101110/gcc/../libdecnumber -I../../gcc-4.6-20101110/gcc/../libdecnumber/dpd -I../libdecnumber -I/sw/include  -I/sw/include -DCLOOG_PPL_BACKEND  -I/sw/include ../../gcc-4.6-20101110/gcc/varasm.c -o varasm.o
> ../../gcc-4.6-20101110/gcc/varasm.c: In function 'assemble_start_function':
> ../../gcc-4.6-20101110/gcc/varasm.c:1517: error: 'struct function_subsections' has no member named 'unlikely_text_section_name'

Oops, can you, please, try again with the statement at line 1517:
 crtl->subsections.unlikely_text_section_name = NULL
removed?
I checked that this is last occurence of this and updated my local patch.

Thanks for testing it!

Honza

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 15:29             ` Jan Hubicka
@ 2010-11-10 16:54               ` Jack Howarth
  2010-11-10 17:10                 ` Jan Hubicka
  0 siblings, 1 reply; 70+ messages in thread
From: Jack Howarth @ 2010-11-10 16:54 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Ian Lance Taylor, Richard Henderson, Joseph S. Myers,
	gcc-patches, rguenther, iains, tglek

On Wed, Nov 10, 2010 at 04:27:17PM +0100, Jan Hubicka wrote:
> > On Wed, Nov 10, 2010 at 06:54:50AM -0800, Ian Lance Taylor wrote:
> > > Jack Howarth <howarth@bromo.med.uc.edu> writes:
> > > 
> > > >    This patch is causing a bootstrap failure on x86_64-apple-darwin10 with the error...
> > > >
> > > > makeinfo: Removing output file `doc/gccint.info' due to errors; use --force to preserve.
> > > > gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I../../gcc-4.6-20101110/gcc -I../../gcc-4.6-20101110/gcc/build -I../../gcc-4.6-20101110/gcc/../include -I../../gcc-4.6-20101110/gcc/../libcpp/include -I/sw/include -I/sw/include  -I../../gcc-4.6-20101110/gcc/../libdecnumber -I../../gcc-4.6-20101110/gcc/../libdecnumber/dpd -I../libdecnumber -I/sw/include  -I/sw/include -DCLOOG_PPL_BACKEND  -I/sw/include \
> > > > 		-o build/gencodes.o ../../gcc-4.6-20101110/gcc/gencodes.c
> > > > make[3]: *** [doc/gccint.info] Error 1
> > > > make[3]: *** Waiting for unfinished jobs....
> > > 
> > > That's not the error, that's just the part which says that there were
> > > errors.  There should be more output from makeinfo earlier in the log
> > > which shows the real error.
> > > 
> > > Ian
> > 
> > Ian,
> >    Opps. Looks like the real error may be...
> > 
> > # make
> > [ -f stage_final ] || echo stage3 > stage_final
> > rm -f stage_current
> > make[4]: Nothing to be done for `all'.
> > make[3]: Nothing to be done for `all'.
> > make[4]: Nothing to be done for `all'.
> > make[3]: Nothing to be done for `all'.
> > make[3]: Nothing to be done for `all'.
> > make[3]: Nothing to be done for `all'.
> > gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common  -DHAVE_CONFIG_H -I. -I. -I../../gcc-4.6-20101110/gcc -I../../gcc-4.6-20101110/gcc/. -I../../gcc-4.6-20101110/gcc/../include -I../../gcc-4.6-20101110/gcc/../libcpp/include -I/sw/include -I/sw/include  -I../../gcc-4.6-20101110/gcc/../libdecnumber -I../../gcc-4.6-20101110/gcc/../libdecnumber/dpd -I../libdecnumber -I/sw/include  -I/sw/include -DCLOOG_PPL_BACKEND  -I/sw/include ../../gcc-4.6-20101110/gcc/varasm.c -o varasm.o
> > ../../gcc-4.6-20101110/gcc/varasm.c: In function 'assemble_start_function':
> > ../../gcc-4.6-20101110/gcc/varasm.c:1517: error: 'struct function_subsections' has no member named 'unlikely_text_section_name'
> 
> Oops, can you, please, try again with the statement at line 1517:
>  crtl->subsections.unlikely_text_section_name = NULL
> removed?
> I checked that this is last occurence of this and updated my local patch.
> 
> Thanks for testing it!
> 
> Honza

Honza,
   With the line containing...

crtl->subsections.unlikely_text_section_name = NULL

in gcc/varasm.c removed. the failure now moved to...

if [ xinfo = xinfo ]; then \
		makeinfo --split-size=5000000 --split-size=5000000 --split-size=5000000 --no-split -I . -I ../../gcc-4.6-20101110/gcc/doc \
			-I ../../gcc-4.6-20101110/gcc/doc/include -o doc/gccint.info ../../gcc-4.6-20101110/gcc/doc/gccint.texi; \
	fi
../../gcc-4.6-20101110/gcc/doc//tm.texi:7318: No matching `@end deftypefn'.
makeinfo: Removing output file `doc/gccint.info' due to errors; use --force to preserve.
make[3]: *** [doc/gccint.info] Error 1
make[2]: *** [all-stage1-gcc] Error 2
make[1]: *** [stage1-bubble] Error 2
make: *** [all] Error 2

            Jack

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 16:54               ` Jack Howarth
@ 2010-11-10 17:10                 ` Jan Hubicka
  2010-11-10 17:16                   ` Jan Hubicka
  2010-11-10 17:49                   ` Jack Howarth
  0 siblings, 2 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-11-10 17:10 UTC (permalink / raw)
  To: Jack Howarth
  Cc: Jan Hubicka, Ian Lance Taylor, Richard Henderson,
	Joseph S. Myers, gcc-patches, rguenther, iains, tglek

> On Wed, Nov 10, 2010 at 04:27:17PM +0100, Jan Hubicka wrote:
> > > On Wed, Nov 10, 2010 at 06:54:50AM -0800, Ian Lance Taylor wrote:
> > > > Jack Howarth <howarth@bromo.med.uc.edu> writes:
> > > > 
> > > > >    This patch is causing a bootstrap failure on x86_64-apple-darwin10 with the error...
> > > > >
> > > > > makeinfo: Removing output file `doc/gccint.info' due to errors; use --force to preserve.
> > > > > gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE -I. -Ibuild -I../../gcc-4.6-20101110/gcc -I../../gcc-4.6-20101110/gcc/build -I../../gcc-4.6-20101110/gcc/../include -I../../gcc-4.6-20101110/gcc/../libcpp/include -I/sw/include -I/sw/include  -I../../gcc-4.6-20101110/gcc/../libdecnumber -I../../gcc-4.6-20101110/gcc/../libdecnumber/dpd -I../libdecnumber -I/sw/include  -I/sw/include -DCLOOG_PPL_BACKEND  -I/sw/include \
> > > > > 		-o build/gencodes.o ../../gcc-4.6-20101110/gcc/gencodes.c
> > > > > make[3]: *** [doc/gccint.info] Error 1
> > > > > make[3]: *** Waiting for unfinished jobs....
> > > > 
> > > > That's not the error, that's just the part which says that there were
> > > > errors.  There should be more output from makeinfo earlier in the log
> > > > which shows the real error.
> > > > 
> > > > Ian
> > > 
> > > Ian,
> > >    Opps. Looks like the real error may be...
> > > 
> > > # make
> > > [ -f stage_final ] || echo stage3 > stage_final
> > > rm -f stage_current
> > > make[4]: Nothing to be done for `all'.
> > > make[3]: Nothing to be done for `all'.
> > > make[4]: Nothing to be done for `all'.
> > > make[3]: Nothing to be done for `all'.
> > > make[3]: Nothing to be done for `all'.
> > > make[3]: Nothing to be done for `all'.
> > > gcc -c   -g -fkeep-inline-functions -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common  -DHAVE_CONFIG_H -I. -I. -I../../gcc-4.6-20101110/gcc -I../../gcc-4.6-20101110/gcc/. -I../../gcc-4.6-20101110/gcc/../include -I../../gcc-4.6-20101110/gcc/../libcpp/include -I/sw/include -I/sw/include  -I../../gcc-4.6-20101110/gcc/../libdecnumber -I../../gcc-4.6-20101110/gcc/../libdecnumber/dpd -I../libdecnumber -I/sw/include  -I/sw/include -DCLOOG_PPL_BACKEND  -I/sw/include ../../gcc-4.6-20101110/gcc/varasm.c -o varasm.o
> > > ../../gcc-4.6-20101110/gcc/varasm.c: In function 'assemble_start_function':
> > > ../../gcc-4.6-20101110/gcc/varasm.c:1517: error: 'struct function_subsections' has no member named 'unlikely_text_section_name'
> > 
> > Oops, can you, please, try again with the statement at line 1517:
> >  crtl->subsections.unlikely_text_section_name = NULL
> > removed?
> > I checked that this is last occurence of this and updated my local patch.
> > 
> > Thanks for testing it!
> > 
> > Honza
> 
> Honza,
>    With the line containing...
> 
> crtl->subsections.unlikely_text_section_name = NULL
> 
> in gcc/varasm.c removed. the failure now moved to...
> 
> if [ xinfo = xinfo ]; then \
> 		makeinfo --split-size=5000000 --split-size=5000000 --split-size=5000000 --no-split -I . -I ../../gcc-4.6-20101110/gcc/doc \
> 			-I ../../gcc-4.6-20101110/gcc/doc/include -o doc/gccint.info ../../gcc-4.6-20101110/gcc/doc/gccint.texi; \
> 	fi
> ../../gcc-4.6-20101110/gcc/doc//tm.texi:7318: No matching `@end deftypefn'.
> makeinfo: Removing output file `doc/gccint.info' due to errors; use --force to preserve.

Makefile now has this funny message of copying tm.texi.in into tm.texi.  It seems that you copied
doc/tm.texi.in into doc/tm.texi while you need to copy tm.texi.in from your build directory.

Honza

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 17:10                 ` Jan Hubicka
@ 2010-11-10 17:16                   ` Jan Hubicka
  2010-11-10 22:26                     ` Jack Howarth
  2010-11-10 17:49                   ` Jack Howarth
  1 sibling, 1 reply; 70+ messages in thread
From: Jan Hubicka @ 2010-11-10 17:16 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Jack Howarth, Ian Lance Taylor, Richard Henderson,
	Joseph S. Myers, gcc-patches, rguenther, iains, tglek

> Makefile now has this funny message of copying tm.texi.in into tm.texi.  It seems that you copied
> doc/tm.texi.in into doc/tm.texi while you need to copy tm.texi.in from your build directory.

You can also just revert the doc directory change.  It needs no testing on multiple targets.

Thanks for doing the Darwin run BTW :)
Honza
> 
> Honza

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 17:10                 ` Jan Hubicka
  2010-11-10 17:16                   ` Jan Hubicka
@ 2010-11-10 17:49                   ` Jack Howarth
  2010-11-11 18:57                     ` Ralf Wildenhues
  1 sibling, 1 reply; 70+ messages in thread
From: Jack Howarth @ 2010-11-10 17:49 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Ian Lance Taylor, Richard Henderson, Joseph S. Myers,
	gcc-patches, rguenther, iains, tglek

On Wed, Nov 10, 2010 at 06:04:12PM +0100, Jan Hubicka wrote:
> 
> Makefile now has this funny message of copying tm.texi.in into tm.texi.  It seems that you copied
> doc/tm.texi.in into doc/tm.texi while you need to copy tm.texi.in from your build directory.
> 
> Honza

Honza,
   Isn't part of the problem due to...

Index: doc/tm.texi
===================================================================
--- doc/tm.texi	(revision 166490)
+++ doc/tm.texi	(working copy)
@@ -7335,6 +7335,8 @@ macro is not defined, nothing is output 
 @end defmac
 
 @deftypefn {Target Hook} void TARGET_ASM_NAMED_SECTION (const char *@var{name}, unsigned int @var{flags}, tree @var{decl})
+
+@deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_SECTION (tree @var{decl}, enum node_frequency @var{freq}, bool @var{startup}, bool @var{exit})
 Output assembly directives to switch to section @var{name}.  The section
 should have attributes as specified by @var{flags}, which is a bit mask
 of the @code{SECTION_*} flags defined in @file{output.h}.  If @var{decl}

This change adds a new @deftypefn without adding any text or the closing
@end deftypefn? There could be other issues as well but that doesn't seem
correct.
                  Jack

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10  4:59     ` Group static constructors and destructors in specific subsections, take 2 Jan Hubicka
  2010-11-10 14:16       ` Jack Howarth
@ 2010-11-10 18:59       ` Richard Henderson
  2010-11-10 19:44         ` Jan Hubicka
  2010-11-21  1:13         ` Jan Hubicka
  1 sibling, 2 replies; 70+ messages in thread
From: Richard Henderson @ 2010-11-10 18:59 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Joseph S. Myers, gcc-patches, rguenther, iant, iains, tglek

On 11/09/2010 08:24 PM, Jan Hubicka wrote:
> Index: config/ia64/hpux.h
> ===================================================================
> --- config/ia64/hpux.h	(revision 166490)
> +++ config/ia64/hpux.h	(working copy)
> @@ -221,8 +221,4 @@ do {								\
>     it is fixed, prevent code from being put into .text.unlikely or
>     .text.hot.  */
>  
> -#undef UNLIKELY_EXECUTED_TEXT_SECTION_NAME
> -#define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text"
> -
> -#undef HOT_TEXT_SECTION_NAME
> -#define HOT_TEXT_SECTION_NAME ".text"
> +#define TARGET_ASM_FUNCTION_SECTOIN NULL

Typo.  But I'd rather avoid a check for NULL that affects only one target.
Better to define a function that does what you want.

>  #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
>    if (decl != NULL_TREE
>        && DECL_SECTION_NAME (decl) != NULL_TREE)
> -    return reloc ? unlikely_text_section ()
> -		 : get_named_section (decl, NULL, 0);
> +    {
> +      if (targetm.asm_out.function_section)
> +	section = targetm.asm_out.function_section (decl, freq,
> +						    startup, exit);
> +      if (section)
> +	return section;
> +      return get_named_section (decl, NULL, 0);
> +    }
>    else
> -    return targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
> +    return targetm.asm_out.select_section
> +	    (decl, freq == NODE_FREQUENCY_UNLIKELY_EXECUTED,
> +	     DECL_ALIGN (decl));
>  #else
> -  return reloc ? unlikely_text_section () : hot_function_section (decl);
> +  if (targetm.asm_out.function_section)
> +    section = targetm.asm_out.function_section (decl, freq, startup, exit);
> +  if (section)
> +    return section;
> +  return hot_function_section (decl);
>  #endif

Honestly, adding the function_section hook is an excellent opportunity
to get rid of the USE_SELECT_SECTION_FOR_FUNCTIONS macro, used only by
darwin and the MEP port.

I think this whole section should boil down to 

  return targetm.asm_out.function_section (decl, freq, startup, exit);

with the associated ugliness above pushed into various versions of the
function_section hook.

Also, I think hot_function_section should be merged here, so that we're
properly checking for DECL_SECTION_NAME before deferring to the new
function_section hook.  Which means that the hook will either gen up
a new section name based on (freq,startup,exit), or return text_section.

> +section *
> +named_text_subsection (tree decl,
> +		       const char *text_section_name,
> +		       const char *named_section_suffix)

I don't understand why you're bothering to hash the section here, when
that would also be done inside get_section.  Seems like it'd be just as
easy to concat the section name you want and defer the rest to the 
existing functions.

Also, I think get_named_text_section is a better name, in line with the
related functions.

Also, I wonder about the naming of these sections.  Why would we prefer
".text.foo_unlikely" rather than ".text.unlikely.foo"?  It seems to me that 
have the possibility of conflict between a function foo (which is unlikely),
and a function foo_unlikely (which happens to not be unlikely and so not
receive a suffix).

Also, it appears we're doing some amount of section switching even for
functions that have had their DECL_SECTION_NAME set by the user.  We don't
seem to be distinguishing between user settings and settings that came
from resolve_unique_section.  It seems to me that we should *never* change
the section when the user has specified one.

I realize that these last two are pre-existing conditions, but previously
would only be encountered with flag_reorder_blocks_and_partition.  You're
now planning changes that would take effect by default, which seems likely
to break kernels.

Which leads me to wonder if a new hash *is* called for, but would simply
be a map from decl to hot+cold section objects.  Perhaps

static section *
function_section_1 (tree decl, bool force_cold)
{
  slot = decl in hash;
  if (slot)
    {
      sec = is_cold ? slot->cold : slot->hot;
      if (sec)
	return sec;
    }
  else
    slot = new hash entry;

  if (targetm.have_named_sections)
    {
      if (DECL_SECTION_NAME (decl))
        {
          // The user has setup a section for the function.  Honor it.
          sec = get_named_section (decl, NULL, 0);
        }
      else if (DECL_ONE_ONLY (decl))
        {
          // The function must be placed in a comdat section.  Lest we
          // fail to merge duplicates, the section name is part of the
          // ABI of the function.  Don't do hot/cold splitting or other
          // sorts of renames.
	  // ??? Note that this still sets DECL_SECTION_NAME; ideally
	  // we'd revise the interface to return a section, but it's
	  // not critical.
          targetm.asm_out.unique_section (decl, false);
          sec = get_named_section (decl, NULL, 0);
	}
    }
  if (!sec)
    {
      // The hook handles all of the details of hot/cold/startup/exit.
      // It must *not* set DECL_SECTION_NAME, so that when we return
      // for the !is_cold section we arrive here and don't get confused
      // by the cases above.
      sec = targetm.asm_out.function_section (...);
    }

  if (is_cold)
    slot->cold = sec;
  else
    slot->hot = sec;

  return sec;
}



r~

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 18:59       ` Richard Henderson
@ 2010-11-10 19:44         ` Jan Hubicka
  2010-11-21  1:13         ` Jan Hubicka
  1 sibling, 0 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-11-10 19:44 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jan Hubicka, Joseph S. Myers, gcc-patches, rguenther, iant, iains, tglek

> > -#undef HOT_TEXT_SECTION_NAME
> > -#define HOT_TEXT_SECTION_NAME ".text"
> > +#define TARGET_ASM_FUNCTION_SECTOIN NULL
> 
> Typo.  But I'd rather avoid a check for NULL that affects only one target.
> Better to define a function that does what you want.
OK, will do.  Basically it should return NULL.
> 
> >  #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
> >    if (decl != NULL_TREE
> >        && DECL_SECTION_NAME (decl) != NULL_TREE)
> > -    return reloc ? unlikely_text_section ()
> > -		 : get_named_section (decl, NULL, 0);
> > +    {
> > +      if (targetm.asm_out.function_section)
> > +	section = targetm.asm_out.function_section (decl, freq,
> > +						    startup, exit);
> > +      if (section)
> > +	return section;
> > +      return get_named_section (decl, NULL, 0);
> > +    }
> >    else
> > -    return targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
> > +    return targetm.asm_out.select_section
> > +	    (decl, freq == NODE_FREQUENCY_UNLIKELY_EXECUTED,
> > +	     DECL_ALIGN (decl));
> >  #else
> > -  return reloc ? unlikely_text_section () : hot_function_section (decl);
> > +  if (targetm.asm_out.function_section)
> > +    section = targetm.asm_out.function_section (decl, freq, startup, exit);
> > +  if (section)
> > +    return section;
> > +  return hot_function_section (decl);
> >  #endif
> 
> Honestly, adding the function_section hook is an excellent opportunity
> to get rid of the USE_SELECT_SECTION_FOR_FUNCTIONS macro, used only by
> darwin and the MEP port.

Hmm, yep, I was looking into that too, but it seems better to do this incrementally
rather than keep snowballing this.
> 
> I think this whole section should boil down to 
> 
>   return targetm.asm_out.function_section (decl, freq, startup, exit);
> 
> with the associated ugliness above pushed into various versions of the
> function_section hook.
> 
> Also, I think hot_function_section should be merged here, so that we're
> properly checking for DECL_SECTION_NAME before deferring to the new
> function_section hook.  Which means that the hook will either gen up
> a new section name based on (freq,startup,exit), or return text_section.
> 
> > +section *
> > +named_text_subsection (tree decl,
> > +		       const char *text_section_name,
> > +		       const char *named_section_suffix)
> 
> I don't understand why you're bothering to hash the section here, when
> that would also be done inside get_section.  Seems like it'd be just as
> easy to concat the section name you want and defer the rest to the 
> existing functions.

Well, I wanted to avoid the string operations because old code did that
too.  We can switch quite few times when doing BB splitting, but I guess
it is non-critical whether we will always concatenate or whether we will
cache in hash.  I can get rid of it definitly.
> 
> Also, I think get_named_text_section is a better name, in line with the
> related functions.
OK.  I use subsection because...
> 
> Also, I wonder about the naming of these sections.  Why would we prefer
> ".text.foo_unlikely" rather than ".text.unlikely.foo"?  It seems to me that 
> have the possibility of conflict between a function foo (which is unlikely),
> and a function foo_unlikely (which happens to not be unlikely and so not
> receive a suffix).

Hmm, this was again copied from old code.  I
> 
> Also, it appears we're doing some amount of section switching even for
> functions that have had their DECL_SECTION_NAME set by the user.  We don't
> seem to be distinguishing between user settings and settings that came
> from resolve_unique_section.  It seems to me that we should *never* change
> the section when the user has specified one.

... while i was glancing over this I believed that it is supposed to work
in a way that when user specify named section "foo", we will create
foo._unlikely/foo._hot ... etc subsection so if there are multiple functions in
the same section they willbe properly grouped within that function.
> 
> I realize that these last two are pre-existing conditions, but previously
> would only be encountered with flag_reorder_blocks_and_partition.  You're
> now planning changes that would take effect by default, which seems likely
> to break kernels.

We used to produce cold functions automatically since ipa-profile got in.
> 
> Which leads me to wonder if a new hash *is* called for, but would simply
> be a map from decl to hot+cold section objects.  Perhaps

I was trying to avoid too much assumptions that we break up into hot/cold
only since it makes sense to do more partitioning and there is no need
to spread this across the code.  But this would work with me, too.

Honza
> 
> static section *
> function_section_1 (tree decl, bool force_cold)
> {
>   slot = decl in hash;
>   if (slot)
>     {
>       sec = is_cold ? slot->cold : slot->hot;
>       if (sec)
> 	return sec;
>     }
>   else
>     slot = new hash entry;
> 
>   if (targetm.have_named_sections)
>     {
>       if (DECL_SECTION_NAME (decl))
>         {
>           // The user has setup a section for the function.  Honor it.
>           sec = get_named_section (decl, NULL, 0);
>         }
>       else if (DECL_ONE_ONLY (decl))
>         {
>           // The function must be placed in a comdat section.  Lest we
>           // fail to merge duplicates, the section name is part of the
>           // ABI of the function.  Don't do hot/cold splitting or other
>           // sorts of renames.
> 	  // ??? Note that this still sets DECL_SECTION_NAME; ideally
> 	  // we'd revise the interface to return a section, but it's
> 	  // not critical.
>           targetm.asm_out.unique_section (decl, false);
>           sec = get_named_section (decl, NULL, 0);
> 	}
>     }
>   if (!sec)
>     {
>       // The hook handles all of the details of hot/cold/startup/exit.
>       // It must *not* set DECL_SECTION_NAME, so that when we return
>       // for the !is_cold section we arrive here and don't get confused
>       // by the cases above.
>       sec = targetm.asm_out.function_section (...);
>     }
> 
>   if (is_cold)
>     slot->cold = sec;
>   else
>     slot->hot = sec;
> 
>   return sec;
> }
> 
> 
> 
> r~

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 17:16                   ` Jan Hubicka
@ 2010-11-10 22:26                     ` Jack Howarth
  2010-11-10 22:57                       ` Jan Hubicka
  0 siblings, 1 reply; 70+ messages in thread
From: Jack Howarth @ 2010-11-10 22:26 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Ian Lance Taylor, Richard Henderson, Joseph S. Myers,
	gcc-patches, rguenther, iains, tglek

On Wed, Nov 10, 2010 at 06:13:17PM +0100, Jan Hubicka wrote:
> > Makefile now has this funny message of copying tm.texi.in into tm.texi.  It seems that you copied
> > doc/tm.texi.in into doc/tm.texi while you need to copy tm.texi.in from your build directory.
> 
> You can also just revert the doc directory change.  It needs no testing on multiple targets.
> 
> Thanks for doing the Darwin run BTW :)
> Honza
> > 
> > Honza

Honza,
   I can't seem to get past the error...

gcc   -g -O2 -DIN_GCC   -W -Wall -Wwrite-strings -Wcast-qual -Wstrict-prototypes -Wmissing-prototypes -Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Wold-style-definition -Wc++-compat -fno-common  -DHAVE_CONFIG_H -DGENERATOR_FILE  -o build/gengenrtl \
	    build/gengenrtl.o build/errors.o ../build-x86_64-apple-darwin10.5.0/libiberty/libiberty.a
/bin/sh ../../gcc/gcc/../move-if-change tmp-tm.texi tm.texi

Verify that you have permission to grant a GFDL license for all
new text in tm.texi, then copy it to ../../gcc/gcc/doc/tm.texi.
make[2]: *** [s-tm-texi] Error 1
make[2]: *** Waiting for unfinished jobs....
rm gfdl.pod cpp.pod gcov.pod fsf-funding.pod gcc.pod
make[1]: *** [all-gcc] Error 2
make: *** [all] Error 2

even with the simple build...

../gcc/configure --prefix=/Users/howarth/dist --with-gmp=/sw --with-mpc=/sw --with-libiconv-prefix=/sw --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib --enable-languages=c --disable-bootstrap

...using your patch except for the section...

Index: doc/tm.texi
===================================================================
--- doc/tm.texi	(revision 166490)
+++ doc/tm.texi	(working copy)
@@ -7335,6 +7335,8 @@ macro is not defined, nothing is output 
 @end defmac
 
 @deftypefn {Target Hook} void TARGET_ASM_NAMED_SECTION (const char *@var{name}, unsigned int @var{flags}, tree @var{decl})
+
+@deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_SECTION (tree @var{decl}, enum node_frequency @var{freq}, bool @var{startup}, bool @var{exit})
 Output assembly directives to switch to section @var{name}.  The section
 should have attributes as specified by @var{flags}, which is a bit mask
 of the @code{SECTION_*} flags defined in @file{output.h}.  If @var{decl}

that I pruned off of it. If I revert your patch and repeat the build, the error disappears. It is unclear
to me what section in the remainder of the patch could be causing this problem.
         Jack

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 22:26                     ` Jack Howarth
@ 2010-11-10 22:57                       ` Jan Hubicka
  2010-11-11  1:55                         ` Jack Howarth
  2010-11-11 12:52                         ` Jack Howarth
  0 siblings, 2 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-11-10 22:57 UTC (permalink / raw)
  To: Jack Howarth
  Cc: Jan Hubicka, Ian Lance Taylor, Richard Henderson,
	Joseph S. Myers, gcc-patches, rguenther, iains, tglek

> 
> that I pruned off of it. If I revert your patch and repeat the build, the error disappears. It is unclear
> to me what section in the remainder of the patch could be causing this problem.

It is the tm.texi being build from tm.texi.in
just revert both doc directory changes and you should be safe.

Thanks,
Honza
>          Jack

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 22:57                       ` Jan Hubicka
@ 2010-11-11  1:55                         ` Jack Howarth
  2010-11-11  2:53                           ` Dave Korn
  2010-11-11 12:52                         ` Jack Howarth
  1 sibling, 1 reply; 70+ messages in thread
From: Jack Howarth @ 2010-11-11  1:55 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Ian Lance Taylor, Richard Henderson, Joseph S. Myers,
	gcc-patches, rguenther, iains, tglek

On Wed, Nov 10, 2010 at 11:56:18PM +0100, Jan Hubicka wrote:
> > 
> > that I pruned off of it. If I revert your patch and repeat the build, the error disappears. It is unclear
> > to me what section in the remainder of the patch could be causing this problem.
> 
> It is the tm.texi being build from tm.texi.in
> just revert both doc directory changes and you should be safe.
> 
> Thanks,
> Honza

Honza,
    The problem is that you can't avoid making the texi changes. Without the doc/tm.texi section of your
patch applied, the build actually ends up recreating it in the build directory and then demands that the
change be moved over into the source tree's gcc/doc/tm.texi. If you do that (ie restore the original patch),
the error changes to a complaint that the gcc/doc/tm.texi is missing the matching `@end deftypefn'
for the new entry. If you correct that, the error changes yet again to a complaint that the changes should
be in gcc/doc/tm.texi.in rather than gcc/doc/tm.texi. Once you correct that the build can complete. So
your patch seems to require properly formatted changes to gcc/doc/tm.texi.in which you currently lack.
Pretty weird texinfo bug.
           Jack

> >          Jack

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-11  1:55                         ` Jack Howarth
@ 2010-11-11  2:53                           ` Dave Korn
  2010-11-11  5:24                             ` Jack Howarth
  0 siblings, 1 reply; 70+ messages in thread
From: Dave Korn @ 2010-11-11  2:53 UTC (permalink / raw)
  To: Jack Howarth
  Cc: Jan Hubicka, Ian Lance Taylor, Richard Henderson,
	Joseph S. Myers, gcc-patches, rguenther, iains, tglek

On 11/11/2010 01:15, Jack Howarth wrote:
> On Wed, Nov 10, 2010 at 11:56:18PM +0100, Jan Hubicka wrote:
>>> that I pruned off of it. If I revert your patch and repeat the build,
>>> the error disappears. It is unclear to me what section in the remainder
>>> of the patch could be causing this problem.
>> It is the tm.texi being build from tm.texi.in just revert both doc
>> directory changes and you should be safe.
>> 
>> Thanks, Honza
> 
> Honza, The problem is that you can't avoid making the texi changes.

  Yes you can.  Just start with a fresh build dir, a clean source tree, and
discard the entire tm.texi hunk from Honza's patch, which was most likely
inadvertent since it's not mentioned in the changelog.

> Without
> the doc/tm.texi section of your patch applied, the build actually ends up
> recreating it in the build directory and then demands that the change be
> moved over into the source tree's gcc/doc/tm.texi.

  Not if $srcdir/gcc/doc/tm.texi.in is clean.

    cheers,
      DaveK

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-11  2:53                           ` Dave Korn
@ 2010-11-11  5:24                             ` Jack Howarth
  0 siblings, 0 replies; 70+ messages in thread
From: Jack Howarth @ 2010-11-11  5:24 UTC (permalink / raw)
  To: Dave Korn
  Cc: Jan Hubicka, Ian Lance Taylor, Richard Henderson,
	Joseph S. Myers, gcc-patches, rguenther, iains, tglek

On Thu, Nov 11, 2010 at 02:57:02AM +0000, Dave Korn wrote:
> On 11/11/2010 01:15, Jack Howarth wrote:
> > On Wed, Nov 10, 2010 at 11:56:18PM +0100, Jan Hubicka wrote:
> >>> that I pruned off of it. If I revert your patch and repeat the build,
> >>> the error disappears. It is unclear to me what section in the remainder
> >>> of the patch could be causing this problem.
> >> It is the tm.texi being build from tm.texi.in just revert both doc
> >> directory changes and you should be safe.
> >> 
> >> Thanks, Honza
> > 
> > Honza, The problem is that you can't avoid making the texi changes.
> 
>   Yes you can.  Just start with a fresh build dir, a clean source tree, and
> discard the entire tm.texi hunk from Honza's patch, which was most likely
> inadvertent since it's not mentioned in the changelog.

Dave,
   No, I did that. Starting from a fresh source tree and using all of the
patch except for the first hunk gcc/doc/tm.texi (which I deleted manually
from the patch before applying), the failure still occurs. Something in the
build triggers the generation of the same exact lines which were omitted
manually from the patch. Believe me I tried this four or five times to
verify the behavior exists.
                 Jack

> 
> > Without
> > the doc/tm.texi section of your patch applied, the build actually ends up
> > recreating it in the build directory and then demands that the change be
> > moved over into the source tree's gcc/doc/tm.texi.
> 
>   Not if $srcdir/gcc/doc/tm.texi.in is clean.
> 
>     cheers,
>       DaveK

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 22:57                       ` Jan Hubicka
  2010-11-11  1:55                         ` Jack Howarth
@ 2010-11-11 12:52                         ` Jack Howarth
  2010-11-11 16:28                           ` Jan Hubicka
  1 sibling, 1 reply; 70+ messages in thread
From: Jack Howarth @ 2010-11-11 12:52 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Ian Lance Taylor, Richard Henderson, Joseph S. Myers,
	gcc-patches, rguenther, iains, tglek

On Wed, Nov 10, 2010 at 11:56:18PM +0100, Jan Hubicka wrote:
> > 
> > that I pruned off of it. If I revert your patch and repeat the build, the error disappears. It is unclear
> > to me what section in the remainder of the patch could be causing this problem.
> 
> It is the tm.texi being build from tm.texi.in
> just revert both doc directory changes and you should be safe.
> 
> Thanks,
> Honza
> >          Jack

Honza,
   The proposed patch from http://gcc.gnu.org/ml/gcc-patches/2010-11/msg00983.html
with the fix from http://gcc.gnu.org/ml/gcc-patches/2010-11/msg01035.html shows
no regressions on x86_64-apple-darwin10...

http://gcc.gnu.org/ml/gcc-testresults/2010-11/msg00874.html

          Jack

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-11 12:52                         ` Jack Howarth
@ 2010-11-11 16:28                           ` Jan Hubicka
  0 siblings, 0 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-11-11 16:28 UTC (permalink / raw)
  To: Jack Howarth
  Cc: Jan Hubicka, Ian Lance Taylor, Richard Henderson,
	Joseph S. Myers, gcc-patches, rguenther, iains, tglek

> Honza,
>    The proposed patch from http://gcc.gnu.org/ml/gcc-patches/2010-11/msg00983.html
> with the fix from http://gcc.gnu.org/ml/gcc-patches/2010-11/msg01035.html shows
> no regressions on x86_64-apple-darwin10...
> 
> http://gcc.gnu.org/ml/gcc-testresults/2010-11/msg00874.html

Thanks.  I will make at least one extra revision to incorporate Richard's comments once
we are settled, but it is good to know that it generally wroks. I was affraid of getting
section flags wrong by removing all the special cases on cold sections.

Honza
> 
>           Jack

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 17:49                   ` Jack Howarth
@ 2010-11-11 18:57                     ` Ralf Wildenhues
  2010-11-11 21:41                       ` Dave Korn
  0 siblings, 1 reply; 70+ messages in thread
From: Ralf Wildenhues @ 2010-11-11 18:57 UTC (permalink / raw)
  To: Jack Howarth
  Cc: Jan Hubicka, Ian Lance Taylor, Richard Henderson,
	Joseph S. Myers, gcc-patches, rguenther, iains, tglek

* Jack Howarth wrote on Wed, Nov 10, 2010 at 06:44:42PM CET:
> Honza,
>    Isn't part of the problem due to...
> 

> --- doc/tm.texi	(revision 166490)
> +++ doc/tm.texi	(working copy)
> @@ -7335,6 +7335,8 @@ macro is not defined, nothing is output 
>  @end defmac
>  
>  @deftypefn {Target Hook} void TARGET_ASM_NAMED_SECTION (const char *@var{name}, unsigned int @var{flags}, tree @var{decl})
> +
> +@deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_SECTION (tree @var{decl}, enum node_frequency @var{freq}, bool @var{startup}, bool @var{exit})
>  Output assembly directives to switch to section @var{name}.  The section
>  should have attributes as specified by @var{flags}, which is a bit mask
>  of the @code{SECTION_*} flags defined in @file{output.h}.  If @var{decl}
> 
> This change adds a new @deftypefn without adding any text or the closing
> @end deftypefn? There could be other issues as well but that doesn't seem
> correct.

If this entry should be there, then it should be a @deftypefnx one,
and without a preceding empty line.

You can wrap long @def* lines by ending the line with an @,
and optional indenting spaces on the following line.

Cheers,
Ralf

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-11 18:57                     ` Ralf Wildenhues
@ 2010-11-11 21:41                       ` Dave Korn
  2010-11-11 23:04                         ` Dave Korn
  0 siblings, 1 reply; 70+ messages in thread
From: Dave Korn @ 2010-11-11 21:41 UTC (permalink / raw)
  To: Ralf Wildenhues, Jack Howarth, Jan Hubicka, Ian Lance Taylor,
	Richard Henderson, Joseph S. Myers, gcc-patches, rguenther,
	iains, tglek

On 11/11/2010 18:55, Ralf Wildenhues wrote:
> * Jack Howarth wrote on Wed, Nov 10, 2010 at 06:44:42PM CET:
>> Honza,
>>    Isn't part of the problem due to...
>>
> 
>> --- doc/tm.texi	(revision 166490)
>> +++ doc/tm.texi	(working copy)
>> @@ -7335,6 +7335,8 @@ macro is not defined, nothing is output 
>>  @end defmac
>>  
>>  @deftypefn {Target Hook} void TARGET_ASM_NAMED_SECTION (const char *@var{name}, unsigned int @var{flags}, tree @var{decl})
>> +
>> +@deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_SECTION (tree @var{decl}, enum node_frequency @var{freq}, bool @var{startup}, bool @var{exit})
>>  Output assembly directives to switch to section @var{name}.  The section
>>  should have attributes as specified by @var{flags}, which is a bit mask
>>  of the @code{SECTION_*} flags defined in @file{output.h}.  If @var{decl}
>>
>> This change adds a new @deftypefn without adding any text or the closing
>> @end deftypefn? There could be other issues as well but that doesn't seem
>> correct.
> 
> If this entry should be there, then it should be a @deftypefnx one,
> and without a preceding empty line.
> 
> You can wrap long @def* lines by ending the line with an @,
> and optional indenting spaces on the following line.

  The change is coming from genhooks, autogenerated based on the addition to
target.def.  It appears that if you add a new hook in target.def, but don't
add an @hook documentation paragraph in tm.texi.in to match, genhooks.c emits
the @deftype{fn,var} start marker along with the function name and parameters,
but then fails to enter this if clause in emit_documentation():

	      if (hook_array[i].doc[0])
		{
		  printf ("\n");
		  /* Print each documentation paragraph in turn.  */
		  for (doc = hook_array[i].doc; *doc; doc = p_end)
		    {
		      /* Find paragraph end.  */
		      p_end = strstr (doc, "\n\n");
		      p_end = (p_end ? p_end + 2 : doc + strlen (doc));
		      /* Print paragraph, emitting @Fcode as @code.  */
		      for (; (fcode = strstr (doc, "@Fcode{")) && fcode < p_end;
			   doc = fcode + 2)
			printf ("%.*s@", (int) (fcode - doc), doc);
		      printf ("%.*s", (int) (p_end - doc), doc);
		      /* Emit function indices for next paragraph.  */
		      emit_findices (p_end, name);
		    }
		  printf ("\n@end %s", deftype);
		}

... which is where the end directive gets emitted.  Genhooks should probably
fail more gracefully, but I imagine it is an error to add a new hook to
target.def without documenting it in tm.texi.in.  (If this isn't attempting to
enforce policy, then moving the final printf() outside the if condition should
do the trick.)


    cheers,
      DaveK

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-11 21:41                       ` Dave Korn
@ 2010-11-11 23:04                         ` Dave Korn
  2010-11-12 19:27                           ` Ralf Wildenhues
  0 siblings, 1 reply; 70+ messages in thread
From: Dave Korn @ 2010-11-11 23:04 UTC (permalink / raw)
  To: Ralf Wildenhues, Jack Howarth, Jan Hubicka, Ian Lance Taylor,
	Richard Henderson, Joseph S. Myers, gcc-patches, rguenther,
	iains, tglek

On 11/11/2010 22:04, Dave Korn wrote:

>> If this entry should be there, then it should be a @deftypefnx one, and
>> without a preceding empty line.

> The change is coming from genhooks, autogenerated based on the addition to 
> target.def.  It appears that if you add a new hook in target.def, but don't
>  add an @hook documentation paragraph in tm.texi.in to match, genhooks.c
> emits the @deftype{fn,var} start marker along with the function name and
> parameters, but then fails to enter this if clause in emit_documentation():

> ... which is where the end directive gets emitted.  Genhooks should
> probably fail more gracefully, but I imagine it is an error to add a new
> hook to target.def without documenting it in tm.texi.in.  (If this isn't
> attempting to enforce policy, then moving the final printf() outside the if
> condition should do the trick.)

  Actually, it's deliberate:

>    When the doc field is "", @deftypefn/@deftypevr and the hook signature
>    is emitted, but not the matching @end.  This allows all the free-form
>    documentation to be placed in IN_FNAME, to work around GPL/GFDL
>    licensing incompatibility issues.  */

  If what is desired is no docs at all,

>    In particular, a doc field of "*" means not to emit any ocumentation for
>    this target.def / hook_array entry at all (there might be documentation
>    for this hook in the file named IN_FNAME, though).

is the way to go, but using "" deliberately indicates "free-form documentation
[...] in " ... tm.texi.in.  (I expect in this case that adding docs to
tm.texi.in rather than using "*" to create an undocumented target hook is the
desired fix.)

    cheers,
      DaveK

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-11 23:04                         ` Dave Korn
@ 2010-11-12 19:27                           ` Ralf Wildenhues
  2010-11-12 20:40                             ` Dave Korn
  0 siblings, 1 reply; 70+ messages in thread
From: Ralf Wildenhues @ 2010-11-12 19:27 UTC (permalink / raw)
  To: Dave Korn
  Cc: Jack Howarth, Jan Hubicka, Ian Lance Taylor, Richard Henderson,
	Joseph S. Myers, gcc-patches, rguenther, iains, tglek

* Dave Korn wrote on Fri, Nov 12, 2010 at 12:23:25AM CET:
> On 11/11/2010 22:04, Dave Korn wrote:
> 
> >> If this entry should be there, then it should be a @deftypefnx one, and
> >> without a preceding empty line.
> 
> > The change is coming from genhooks, autogenerated based on the addition to 
> > target.def.  It appears that if you add a new hook in target.def, but don't
> >  add an @hook documentation paragraph in tm.texi.in to match, genhooks.c
> > emits the @deftype{fn,var} start marker along with the function name and
> > parameters, but then fails to enter this if clause in emit_documentation():
> 
> > ... which is where the end directive gets emitted.  Genhooks should
> > probably fail more gracefully, but I imagine it is an error to add a new
> > hook to target.def without documenting it in tm.texi.in.  (If this isn't
> > attempting to enforce policy, then moving the final printf() outside the if
> > condition should do the trick.)
> 
>   Actually, it's deliberate:

It doesn't matter if it's deliberate or not.  A texinfo file must be
well-formed.  That means, each @deftypefn must end with @end deftypefn,
without intervening @deftypefn lines.  Between these two, there may be
zero or more @deftypefnx lines.

Cheers,
Ralf

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-12 19:27                           ` Ralf Wildenhues
@ 2010-11-12 20:40                             ` Dave Korn
  0 siblings, 0 replies; 70+ messages in thread
From: Dave Korn @ 2010-11-12 20:40 UTC (permalink / raw)
  To: Ralf Wildenhues, Dave Korn, Jack Howarth, Jan Hubicka,
	Ian Lance Taylor, Richard Henderson, Joseph S. Myers,
	gcc-patches, rguenther, iains, tglek

On 12/11/2010 18:52, Ralf Wildenhues wrote:
> * Dave Korn wrote on Fri, Nov 12, 2010 at 12:23:25AM CET:
>> On 11/11/2010 22:04, Dave Korn wrote:
>>
>>>> If this entry should be there, then it should be a @deftypefnx one, and
>>>> without a preceding empty line.
>>> The change is coming from genhooks, autogenerated based on the addition to 
>>> target.def.  It appears that if you add a new hook in target.def, but don't
>>>  add an @hook documentation paragraph in tm.texi.in to match, genhooks.c
>>> emits the @deftype{fn,var} start marker along with the function name and
>>> parameters, but then fails to enter this if clause in emit_documentation():
>>> ... which is where the end directive gets emitted.  Genhooks should
>>> probably fail more gracefully, but I imagine it is an error to add a new
>>> hook to target.def without documenting it in tm.texi.in.  (If this isn't
>>> attempting to enforce policy, then moving the final printf() outside the if
>>> condition should do the trick.)
>>   Actually, it's deliberate:
> 
> It doesn't matter if it's deliberate or not.  A texinfo file must be
> well-formed.  That means, each @deftypefn must end with @end deftypefn,
> without intervening @deftypefn lines.  Between these two, there may be
> zero or more @deftypefnx lines.

  Yes, and if the @hook hunk had not been missing from tm.texi.in, that is
exactly what genhooks would have output.  But the @hook hunk was missing,
which is not allowed when the doc string is "".  This is a simple case of
GIGO: the input to genhooks was invalid, and it ended up generating invalid
texinfo output as a result.


    cheers,
      DaveK

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-10 18:59       ` Richard Henderson
  2010-11-10 19:44         ` Jan Hubicka
@ 2010-11-21  1:13         ` Jan Hubicka
  2010-11-21  3:31           ` Jan Hubicka
  1 sibling, 1 reply; 70+ messages in thread
From: Jan Hubicka @ 2010-11-21  1:13 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jan Hubicka, Joseph S. Myers, gcc-patches, rguenther, iant, iains, tglek

> 
> Honestly, adding the function_section hook is an excellent opportunity
> to get rid of the USE_SELECT_SECTION_FOR_FUNCTIONS macro, used only by
> darwin and the MEP port.

I am trying to look into this.  I wonder what darwin is using it for after all?
machopic_select_section seems to just return darwin_text_section for all
FUNCTION_DECLs....

Well, this is all bit rotten.  I will try to prepare more updated patch now so
we can discuss things incrementally.

Honza

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-21  1:13         ` Jan Hubicka
@ 2010-11-21  3:31           ` Jan Hubicka
  2010-11-21 18:49             ` Jack Howarth
                               ` (3 more replies)
  0 siblings, 4 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-11-21  3:31 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Richard Henderson, Joseph S. Myers, gcc-patches, rguenther, iant,
	iains, tglek


Hi,
this is updated patch.  I addressed, I hope, all the comments except for
removing USE_SELECT_SECTION_FOR_FUNCTIONS. I am somewhat confused by darwin and
thus would like to handle it incrementally.

I've removed the subsection hash, the string concatenations and section hash
should be fast enough given that we switch twice per function at most.  I also
added flag to tree_decl_with_vis specifying whether section name is implied by
-ffunction-sections or given by user and made default_function_section to do
text.unlikely.function_name by default.  I kept adding the _unlikely only for
darwin.  It was developed there and I suppose the funny suffixes do have
meaning for darwin linker.

Bootstrapped/regtested x86_64-linux, OK (after testing darwin and ia64)?

Honza

	* tree.h (DECL_HAS_IMPLICIT_SECTION_NAME_P): New macro.
	(tree_decl_with_vis): Add implicit_section_name_p.
	* targhooks.h (default_function_section): Declare.
	* target.def (function_section): New hook.
	* defaults.h (HOT_TEXT_SECTION_NAME,
	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
	* predict.c (choose_function_section): Remove.
	(estimate_bb_frequencies): Do not use choose_function_section.
	* coretypes.h (enum node_frequency): Move here from cgraph.h
	* cgraph.h (enum node_frequency): Remove.
	* varasm.c (initialize_cold_section_name, unlikely_text_section,
	unlikely_text_section_p): Remove.
	(named_subsection_entry): New structure.
	(get_text_section): New function.
	(default_function_section): New function.
	(function_section_1): Break out from ...; handle profile info.
	(function_section): ... here.
	(unlikely_text_section): Remove.
	(unlikely_text_section_p): Use function_section_1.
	(assemble_start_function): Do not initialize cold section.
	(default_section_type_flags): Do not special case cold subsection.
	(switch_to_section): Likewise.
	* output.h (get_text_section): Define.
	* config/i386/winnt.c: Do not special case cold section.
	* config/darwin-protos.h (darwin_function_section): Declare.
	* config/microblaze/microblaze.h (HOT_TEXT_SECTION_NAME,
	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
	* config/ia64/hpux.h (HOT_TEXT_SECTION_NAME,
	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
	(TARGET_ASM_FUNCTION_SECTION): Define to ia64_hpux_function_section.
	* config/ia64/ia64.c (ia64_hpux_function_section): New function.
	* config/ia64/ia64-protos.h (ia64_hpux_function_section): Define.
	* config/darwin.c (machopic_select_section): Use
	darwin_function_section.
	(darwin_function_section): New function.
	* config/darwin.h (HOT_TEXT_SECTION_NAME,
	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
	(TARGET_ASM_FUNCTION_SECTION): Define.
Index: doc/tm.texi
===================================================================
*** doc/tm.texi	(revision 166985)
--- doc/tm.texi	(working copy)
*************** is non-NULL, it is the @code{VAR_DECL} o
*** 7330,7335 ****
--- 7330,7345 ----
  this section is associated.
  @end deftypefn
  
+ @deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_SECTION (tree @var{decl}, enum node_frequency @var{freq}, bool @var{startup}, bool @var{exit})
+ Return preferred text (sub)section for function @var{decl}.
+ Main purpose of this function is to separate cold, normal and hot
+ functions. @var{startup} is true when function is known to be used only 
+ at startup (from static constructors or it is @code{main()}).
+ @var{exit} is true when function is known to be used only at exit
+ (from static destructors).
+ Return NULL if function should go to default text section.
+ @end deftypefn
+ 
  @deftypevr {Target Hook} bool TARGET_HAVE_NAMED_SECTIONS
  This flag is true if the target supports @code{TARGET_ASM_NAMED_SECTION}.
  @end deftypevr
Index: doc/tm.texi.in
===================================================================
*** doc/tm.texi.in	(revision 166985)
--- doc/tm.texi.in	(working copy)
*************** is non-NULL, it is the @code{VAR_DECL} o
*** 7305,7310 ****
--- 7305,7320 ----
  this section is associated.
  @end deftypefn
  
+ @hook TARGET_ASM_FUNCTION_SECTION
+ Return preferred text (sub)section for function @var{decl}.
+ Main purpose of this function is to separate cold, normal and hot
+ functions. @var{startup} is true when function is known to be used only 
+ at startup (from static constructors or it is @code{main()}).
+ @var{exit} is true when function is known to be used only at exit
+ (from static destructors).
+ Return NULL if function should go to default text section.
+ @end deftypefn
+ 
  @hook TARGET_HAVE_NAMED_SECTIONS
  This flag is true if the target supports @code{TARGET_ASM_NAMED_SECTION}.
  @end deftypevr
Index: targhooks.h
===================================================================
*** targhooks.h	(revision 166985)
--- targhooks.h	(working copy)
*************** extern int default_label_align_after_bar
*** 171,177 ****
  extern int default_loop_align_max_skip (rtx);
  extern int default_label_align_max_skip (rtx);
  extern int default_jump_align_max_skip (rtx);
! 
  extern enum machine_mode default_get_reg_raw_mode(int);
  
  extern const struct default_options empty_optimization_table[];
--- 171,178 ----
  extern int default_loop_align_max_skip (rtx);
  extern int default_label_align_max_skip (rtx);
  extern int default_jump_align_max_skip (rtx);
! extern section * default_function_section(tree decl, enum node_frequency freq,
! 					  bool startup, bool exit);
  extern enum machine_mode default_get_reg_raw_mode(int);
  
  extern const struct default_options empty_optimization_table[];
Index: target.def
===================================================================
*** target.def	(revision 166985)
--- target.def	(working copy)
*************** DEFHOOK
*** 281,286 ****
--- 281,299 ----
   void, (const char *name, unsigned int flags, tree decl),
   default_no_named_section)
  
+ /* Return preferred text (sub)section for function DECL.
+    Main purpose of this function is to separate cold, normal and hot
+    functions. STARTUP is true when function is known to be used only 
+    at startup (from static constructors or it is main()).
+    EXIT is true when function is known to be used only at exit
+    (from static destructors).
+    Return NULL if function should go to default text section.  */
+ DEFHOOK
+ (function_section,
+  "",
+  section *, (tree decl, enum node_frequency freq, bool startup, bool exit),
+  default_function_section)
+ 
  /* Return a mask describing how relocations should be treated when
     selecting sections.  Bit 1 should be set if global relocations
     should be placed in a read-write section; bit 0 should be set if
Index: cgraph.h
===================================================================
*** cgraph.h	(revision 166985)
--- cgraph.h	(working copy)
*************** struct GTY(()) cgraph_clone_info
*** 180,199 ****
    bitmap combined_args_to_skip;
  };
  
- enum node_frequency {
-   /* This function most likely won't be executed at all.
-      (set only when profile feedback is available or via function attribute). */
-   NODE_FREQUENCY_UNLIKELY_EXECUTED,
-   /* For functions that are known to be executed once (i.e. constructors, destructors
-      and main function.  */
-   NODE_FREQUENCY_EXECUTED_ONCE,
-   /* The default value.  */
-   NODE_FREQUENCY_NORMAL,
-   /* Optimize this function hard
-      (set only when profile feedback is available or via function attribute). */
-   NODE_FREQUENCY_HOT
- };
- 
  
  /* The cgraph data structure.
     Each function decl has assigned cgraph_node listing callees and callers.  */
--- 180,185 ----
Index: defaults.h
===================================================================
*** defaults.h	(revision 166985)
--- defaults.h	(working copy)
*************** see the files COPYING3 and COPYING.RUNTI
*** 897,910 ****
  #define TARGET_DEC_EVAL_METHOD 2
  #endif
  
- #ifndef HOT_TEXT_SECTION_NAME
- #define HOT_TEXT_SECTION_NAME ".text.hot"
- #endif
- 
- #ifndef UNLIKELY_EXECUTED_TEXT_SECTION_NAME
- #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text.unlikely"
- #endif
- 
  #ifndef HAS_LONG_COND_BRANCH
  #define HAS_LONG_COND_BRANCH 0
  #endif
--- 897,902 ----
Index: tree.h
===================================================================
*** tree.h	(revision 166985)
--- tree.h	(working copy)
*************** struct GTY(()) tree_parm_decl {
*** 3107,3112 ****
--- 3107,3117 ----
  #define DECL_HAS_INIT_PRIORITY_P(NODE) \
    (VAR_DECL_CHECK (NODE)->decl_with_vis.init_priority_p)
  
+ /* Specify whether the section name was set by user or by
+    compiler via -ffunction-sections.  */
+ #define DECL_HAS_IMPLICIT_SECTION_NAME_P(NODE) \
+   (DECL_WITH_VIS_CHECK (NODE)->decl_with_vis.implicit_section_name_p)
+ 
  struct GTY(()) tree_decl_with_vis {
   struct tree_decl_with_rtl common;
   tree assembler_name;
*************** struct GTY(()) tree_decl_with_vis {
*** 3135,3141 ****
   unsigned init_priority_p : 1;
   /* Used by C++ only.  Might become a generic decl flag.  */
   unsigned shadowed_for_var_p : 1;
!  /* 14 unused bits. */
  };
  
  extern tree decl_debug_expr_lookup (tree);
--- 3140,3148 ----
   unsigned init_priority_p : 1;
   /* Used by C++ only.  Might become a generic decl flag.  */
   unsigned shadowed_for_var_p : 1;
!  /* When SECTION_NAME is implied by -ffunsection-section.  */
!  unsigned implicit_section_name_p : 1;
!  /* 13 unused bits. */
  };
  
  extern tree decl_debug_expr_lookup (tree);
Index: predict.c
===================================================================
*** predict.c	(revision 166985)
--- predict.c	(working copy)
*************** static sreal real_zero, real_one, real_a
*** 78,84 ****
  static void combine_predictions_for_insn (rtx, basic_block);
  static void dump_prediction (FILE *, enum br_predictor, int, basic_block, int);
  static void predict_paths_leading_to (basic_block, enum br_predictor, enum prediction);
- static void choose_function_section (void);
  static bool can_predict_insn_p (const_rtx);
  
  /* Information we hold about each branch predictor.
--- 78,83 ----
*************** estimate_bb_frequencies (void)
*** 2185,2192 ****
        free_aux_for_edges ();
      }
    compute_function_frequency ();
-   if (flag_reorder_functions)
-     choose_function_section ();
  }
  
  /* Decide whether function is hot, cold or unlikely executed.  */
--- 2184,2189 ----
*************** compute_function_frequency (void)
*** 2232,2266 ****
      }
  }
  
- /* Choose appropriate section for the function.  */
- static void
- choose_function_section (void)
- {
-   struct cgraph_node *node = cgraph_node (current_function_decl);
-   if (DECL_SECTION_NAME (current_function_decl)
-       || !targetm.have_named_sections
-       /* Theoretically we can split the gnu.linkonce text section too,
- 	 but this requires more work as the frequency needs to match
- 	 for all generated objects so we need to merge the frequency
- 	 of all instances.  For now just never set frequency for these.  */
-       || DECL_ONE_ONLY (current_function_decl))
-     return;
- 
-   /* If we are doing the partitioning optimization, let the optimization
-      choose the correct section into which to put things.  */
- 
-   if (flag_reorder_blocks_and_partition)
-     return;
- 
-   if (node->frequency == NODE_FREQUENCY_HOT)
-     DECL_SECTION_NAME (current_function_decl) =
-       build_string (strlen (HOT_TEXT_SECTION_NAME), HOT_TEXT_SECTION_NAME);
-   if (node->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
-     DECL_SECTION_NAME (current_function_decl) =
-       build_string (strlen (UNLIKELY_EXECUTED_TEXT_SECTION_NAME),
- 		    UNLIKELY_EXECUTED_TEXT_SECTION_NAME);
- }
- 
  static bool
  gate_estimate_probability (void)
  {
--- 2229,2234 ----
Index: function.h
===================================================================
*** function.h	(revision 166985)
--- function.h	(working copy)
*************** struct GTY(()) function_subsections {
*** 226,236 ****
    const char *cold_section_label;
    const char *hot_section_end_label;
    const char *cold_section_end_label;
- 
-   /* String to be used for name of cold text sections, via
-      targetm.asm_out.named_section.  */
- 
-   const char *unlikely_text_section_name;
  };
  
  /* Describe an empty area of space in the stack frame.  These can be chained
--- 226,231 ----
Index: coretypes.h
===================================================================
*** coretypes.h	(revision 166985)
--- coretypes.h	(working copy)
*************** enum unwind_info_type
*** 121,126 ****
--- 121,142 ----
    UI_TARGET
  };
  
+ /* Callgraph node profile representation.  */
+ enum node_frequency {
+   /* This function most likely won't be executed at all.
+      (set only when profile feedback is available or via function attribute). */
+   NODE_FREQUENCY_UNLIKELY_EXECUTED,
+   /* For functions that are known to be executed once (i.e. constructors, destructors
+      and main function.  */
+   NODE_FREQUENCY_EXECUTED_ONCE,
+   /* The default value.  */
+   NODE_FREQUENCY_NORMAL,
+   /* Optimize this function hard
+      (set only when profile feedback is available or via function attribute). */
+   NODE_FREQUENCY_HOT
+ };
+ 
+ 
  struct edge_def;
  typedef struct edge_def *edge;
  typedef const struct edge_def *const_edge;
Index: varasm.c
===================================================================
*** varasm.c	(revision 166985)
--- varasm.c	(working copy)
*************** create_block_symbol (const char *label, 
*** 368,436 ****
    return symbol;
  }
  
- static void
- initialize_cold_section_name (void)
- {
-   const char *stripped_name;
-   char *name, *buffer;
-   tree dsn;
- 
-   gcc_assert (cfun && current_function_decl);
-   if (crtl->subsections.unlikely_text_section_name)
-     return;
- 
-   dsn = DECL_SECTION_NAME (current_function_decl);
-   if (flag_function_sections && dsn)
-     {
-       name = (char *) alloca (TREE_STRING_LENGTH (dsn) + 1);
-       memcpy (name, TREE_STRING_POINTER (dsn), TREE_STRING_LENGTH (dsn) + 1);
- 
-       stripped_name = targetm.strip_name_encoding (name);
- 
-       buffer = ACONCAT ((stripped_name, "_unlikely", NULL));
-       crtl->subsections.unlikely_text_section_name = ggc_strdup (buffer);
-     }
-   else
-     crtl->subsections.unlikely_text_section_name =  UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
- }
- 
- /* Tell assembler to switch to unlikely-to-be-executed text section.  */
- 
- section *
- unlikely_text_section (void)
- {
-   if (cfun)
-     {
-       if (!crtl->subsections.unlikely_text_section_name)
- 	initialize_cold_section_name ();
- 
-       return get_named_section (NULL, crtl->subsections.unlikely_text_section_name, 0);
-     }
-   else
-     return get_named_section (NULL, UNLIKELY_EXECUTED_TEXT_SECTION_NAME, 0);
- }
- 
- /* When called within a function context, return true if the function
-    has been assigned a cold text section and if SECT is that section.
-    When called outside a function context, return true if SECT is the
-    default cold section.  */
- 
- bool
- unlikely_text_section_p (section *sect)
- {
-   const char *name;
- 
-   if (cfun)
-     name = crtl->subsections.unlikely_text_section_name;
-   else
-     name = UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
- 
-   return (name
- 	  && sect
- 	  && SECTION_STYLE (sect) == SECTION_NAMED
- 	  && strcmp (name, sect->named.name) == 0);
- }
- 
  /* Return a section with a particular name and with whatever SECTION_*
     flags section_type_flags deems appropriate.  The name of the section
     is taken from NAME if nonnull, otherwise it is taken from DECL's
--- 368,373 ----
*************** resolve_unique_section (tree decl, int r
*** 462,468 ****
        && targetm.have_named_sections
        && (flag_function_or_data_sections
  	  || DECL_ONE_ONLY (decl)))
!     targetm.asm_out.unique_section (decl, reloc);
  }
  
  #ifdef BSS_SECTION_ASM_OP
--- 399,408 ----
        && targetm.have_named_sections
        && (flag_function_or_data_sections
  	  || DECL_ONE_ONLY (decl)))
!     {
!       targetm.asm_out.unique_section (decl, reloc);
!       DECL_HAS_IMPLICIT_SECTION_NAME_P (decl) = true;
!     }
  }
  
  #ifdef BSS_SECTION_ASM_OP
*************** hot_function_section (tree decl)
*** 539,587 ****
  }
  #endif
  
  /* Return the section for function DECL.
  
     If DECL is NULL_TREE, return the text section.  We can be passed
!    NULL_TREE under some circumstances by dbxout.c at least.  */
  
! section *
! function_section (tree decl)
  {
!   int reloc = 0;
  
!   if (first_function_block_is_cold)
!     reloc = 1;
  
  #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
    if (decl != NULL_TREE
        && DECL_SECTION_NAME (decl) != NULL_TREE)
!     return reloc ? unlikely_text_section ()
! 		 : get_named_section (decl, NULL, 0);
    else
!     return targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
  #else
!   return reloc ? unlikely_text_section () : hot_function_section (decl);
  #endif
  }
  
  section *
  current_function_section (void)
  {
! #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
!   if (current_function_decl != NULL_TREE
!       && DECL_SECTION_NAME (current_function_decl) != NULL_TREE)
!     return in_cold_section_p ? unlikely_text_section ()
! 			     : get_named_section (current_function_decl,
! 						  NULL, 0);
!   else
!     return targetm.asm_out.select_section (current_function_decl,
! 					   in_cold_section_p,
! 					   DECL_ALIGN (current_function_decl));
! #else
!   return (in_cold_section_p
! 	  ? unlikely_text_section ()
! 	  : hot_function_section (current_function_decl));
! #endif
  }
  
  /* Return the read-only data section associated with function DECL.  */
--- 479,654 ----
  }
  #endif
  
+ /* Return section for TEXT_SECTION_NAME if DECL or DECL_SECTION_NAME (DECL)
+    is NULL.
+ 
+    When DECL_SECTION_NAME is non-NULL and it is implicit section and
+    NAMED_SECTION_SUFFIX is non-NULL, then produce section called
+    concatenate the name with NAMED_SECTION_SUFFIX.
+    Otherwise produce "TEXT_SECTION_NAME.IMPLICIT_NAME".  */
+ 
+ section *
+ get_named_text_section (tree decl,
+ 		        const char *text_section_name,
+ 		        const char *named_section_suffix)
+ {
+   if (decl && DECL_SECTION_NAME (decl))
+     {
+       if (named_section_suffix)
+ 	{
+ 	  tree dsn = DECL_SECTION_NAME (decl);
+ 	  const char *stripped_name;
+ 	  char *name, *buffer;
+ 
+ 	  name = (char *) alloca (TREE_STRING_LENGTH (dsn) + 1);
+ 	  memcpy (name, TREE_STRING_POINTER (dsn),
+ 		  TREE_STRING_LENGTH (dsn) + 1);
+ 
+ 	  stripped_name = targetm.strip_name_encoding (name);
+ 
+ 	  buffer = ACONCAT ((stripped_name, named_section_suffix, NULL));
+ 	  return get_named_section (decl, buffer, 0);
+ 	}
+       else if (DECL_HAS_IMPLICIT_SECTION_NAME_P (decl))
+ 	{
+ 	  const char *name;
+ 
+ 	  /* Do not try to split gnu_linkonce functions.  This gets somewhat
+ 	     slipperly.  */
+ 	  if (DECL_ONE_ONLY (decl) && !HAVE_COMDAT_GROUP)
+ 	    return NULL;
+ 	  name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
+ 	  name = targetm.strip_name_encoding (name);
+ 	  return get_named_section (decl, ACONCAT ((text_section_name, ".",
+ 				                   name, NULL)), 0);
+ 	}
+       else
+ 	return NULL;
+     }
+   return get_named_section (decl, text_section_name, 0);
+ }
+ 
+ /* Choose named function section based on its frequency.  */
+ 
+ section *
+ default_function_section (tree decl, enum node_frequency freq,
+ 			  bool startup, bool exit)
+ {
+   /* Startup code should go to startup subsection unless it is
+      unlikely executed (this happens especially with function splitting
+      where we can split away unnecesary parts of static constructors.  */
+   if (startup && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
+     return get_named_text_section (decl, ".text.startup", NULL);
+ 
+   /* Similarly for exit.  */
+   if (exit && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
+     return get_named_text_section (decl, ".text.exit", NULL);
+ 
+   /* Group cold functions together, similarly for hot code.  */
+   switch (freq)
+     {
+       case NODE_FREQUENCY_UNLIKELY_EXECUTED:
+ 	return get_named_text_section (decl, "text.unlikely", NULL);
+       case NODE_FREQUENCY_HOT:
+ 	return get_named_text_section (decl, "text.hot", NULL);
+       default:
+ 	return NULL;
+     }
+ }
+ 
  /* Return the section for function DECL.
  
     If DECL is NULL_TREE, return the text section.  We can be passed
!    NULL_TREE under some circumstances by dbxout.c at least. 
  
!    If FORCE_COLD is true, return cold function section ignoring
!    the frequency info of cgraph_node.  */
! 
! static section *
! function_section_1 (tree decl, bool force_cold)
  {
!   section *section = NULL;
!   enum node_frequency freq = NODE_FREQUENCY_NORMAL;
!   bool startup = false, exit = false;
  
!   if (decl)
!     {
!       struct cgraph_node *node = cgraph_node (decl);
! 
!       freq = node->frequency;
!       startup = node->only_called_at_startup;
!       exit = node->only_called_at_exit;
!     }
!   if (force_cold)
!     freq = NODE_FREQUENCY_UNLIKELY_EXECUTED;
  
  #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
    if (decl != NULL_TREE
        && DECL_SECTION_NAME (decl) != NULL_TREE)
!     {
!       if (targetm.asm_out.function_section)
! 	section = targetm.asm_out.function_section (decl, freq,
! 						    startup, exit);
!       if (section)
! 	return section;
!       return get_named_section (decl, NULL, 0);
!     }
    else
!     return targetm.asm_out.select_section
! 	    (decl, freq == NODE_FREQUENCY_UNLIKELY_EXECUTED,
! 	     DECL_ALIGN (decl));
  #else
!   if (targetm.asm_out.function_section)
!     section = targetm.asm_out.function_section (decl, freq, startup, exit);
!   if (section)
!     return section;
!   return hot_function_section (decl);
  #endif
  }
  
+ /* Return the section for function DECL.
+ 
+    If DECL is NULL_TREE, return the text section.  We can be passed
+    NULL_TREE under some circumstances by dbxout.c at least.  */
+ 
+ section *
+ function_section (tree decl)
+ {
+   /* Handle cases where function splitting code decides
+      to put function entry point into unlikely executed section
+      despite the fact that the function itself is not cold
+      (i.e. it is called rarely but contains a hot loop that is
+      better to live in hot subsection for the code locality).  */
+   return function_section_1 (decl,
+ 			     first_function_block_is_cold);
+ }
+ 
+ /* Return the section for the current function, take IN_COLD_SECTION_P
+    into account.  */
+ 
  section *
  current_function_section (void)
  {
!   return function_section_1 (current_function_decl, in_cold_section_p);
! }
! 
! /* Tell assembler to switch to unlikely-to-be-executed text section.  */
! 
! section *
! unlikely_text_section (void)
! {
!   return function_section_1 (current_function_decl, true);
! }
! 
! /* When called within a function context, return true if the function
!    has been assigned a cold text section and if SECT is that section.
!    When called outside a function context, return true if SECT is the
!    default cold section.  */
! 
! bool
! unlikely_text_section_p (section *sect)
! {
!   return sect == function_section_1 (current_function_decl, true);
  }
  
  /* Return the read-only data section associated with function DECL.  */
*************** assemble_start_function (tree decl, cons
*** 1454,1461 ****
    char tmp_label[100];
    bool hot_label_written = false;
  
-   crtl->subsections.unlikely_text_section_name = NULL;
- 
    first_function_block_is_cold = false;
    if (flag_reorder_blocks_and_partition)
      {
--- 1521,1526 ----
*************** assemble_start_function (tree decl, cons
*** 1513,1528 ****
    else if (DECL_SECTION_NAME (decl))
      {
        /* Calls to function_section rely on first_function_block_is_cold
! 	 being accurate.  The first block may be cold even if we aren't
! 	 doing partitioning, if the entire function was decided by
! 	 choose_function_section (predict.c) to be cold.  */
! 
!       initialize_cold_section_name ();
! 
!       if (crtl->subsections.unlikely_text_section_name
! 	  && strcmp (TREE_STRING_POINTER (DECL_SECTION_NAME (decl)),
! 		     crtl->subsections.unlikely_text_section_name) == 0)
! 	first_function_block_is_cold = true;
      }
  
    in_cold_section_p = first_function_block_is_cold;
--- 1578,1587 ----
    else if (DECL_SECTION_NAME (decl))
      {
        /* Calls to function_section rely on first_function_block_is_cold
! 	 being accurate.  */
!       first_function_block_is_cold
! 	 = (cgraph_node (current_function_decl)->frequency
! 	    == NODE_FREQUENCY_UNLIKELY_EXECUTED);
      }
  
    in_cold_section_p = first_function_block_is_cold;
*************** default_section_type_flags (tree decl, c
*** 5871,5885 ****
      flags = SECTION_CODE;
    else if (decl && decl_readonly_section (decl, reloc))
      flags = 0;
-   else if (current_function_decl
- 	   && cfun
- 	   && crtl->subsections.unlikely_text_section_name
- 	   && strcmp (name, crtl->subsections.unlikely_text_section_name) == 0)
-     flags = SECTION_CODE;
-   else if (!decl
- 	   && (!current_function_decl || !cfun)
- 	   && strcmp (name, UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
-     flags = SECTION_CODE;
    else
      flags = SECTION_WRITE;
  
--- 5930,5935 ----
*************** switch_to_section (section *new_section)
*** 6810,6821 ****
    switch (SECTION_STYLE (new_section))
      {
      case SECTION_NAMED:
-       if (cfun
- 	  && !crtl->subsections.unlikely_text_section_name
- 	  && strcmp (new_section->named.name,
- 		     UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
- 	crtl->subsections.unlikely_text_section_name = UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
- 
        targetm.asm_out.named_section (new_section->named.name,
  				     new_section->named.common.flags,
  				     new_section->named.decl);
--- 6860,6865 ----
Index: output.h
===================================================================
*** output.h	(revision 166985)
--- output.h	(working copy)
*************** extern bool default_assemble_integer (rt
*** 269,274 ****
--- 269,278 ----
     be outputable. */
  extern bool assemble_integer (rtx, unsigned, unsigned, int);
  
+ /* Return section for TEXT_SECITON_NAME if DECL or DECL_SECTION_NAME (DECL)
+    is NULL.  */
+ extern section *get_named_text_section (tree, const char *, const char *);
+ 
  /* An interface to assemble_integer for the common case in which a value is
     fully aligned and must be printed.  VALUE is the value of the integer
     object and SIZE is the number of bytes it contains.  */
Index: config/i386/winnt.c
===================================================================
*** config/i386/winnt.c	(revision 166985)
--- config/i386/winnt.c	(working copy)
*************** i386_pe_section_type_flags (tree decl, c
*** 414,428 ****
      flags = SECTION_CODE;
    else if (decl && decl_readonly_section (decl, reloc))
      flags = 0;
-   else if (current_function_decl
- 	   && cfun
- 	   && crtl->subsections.unlikely_text_section_name
- 	   && strcmp (name, crtl->subsections.unlikely_text_section_name) == 0)
-     flags = SECTION_CODE;
-   else if (!decl
- 	   && (!current_function_decl || !cfun)
- 	   && strcmp (name, UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
-     flags = SECTION_CODE;
    else
      {
        flags = SECTION_WRITE;
--- 414,419 ----
Index: config/darwin-protos.h
===================================================================
*** config/darwin-protos.h	(revision 166985)
--- config/darwin-protos.h	(working copy)
*************** extern section *machopic_select_section 
*** 57,62 ****
--- 57,63 ----
  extern section *machopic_select_rtx_section (enum machine_mode, rtx,
  					     unsigned HOST_WIDE_INT);
  
+ extern section *darwin_function_section (tree, enum node_frequency, bool, bool); 
  extern void darwin_unique_section (tree decl, int reloc);
  extern void darwin_asm_named_section (const char *, unsigned int, tree);
  extern void darwin_non_lazy_pcrel (FILE *, rtx);
Index: config/microblaze/microblaze.h
===================================================================
*** config/microblaze/microblaze.h	(revision 166985)
--- config/microblaze/microblaze.h	(working copy)
*************** do {									 \
*** 880,889 ****
  #define SBSS_SECTION_ASM_OP     "\t.sbss"	/* Small RW uninitialized data */
  #define SBSS2_SECTION_ASM_OP    "\t.sbss2"	/* Small RO uninitialized data */
  
- #define HOT_TEXT_SECTION_NAME   ".text.hot"
- #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \
-                                 ".text.unlikely"
- 
  /* We do this to save a few 10s of code space that would be taken up
     by the call_FUNC () wrappers, used by the generic CRT_CALL_STATIC_FUNCTION
     definition in crtstuff.c.  */
--- 880,885 ----
Index: config/ia64/ia64.c
===================================================================
*** config/ia64/ia64.c	(revision 166985)
--- config/ia64/ia64.c	(working copy)
*************** ia64_get_reg_raw_mode (int regno)
*** 11021,11024 ****
--- 11021,11035 ----
    return default_get_reg_raw_mode(regno);
  }
  
+ /* Always default to .text section until HP-UX linker is fixed.  */
+ 
+ section *
+ ia64_hpux_function_section (tree decl ATTRIBUTE_UNUSED,
+ 			    enum node_frequency freq ATTRIBUTE_UNUSED,
+ 			    bool startup ATTRIBUTE_UNUSED,
+ 			    bool exit ATTRIBUTE_UNUSED)
+ {
+   return NULL;
+ }
+ 
  #include "gt-ia64.h"
Index: config/ia64/hpux.h
===================================================================
*** config/ia64/hpux.h	(revision 166985)
--- config/ia64/hpux.h	(working copy)
*************** do {								\
*** 221,228 ****
     it is fixed, prevent code from being put into .text.unlikely or
     .text.hot.  */
  
! #undef UNLIKELY_EXECUTED_TEXT_SECTION_NAME
! #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text"
! 
! #undef HOT_TEXT_SECTION_NAME
! #define HOT_TEXT_SECTION_NAME ".text"
--- 221,224 ----
     it is fixed, prevent code from being put into .text.unlikely or
     .text.hot.  */
  
! #define TARGET_ASM_FUNCTION_SECTION ia64_hpux_function_section
Index: config/ia64/ia64-protos.h
===================================================================
*** config/ia64/ia64-protos.h	(revision 166985)
--- config/ia64/ia64-protos.h	(working copy)
*************** extern void ia64_init_expanders (void);
*** 97,99 ****
--- 97,100 ----
  
  extern rtx ia64_dconst_0_5 (void);
  extern rtx ia64_dconst_0_375 (void);
+ extern section *ia64_hpux_function_section (decl, enum node_frequency, startup, exit);
Index: config/darwin.c
===================================================================
*** config/darwin.c	(revision 166985)
--- config/darwin.c	(working copy)
*************** machopic_select_section (tree decl,
*** 1233,1244 ****
  	       && DECL_WEAK (decl)
  	       && !lookup_attribute ("weak_import",
  				     DECL_ATTRIBUTES (decl)));
!   section *base_section;
  
    switch (categorize_decl_for_section (decl, reloc))
      {
      case SECCAT_TEXT:
!       base_section = darwin_text_section (reloc, weak);
        break;
  
      case SECCAT_RODATA:
--- 1233,1254 ----
  	       && DECL_WEAK (decl)
  	       && !lookup_attribute ("weak_import",
  				     DECL_ATTRIBUTES (decl)));
!   section *base_section = NULL;
  
    switch (categorize_decl_for_section (decl, reloc))
      {
      case SECCAT_TEXT:
!       {
! 	struct cgraph_node *node;
! 	if (decl && TREE_CODE (decl) == FUNCTION_DECL
! 	    && (node = cgraph_get_node (decl)) != NULL)
! 	  base_section = darwin_function_section (decl,
! 						  node->frequency,
! 						  node->only_called_at_startup,
! 						  node->only_called_at_exit);
! 	if (!base_section)
!           base_section = darwin_text_section (reloc, weak);
!       }
        break;
  
      case SECCAT_RODATA:
*************** darwin_enter_string_into_cfstring_table 
*** 2362,2365 ****
--- 2372,2409 ----
      }
  }
  
+ /* Choose named function section based on its frequency.  */
+ 
+ section *
+ darwin_function_section (tree decl, enum node_frequency freq,
+ 			  bool startup, bool exit)
+ {
+   /* Startup code should go to startup subsection unless it is
+      unlikely executed (this happens especially with function splitting
+      where we can split away unnecesary parts of static constructors.  */
+   if (startup && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
+     return get_named_text_section
+ 	     (decl, "__TEXT,__startup,regular,pure_instructions", "_startup");
+ 
+   /* Similarly for exit.  */
+   if (exit && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
+     return get_named_text_section (decl,
+ 				   "__TEXT,__exit,regular,pure_instructions",
+ 				   "_exit");
+ 
+   /* Group cold functions together, similarly for hot code.  */
+   switch (freq)
+     {
+       case NODE_FREQUENCY_UNLIKELY_EXECUTED:
+ 	return get_named_text_section
+ 		 (decl,
+ 	          "__TEXT,__unlikely,regular,pure_instructions", "_ulinkely");
+       case NODE_FREQUENCY_HOT:
+ 	return get_named_text_section
+ 		 (decl, "__TEXT,__hot,regular,pure_instructions", "_hot");
+       default:
+ 	return NULL;
+     }
+ }
+ 
  #include "gt-darwin.h"
Index: config/darwin.h
===================================================================
*** config/darwin.h	(revision 166985)
--- config/darwin.h	(working copy)
*************** int darwin_label_is_anonymous_local_objc
*** 658,668 ****
  /* The generic version, archs should over-ride where required.  */
  #define MACHOPIC_NL_SYMBOL_PTR_SECTION ".non_lazy_symbol_pointer"
  
- /* These are used by -fbranch-probabilities */
- #define HOT_TEXT_SECTION_NAME "__TEXT,__text,regular,pure_instructions"
- #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \
-                               "__TEXT,__unlikely,regular,pure_instructions"
- 
  /* Declare the section variables.  */
  #ifndef USED_FOR_TARGET
  enum darwin_section_enum {
--- 658,663 ----
*************** extern GTY(()) section * darwin_sections
*** 677,682 ****
--- 672,679 ----
  #undef	TARGET_ASM_SELECT_SECTION
  #define TARGET_ASM_SELECT_SECTION machopic_select_section
  #define USE_SELECT_SECTION_FOR_FUNCTIONS
+ #undef	TARGET_ASM_FUNCTION_SECTION
+ #define TARGET_ASM_FUNCTION_SECTION darwin_function_section
  
  #undef	TARGET_ASM_SELECT_RTX_SECTION
  #define TARGET_ASM_SELECT_RTX_SECTION machopic_select_rtx_section

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-21  3:31           ` Jan Hubicka
@ 2010-11-21 18:49             ` Jack Howarth
  2010-11-22 18:33             ` Richard Henderson
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 70+ messages in thread
From: Jack Howarth @ 2010-11-21 18:49 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Richard Henderson, Joseph S. Myers, gcc-patches, rguenther, iant,
	iains, tglek

On Sun, Nov 21, 2010 at 02:12:43AM +0100, Jan Hubicka wrote:
> 
> Hi,
> this is updated patch.  I addressed, I hope, all the comments except for
> removing USE_SELECT_SECTION_FOR_FUNCTIONS. I am somewhat confused by darwin and
> thus would like to handle it incrementally.
> 
> I've removed the subsection hash, the string concatenations and section hash
> should be fast enough given that we switch twice per function at most.  I also
> added flag to tree_decl_with_vis specifying whether section name is implied by
> -ffunction-sections or given by user and made default_function_section to do
> text.unlikely.function_name by default.  I kept adding the _unlikely only for
> darwin.  It was developed there and I suppose the funny suffixes do have
> meaning for darwin linker.
> 
> Bootstrapped/regtested x86_64-linux, OK (after testing darwin and ia64)?

Honza,
   No problems after bootstrapping and regression testing on x86_64-apple-darwin10.

http://gcc.gnu.org/ml/gcc-testresults/2010-11/msg01764.html

Of course I am still having to hack around PR46510.
              Jack

> 
> Honza
> 
> 	* tree.h (DECL_HAS_IMPLICIT_SECTION_NAME_P): New macro.
> 	(tree_decl_with_vis): Add implicit_section_name_p.
> 	* targhooks.h (default_function_section): Declare.
> 	* target.def (function_section): New hook.
> 	* defaults.h (HOT_TEXT_SECTION_NAME,
> 	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
> 	* predict.c (choose_function_section): Remove.
> 	(estimate_bb_frequencies): Do not use choose_function_section.
> 	* coretypes.h (enum node_frequency): Move here from cgraph.h
> 	* cgraph.h (enum node_frequency): Remove.
> 	* varasm.c (initialize_cold_section_name, unlikely_text_section,
> 	unlikely_text_section_p): Remove.
> 	(named_subsection_entry): New structure.
> 	(get_text_section): New function.
> 	(default_function_section): New function.
> 	(function_section_1): Break out from ...; handle profile info.
> 	(function_section): ... here.
> 	(unlikely_text_section): Remove.
> 	(unlikely_text_section_p): Use function_section_1.
> 	(assemble_start_function): Do not initialize cold section.
> 	(default_section_type_flags): Do not special case cold subsection.
> 	(switch_to_section): Likewise.
> 	* output.h (get_text_section): Define.
> 	* config/i386/winnt.c: Do not special case cold section.
> 	* config/darwin-protos.h (darwin_function_section): Declare.
> 	* config/microblaze/microblaze.h (HOT_TEXT_SECTION_NAME,
> 	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
> 	* config/ia64/hpux.h (HOT_TEXT_SECTION_NAME,
> 	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
> 	(TARGET_ASM_FUNCTION_SECTION): Define to ia64_hpux_function_section.
> 	* config/ia64/ia64.c (ia64_hpux_function_section): New function.
> 	* config/ia64/ia64-protos.h (ia64_hpux_function_section): Define.
> 	* config/darwin.c (machopic_select_section): Use
> 	darwin_function_section.
> 	(darwin_function_section): New function.
> 	* config/darwin.h (HOT_TEXT_SECTION_NAME,
> 	UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
> 	(TARGET_ASM_FUNCTION_SECTION): Define.
> Index: doc/tm.texi
> ===================================================================
> *** doc/tm.texi	(revision 166985)
> --- doc/tm.texi	(working copy)
> *************** is non-NULL, it is the @code{VAR_DECL} o
> *** 7330,7335 ****
> --- 7330,7345 ----
>   this section is associated.
>   @end deftypefn
>   
> + @deftypefn {Target Hook} {section *} TARGET_ASM_FUNCTION_SECTION (tree @var{decl}, enum node_frequency @var{freq}, bool @var{startup}, bool @var{exit})
> + Return preferred text (sub)section for function @var{decl}.
> + Main purpose of this function is to separate cold, normal and hot
> + functions. @var{startup} is true when function is known to be used only 
> + at startup (from static constructors or it is @code{main()}).
> + @var{exit} is true when function is known to be used only at exit
> + (from static destructors).
> + Return NULL if function should go to default text section.
> + @end deftypefn
> + 
>   @deftypevr {Target Hook} bool TARGET_HAVE_NAMED_SECTIONS
>   This flag is true if the target supports @code{TARGET_ASM_NAMED_SECTION}.
>   @end deftypevr
> Index: doc/tm.texi.in
> ===================================================================
> *** doc/tm.texi.in	(revision 166985)
> --- doc/tm.texi.in	(working copy)
> *************** is non-NULL, it is the @code{VAR_DECL} o
> *** 7305,7310 ****
> --- 7305,7320 ----
>   this section is associated.
>   @end deftypefn
>   
> + @hook TARGET_ASM_FUNCTION_SECTION
> + Return preferred text (sub)section for function @var{decl}.
> + Main purpose of this function is to separate cold, normal and hot
> + functions. @var{startup} is true when function is known to be used only 
> + at startup (from static constructors or it is @code{main()}).
> + @var{exit} is true when function is known to be used only at exit
> + (from static destructors).
> + Return NULL if function should go to default text section.
> + @end deftypefn
> + 
>   @hook TARGET_HAVE_NAMED_SECTIONS
>   This flag is true if the target supports @code{TARGET_ASM_NAMED_SECTION}.
>   @end deftypevr
> Index: targhooks.h
> ===================================================================
> *** targhooks.h	(revision 166985)
> --- targhooks.h	(working copy)
> *************** extern int default_label_align_after_bar
> *** 171,177 ****
>   extern int default_loop_align_max_skip (rtx);
>   extern int default_label_align_max_skip (rtx);
>   extern int default_jump_align_max_skip (rtx);
> ! 
>   extern enum machine_mode default_get_reg_raw_mode(int);
>   
>   extern const struct default_options empty_optimization_table[];
> --- 171,178 ----
>   extern int default_loop_align_max_skip (rtx);
>   extern int default_label_align_max_skip (rtx);
>   extern int default_jump_align_max_skip (rtx);
> ! extern section * default_function_section(tree decl, enum node_frequency freq,
> ! 					  bool startup, bool exit);
>   extern enum machine_mode default_get_reg_raw_mode(int);
>   
>   extern const struct default_options empty_optimization_table[];
> Index: target.def
> ===================================================================
> *** target.def	(revision 166985)
> --- target.def	(working copy)
> *************** DEFHOOK
> *** 281,286 ****
> --- 281,299 ----
>    void, (const char *name, unsigned int flags, tree decl),
>    default_no_named_section)
>   
> + /* Return preferred text (sub)section for function DECL.
> +    Main purpose of this function is to separate cold, normal and hot
> +    functions. STARTUP is true when function is known to be used only 
> +    at startup (from static constructors or it is main()).
> +    EXIT is true when function is known to be used only at exit
> +    (from static destructors).
> +    Return NULL if function should go to default text section.  */
> + DEFHOOK
> + (function_section,
> +  "",
> +  section *, (tree decl, enum node_frequency freq, bool startup, bool exit),
> +  default_function_section)
> + 
>   /* Return a mask describing how relocations should be treated when
>      selecting sections.  Bit 1 should be set if global relocations
>      should be placed in a read-write section; bit 0 should be set if
> Index: cgraph.h
> ===================================================================
> *** cgraph.h	(revision 166985)
> --- cgraph.h	(working copy)
> *************** struct GTY(()) cgraph_clone_info
> *** 180,199 ****
>     bitmap combined_args_to_skip;
>   };
>   
> - enum node_frequency {
> -   /* This function most likely won't be executed at all.
> -      (set only when profile feedback is available or via function attribute). */
> -   NODE_FREQUENCY_UNLIKELY_EXECUTED,
> -   /* For functions that are known to be executed once (i.e. constructors, destructors
> -      and main function.  */
> -   NODE_FREQUENCY_EXECUTED_ONCE,
> -   /* The default value.  */
> -   NODE_FREQUENCY_NORMAL,
> -   /* Optimize this function hard
> -      (set only when profile feedback is available or via function attribute). */
> -   NODE_FREQUENCY_HOT
> - };
> - 
>   
>   /* The cgraph data structure.
>      Each function decl has assigned cgraph_node listing callees and callers.  */
> --- 180,185 ----
> Index: defaults.h
> ===================================================================
> *** defaults.h	(revision 166985)
> --- defaults.h	(working copy)
> *************** see the files COPYING3 and COPYING.RUNTI
> *** 897,910 ****
>   #define TARGET_DEC_EVAL_METHOD 2
>   #endif
>   
> - #ifndef HOT_TEXT_SECTION_NAME
> - #define HOT_TEXT_SECTION_NAME ".text.hot"
> - #endif
> - 
> - #ifndef UNLIKELY_EXECUTED_TEXT_SECTION_NAME
> - #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text.unlikely"
> - #endif
> - 
>   #ifndef HAS_LONG_COND_BRANCH
>   #define HAS_LONG_COND_BRANCH 0
>   #endif
> --- 897,902 ----
> Index: tree.h
> ===================================================================
> *** tree.h	(revision 166985)
> --- tree.h	(working copy)
> *************** struct GTY(()) tree_parm_decl {
> *** 3107,3112 ****
> --- 3107,3117 ----
>   #define DECL_HAS_INIT_PRIORITY_P(NODE) \
>     (VAR_DECL_CHECK (NODE)->decl_with_vis.init_priority_p)
>   
> + /* Specify whether the section name was set by user or by
> +    compiler via -ffunction-sections.  */
> + #define DECL_HAS_IMPLICIT_SECTION_NAME_P(NODE) \
> +   (DECL_WITH_VIS_CHECK (NODE)->decl_with_vis.implicit_section_name_p)
> + 
>   struct GTY(()) tree_decl_with_vis {
>    struct tree_decl_with_rtl common;
>    tree assembler_name;
> *************** struct GTY(()) tree_decl_with_vis {
> *** 3135,3141 ****
>    unsigned init_priority_p : 1;
>    /* Used by C++ only.  Might become a generic decl flag.  */
>    unsigned shadowed_for_var_p : 1;
> !  /* 14 unused bits. */
>   };
>   
>   extern tree decl_debug_expr_lookup (tree);
> --- 3140,3148 ----
>    unsigned init_priority_p : 1;
>    /* Used by C++ only.  Might become a generic decl flag.  */
>    unsigned shadowed_for_var_p : 1;
> !  /* When SECTION_NAME is implied by -ffunsection-section.  */
> !  unsigned implicit_section_name_p : 1;
> !  /* 13 unused bits. */
>   };
>   
>   extern tree decl_debug_expr_lookup (tree);
> Index: predict.c
> ===================================================================
> *** predict.c	(revision 166985)
> --- predict.c	(working copy)
> *************** static sreal real_zero, real_one, real_a
> *** 78,84 ****
>   static void combine_predictions_for_insn (rtx, basic_block);
>   static void dump_prediction (FILE *, enum br_predictor, int, basic_block, int);
>   static void predict_paths_leading_to (basic_block, enum br_predictor, enum prediction);
> - static void choose_function_section (void);
>   static bool can_predict_insn_p (const_rtx);
>   
>   /* Information we hold about each branch predictor.
> --- 78,83 ----
> *************** estimate_bb_frequencies (void)
> *** 2185,2192 ****
>         free_aux_for_edges ();
>       }
>     compute_function_frequency ();
> -   if (flag_reorder_functions)
> -     choose_function_section ();
>   }
>   
>   /* Decide whether function is hot, cold or unlikely executed.  */
> --- 2184,2189 ----
> *************** compute_function_frequency (void)
> *** 2232,2266 ****
>       }
>   }
>   
> - /* Choose appropriate section for the function.  */
> - static void
> - choose_function_section (void)
> - {
> -   struct cgraph_node *node = cgraph_node (current_function_decl);
> -   if (DECL_SECTION_NAME (current_function_decl)
> -       || !targetm.have_named_sections
> -       /* Theoretically we can split the gnu.linkonce text section too,
> - 	 but this requires more work as the frequency needs to match
> - 	 for all generated objects so we need to merge the frequency
> - 	 of all instances.  For now just never set frequency for these.  */
> -       || DECL_ONE_ONLY (current_function_decl))
> -     return;
> - 
> -   /* If we are doing the partitioning optimization, let the optimization
> -      choose the correct section into which to put things.  */
> - 
> -   if (flag_reorder_blocks_and_partition)
> -     return;
> - 
> -   if (node->frequency == NODE_FREQUENCY_HOT)
> -     DECL_SECTION_NAME (current_function_decl) =
> -       build_string (strlen (HOT_TEXT_SECTION_NAME), HOT_TEXT_SECTION_NAME);
> -   if (node->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED)
> -     DECL_SECTION_NAME (current_function_decl) =
> -       build_string (strlen (UNLIKELY_EXECUTED_TEXT_SECTION_NAME),
> - 		    UNLIKELY_EXECUTED_TEXT_SECTION_NAME);
> - }
> - 
>   static bool
>   gate_estimate_probability (void)
>   {
> --- 2229,2234 ----
> Index: function.h
> ===================================================================
> *** function.h	(revision 166985)
> --- function.h	(working copy)
> *************** struct GTY(()) function_subsections {
> *** 226,236 ****
>     const char *cold_section_label;
>     const char *hot_section_end_label;
>     const char *cold_section_end_label;
> - 
> -   /* String to be used for name of cold text sections, via
> -      targetm.asm_out.named_section.  */
> - 
> -   const char *unlikely_text_section_name;
>   };
>   
>   /* Describe an empty area of space in the stack frame.  These can be chained
> --- 226,231 ----
> Index: coretypes.h
> ===================================================================
> *** coretypes.h	(revision 166985)
> --- coretypes.h	(working copy)
> *************** enum unwind_info_type
> *** 121,126 ****
> --- 121,142 ----
>     UI_TARGET
>   };
>   
> + /* Callgraph node profile representation.  */
> + enum node_frequency {
> +   /* This function most likely won't be executed at all.
> +      (set only when profile feedback is available or via function attribute). */
> +   NODE_FREQUENCY_UNLIKELY_EXECUTED,
> +   /* For functions that are known to be executed once (i.e. constructors, destructors
> +      and main function.  */
> +   NODE_FREQUENCY_EXECUTED_ONCE,
> +   /* The default value.  */
> +   NODE_FREQUENCY_NORMAL,
> +   /* Optimize this function hard
> +      (set only when profile feedback is available or via function attribute). */
> +   NODE_FREQUENCY_HOT
> + };
> + 
> + 
>   struct edge_def;
>   typedef struct edge_def *edge;
>   typedef const struct edge_def *const_edge;
> Index: varasm.c
> ===================================================================
> *** varasm.c	(revision 166985)
> --- varasm.c	(working copy)
> *************** create_block_symbol (const char *label, 
> *** 368,436 ****
>     return symbol;
>   }
>   
> - static void
> - initialize_cold_section_name (void)
> - {
> -   const char *stripped_name;
> -   char *name, *buffer;
> -   tree dsn;
> - 
> -   gcc_assert (cfun && current_function_decl);
> -   if (crtl->subsections.unlikely_text_section_name)
> -     return;
> - 
> -   dsn = DECL_SECTION_NAME (current_function_decl);
> -   if (flag_function_sections && dsn)
> -     {
> -       name = (char *) alloca (TREE_STRING_LENGTH (dsn) + 1);
> -       memcpy (name, TREE_STRING_POINTER (dsn), TREE_STRING_LENGTH (dsn) + 1);
> - 
> -       stripped_name = targetm.strip_name_encoding (name);
> - 
> -       buffer = ACONCAT ((stripped_name, "_unlikely", NULL));
> -       crtl->subsections.unlikely_text_section_name = ggc_strdup (buffer);
> -     }
> -   else
> -     crtl->subsections.unlikely_text_section_name =  UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
> - }
> - 
> - /* Tell assembler to switch to unlikely-to-be-executed text section.  */
> - 
> - section *
> - unlikely_text_section (void)
> - {
> -   if (cfun)
> -     {
> -       if (!crtl->subsections.unlikely_text_section_name)
> - 	initialize_cold_section_name ();
> - 
> -       return get_named_section (NULL, crtl->subsections.unlikely_text_section_name, 0);
> -     }
> -   else
> -     return get_named_section (NULL, UNLIKELY_EXECUTED_TEXT_SECTION_NAME, 0);
> - }
> - 
> - /* When called within a function context, return true if the function
> -    has been assigned a cold text section and if SECT is that section.
> -    When called outside a function context, return true if SECT is the
> -    default cold section.  */
> - 
> - bool
> - unlikely_text_section_p (section *sect)
> - {
> -   const char *name;
> - 
> -   if (cfun)
> -     name = crtl->subsections.unlikely_text_section_name;
> -   else
> -     name = UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
> - 
> -   return (name
> - 	  && sect
> - 	  && SECTION_STYLE (sect) == SECTION_NAMED
> - 	  && strcmp (name, sect->named.name) == 0);
> - }
> - 
>   /* Return a section with a particular name and with whatever SECTION_*
>      flags section_type_flags deems appropriate.  The name of the section
>      is taken from NAME if nonnull, otherwise it is taken from DECL's
> --- 368,373 ----
> *************** resolve_unique_section (tree decl, int r
> *** 462,468 ****
>         && targetm.have_named_sections
>         && (flag_function_or_data_sections
>   	  || DECL_ONE_ONLY (decl)))
> !     targetm.asm_out.unique_section (decl, reloc);
>   }
>   
>   #ifdef BSS_SECTION_ASM_OP
> --- 399,408 ----
>         && targetm.have_named_sections
>         && (flag_function_or_data_sections
>   	  || DECL_ONE_ONLY (decl)))
> !     {
> !       targetm.asm_out.unique_section (decl, reloc);
> !       DECL_HAS_IMPLICIT_SECTION_NAME_P (decl) = true;
> !     }
>   }
>   
>   #ifdef BSS_SECTION_ASM_OP
> *************** hot_function_section (tree decl)
> *** 539,587 ****
>   }
>   #endif
>   
>   /* Return the section for function DECL.
>   
>      If DECL is NULL_TREE, return the text section.  We can be passed
> !    NULL_TREE under some circumstances by dbxout.c at least.  */
>   
> ! section *
> ! function_section (tree decl)
>   {
> !   int reloc = 0;
>   
> !   if (first_function_block_is_cold)
> !     reloc = 1;
>   
>   #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
>     if (decl != NULL_TREE
>         && DECL_SECTION_NAME (decl) != NULL_TREE)
> !     return reloc ? unlikely_text_section ()
> ! 		 : get_named_section (decl, NULL, 0);
>     else
> !     return targetm.asm_out.select_section (decl, reloc, DECL_ALIGN (decl));
>   #else
> !   return reloc ? unlikely_text_section () : hot_function_section (decl);
>   #endif
>   }
>   
>   section *
>   current_function_section (void)
>   {
> ! #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
> !   if (current_function_decl != NULL_TREE
> !       && DECL_SECTION_NAME (current_function_decl) != NULL_TREE)
> !     return in_cold_section_p ? unlikely_text_section ()
> ! 			     : get_named_section (current_function_decl,
> ! 						  NULL, 0);
> !   else
> !     return targetm.asm_out.select_section (current_function_decl,
> ! 					   in_cold_section_p,
> ! 					   DECL_ALIGN (current_function_decl));
> ! #else
> !   return (in_cold_section_p
> ! 	  ? unlikely_text_section ()
> ! 	  : hot_function_section (current_function_decl));
> ! #endif
>   }
>   
>   /* Return the read-only data section associated with function DECL.  */
> --- 479,654 ----
>   }
>   #endif
>   
> + /* Return section for TEXT_SECTION_NAME if DECL or DECL_SECTION_NAME (DECL)
> +    is NULL.
> + 
> +    When DECL_SECTION_NAME is non-NULL and it is implicit section and
> +    NAMED_SECTION_SUFFIX is non-NULL, then produce section called
> +    concatenate the name with NAMED_SECTION_SUFFIX.
> +    Otherwise produce "TEXT_SECTION_NAME.IMPLICIT_NAME".  */
> + 
> + section *
> + get_named_text_section (tree decl,
> + 		        const char *text_section_name,
> + 		        const char *named_section_suffix)
> + {
> +   if (decl && DECL_SECTION_NAME (decl))
> +     {
> +       if (named_section_suffix)
> + 	{
> + 	  tree dsn = DECL_SECTION_NAME (decl);
> + 	  const char *stripped_name;
> + 	  char *name, *buffer;
> + 
> + 	  name = (char *) alloca (TREE_STRING_LENGTH (dsn) + 1);
> + 	  memcpy (name, TREE_STRING_POINTER (dsn),
> + 		  TREE_STRING_LENGTH (dsn) + 1);
> + 
> + 	  stripped_name = targetm.strip_name_encoding (name);
> + 
> + 	  buffer = ACONCAT ((stripped_name, named_section_suffix, NULL));
> + 	  return get_named_section (decl, buffer, 0);
> + 	}
> +       else if (DECL_HAS_IMPLICIT_SECTION_NAME_P (decl))
> + 	{
> + 	  const char *name;
> + 
> + 	  /* Do not try to split gnu_linkonce functions.  This gets somewhat
> + 	     slipperly.  */
> + 	  if (DECL_ONE_ONLY (decl) && !HAVE_COMDAT_GROUP)
> + 	    return NULL;
> + 	  name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
> + 	  name = targetm.strip_name_encoding (name);
> + 	  return get_named_section (decl, ACONCAT ((text_section_name, ".",
> + 				                   name, NULL)), 0);
> + 	}
> +       else
> + 	return NULL;
> +     }
> +   return get_named_section (decl, text_section_name, 0);
> + }
> + 
> + /* Choose named function section based on its frequency.  */
> + 
> + section *
> + default_function_section (tree decl, enum node_frequency freq,
> + 			  bool startup, bool exit)
> + {
> +   /* Startup code should go to startup subsection unless it is
> +      unlikely executed (this happens especially with function splitting
> +      where we can split away unnecesary parts of static constructors.  */
> +   if (startup && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
> +     return get_named_text_section (decl, ".text.startup", NULL);
> + 
> +   /* Similarly for exit.  */
> +   if (exit && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
> +     return get_named_text_section (decl, ".text.exit", NULL);
> + 
> +   /* Group cold functions together, similarly for hot code.  */
> +   switch (freq)
> +     {
> +       case NODE_FREQUENCY_UNLIKELY_EXECUTED:
> + 	return get_named_text_section (decl, "text.unlikely", NULL);
> +       case NODE_FREQUENCY_HOT:
> + 	return get_named_text_section (decl, "text.hot", NULL);
> +       default:
> + 	return NULL;
> +     }
> + }
> + 
>   /* Return the section for function DECL.
>   
>      If DECL is NULL_TREE, return the text section.  We can be passed
> !    NULL_TREE under some circumstances by dbxout.c at least. 
>   
> !    If FORCE_COLD is true, return cold function section ignoring
> !    the frequency info of cgraph_node.  */
> ! 
> ! static section *
> ! function_section_1 (tree decl, bool force_cold)
>   {
> !   section *section = NULL;
> !   enum node_frequency freq = NODE_FREQUENCY_NORMAL;
> !   bool startup = false, exit = false;
>   
> !   if (decl)
> !     {
> !       struct cgraph_node *node = cgraph_node (decl);
> ! 
> !       freq = node->frequency;
> !       startup = node->only_called_at_startup;
> !       exit = node->only_called_at_exit;
> !     }
> !   if (force_cold)
> !     freq = NODE_FREQUENCY_UNLIKELY_EXECUTED;
>   
>   #ifdef USE_SELECT_SECTION_FOR_FUNCTIONS
>     if (decl != NULL_TREE
>         && DECL_SECTION_NAME (decl) != NULL_TREE)
> !     {
> !       if (targetm.asm_out.function_section)
> ! 	section = targetm.asm_out.function_section (decl, freq,
> ! 						    startup, exit);
> !       if (section)
> ! 	return section;
> !       return get_named_section (decl, NULL, 0);
> !     }
>     else
> !     return targetm.asm_out.select_section
> ! 	    (decl, freq == NODE_FREQUENCY_UNLIKELY_EXECUTED,
> ! 	     DECL_ALIGN (decl));
>   #else
> !   if (targetm.asm_out.function_section)
> !     section = targetm.asm_out.function_section (decl, freq, startup, exit);
> !   if (section)
> !     return section;
> !   return hot_function_section (decl);
>   #endif
>   }
>   
> + /* Return the section for function DECL.
> + 
> +    If DECL is NULL_TREE, return the text section.  We can be passed
> +    NULL_TREE under some circumstances by dbxout.c at least.  */
> + 
> + section *
> + function_section (tree decl)
> + {
> +   /* Handle cases where function splitting code decides
> +      to put function entry point into unlikely executed section
> +      despite the fact that the function itself is not cold
> +      (i.e. it is called rarely but contains a hot loop that is
> +      better to live in hot subsection for the code locality).  */
> +   return function_section_1 (decl,
> + 			     first_function_block_is_cold);
> + }
> + 
> + /* Return the section for the current function, take IN_COLD_SECTION_P
> +    into account.  */
> + 
>   section *
>   current_function_section (void)
>   {
> !   return function_section_1 (current_function_decl, in_cold_section_p);
> ! }
> ! 
> ! /* Tell assembler to switch to unlikely-to-be-executed text section.  */
> ! 
> ! section *
> ! unlikely_text_section (void)
> ! {
> !   return function_section_1 (current_function_decl, true);
> ! }
> ! 
> ! /* When called within a function context, return true if the function
> !    has been assigned a cold text section and if SECT is that section.
> !    When called outside a function context, return true if SECT is the
> !    default cold section.  */
> ! 
> ! bool
> ! unlikely_text_section_p (section *sect)
> ! {
> !   return sect == function_section_1 (current_function_decl, true);
>   }
>   
>   /* Return the read-only data section associated with function DECL.  */
> *************** assemble_start_function (tree decl, cons
> *** 1454,1461 ****
>     char tmp_label[100];
>     bool hot_label_written = false;
>   
> -   crtl->subsections.unlikely_text_section_name = NULL;
> - 
>     first_function_block_is_cold = false;
>     if (flag_reorder_blocks_and_partition)
>       {
> --- 1521,1526 ----
> *************** assemble_start_function (tree decl, cons
> *** 1513,1528 ****
>     else if (DECL_SECTION_NAME (decl))
>       {
>         /* Calls to function_section rely on first_function_block_is_cold
> ! 	 being accurate.  The first block may be cold even if we aren't
> ! 	 doing partitioning, if the entire function was decided by
> ! 	 choose_function_section (predict.c) to be cold.  */
> ! 
> !       initialize_cold_section_name ();
> ! 
> !       if (crtl->subsections.unlikely_text_section_name
> ! 	  && strcmp (TREE_STRING_POINTER (DECL_SECTION_NAME (decl)),
> ! 		     crtl->subsections.unlikely_text_section_name) == 0)
> ! 	first_function_block_is_cold = true;
>       }
>   
>     in_cold_section_p = first_function_block_is_cold;
> --- 1578,1587 ----
>     else if (DECL_SECTION_NAME (decl))
>       {
>         /* Calls to function_section rely on first_function_block_is_cold
> ! 	 being accurate.  */
> !       first_function_block_is_cold
> ! 	 = (cgraph_node (current_function_decl)->frequency
> ! 	    == NODE_FREQUENCY_UNLIKELY_EXECUTED);
>       }
>   
>     in_cold_section_p = first_function_block_is_cold;
> *************** default_section_type_flags (tree decl, c
> *** 5871,5885 ****
>       flags = SECTION_CODE;
>     else if (decl && decl_readonly_section (decl, reloc))
>       flags = 0;
> -   else if (current_function_decl
> - 	   && cfun
> - 	   && crtl->subsections.unlikely_text_section_name
> - 	   && strcmp (name, crtl->subsections.unlikely_text_section_name) == 0)
> -     flags = SECTION_CODE;
> -   else if (!decl
> - 	   && (!current_function_decl || !cfun)
> - 	   && strcmp (name, UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
> -     flags = SECTION_CODE;
>     else
>       flags = SECTION_WRITE;
>   
> --- 5930,5935 ----
> *************** switch_to_section (section *new_section)
> *** 6810,6821 ****
>     switch (SECTION_STYLE (new_section))
>       {
>       case SECTION_NAMED:
> -       if (cfun
> - 	  && !crtl->subsections.unlikely_text_section_name
> - 	  && strcmp (new_section->named.name,
> - 		     UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
> - 	crtl->subsections.unlikely_text_section_name = UNLIKELY_EXECUTED_TEXT_SECTION_NAME;
> - 
>         targetm.asm_out.named_section (new_section->named.name,
>   				     new_section->named.common.flags,
>   				     new_section->named.decl);
> --- 6860,6865 ----
> Index: output.h
> ===================================================================
> *** output.h	(revision 166985)
> --- output.h	(working copy)
> *************** extern bool default_assemble_integer (rt
> *** 269,274 ****
> --- 269,278 ----
>      be outputable. */
>   extern bool assemble_integer (rtx, unsigned, unsigned, int);
>   
> + /* Return section for TEXT_SECITON_NAME if DECL or DECL_SECTION_NAME (DECL)
> +    is NULL.  */
> + extern section *get_named_text_section (tree, const char *, const char *);
> + 
>   /* An interface to assemble_integer for the common case in which a value is
>      fully aligned and must be printed.  VALUE is the value of the integer
>      object and SIZE is the number of bytes it contains.  */
> Index: config/i386/winnt.c
> ===================================================================
> *** config/i386/winnt.c	(revision 166985)
> --- config/i386/winnt.c	(working copy)
> *************** i386_pe_section_type_flags (tree decl, c
> *** 414,428 ****
>       flags = SECTION_CODE;
>     else if (decl && decl_readonly_section (decl, reloc))
>       flags = 0;
> -   else if (current_function_decl
> - 	   && cfun
> - 	   && crtl->subsections.unlikely_text_section_name
> - 	   && strcmp (name, crtl->subsections.unlikely_text_section_name) == 0)
> -     flags = SECTION_CODE;
> -   else if (!decl
> - 	   && (!current_function_decl || !cfun)
> - 	   && strcmp (name, UNLIKELY_EXECUTED_TEXT_SECTION_NAME) == 0)
> -     flags = SECTION_CODE;
>     else
>       {
>         flags = SECTION_WRITE;
> --- 414,419 ----
> Index: config/darwin-protos.h
> ===================================================================
> *** config/darwin-protos.h	(revision 166985)
> --- config/darwin-protos.h	(working copy)
> *************** extern section *machopic_select_section 
> *** 57,62 ****
> --- 57,63 ----
>   extern section *machopic_select_rtx_section (enum machine_mode, rtx,
>   					     unsigned HOST_WIDE_INT);
>   
> + extern section *darwin_function_section (tree, enum node_frequency, bool, bool); 
>   extern void darwin_unique_section (tree decl, int reloc);
>   extern void darwin_asm_named_section (const char *, unsigned int, tree);
>   extern void darwin_non_lazy_pcrel (FILE *, rtx);
> Index: config/microblaze/microblaze.h
> ===================================================================
> *** config/microblaze/microblaze.h	(revision 166985)
> --- config/microblaze/microblaze.h	(working copy)
> *************** do {									 \
> *** 880,889 ****
>   #define SBSS_SECTION_ASM_OP     "\t.sbss"	/* Small RW uninitialized data */
>   #define SBSS2_SECTION_ASM_OP    "\t.sbss2"	/* Small RO uninitialized data */
>   
> - #define HOT_TEXT_SECTION_NAME   ".text.hot"
> - #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \
> -                                 ".text.unlikely"
> - 
>   /* We do this to save a few 10s of code space that would be taken up
>      by the call_FUNC () wrappers, used by the generic CRT_CALL_STATIC_FUNCTION
>      definition in crtstuff.c.  */
> --- 880,885 ----
> Index: config/ia64/ia64.c
> ===================================================================
> *** config/ia64/ia64.c	(revision 166985)
> --- config/ia64/ia64.c	(working copy)
> *************** ia64_get_reg_raw_mode (int regno)
> *** 11021,11024 ****
> --- 11021,11035 ----
>     return default_get_reg_raw_mode(regno);
>   }
>   
> + /* Always default to .text section until HP-UX linker is fixed.  */
> + 
> + section *
> + ia64_hpux_function_section (tree decl ATTRIBUTE_UNUSED,
> + 			    enum node_frequency freq ATTRIBUTE_UNUSED,
> + 			    bool startup ATTRIBUTE_UNUSED,
> + 			    bool exit ATTRIBUTE_UNUSED)
> + {
> +   return NULL;
> + }
> + 
>   #include "gt-ia64.h"
> Index: config/ia64/hpux.h
> ===================================================================
> *** config/ia64/hpux.h	(revision 166985)
> --- config/ia64/hpux.h	(working copy)
> *************** do {								\
> *** 221,228 ****
>      it is fixed, prevent code from being put into .text.unlikely or
>      .text.hot.  */
>   
> ! #undef UNLIKELY_EXECUTED_TEXT_SECTION_NAME
> ! #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text"
> ! 
> ! #undef HOT_TEXT_SECTION_NAME
> ! #define HOT_TEXT_SECTION_NAME ".text"
> --- 221,224 ----
>      it is fixed, prevent code from being put into .text.unlikely or
>      .text.hot.  */
>   
> ! #define TARGET_ASM_FUNCTION_SECTION ia64_hpux_function_section
> Index: config/ia64/ia64-protos.h
> ===================================================================
> *** config/ia64/ia64-protos.h	(revision 166985)
> --- config/ia64/ia64-protos.h	(working copy)
> *************** extern void ia64_init_expanders (void);
> *** 97,99 ****
> --- 97,100 ----
>   
>   extern rtx ia64_dconst_0_5 (void);
>   extern rtx ia64_dconst_0_375 (void);
> + extern section *ia64_hpux_function_section (decl, enum node_frequency, startup, exit);
> Index: config/darwin.c
> ===================================================================
> *** config/darwin.c	(revision 166985)
> --- config/darwin.c	(working copy)
> *************** machopic_select_section (tree decl,
> *** 1233,1244 ****
>   	       && DECL_WEAK (decl)
>   	       && !lookup_attribute ("weak_import",
>   				     DECL_ATTRIBUTES (decl)));
> !   section *base_section;
>   
>     switch (categorize_decl_for_section (decl, reloc))
>       {
>       case SECCAT_TEXT:
> !       base_section = darwin_text_section (reloc, weak);
>         break;
>   
>       case SECCAT_RODATA:
> --- 1233,1254 ----
>   	       && DECL_WEAK (decl)
>   	       && !lookup_attribute ("weak_import",
>   				     DECL_ATTRIBUTES (decl)));
> !   section *base_section = NULL;
>   
>     switch (categorize_decl_for_section (decl, reloc))
>       {
>       case SECCAT_TEXT:
> !       {
> ! 	struct cgraph_node *node;
> ! 	if (decl && TREE_CODE (decl) == FUNCTION_DECL
> ! 	    && (node = cgraph_get_node (decl)) != NULL)
> ! 	  base_section = darwin_function_section (decl,
> ! 						  node->frequency,
> ! 						  node->only_called_at_startup,
> ! 						  node->only_called_at_exit);
> ! 	if (!base_section)
> !           base_section = darwin_text_section (reloc, weak);
> !       }
>         break;
>   
>       case SECCAT_RODATA:
> *************** darwin_enter_string_into_cfstring_table 
> *** 2362,2365 ****
> --- 2372,2409 ----
>       }
>   }
>   
> + /* Choose named function section based on its frequency.  */
> + 
> + section *
> + darwin_function_section (tree decl, enum node_frequency freq,
> + 			  bool startup, bool exit)
> + {
> +   /* Startup code should go to startup subsection unless it is
> +      unlikely executed (this happens especially with function splitting
> +      where we can split away unnecesary parts of static constructors.  */
> +   if (startup && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
> +     return get_named_text_section
> + 	     (decl, "__TEXT,__startup,regular,pure_instructions", "_startup");
> + 
> +   /* Similarly for exit.  */
> +   if (exit && freq != NODE_FREQUENCY_UNLIKELY_EXECUTED)
> +     return get_named_text_section (decl,
> + 				   "__TEXT,__exit,regular,pure_instructions",
> + 				   "_exit");
> + 
> +   /* Group cold functions together, similarly for hot code.  */
> +   switch (freq)
> +     {
> +       case NODE_FREQUENCY_UNLIKELY_EXECUTED:
> + 	return get_named_text_section
> + 		 (decl,
> + 	          "__TEXT,__unlikely,regular,pure_instructions", "_ulinkely");
> +       case NODE_FREQUENCY_HOT:
> + 	return get_named_text_section
> + 		 (decl, "__TEXT,__hot,regular,pure_instructions", "_hot");
> +       default:
> + 	return NULL;
> +     }
> + }
> + 
>   #include "gt-darwin.h"
> Index: config/darwin.h
> ===================================================================
> *** config/darwin.h	(revision 166985)
> --- config/darwin.h	(working copy)
> *************** int darwin_label_is_anonymous_local_objc
> *** 658,668 ****
>   /* The generic version, archs should over-ride where required.  */
>   #define MACHOPIC_NL_SYMBOL_PTR_SECTION ".non_lazy_symbol_pointer"
>   
> - /* These are used by -fbranch-probabilities */
> - #define HOT_TEXT_SECTION_NAME "__TEXT,__text,regular,pure_instructions"
> - #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME \
> -                               "__TEXT,__unlikely,regular,pure_instructions"
> - 
>   /* Declare the section variables.  */
>   #ifndef USED_FOR_TARGET
>   enum darwin_section_enum {
> --- 658,663 ----
> *************** extern GTY(()) section * darwin_sections
> *** 677,682 ****
> --- 672,679 ----
>   #undef	TARGET_ASM_SELECT_SECTION
>   #define TARGET_ASM_SELECT_SECTION machopic_select_section
>   #define USE_SELECT_SECTION_FOR_FUNCTIONS
> + #undef	TARGET_ASM_FUNCTION_SECTION
> + #define TARGET_ASM_FUNCTION_SECTION darwin_function_section
>   
>   #undef	TARGET_ASM_SELECT_RTX_SECTION
>   #define TARGET_ASM_SELECT_RTX_SECTION machopic_select_rtx_section

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-21  3:31           ` Jan Hubicka
  2010-11-21 18:49             ` Jack Howarth
@ 2010-11-22 18:33             ` Richard Henderson
  2010-11-23 16:48               ` Jan Hubicka
  2010-11-23 22:19             ` H.J. Lu
  2011-04-21 23:28             ` Mike Stump
  3 siblings, 1 reply; 70+ messages in thread
From: Richard Henderson @ 2010-11-22 18:33 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Joseph S. Myers, gcc-patches, rguenther, iant, iains, tglek

On 11/20/2010 05:12 PM, Jan Hubicka wrote:
> +       case NODE_FREQUENCY_UNLIKELY_EXECUTED:
> + 	return get_named_text_section (decl, "text.unlikely", NULL);
> +       case NODE_FREQUENCY_HOT:
> + 	return get_named_text_section (decl, "text.hot", NULL);

Missing leading "." for both of these.

> + /* Always default to .text section until HP-UX linker is fixed.  */
> + 
> + section *
> + ia64_hpux_function_section (tree decl ATTRIBUTE_UNUSED,
> + 			    enum node_frequency freq ATTRIBUTE_UNUSED,
> + 			    bool startup ATTRIBUTE_UNUSED,
> + 			    bool exit ATTRIBUTE_UNUSED)
> + {
> +   return NULL;
> + }

I really really think this is much clearer if we do

  return text_section;

here, but I guess this could be a subsequent cleanup.

> + extern section *ia64_hpux_function_section (decl, enum node_frequency, startup, exit);

This doesn't need to be declared here.  The only real use is within ia64.c;
the reference within hpux.h can just as well talk about the static function.
I suppose you do have to either conditionally define the function or mark
is as ATTRIBUTE_UNUSED.

> + 	          "__TEXT,__unlikely,regular,pure_instructions", "_ulinkely");

Typo in that last string.

> ! #undef UNLIKELY_EXECUTED_TEXT_SECTION_NAME
> ! #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text"
> ! 
> ! #undef HOT_TEXT_SECTION_NAME
> ! #define HOT_TEXT_SECTION_NAME ".text"

Are these now poison-able?

Otherwise ok.


r~

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-22 18:33             ` Richard Henderson
@ 2010-11-23 16:48               ` Jan Hubicka
  0 siblings, 0 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-11-23 16:48 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jan Hubicka, Joseph S. Myers, gcc-patches, rguenther, iant, iains, tglek

> > + /* Always default to .text section until HP-UX linker is fixed.  */
> > + 
> > + section *
> > + ia64_hpux_function_section (tree decl ATTRIBUTE_UNUSED,
> > + 			    enum node_frequency freq ATTRIBUTE_UNUSED,
> > + 			    bool startup ATTRIBUTE_UNUSED,
> > + 			    bool exit ATTRIBUTE_UNUSED)
> > + {
> > +   return NULL;
> > + }
> 
> I really really think this is much clearer if we do
> 
>   return text_section;
> 
> here, but I guess this could be a subsequent cleanup.

Yep, it would be, but for that USE_SELECT_SECTION_FOR_FUNCTIONS needs to go.  I will be happy to do
that for next stage1.
> > ! #undef UNLIKELY_EXECUTED_TEXT_SECTION_NAME
> > ! #define UNLIKELY_EXECUTED_TEXT_SECTION_NAME ".text"
> > ! 
> > ! #undef HOT_TEXT_SECTION_NAME
> > ! #define HOT_TEXT_SECTION_NAME ".text"
> 
> Are these now poison-able?
Should be. I will add the poison, re-test and commit.  Thanks!

Honza

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-21  3:31           ` Jan Hubicka
  2010-11-21 18:49             ` Jack Howarth
  2010-11-22 18:33             ` Richard Henderson
@ 2010-11-23 22:19             ` H.J. Lu
  2010-11-24  1:00               ` Jan Hubicka
  2010-11-26 18:45               ` H.J. Lu
  2011-04-21 23:28             ` Mike Stump
  3 siblings, 2 replies; 70+ messages in thread
From: H.J. Lu @ 2010-11-23 22:19 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Richard Henderson, Joseph S. Myers, gcc-patches, rguenther, iant,
	iains, tglek

On Sat, Nov 20, 2010 at 5:12 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>
> Hi,
> this is updated patch.  I addressed, I hope, all the comments except for
> removing USE_SELECT_SECTION_FOR_FUNCTIONS. I am somewhat confused by darwin and
> thus would like to handle it incrementally.
>
> I've removed the subsection hash, the string concatenations and section hash
> should be fast enough given that we switch twice per function at most.  I also
> added flag to tree_decl_with_vis specifying whether section name is implied by
> -ffunction-sections or given by user and made default_function_section to do
> text.unlikely.function_name by default.  I kept adding the _unlikely only for
> darwin.  It was developed there and I suppose the funny suffixes do have
> meaning for darwin linker.
>
> Bootstrapped/regtested x86_64-linux, OK (after testing darwin and ia64)?
>
> Honza
>
>        * tree.h (DECL_HAS_IMPLICIT_SECTION_NAME_P): New macro.
>        (tree_decl_with_vis): Add implicit_section_name_p.
>        * targhooks.h (default_function_section): Declare.
>        * target.def (function_section): New hook.
>        * defaults.h (HOT_TEXT_SECTION_NAME,
>        UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
>        * predict.c (choose_function_section): Remove.
>        (estimate_bb_frequencies): Do not use choose_function_section.
>        * coretypes.h (enum node_frequency): Move here from cgraph.h
>        * cgraph.h (enum node_frequency): Remove.
>        * varasm.c (initialize_cold_section_name, unlikely_text_section,
>        unlikely_text_section_p): Remove.
>        (named_subsection_entry): New structure.
>        (get_text_section): New function.
>        (default_function_section): New function.
>        (function_section_1): Break out from ...; handle profile info.
>        (function_section): ... here.
>        (unlikely_text_section): Remove.
>        (unlikely_text_section_p): Use function_section_1.
>        (assemble_start_function): Do not initialize cold section.
>        (default_section_type_flags): Do not special case cold subsection.
>        (switch_to_section): Likewise.
>        * output.h (get_text_section): Define.
>        * config/i386/winnt.c: Do not special case cold section.
>        * config/darwin-protos.h (darwin_function_section): Declare.
>        * config/microblaze/microblaze.h (HOT_TEXT_SECTION_NAME,
>        UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
>        * config/ia64/hpux.h (HOT_TEXT_SECTION_NAME,
>        UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
>        (TARGET_ASM_FUNCTION_SECTION): Define to ia64_hpux_function_section.
>        * config/ia64/ia64.c (ia64_hpux_function_section): New function.
>        * config/ia64/ia64-protos.h (ia64_hpux_function_section): Define.
>        * config/darwin.c (machopic_select_section): Use
>        darwin_function_section.
>        (darwin_function_section): New function.
>        * config/darwin.h (HOT_TEXT_SECTION_NAME,
>        UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
>        (TARGET_ASM_FUNCTION_SECTION): Define.

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46628

H.J.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-23 22:19             ` H.J. Lu
@ 2010-11-24  1:00               ` Jan Hubicka
  2010-11-24 18:18                 ` Jan Hubicka
  2010-11-26 18:45               ` H.J. Lu
  1 sibling, 1 reply; 70+ messages in thread
From: Jan Hubicka @ 2010-11-24  1:00 UTC (permalink / raw)
  To: H.J. Lu
  Cc: Jan Hubicka, Richard Henderson, Joseph S. Myers, gcc-patches,
	rguenther, iant, iains, tglek

> 
> This caused:
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46628

Hi,
this problem is caused by dwarf2out trying to get into cold section to output
starting and ending label.  This is no longer so easy since one needs a
function declaration to get into cold section, otherwise we end up producing
cold section with data section flags.

This patch makes dwarf2out to behave roughly as before by waiting with
switching to the cold section until first function is output.

I think this is just latent problem since multiple cold section will lead to
bad debug info. (wilth -ffunction-sections -fpartition-functions-and-reorder).
How we should handle this case?  Does this patch seems resonable to fix the
regression?

Bootstrapped/regtested x86_64-linux.  My apologizes for the breakage.

Honza

	* dwarf2out.c (dwarf2out_begin_function): Set cold_text_section
	and output cold_text_section_label.
	(dwarf2out_init): Don't do that there.
	(dwarf2out_finish): Handle cold section end label only if cold
	section was used.
Index: dwarf2out.c
===================================================================
--- dwarf2out.c	(revision 167086)
+++ dwarf2out.c	(working copy)
@@ -21674,6 +21674,15 @@ dwarf2out_var_location (rtx loc_note)
 static void
 dwarf2out_begin_function (tree fun)
 {
+  if (flag_reorder_blocks_and_partition && !cold_text_section)
+    {
+      gcc_assert (current_function_decl == fun);
+      cold_text_section = unlikely_text_section ();
+      switch_to_section (cold_text_section);
+      ASM_OUTPUT_LABEL (asm_out_file, cold_text_section_label);
+      switch_to_section (current_function_section ());
+    }
+
   if (function_section (fun) != text_section)
     have_multiple_function_sections = true;
 
@@ -21996,13 +22005,6 @@ dwarf2out_init (const char *filename ATT
 
   switch_to_section (text_section);
   ASM_OUTPUT_LABEL (asm_out_file, text_section_label);
-  if (flag_reorder_blocks_and_partition)
-    {
-      cold_text_section = unlikely_text_section ();
-      switch_to_section (cold_text_section);
-      ASM_OUTPUT_LABEL (asm_out_file, cold_text_section_label);
-    }
-
 }
 
 /* Called before cgraph_optimize starts outputtting functions, variables
@@ -23108,9 +23110,9 @@ dwarf2out_finish (const char *filename)
   /* Output a terminator label for the .text section.  */
   switch_to_section (text_section);
   targetm.asm_out.internal_label (asm_out_file, TEXT_END_LABEL, 0);
-  if (flag_reorder_blocks_and_partition)
+  if (cold_text_section)
     {
-      switch_to_section (unlikely_text_section ());
+      switch_to_section (cold_text_section);
       targetm.asm_out.internal_label (asm_out_file, COLD_END_LABEL, 0);
     }
 

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-24  1:00               ` Jan Hubicka
@ 2010-11-24 18:18                 ` Jan Hubicka
  2010-11-26 19:19                   ` Richard Henderson
  0 siblings, 1 reply; 70+ messages in thread
From: Jan Hubicka @ 2010-11-24 18:18 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: H.J. Lu, Richard Henderson, Joseph S. Myers, gcc-patches,
	rguenther, iant, iains, tglek

Hi,
as Jakub, Tromey and Jason patiently explained to me, the cold_text_section
code in dwarf2out is just optimizations.  Functions in separate sections are
handled elsewhere. So I take back my claims about dwarf2out being broken in
partitioning and attaching the updated patch.  The only difference is in
dwarf2out_begin_function that waits for first function in text section (that
consequently have standard cold sectoin) and initialize vars correspondingly.

Bootstrapped/regtested x86_64-linux, OK?

Honza

	* dwarf2out.c (dwarf2out_begin_function): Initialize cold_text_section
	and emit cold_text_section_label.
	(dwarf2out_init): Don't do that there.
	(dwarf2out_finish): Emit end label only when cold_text_section is
	initialized.
Index: dwarf2out.c
===================================================================
*** dwarf2out.c	(revision 167086)
--- dwarf2out.c	(working copy)
*************** dwarf2out_begin_function (tree fun)
*** 21676,21681 ****
--- 21676,21689 ----
  {
    if (function_section (fun) != text_section)
      have_multiple_function_sections = true;
+   else if (flag_reorder_blocks_and_partition && !cold_text_section)
+     {
+       gcc_assert (current_function_decl == fun);
+       cold_text_section = unlikely_text_section ();
+       switch_to_section (cold_text_section);
+       ASM_OUTPUT_LABEL (asm_out_file, cold_text_section_label);
+       switch_to_section (current_function_section ());
+     }
  
    dwarf2out_note_section_used ();
  }
*************** dwarf2out_init (const char *filename ATT
*** 21996,22008 ****
  
    switch_to_section (text_section);
    ASM_OUTPUT_LABEL (asm_out_file, text_section_label);
-   if (flag_reorder_blocks_and_partition)
-     {
-       cold_text_section = unlikely_text_section ();
-       switch_to_section (cold_text_section);
-       ASM_OUTPUT_LABEL (asm_out_file, cold_text_section_label);
-     }
- 
  }
  
  /* Called before cgraph_optimize starts outputtting functions, variables
--- 22004,22009 ----
*************** dwarf2out_finish (const char *filename)
*** 23108,23116 ****
    /* Output a terminator label for the .text section.  */
    switch_to_section (text_section);
    targetm.asm_out.internal_label (asm_out_file, TEXT_END_LABEL, 0);
!   if (flag_reorder_blocks_and_partition)
      {
!       switch_to_section (unlikely_text_section ());
        targetm.asm_out.internal_label (asm_out_file, COLD_END_LABEL, 0);
      }
  
--- 23109,23117 ----
    /* Output a terminator label for the .text section.  */
    switch_to_section (text_section);
    targetm.asm_out.internal_label (asm_out_file, TEXT_END_LABEL, 0);
!   if (cold_text_section)
      {
!       switch_to_section (cold_text_section);
        targetm.asm_out.internal_label (asm_out_file, COLD_END_LABEL, 0);
      }
  

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-23 22:19             ` H.J. Lu
  2010-11-24  1:00               ` Jan Hubicka
@ 2010-11-26 18:45               ` H.J. Lu
  1 sibling, 0 replies; 70+ messages in thread
From: H.J. Lu @ 2010-11-26 18:45 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: Richard Henderson, Joseph S. Myers, gcc-patches, rguenther, iant,
	iains, tglek

On Tue, Nov 23, 2010 at 12:09 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Sat, Nov 20, 2010 at 5:12 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>
>> Hi,
>> this is updated patch.  I addressed, I hope, all the comments except for
>> removing USE_SELECT_SECTION_FOR_FUNCTIONS. I am somewhat confused by darwin and
>> thus would like to handle it incrementally.
>>
>> I've removed the subsection hash, the string concatenations and section hash
>> should be fast enough given that we switch twice per function at most.  I also
>> added flag to tree_decl_with_vis specifying whether section name is implied by
>> -ffunction-sections or given by user and made default_function_section to do
>> text.unlikely.function_name by default.  I kept adding the _unlikely only for
>> darwin.  It was developed there and I suppose the funny suffixes do have
>> meaning for darwin linker.
>>
>> Bootstrapped/regtested x86_64-linux, OK (after testing darwin and ia64)?
>>
>> Honza
>>
>>        * tree.h (DECL_HAS_IMPLICIT_SECTION_NAME_P): New macro.
>>        (tree_decl_with_vis): Add implicit_section_name_p.
>>        * targhooks.h (default_function_section): Declare.
>>        * target.def (function_section): New hook.
>>        * defaults.h (HOT_TEXT_SECTION_NAME,
>>        UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
>>        * predict.c (choose_function_section): Remove.
>>        (estimate_bb_frequencies): Do not use choose_function_section.
>>        * coretypes.h (enum node_frequency): Move here from cgraph.h
>>        * cgraph.h (enum node_frequency): Remove.
>>        * varasm.c (initialize_cold_section_name, unlikely_text_section,
>>        unlikely_text_section_p): Remove.
>>        (named_subsection_entry): New structure.
>>        (get_text_section): New function.
>>        (default_function_section): New function.
>>        (function_section_1): Break out from ...; handle profile info.
>>        (function_section): ... here.
>>        (unlikely_text_section): Remove.
>>        (unlikely_text_section_p): Use function_section_1.
>>        (assemble_start_function): Do not initialize cold section.
>>        (default_section_type_flags): Do not special case cold subsection.
>>        (switch_to_section): Likewise.
>>        * output.h (get_text_section): Define.
>>        * config/i386/winnt.c: Do not special case cold section.
>>        * config/darwin-protos.h (darwin_function_section): Declare.
>>        * config/microblaze/microblaze.h (HOT_TEXT_SECTION_NAME,
>>        UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
>>        * config/ia64/hpux.h (HOT_TEXT_SECTION_NAME,
>>        UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
>>        (TARGET_ASM_FUNCTION_SECTION): Define to ia64_hpux_function_section.
>>        * config/ia64/ia64.c (ia64_hpux_function_section): New function.
>>        * config/ia64/ia64-protos.h (ia64_hpux_function_section): Define.
>>        * config/darwin.c (machopic_select_section): Use
>>        darwin_function_section.
>>        (darwin_function_section): New function.
>>        * config/darwin.h (HOT_TEXT_SECTION_NAME,
>>        UNLIKELY_EXECUTED_TEXT_SECTION_NAME): Remove.
>>        (TARGET_ASM_FUNCTION_SECTION): Define.
>
> This caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46628
>

This also caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46667


-- 
H.J.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-24 18:18                 ` Jan Hubicka
@ 2010-11-26 19:19                   ` Richard Henderson
  2010-11-27 23:17                     ` Jan Hubicka
  0 siblings, 1 reply; 70+ messages in thread
From: Richard Henderson @ 2010-11-26 19:19 UTC (permalink / raw)
  To: Jan Hubicka
  Cc: H.J. Lu, Joseph S. Myers, gcc-patches, rguenther, iant, iains, tglek

On 11/24/2010 08:09 AM, Jan Hubicka wrote:
> Hi,
> as Jakub, Tromey and Jason patiently explained to me, the cold_text_section
> code in dwarf2out is just optimizations.  Functions in separate sections are
> handled elsewhere. So I take back my claims about dwarf2out being broken in
> partitioning and attaching the updated patch.  The only difference is in
> dwarf2out_begin_function that waits for first function in text section (that
> consequently have standard cold sectoin) and initialize vars correspondingly.
> 
> Bootstrapped/regtested x86_64-linux, OK?
> 
> Honza
> 
> 	* dwarf2out.c (dwarf2out_begin_function): Initialize cold_text_section
> 	and emit cold_text_section_label.
> 	(dwarf2out_init): Don't do that there.
> 	(dwarf2out_finish): Emit end label only when cold_text_section is
> 	initialized.
> Index: dwarf2out.c
> ===================================================================
> *** dwarf2out.c	(revision 167086)
> --- dwarf2out.c	(working copy)
> *************** dwarf2out_begin_function (tree fun)
> *** 21676,21681 ****
> --- 21676,21689 ----
>   {
>     if (function_section (fun) != text_section)
>       have_multiple_function_sections = true;
> +   else if (flag_reorder_blocks_and_partition && !cold_text_section)
> +     {
> +       gcc_assert (current_function_decl == fun);
> +       cold_text_section = unlikely_text_section ();
> +       switch_to_section (cold_text_section);
> +       ASM_OUTPUT_LABEL (asm_out_file, cold_text_section_label);
> +       switch_to_section (current_function_section ());
> +     }

Is there a quick way to determine if the function will ever go into the
cold_text_section?  It would be nice if a given translation unit is
entirely non-cold that we never switch into the cold_text_section to
create that label.

If there is no way for that atm, then the patch is ok as-is.  It's
clearly an improvement over the current state of affairs.


r~

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-26 19:19                   ` Richard Henderson
@ 2010-11-27 23:17                     ` Jan Hubicka
  0 siblings, 0 replies; 70+ messages in thread
From: Jan Hubicka @ 2010-11-27 23:17 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Jan Hubicka, H.J. Lu, Joseph S. Myers, gcc-patches, rguenther,
	iant, iains, tglek

> 
> Is there a quick way to determine if the function will ever go into the
> cold_text_section?  It would be nice if a given translation unit is
> entirely non-cold that we never switch into the cold_text_section to
> create that label.
> 
> If there is no way for that atm, then the patch is ok as-is.  It's
> clearly an improvement over the current state of affairs.

I didn't noticed any except for walking CFG and looking for COLD_PARTITION
(or perhaps one needs to check only first and last BB as there should
be at most one partition change).
It is trivial to add a flag passed down from bb-reorder if the spurious label
seems like a problem.

I've comitted the patch. Thanks!
Honza
> 
> 
> r~

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections, take 2
  2010-11-21  3:31           ` Jan Hubicka
                               ` (2 preceding siblings ...)
  2010-11-23 22:19             ` H.J. Lu
@ 2011-04-21 23:28             ` Mike Stump
  3 siblings, 0 replies; 70+ messages in thread
From: Mike Stump @ 2011-04-21 23:28 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches@gcc.gnu.org Patches

On Nov 20, 2010, at 5:12 PM, Jan Hubicka wrote:
> this is updated patch.  I addressed, I hope, all the comments except for
> removing USE_SELECT_SECTION_FOR_FUNCTIONS. I am somewhat confused by darwin and
> thus would like to handle it incrementally.
> 
> I've removed the subsection hash, the string concatenations and section hash
> should be fast enough given that we switch twice per function at most.  I also
> added flag to tree_decl_with_vis specifying whether section name is implied by
> -ffunction-sections or given by user and made default_function_section to do
> text.unlikely.function_name by default.  I kept adding the _unlikely only for
> darwin.  It was developed there and I suppose the funny suffixes do have
> meaning for darwin linker.

So, I didn't see the patch that poisoned them in system.h.  That patch should have removed the documentation for HOT_TEXT_SECTION_NAME and UNLIKELY_EXECUTED_TEXT_SECTION_NAME?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2010-10-21 18:02 ` Cary Coutant
  2010-10-21 19:43   ` Ian Lance Taylor
@ 2013-07-15 17:46   ` Cary Coutant
  2013-07-15 18:08     ` Jan Hubicka
  1 sibling, 1 reply; 70+ messages in thread
From: Cary Coutant @ 2013-07-15 17:46 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: gcc-patches, Richard Biener, Ian Lance Taylor, iains, tglek

>> I am not sure how to update gold - I basically copied existing code in binutils
>> for .text.unlikely group in GNU LD linker script, but I think gold is doing
>> independent decisions somewhere.

Ian committed this patch a few months ago, after a lengthy discussion
around a patch originally submitted by Sriraman:

   http://sourceware.org/ml/binutils/2012-12/msg00227.html

That should have gold handling .text.startup, .text.exit, and
.text.hot the same as Gnu ld.

-cary

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2013-07-15 17:46   ` Cary Coutant
@ 2013-07-15 18:08     ` Jan Hubicka
  2013-07-15 18:18       ` Ian Lance Taylor
  0 siblings, 1 reply; 70+ messages in thread
From: Jan Hubicka @ 2013-07-15 18:08 UTC (permalink / raw)
  To: Cary Coutant
  Cc: Jan Hubicka, gcc-patches, Richard Biener, Ian Lance Taylor, iains, tglek

> >> I am not sure how to update gold - I basically copied existing code in binutils
> >> for .text.unlikely group in GNU LD linker script, but I think gold is doing
> >> independent decisions somewhere.
> 
> Ian committed this patch a few months ago, after a lengthy discussion
> around a patch originally submitted by Sriraman:
> 
>    http://sourceware.org/ml/binutils/2012-12/msg00227.html
> 
> That should have gold handling .text.startup, .text.exit, and
> .text.hot the same as Gnu ld.

Thanks a lot! It seems that this versio nof gold did not hit our distro yet.
I will update my local installation.
The next thing is how to tell GNU LD/Gold the relative order of functions.
I.e. my_function_section.order.125 or something like that?

Honza
> 
> -cary

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2013-07-15 18:08     ` Jan Hubicka
@ 2013-07-15 18:18       ` Ian Lance Taylor
  2013-07-15 18:26         ` Jan Hubicka
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Lance Taylor @ 2013-07-15 18:18 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Cary Coutant, gcc-patches, Richard Biener, iains, tglek

On Mon, Jul 15, 2013 at 11:05 AM, Jan Hubicka <hubicka@ucw.cz> wrote:

> The next thing is how to tell GNU LD/Gold the relative order of functions.
> I.e. my_function_section.order.125 or something like that?

Gold has a --section-ordering-file option that lets you specify the
order in which sections should appear in the executable.  Sections not
listed there follow the default rules.

Ian

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2013-07-15 18:18       ` Ian Lance Taylor
@ 2013-07-15 18:26         ` Jan Hubicka
  2013-07-15 22:45           ` Ian Lance Taylor
  0 siblings, 1 reply; 70+ messages in thread
From: Jan Hubicka @ 2013-07-15 18:26 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Jan Hubicka, Cary Coutant, gcc-patches, Richard Biener, iains, tglek

> On Mon, Jul 15, 2013 at 11:05 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> 
> > The next thing is how to tell GNU LD/Gold the relative order of functions.
> > I.e. my_function_section.order.125 or something like that?
> 
> Gold has a --section-ordering-file option that lets you specify the
> order in which sections should appear in the executable.  Sections not
> listed there follow the default rules.

Yep, the problem is where to produce the section ordering file.
The scheme is as follows:
  - with -fprofile-generate instrument every function entry point and record
    time of first and last invocation of the functoin
  - At compile time we take functions that are executed during the startup
    and we want to order them in the increasing order of the first invocation
    time measured at FDO time. So we know the relative position of given function
    in the program, but not the complette function order.

Honza
> 
> Ian

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2013-07-15 18:26         ` Jan Hubicka
@ 2013-07-15 22:45           ` Ian Lance Taylor
  2013-07-17  5:27             ` Jan Hubicka
  0 siblings, 1 reply; 70+ messages in thread
From: Ian Lance Taylor @ 2013-07-15 22:45 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Cary Coutant, gcc-patches, Richard Biener, iains, tglek

On Mon, Jul 15, 2013 at 11:18 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On Mon, Jul 15, 2013 at 11:05 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>
>> > The next thing is how to tell GNU LD/Gold the relative order of functions.
>> > I.e. my_function_section.order.125 or something like that?
>>
>> Gold has a --section-ordering-file option that lets you specify the
>> order in which sections should appear in the executable.  Sections not
>> listed there follow the default rules.
>
> Yep, the problem is where to produce the section ordering file.
> The scheme is as follows:
>   - with -fprofile-generate instrument every function entry point and record
>     time of first and last invocation of the functoin
>   - At compile time we take functions that are executed during the startup
>     and we want to order them in the increasing order of the first invocation
>     time measured at FDO time. So we know the relative position of given function
>     in the program, but not the complette function order.

Perhaps I misunderstand, but you can use --section-ordering-file
without knowing the complete function ordering.  Just specify the
functions you care about.

Ian

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2013-07-15 22:45           ` Ian Lance Taylor
@ 2013-07-17  5:27             ` Jan Hubicka
  2013-07-17 19:56               ` Cary Coutant
  0 siblings, 1 reply; 70+ messages in thread
From: Jan Hubicka @ 2013-07-17  5:27 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Jan Hubicka, Cary Coutant, gcc-patches, Richard Biener, iains, tglek

> On Mon, Jul 15, 2013 at 11:18 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> On Mon, Jul 15, 2013 at 11:05 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >>
> >> > The next thing is how to tell GNU LD/Gold the relative order of functions.
> >> > I.e. my_function_section.order.125 or something like that?
> >>
> >> Gold has a --section-ordering-file option that lets you specify the
> >> order in which sections should appear in the executable.  Sections not
> >> listed there follow the default rules.
> >
> > Yep, the problem is where to produce the section ordering file.
> > The scheme is as follows:
> >   - with -fprofile-generate instrument every function entry point and record
> >     time of first and last invocation of the functoin
> >   - At compile time we take functions that are executed during the startup
> >     and we want to order them in the increasing order of the first invocation
> >     time measured at FDO time. So we know the relative position of given function
> >     in the program, but not the complette function order.
> 
> Perhaps I misunderstand, but you can use --section-ordering-file
> without knowing the complete function ordering.  Just specify the
> functions you care about.

The thing is that when compiling given object file, you know only functions in
that object file, so you can not produce full --section-ordering-file.  We
would need a tool colleting the partial orders from all objects to single file
that I think may be just done in linker.

Honza
> 
> Ian

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2013-07-17  5:27             ` Jan Hubicka
@ 2013-07-17 19:56               ` Cary Coutant
  2013-07-20 13:30                 ` Martin Liška
  0 siblings, 1 reply; 70+ messages in thread
From: Cary Coutant @ 2013-07-17 19:56 UTC (permalink / raw)
  To: Jan Hubicka; +Cc: Ian Lance Taylor, gcc-patches, Richard Biener, iains, tglek

>> > Yep, the problem is where to produce the section ordering file.
>> > The scheme is as follows:
>> >   - with -fprofile-generate instrument every function entry point and record
>> >     time of first and last invocation of the functoin
>> >   - At compile time we take functions that are executed during the startup
>> >     and we want to order them in the increasing order of the first invocation
>> >     time measured at FDO time. So we know the relative position of given function
>> >     in the program, but not the complette function order.
>>
>> Perhaps I misunderstand, but you can use --section-ordering-file
>> without knowing the complete function ordering.  Just specify the
>> functions you care about.
>
> The thing is that when compiling given object file, you know only functions in
> that object file, so you can not produce full --section-ordering-file.  We
> would need a tool colleting the partial orders from all objects to single file
> that I think may be just done in linker.

How granular a solution do you need? If you need something fine-grain,
like microseconds since startup, we'd also need some way of ensuring
that all compilation units are using the same scale. What if someone
else wants to order by execution count instead? We could do something
coarse-grain by adding a few more "buckets" after "unlikely", "exit",
"startup", and "hot", but you probably would need to see the whole
program before you could translate something like time-since-startup
into a bucket.

In another old thread, I suggested modifying the section naming
convention to remove the ambiguity between a function named "unlikely"
compiled with -ffunction-sections, and an arbitrary function placed
into the "unlikely" bucket. Namely, instead of using
".text.function_name" and ".text.bucket", we combine these into
".text.bucket.function_name". Without -ffunction-sections, we'd just
have ".text.bucket" like we do today, but with -ffunction-sections,
we'd have ".text..function_name" in the case where there is no bucket.
In order to distinguish between old and new conventions, I'd amend
that suggestion to use a different set of delimiters -- perhaps
".text[bucket](function_name)". That at least makes it more obvious
that the input section goes into an output section named ".text", and
we can have a general rule rather than the collection of special cases
we have now.

To support your use case, we could allow, in addition to the four
buckets we already have defined, numeric buckets ranging from, say, 0
to 99999. You could map whatever ordering criterion you want to use
into that range, and the linker would order the text sections by
bucket, placing the numbered buckets after "hot" and before all the
unbucketed sections. I might further suggest moving "unlikely" to the
end, after all the unbucketed sections.

(I can't believe I'm suggesting this -- I don't like the increasing
effect that section names have on the linker behavior, but I don't
think we really have any better options in ELF.)

As an aside, is there any reason why the function name must appear in
the section name when we use -ffunction-sections? ELF doesn't require
sections to have unique names, so they could all be named ".text". We
could do section reordering based on the symbol names rather than the
section names, so it's not necessary for reordering. As far as I can
tell, it's just an assembler limitation, which we could fix by
modifying the syntax of the .section directive to allow both an
assembler name and a linker name. All those function names just bloat
the section string table for no good reason.

-cary

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2013-07-17 19:56               ` Cary Coutant
@ 2013-07-20 13:30                 ` Martin Liška
  2013-11-18  9:20                   ` Martin Liška
  0 siblings, 1 reply; 70+ messages in thread
From: Martin Liška @ 2013-07-20 13:30 UTC (permalink / raw)
  To: Cary Coutant
  Cc: Jan Hubicka, Ian Lance Taylor, gcc-patches, Richard Biener,
	iains, Taras Glek

On 17 July 2013 20:22, Cary Coutant <ccoutant@google.com> wrote:
>>> > Yep, the problem is where to produce the section ordering file.
>>> > The scheme is as follows:
>>> >   - with -fprofile-generate instrument every function entry point and record
>>> >     time of first and last invocation of the functoin
>>> >   - At compile time we take functions that are executed during the startup
>>> >     and we want to order them in the increasing order of the first invocation
>>> >     time measured at FDO time. So we know the relative position of given function
>>> >     in the program, but not the complette function order.
>>>
>>> Perhaps I misunderstand, but you can use --section-ordering-file
>>> without knowing the complete function ordering.  Just specify the
>>> functions you care about.
>>
>> The thing is that when compiling given object file, you know only functions in
>> that object file, so you can not produce full --section-ordering-file.  We
>> would need a tool colleting the partial orders from all objects to single file
>> that I think may be just done in linker.
>
> How granular a solution do you need? If you need something fine-grain,
> like microseconds since startup, we'd also need some way of ensuring
> that all compilation units are using the same scale. What if someone
> else wants to order by execution count instead? We could do something
> coarse-grain by adding a few more "buckets" after "unlikely", "exit",
> "startup", and "hot", but you probably would need to see the whole
> program before you could translate something like time-since-startup
> into a bucket.

I cooperate on function reordering with Jan, we are primary motivated
to reorder all functions called during startup. I did small
observation, e.g. Inkscape has about 14K function, where 2.5K (~20%)
are called during startup. Apart from that, we would also like to
reorder the rest of functions that are not in the first collection of
functions (e.g. according to edge call graph profile). Thus, it would
be nice we can de facto set up an order for all functions.

> In another old thread, I suggested modifying the section naming
> convention to remove the ambiguity between a function named "unlikely"
> compiled with -ffunction-sections, and an arbitrary function placed
> into the "unlikely" bucket. Namely, instead of using
> ".text.function_name" and ".text.bucket", we combine these into
> ".text.bucket.function_name". Without -ffunction-sections, we'd just
> have ".text.bucket" like we do today, but with -ffunction-sections,
> we'd have ".text..function_name" in the case where there is no bucket.
> In order to distinguish between old and new conventions, I'd amend
> that suggestion to use a different set of delimiters -- perhaps
> ".text[bucket](function_name)". That at least makes it more obvious
> that the input section goes into an output section named ".text", and
> we can have a general rule rather than the collection of special cases
> we have now.
>
> To support your use case, we could allow, in addition to the four
> buckets we already have defined, numeric buckets ranging from, say, 0
> to 99999. You could map whatever ordering criterion you want to use
> into that range, and the linker would order the text sections by
> bucket, placing the numbered buckets after "hot" and before all the
> unbucketed sections. I might further suggest moving "unlikely" to the
> end, after all the unbucketed sections.

I would enhance the capacity of such ranging buckets (f.e. Firefox
does have more than 10^5 functions).

> (I can't believe I'm suggesting this -- I don't like the increasing
> effect that section names have on the linker behavior, but I don't
> think we really have any better options in ELF.)
>
> As an aside, is there any reason why the function name must appear in
> the section name when we use -ffunction-sections? ELF doesn't require
> sections to have unique names, so they could all be named ".text". We
> could do section reordering based on the symbol names rather than the
> section names, so it's not necessary for reordering. As far as I can
> tell, it's just an assembler limitation, which we could fix by
> modifying the syntax of the .section directive to allow both an
> assembler name and a linker name. All those function names just bloat
> the section string table for no good reason.

It's not necessary to produce function sections, but it's question for
Jan and another compiler developers, how could symbol renaming help
with function ordering?

Martin

> -cary

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: Group static constructors and destructors in specific subsections
  2013-07-20 13:30                 ` Martin Liška
@ 2013-11-18  9:20                   ` Martin Liška
  0 siblings, 0 replies; 70+ messages in thread
From: Martin Liška @ 2013-11-18  9:20 UTC (permalink / raw)
  To: Cary Coutant
  Cc: Jan Hubicka, Ian Lance Taylor, gcc-patches, Richard Biener,
	iains, Taras Glek

Dear Cary,
   I've been merging my patches to GCC mainline and I would really
appreciate new section naming convention that was suggested by you in
the previous post. Is there any progress in implementation? Should I
participate in this change and write a patch that will introduce this
new section model?

Thank you,
Martin

On 20 July 2013 14:58, Martin Liška <marxin.liska@gmail.com> wrote:
> On 17 July 2013 20:22, Cary Coutant <ccoutant@google.com> wrote:
>>>> > Yep, the problem is where to produce the section ordering file.
>>>> > The scheme is as follows:
>>>> >   - with -fprofile-generate instrument every function entry point and record
>>>> >     time of first and last invocation of the functoin
>>>> >   - At compile time we take functions that are executed during the startup
>>>> >     and we want to order them in the increasing order of the first invocation
>>>> >     time measured at FDO time. So we know the relative position of given function
>>>> >     in the program, but not the complette function order.
>>>>
>>>> Perhaps I misunderstand, but you can use --section-ordering-file
>>>> without knowing the complete function ordering.  Just specify the
>>>> functions you care about.
>>>
>>> The thing is that when compiling given object file, you know only functions in
>>> that object file, so you can not produce full --section-ordering-file.  We
>>> would need a tool colleting the partial orders from all objects to single file
>>> that I think may be just done in linker.
>>
>> How granular a solution do you need? If you need something fine-grain,
>> like microseconds since startup, we'd also need some way of ensuring
>> that all compilation units are using the same scale. What if someone
>> else wants to order by execution count instead? We could do something
>> coarse-grain by adding a few more "buckets" after "unlikely", "exit",
>> "startup", and "hot", but you probably would need to see the whole
>> program before you could translate something like time-since-startup
>> into a bucket.
>
> I cooperate on function reordering with Jan, we are primary motivated
> to reorder all functions called during startup. I did small
> observation, e.g. Inkscape has about 14K function, where 2.5K (~20%)
> are called during startup. Apart from that, we would also like to
> reorder the rest of functions that are not in the first collection of
> functions (e.g. according to edge call graph profile). Thus, it would
> be nice we can de facto set up an order for all functions.
>
>> In another old thread, I suggested modifying the section naming
>> convention to remove the ambiguity between a function named "unlikely"
>> compiled with -ffunction-sections, and an arbitrary function placed
>> into the "unlikely" bucket. Namely, instead of using
>> ".text.function_name" and ".text.bucket", we combine these into
>> ".text.bucket.function_name". Without -ffunction-sections, we'd just
>> have ".text.bucket" like we do today, but with -ffunction-sections,
>> we'd have ".text..function_name" in the case where there is no bucket.
>> In order to distinguish between old and new conventions, I'd amend
>> that suggestion to use a different set of delimiters -- perhaps
>> ".text[bucket](function_name)". That at least makes it more obvious
>> that the input section goes into an output section named ".text", and
>> we can have a general rule rather than the collection of special cases
>> we have now.
>>
>> To support your use case, we could allow, in addition to the four
>> buckets we already have defined, numeric buckets ranging from, say, 0
>> to 99999. You could map whatever ordering criterion you want to use
>> into that range, and the linker would order the text sections by
>> bucket, placing the numbered buckets after "hot" and before all the
>> unbucketed sections. I might further suggest moving "unlikely" to the
>> end, after all the unbucketed sections.
>
> I would enhance the capacity of such ranging buckets (f.e. Firefox
> does have more than 10^5 functions).
>
>> (I can't believe I'm suggesting this -- I don't like the increasing
>> effect that section names have on the linker behavior, but I don't
>> think we really have any better options in ELF.)
>>
>> As an aside, is there any reason why the function name must appear in
>> the section name when we use -ffunction-sections? ELF doesn't require
>> sections to have unique names, so they could all be named ".text". We
>> could do section reordering based on the symbol names rather than the
>> section names, so it's not necessary for reordering. As far as I can
>> tell, it's just an assembler limitation, which we could fix by
>> modifying the syntax of the .section directive to allow both an
>> assembler name and a linker name. All those function names just bloat
>> the section string table for no good reason.
>
> It's not necessary to produce function sections, but it's question for
> Jan and another compiler developers, how could symbol renaming help
> with function ordering?
>
> Martin
>
>> -cary

^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2013-11-18  7:38 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-14 21:09 Group static constructors and destructors in specific subsections Jan Hubicka
2010-10-14 21:10 ` Joseph S. Myers
2010-10-14 21:12   ` Richard Henderson
2010-10-14 21:18     ` Jan Hubicka
2010-10-14 21:21       ` Richard Henderson
2010-10-14 22:02         ` Jan Hubicka
2010-10-14 22:39           ` Richard Henderson
2010-10-14 21:48     ` Joseph S. Myers
2010-10-23 16:01     ` Jan Hubicka
2010-11-10  4:59     ` Group static constructors and destructors in specific subsections, take 2 Jan Hubicka
2010-11-10 14:16       ` Jack Howarth
2010-11-10 14:56         ` Ian Lance Taylor
2010-11-10 15:04           ` Jack Howarth
2010-11-10 15:29             ` Jan Hubicka
2010-11-10 16:54               ` Jack Howarth
2010-11-10 17:10                 ` Jan Hubicka
2010-11-10 17:16                   ` Jan Hubicka
2010-11-10 22:26                     ` Jack Howarth
2010-11-10 22:57                       ` Jan Hubicka
2010-11-11  1:55                         ` Jack Howarth
2010-11-11  2:53                           ` Dave Korn
2010-11-11  5:24                             ` Jack Howarth
2010-11-11 12:52                         ` Jack Howarth
2010-11-11 16:28                           ` Jan Hubicka
2010-11-10 17:49                   ` Jack Howarth
2010-11-11 18:57                     ` Ralf Wildenhues
2010-11-11 21:41                       ` Dave Korn
2010-11-11 23:04                         ` Dave Korn
2010-11-12 19:27                           ` Ralf Wildenhues
2010-11-12 20:40                             ` Dave Korn
2010-11-10 18:59       ` Richard Henderson
2010-11-10 19:44         ` Jan Hubicka
2010-11-21  1:13         ` Jan Hubicka
2010-11-21  3:31           ` Jan Hubicka
2010-11-21 18:49             ` Jack Howarth
2010-11-22 18:33             ` Richard Henderson
2010-11-23 16:48               ` Jan Hubicka
2010-11-23 22:19             ` H.J. Lu
2010-11-24  1:00               ` Jan Hubicka
2010-11-24 18:18                 ` Jan Hubicka
2010-11-26 19:19                   ` Richard Henderson
2010-11-27 23:17                     ` Jan Hubicka
2010-11-26 18:45               ` H.J. Lu
2011-04-21 23:28             ` Mike Stump
2010-10-14 21:17   ` Group static constructors and destructors in specific subsections Jan Hubicka
2010-10-14 23:53 ` Mike Stump
2010-10-15  0:05 ` Mark Mitchell
2010-10-15  0:33   ` Jan Hubicka
2010-10-15  7:24 ` Mike Stump
2010-10-15 12:14   ` Jan Hubicka
2010-10-15 14:05 ` Paolo Carlini
2010-10-15 16:54   ` Jan Hubicka
2010-10-15 17:15     ` Paolo Carlini
2010-10-16 18:39       ` Jan Hubicka
2010-10-15 14:36 ` Jack Howarth
2010-10-15 16:14   ` Jan Hubicka
2010-10-21 18:02 ` Cary Coutant
2010-10-21 19:43   ` Ian Lance Taylor
2010-10-21 21:00     ` Mike Stump
2010-10-21 22:32       ` Jan Hubicka
2010-10-21 22:14     ` Jan Hubicka
2013-07-15 17:46   ` Cary Coutant
2013-07-15 18:08     ` Jan Hubicka
2013-07-15 18:18       ` Ian Lance Taylor
2013-07-15 18:26         ` Jan Hubicka
2013-07-15 22:45           ` Ian Lance Taylor
2013-07-17  5:27             ` Jan Hubicka
2013-07-17 19:56               ` Cary Coutant
2013-07-20 13:30                 ` Martin Liška
2013-11-18  9:20                   ` Martin Liška

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).