public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] Fix PR30358, performance with --sort-section
@ 2023-04-26 14:16 Michael Matz
  2023-04-26 23:43 ` Alan Modra
  0 siblings, 1 reply; 2+ messages in thread
From: Michael Matz @ 2023-04-26 14:16 UTC (permalink / raw)
  To: binutils

since af31506c we only use the binary tree when section sorting is
required.  While its unbalanced and hence can degrade to a linear list
it should otherwise have been equivalent to the old code relying on
insertion sort.  Unfortunately it was not.  The old code directly used
lang_add_section to populate the sorted list, the new code first
populates the tree and only then does lang_add_section on the sorted
result.

In the testcase we have very many linkonce section groups, and hence
lang_add_section won't actually insert anything for most of them.  That
limited the to-be-sorted list length previously.  The tree-sorting code
OTOH first created a tree of all candidates sections, including those
that wouldn't be inserted by lang_add_section, hence increasing the size
of the sorting problem.  In the testcase the chain length went from
about 1500 to 106000, and in the degenerated case (as in the testcase)
that goes in quadratically.

This splits out most of the early-out code from lang_add_section to its
own function and uses the latter to avoid inserting into the tree.  This
refactoring slightly changes the order of early-out tests (the ones
based on section flags is now done last, and only in lang_add_section).
The new function is not a pure predicate: it can give warnings and it
might change output_section, like the old early-out code did.  I have
also added a skip-warning case in the first discard case, whose
non-existence seemed to have been an oversight.

	PR 30358
	* ldlang.c (wont_add_section_p): Split out from ...
	(lang_add_section): ... here.
	(output_section_callback_sort): Use wont_add_section_p to not
	always add sections to the sort tree.
---

Of course it would also be nice if the sort tree would be balanced (or 
some other non-O(n^2) sorting structure be used).  But even then it's 
better to not even have to sort stuff that's not going to be included in
the link.  Either way, until such time this patch restores performance in
the degenerate cases.  Tested on 158 targets without regressions.  Okay
for master?

---
 ld/ldlang.c | 92 +++++++++++++++++++++++++++++++++++------------------
 1 file changed, 61 insertions(+), 31 deletions(-)

diff --git a/ld/ldlang.c b/ld/ldlang.c
index 9ad405aa06c..aa01c81e661 100644
--- a/ld/ldlang.c
+++ b/ld/ldlang.c
@@ -80,6 +80,8 @@ static unsigned int opb_shift = 0;
 /* Forward declarations.  */
 static void exp_init_os (etree_type *);
 static lang_input_statement_type *lookup_name (const char *);
+static bool wont_add_section_p (asection *,
+				lang_output_section_statement_type *);
 static void insert_undefined (const char *);
 static bool sort_def_symbol (struct bfd_link_hash_entry *, void *);
 static lang_statement_union_type *new_statement (enum statement_enum type,
@@ -687,6 +689,11 @@ output_section_callback_sort (lang_wild_statement_type *ptr,
   if (unique_section_p (section, os))
     return;
 
+  /* Don't add sections to the tree when we already know that
+     lang_add_section won't do anything with it.  */
+  if (wont_add_section_p (section, os))
+    return;
+
   node = (lang_section_bst_type *) xmalloc (sizeof (lang_section_bst_type));
   node->left = 0;
   node->right = 0;
@@ -2514,27 +2521,20 @@ lang_discard_section_p (asection *section)
   return discard;
 }
 
-/* The wild routines.
-
-   These expand statements like *(.text) and foo.o to a list of
-   explicit actions, like foo.o(.text), bar.o(.text) and
-   foo.o(.text, .data).  */
+/* Return TRUE if SECTION is never going to be added to output statement
+   OUTPUT.  lang_add_section() definitely won't do anything with SECTION
+   if this returns TRUE.  It may do something (or not) if this returns FALSE.
 
-/* Add SECTION to the output section OUTPUT.  Do this by creating a
-   lang_input_section statement which is placed at PTR.  */
+   Can be used as early-out to filter matches.  This may set
+   output_section of SECTION, if it was unset, to the abs section in case
+   we discover SECTION to be always discarded.  This may also give
+   warning messages.  */
 
-void
-lang_add_section (lang_statement_list_type *ptr,
-		  asection *section,
-		  struct wildcard_list *pattern,
-		  struct flag_info *sflag_info,
-		  lang_output_section_statement_type *output)
+static bool
+wont_add_section_p (asection *section,
+		    lang_output_section_statement_type *output)
 {
-  flagword flags = section->flags;
-
   bool discard;
-  lang_input_section_type *new_section;
-  bfd *abfd = link_info.output_bfd;
 
   /* Is this section one we know should be discarded?  */
   discard = lang_discard_section_p (section);
@@ -2548,41 +2548,35 @@ lang_add_section (lang_statement_list_type *ptr,
     {
       if (section->output_section == NULL)
 	{
-	  /* This prevents future calls from assigning this section.  */
+	  /* This prevents future calls from assigning this section or
+	     warning about it again.  */
 	  section->output_section = bfd_abs_section_ptr;
 	}
+      else if (bfd_is_abs_section (section->output_section))
+	;
       else if (link_info.non_contiguous_regions_warnings)
 	einfo (_("%P:%pS: warning: --enable-non-contiguous-regions makes "
 		 "section `%pA' from `%pB' match /DISCARD/ clause.\n"),
 	       NULL, section, section->owner);
 
-      return;
-    }
-
-  if (sflag_info)
-    {
-      bool keep;
-
-      keep = bfd_lookup_section_flags (&link_info, sflag_info, section);
-      if (!keep)
-	return;
+      return true;
     }
 
   if (section->output_section != NULL)
     {
       if (!link_info.non_contiguous_regions)
-	return;
+	return true;
 
       /* SECTION has already been handled in a special way
 	 (eg. LINK_ONCE): skip it.  */
       if (bfd_is_abs_section (section->output_section))
-	return;
+	return true;
 
       /* Already assigned to the same output section, do not process
 	 it again, to avoid creating loops between duplicate sections
 	 later.  */
       if (section->output_section == output->bfd_section)
-	return;
+	return true;
 
       if (link_info.non_contiguous_regions_warnings && output->bfd_section)
 	einfo (_("%P:%pS: warning: --enable-non-contiguous-regions may "
@@ -2597,6 +2591,42 @@ lang_add_section (lang_statement_list_type *ptr,
 	 size_input_section as appropriate.  */
     }
 
+  return false;
+}
+
+/* The wild routines.
+
+   These expand statements like *(.text) and foo.o to a list of
+   explicit actions, like foo.o(.text), bar.o(.text) and
+   foo.o(.text, .data).  */
+
+/* Add SECTION to the output section OUTPUT.  Do this by creating a
+   lang_input_section statement which is placed at PTR.  */
+
+void
+lang_add_section (lang_statement_list_type *ptr,
+		  asection *section,
+		  struct wildcard_list *pattern,
+		  struct flag_info *sflag_info,
+		  lang_output_section_statement_type *output)
+{
+  flagword flags = section->flags;
+
+  lang_input_section_type *new_section;
+  bfd *abfd = link_info.output_bfd;
+
+  if (wont_add_section_p (section, output))
+    return;
+
+  if (sflag_info)
+    {
+      bool keep;
+
+      keep = bfd_lookup_section_flags (&link_info, sflag_info, section);
+      if (!keep)
+	return;
+    }
+
   /* We don't copy the SEC_NEVER_LOAD flag from an input section
      to an output section, because we want to be able to include a
      SEC_NEVER_LOAD section in the middle of an otherwise loaded
-- 
2.39.1

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH] Fix PR30358, performance with --sort-section
  2023-04-26 14:16 [PATCH] Fix PR30358, performance with --sort-section Michael Matz
@ 2023-04-26 23:43 ` Alan Modra
  0 siblings, 0 replies; 2+ messages in thread
From: Alan Modra @ 2023-04-26 23:43 UTC (permalink / raw)
  To: Michael Matz; +Cc: binutils

On Wed, Apr 26, 2023 at 02:16:55PM +0000, Michael Matz via Binutils wrote:
> 	PR 30358
> 	* ldlang.c (wont_add_section_p): Split out from ...
> 	(lang_add_section): ... here.
> 	(output_section_callback_sort): Use wont_add_section_p to not
> 	always add sections to the sort tree.

OK, thanks!

-- 
Alan Modra
Australia Development Lab, IBM

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-04-26 23:43 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-26 14:16 [PATCH] Fix PR30358, performance with --sort-section Michael Matz
2023-04-26 23:43 ` Alan Modra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).