public inbox for binutils@sourceware.org
 help / color / mirror / Atom feed
* [RFC] Pointer alignment issues in ld's map_input_to_output_sections
@ 2006-06-08 23:54 Roger Sayle
  2006-06-09  0:04 ` Alan Modra
  0 siblings, 1 reply; 3+ messages in thread
From: Roger Sayle @ 2006-06-08 23:54 UTC (permalink / raw)
  To: binutils


GNU ld, both 2.16.1 and mainline CVS, currently dump core immediately
at start up on mips-sgi-irix6.5 (when compiled with gcc 3.4 and gcc 4.1).
The point of bus error is the map_input_to_output_sections:1312

  s->output_section_statement.all_input_readonly = TRUE;

After much investigation, the problem appears to be that "s" which is
of type lang_statement_union_type* is assumed to be 64-bit aligned
(as the union lang_statement_union_type requires 64-bit alignment),
even though in this instance it's pointing to a
lang_output_section_statement_type which is only 32-bit aligned.
The resulting misaligned double-word load in the bitfield assignment
causes MIPS to fault (in GCC terms a STRICT_ALIGNMENT target).

The reason that "s" is misaligned is that the source of this pointer
is the function output_statement_newfunc, which allocates a new
"struct output_statement_hash_entry" containing a bfd_hash_entry field
(root) followed by a lang_output_section_statement_type (os).  It is
a pointer to this "os" field that gets returned and ultimately
assigned to a pointer of lang_statement_union_type.


After much discussion on the #gcc IRC, the consensus seems to be that
the code in ldlang.c is badly broken, if not undefined, on all platforms,
but unfortunately no-one has any clever suggestions on how to fix things.

One suggestion was to change the output_statement_hash_entry struct
to make "os" be a lang_statement_union_type instead of a
lang_output_section_statement_type, but unfortunately this would
mean increasing the size of each hash table entry to the largest
member of that union.

Another suggestion was to swap the order of fields in
output_statement_hash_entry so that the 96 byte "os" comes before
the 4-byte "struct bfd_hash_entry root", but it was pointed out
pointers to this type are also cast to bfd_hash_entry.  So the
elements of output_statement_hash_entry are "derived" from two
different types (multiple inheritance), with each field having
differing alignment constraints.

I've no clue how to sort this out.  If portability wasn't an issue
one might require the use of GCC's alignment directives to overcome
the undefinedness of the code as written.  However, one "workaround"
that appears to hide the problem, is implemented by the patch
attached below, which casts the "s" which is assumed to be 64-bit
aligned, to a "lang_output_section_statement_type*" which is
sufficient to confuse GCC's optimizers into treating the problematic
structure as only 32-bit aligned.  It's unclear if this is a GCC
deficiency that may be addressed in future.


I just thought I'd at least post this bug report and workaround
to see what the bright binutils folks make of the problem.

Thoughts?



2006-06-08  Roger Sayle  <roger@eyesopen.com>

	* ldlang.c (map_input_to_output_sections): Refer to the
	output_section_statement field of "s" via a pointer of the
	appropriate type to avoid potential alignment problems.


Index: ldlang.c
===================================================================
RCS file: /cvs/src/src/ld/ldlang.c,v
retrieving revision 1.222
diff -c -3 -p -r1.222 ldlang.c
*** ldlang.c	7 Jun 2006 04:55:11 -0000	1.222
--- ldlang.c	8 Jun 2006 20:56:35 -0000
*************** map_input_to_output_sections
*** 3091,3096 ****
--- 3091,3098 ----
    (lang_statement_union_type *s, const char *target,
     lang_output_section_statement_type *os)
  {
+   lang_output_section_statement_type *sos;
+
    for (; s != NULL; s = s->header.next)
      {
        switch (s->header.type)
*************** map_input_to_output_sections
*** 3104,3130 ****
  					os);
  	  break;
  	case lang_output_section_statement_enum:
! 	  if (s->output_section_statement.constraint)
  	    {
! 	      if (s->output_section_statement.constraint != ONLY_IF_RW
! 		  && s->output_section_statement.constraint != ONLY_IF_RO)
  		break;
! 	      s->output_section_statement.all_input_readonly = TRUE;
! 	      check_input_sections (s->output_section_statement.children.head,
! 				    &s->output_section_statement);
! 	      if ((s->output_section_statement.all_input_readonly
! 		   && s->output_section_statement.constraint == ONLY_IF_RW)
! 		  || (!s->output_section_statement.all_input_readonly
! 		      && s->output_section_statement.constraint == ONLY_IF_RO))
  		{
! 		  s->output_section_statement.constraint = -1;
  		  break;
  		}
  	    }

! 	  map_input_to_output_sections (s->output_section_statement.children.head,
! 					target,
! 					&s->output_section_statement);
  	  break;
  	case lang_output_statement_enum:
  	  break;
--- 3106,3133 ----
  					os);
  	  break;
  	case lang_output_section_statement_enum:
! 	  /* The lang_statement_union_type *S may have stricter alignment
! 	     constraints than the lang_output_section_statement_type that
! 	     S actually points to.  */
! 	  sos = &s->output_section_statement;
! 	  if (sos->constraint)
  	    {
! 	      if (sos->constraint != ONLY_IF_RW
! 		  && sos->constraint != ONLY_IF_RO)
  		break;
! 	      sos->all_input_readonly = TRUE;
! 	      check_input_sections (sos->children.head, sos);
! 	      if ((sos->all_input_readonly
! 		   && sos->constraint == ONLY_IF_RW)
! 		  || (!sos->all_input_readonly
! 		      && sos->constraint == ONLY_IF_RO))
  		{
! 		  sos->constraint = -1;
  		  break;
  		}
  	    }

! 	  map_input_to_output_sections (sos->children.head, target, sos);
  	  break;
  	case lang_output_statement_enum:
  	  break;


Roger
--
Roger Sayle,                         E-mail: roger@eyesopen.com
OpenEye Scientific Software,         WWW: http://www.eyesopen.com/
Suite 1107, 3600 Cerrillos Road,     Tel: (+1) 505-473-7385
Santa Fe, New Mexico, 87507.         Fax: (+1) 505-473-0833

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] Pointer alignment issues in ld's map_input_to_output_sections
  2006-06-08 23:54 [RFC] Pointer alignment issues in ld's map_input_to_output_sections Roger Sayle
@ 2006-06-09  0:04 ` Alan Modra
  2006-06-09  3:40   ` Alan Modra
  0 siblings, 1 reply; 3+ messages in thread
From: Alan Modra @ 2006-06-09  0:04 UTC (permalink / raw)
  To: Roger Sayle; +Cc: binutils

On Thu, Jun 08, 2006 at 05:16:37PM -0600, Roger Sayle wrote:
> One suggestion was to change the output_statement_hash_entry struct
> to make "os" be a lang_statement_union_type instead of a
> lang_output_section_statement_type, but unfortunately this would
> mean increasing the size of each hash table entry to the largest
> member of that union.

This is the right solution.  lang_output_section_statement_type is one
of the largest elements of lang_statement_union_type.  It might even be
the largest.

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC] Pointer alignment issues in ld's map_input_to_output_sections
  2006-06-09  0:04 ` Alan Modra
@ 2006-06-09  3:40   ` Alan Modra
  0 siblings, 0 replies; 3+ messages in thread
From: Alan Modra @ 2006-06-09  3:40 UTC (permalink / raw)
  To: Roger Sayle, binutils

On Fri, Jun 09, 2006 at 09:31:15AM +0930, Alan Modra wrote:
> On Thu, Jun 08, 2006 at 05:16:37PM -0600, Roger Sayle wrote:
> > One suggestion was to change the output_statement_hash_entry struct
> > to make "os" be a lang_statement_union_type instead of a
> > lang_output_section_statement_type, but unfortunately this would
> > mean increasing the size of each hash table entry to the largest
> > member of that union.
> 
> This is the right solution.  lang_output_section_statement_type is one
> of the largest elements of lang_statement_union_type.  It might even be
> the largest.

This does some renaming too, because we have lang_output_statement_type
as well as lang_output_section_statement_type.  So using
output_statement_* in function names and structs that actually deal with
lang_output_section_statement_type's is misleading.  Oh, and packing
lang_input_statement_type makes it smaller than
lang_output_section_statement_type.

Applied mainline and 2.17.

	* ldlang.h (lang_input_statement_type): Use bitfields for booleans.
	* ldlang.c (struct out_section_hash_entry): Rename from
	output_statement_hash_entry.  Delete output_section_statement_type
	entry.  Add statement_union_type entry.  Adjust all users.
	(output_section_statement_table): Rename from output_statement_table.
	Adjust all users.
	(output_section_statement_newfunc): Rename from
	output_statement_newfunc.  Adjust all users.
	(output_section_statement_table_init): Rename from
	output_statement_table_init.  Adjust all users.
	(output_section_statement_table_free): Rename from
	output_statement_table_free.  Adjust all users.

Index: ld/ldlang.c
===================================================================
RCS file: /cvs/src/src/ld/ldlang.c,v
retrieving revision 1.222
diff -u -p -r1.222 ldlang.c
--- ld/ldlang.c	7 Jun 2006 04:55:11 -0000	1.222
+++ ld/ldlang.c	9 Jun 2006 00:57:50 -0000
@@ -869,26 +869,26 @@ lang_add_input_file (const char *name,
   return new_afile (name, file_type, target, TRUE);
 }
 
-struct output_statement_hash_entry
+struct out_section_hash_entry
 {
   struct bfd_hash_entry root;
-  lang_output_section_statement_type os;
+  lang_statement_union_type s;
 };
 
 /* The hash table.  */
 
-static struct bfd_hash_table output_statement_table;
+static struct bfd_hash_table output_section_statement_table;
 
 /* Support routines for the hash table used by lang_output_section_find,
    initialize the table, fill in an entry and remove the table.  */
 
 static struct bfd_hash_entry *
-output_statement_newfunc (struct bfd_hash_entry *entry, 
-			  struct bfd_hash_table *table,
-			  const char *string)
+output_section_statement_newfunc (struct bfd_hash_entry *entry, 
+				  struct bfd_hash_table *table,
+				  const char *string)
 {
   lang_output_section_statement_type **nextp;
-  struct output_statement_hash_entry *ret;
+  struct out_section_hash_entry *ret;
 
   if (entry == NULL)
     {
@@ -901,49 +901,48 @@ output_statement_newfunc (struct bfd_has
   if (entry == NULL)
     return entry;
 
-  ret = (struct output_statement_hash_entry *) entry;
-  memset (&ret->os, 0, sizeof (ret->os));
-  ret->os.header.type = lang_output_section_statement_enum;
-  ret->os.subsection_alignment = -1;
-  ret->os.section_alignment = -1;
-  ret->os.block_value = 1;
-  lang_list_init (&ret->os.children);
-  lang_statement_append (stat_ptr,
-			 (lang_statement_union_type *) &ret->os,
-			 &ret->os.header.next);
+  ret = (struct out_section_hash_entry *) entry;
+  memset (&ret->s, 0, sizeof (ret->s));
+  ret->s.header.type = lang_output_section_statement_enum;
+  ret->s.output_section_statement.subsection_alignment = -1;
+  ret->s.output_section_statement.section_alignment = -1;
+  ret->s.output_section_statement.block_value = 1;
+  lang_list_init (&ret->s.output_section_statement.children);
+  lang_statement_append (stat_ptr, &ret->s, &ret->s.header.next);
 
   /* For every output section statement added to the list, except the
      first one, lang_output_section_statement.tail points to the "next"
      field of the last element of the list.  */
   if (lang_output_section_statement.head != NULL)
-    ret->os.prev = (lang_output_section_statement_type *)
-      ((char *) lang_output_section_statement.tail
-       - offsetof (lang_output_section_statement_type, next));
+    ret->s.output_section_statement.prev
+      = ((lang_output_section_statement_type *)
+	 ((char *) lang_output_section_statement.tail
+	  - offsetof (lang_output_section_statement_type, next)));
 
   /* GCC's strict aliasing rules prevent us from just casting the
      address, so we store the pointer in a variable and cast that
      instead.  */
-  nextp = &ret->os.next;
+  nextp = &ret->s.output_section_statement.next;
   lang_statement_append (&lang_output_section_statement,
-			 (lang_statement_union_type *) &ret->os,
+			 &ret->s,
 			 (lang_statement_union_type **) nextp);
   return &ret->root;
 }
 
 static void
-output_statement_table_init (void)
+output_section_statement_table_init (void)
 {
-  if (!bfd_hash_table_init_n (&output_statement_table,
-			      output_statement_newfunc,
-			      sizeof (struct output_statement_hash_entry),
+  if (!bfd_hash_table_init_n (&output_section_statement_table,
+			      output_section_statement_newfunc,
+			      sizeof (struct out_section_hash_entry),
 			      61))
     einfo (_("%P%F: can not create hash table: %E\n"));
 }
 
 static void
-output_statement_table_free (void)
+output_section_statement_table_free (void)
 {
-  bfd_hash_table_free (&output_statement_table);
+  bfd_hash_table_free (&output_section_statement_table);
 }
 
 /* Build enough state so that the parser can build its tree.  */
@@ -955,7 +954,7 @@ lang_init (void)
 
   stat_ptr = &statement_list;
 
-  output_statement_table_init ();
+  output_section_statement_table_init ();
 
   lang_list_init (stat_ptr);
 
@@ -985,7 +984,7 @@ lang_init (void)
 void
 lang_finish (void)
 {
-  output_statement_table_free ();
+  output_section_statement_table_free ();
 }
 
 /*----------------------------------------------------------------------
@@ -1072,24 +1071,25 @@ lang_memory_default (asection *section)
 lang_output_section_statement_type *
 lang_output_section_find (const char *const name)
 {
-  struct output_statement_hash_entry *entry;
+  struct out_section_hash_entry *entry;
   unsigned long hash;
 
-  entry = ((struct output_statement_hash_entry *)
-	   bfd_hash_lookup (&output_statement_table, name, FALSE, FALSE));
+  entry = ((struct out_section_hash_entry *)
+	   bfd_hash_lookup (&output_section_statement_table, name,
+			    FALSE, FALSE));
   if (entry == NULL)
     return NULL;
 
   hash = entry->root.hash;
   do
     {
-      if (entry->os.constraint != -1)
-	return &entry->os;
-      entry = (struct output_statement_hash_entry *) entry->root.next;
+      if (entry->s.output_section_statement.constraint != -1)
+	return &entry->s.output_section_statement;
+      entry = (struct out_section_hash_entry *) entry->root.next;
     }
   while (entry != NULL
 	 && entry->root.hash == hash
-	 && strcmp (name, entry->os.name) == 0);
+	 && strcmp (name, entry->s.output_section_statement.name) == 0);
 
   return NULL;
 }
@@ -1097,39 +1097,43 @@ lang_output_section_find (const char *co
 static lang_output_section_statement_type *
 lang_output_section_statement_lookup_1 (const char *const name, int constraint)
 {
-  struct output_statement_hash_entry *entry;
-  struct output_statement_hash_entry *last_ent;
+  struct out_section_hash_entry *entry;
+  struct out_section_hash_entry *last_ent;
   unsigned long hash;
 
-  entry = ((struct output_statement_hash_entry *)
-	   bfd_hash_lookup (&output_statement_table, name, TRUE, FALSE));
+  entry = ((struct out_section_hash_entry *)
+	   bfd_hash_lookup (&output_section_statement_table, name,
+			    TRUE, FALSE));
   if (entry == NULL)
     {
       einfo (_("%P%F: failed creating section `%s': %E\n"), name);
       return NULL;
     }
 
-  if (entry->os.name != NULL)
+  if (entry->s.output_section_statement.name != NULL)
     {
       /* We have a section of this name, but it might not have the correct
 	 constraint.  */
       hash = entry->root.hash;
       do
 	{
-	  if (entry->os.constraint != -1
+	  if (entry->s.output_section_statement.constraint != -1
 	      && (constraint == 0
-		  || (constraint == entry->os.constraint
+		  || (constraint == entry->s.output_section_statement.constraint
 		      && constraint != SPECIAL)))
-	    return &entry->os;
+	    return &entry->s.output_section_statement;
 	  last_ent = entry;
-	  entry = (struct output_statement_hash_entry *) entry->root.next;
+	  entry = (struct out_section_hash_entry *) entry->root.next;
 	}
       while (entry != NULL
 	     && entry->root.hash == hash
-	     && strcmp (name, entry->os.name) == 0);
+	     && strcmp (name, entry->s.output_section_statement.name) == 0);
 
-      entry = ((struct output_statement_hash_entry *)
-	       output_statement_newfunc (NULL, &output_statement_table, name));
+      entry
+	= ((struct out_section_hash_entry *)
+	   output_section_statement_newfunc (NULL,
+					     &output_section_statement_table,
+					     name));
       if (entry == NULL)
 	{
 	  einfo (_("%P%F: failed creating section `%s': %E\n"), name);
@@ -1139,9 +1143,9 @@ lang_output_section_statement_lookup_1 (
       last_ent->root.next = &entry->root;
     }
 
-  entry->os.name = name;
-  entry->os.constraint = constraint;
-  return &entry->os;
+  entry->s.output_section_statement.name = name;
+  entry->s.output_section_statement.constraint = constraint;
+  return &entry->s.output_section_statement;
 }
 
 lang_output_section_statement_type *
Index: ld/ldlang.h
===================================================================
RCS file: /cvs/src/src/ld/ldlang.h,v
retrieving revision 1.61
diff -u -p -r1.61 ldlang.h
--- ld/ldlang.h	7 Jun 2006 04:55:11 -0000	1.61
+++ ld/ldlang.h	9 Jun 2006 00:57:50 -0000
@@ -228,7 +228,6 @@ typedef struct lang_input_statement_stru
 
   bfd *the_bfd;
 
-  bfd_boolean closed;
   file_ptr passive_position;
 
   /* Symbol table of the file.  */
@@ -242,40 +241,42 @@ typedef struct lang_input_statement_stru
   /* Point to the next file, but skips archive contents.  */
   union lang_statement_union *next_real_file;
 
-  bfd_boolean is_archive;
+  const char *target;
+
+  unsigned int closed : 1;
+  unsigned int is_archive : 1;
 
   /* 1 means search a set of directories for this file.  */
-  bfd_boolean search_dirs_flag;
+  unsigned int search_dirs_flag : 1;
 
   /* 1 means this was found in a search directory marked as sysrooted,
      if search_dirs_flag is false, otherwise, that it should be
      searched in ld_sysroot before any other location, as long as it
      starts with a slash.  */
-  bfd_boolean sysrooted;
+  unsigned int sysrooted : 1;
 
   /* 1 means this is base file of incremental load.
      Do not load this file's text or data.
      Also default text_start to after this file's bss.  */
-  bfd_boolean just_syms_flag;
+  unsigned int just_syms_flag : 1;
 
   /* Whether to search for this entry as a dynamic archive.  */
-  bfd_boolean dynamic;
+  unsigned int dynamic : 1;
 
   /* Whether DT_NEEDED tags should be added for dynamic libraries in
      DT_NEEDED tags from this entry.  */
-  bfd_boolean add_needed;
+  unsigned int add_needed : 1;
 
   /* Whether this entry should cause a DT_NEEDED tag only when
      satisfying references from regular files, or always.  */
-  bfd_boolean as_needed;
+  unsigned int as_needed : 1;
 
   /* Whether to include the entire contents of an archive.  */
-  bfd_boolean whole_archive;
+  unsigned int whole_archive : 1;
 
-  bfd_boolean loaded;
+  unsigned int loaded : 1;
 
-  const char *target;
-  bfd_boolean real;
+  unsigned int real : 1;
 } lang_input_statement_type;
 
 typedef struct

-- 
Alan Modra
IBM OzLabs - Linux Technology Centre

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-06-09  1:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-06-08 23:54 [RFC] Pointer alignment issues in ld's map_input_to_output_sections Roger Sayle
2006-06-09  0:04 ` Alan Modra
2006-06-09  3:40   ` Alan Modra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).