* gold regression vs BFD ld: __ehdr_start @ 2014-03-30 1:15 Roland McGrath 2014-03-31 19:19 ` Ian Lance Taylor 0 siblings, 1 reply; 19+ messages in thread From: Roland McGrath @ 2014-03-30 1:15 UTC (permalink / raw) To: binutils; +Cc: Roland McGrath With trunk on x86_64-linux-gnu: $ cat foo.s .text .globl _start _start: lea __ehdr_start(%rip), %r11 $ ./gas/as-new -o foo.o foo.s $ ./ld/ld-new -shared -o foo.so foo.o $ readelf -rs foo.so There are no relocations in this file. Symbol table '.dynsym' contains 6 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000000001c1 0 SECTION LOCAL DEFAULT 4 2: 00000000000001c1 0 NOTYPE GLOBAL DEFAULT 4 _start 3: 0000000000200278 0 NOTYPE GLOBAL DEFAULT 6 __bss_start 4: 0000000000200278 0 NOTYPE GLOBAL DEFAULT 6 _edata 5: 0000000000200278 0 NOTYPE GLOBAL DEFAULT 6 _end Symbol table '.symtab' contains 14 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000000000e8 0 SECTION LOCAL DEFAULT 1 2: 0000000000000118 0 SECTION LOCAL DEFAULT 2 3: 00000000000001a8 0 SECTION LOCAL DEFAULT 3 4: 00000000000001c1 0 SECTION LOCAL DEFAULT 4 5: 00000000000001c8 0 SECTION LOCAL DEFAULT 5 6: 00000000002001c8 0 SECTION LOCAL DEFAULT 6 7: 0000000000000000 0 NOTYPE LOCAL DEFAULT 1 __ehdr_start 8: 00000000002001c8 0 OBJECT LOCAL DEFAULT 6 _DYNAMIC 9: 0000000000000000 0 OBJECT LOCAL DEFAULT 6 _GLOBAL_OFFSET_TABLE_ 10: 00000000000001c1 0 NOTYPE GLOBAL DEFAULT 4 _start 11: 0000000000200278 0 NOTYPE GLOBAL DEFAULT 6 __bss_start 12: 0000000000200278 0 NOTYPE GLOBAL DEFAULT 6 _edata 13: 0000000000200278 0 NOTYPE GLOBAL DEFAULT 6 _end $ ./gold/ld-new -shared -o foo.so foo.o ./gold/ld-new: error: foo.o: requires dynamic R_X86_64_PC32 reloc against '__ehdr_start' which may overflow at runtime; recompile with -fPIC [Exit 1] $ I think when __ehdr_start was first added to BFD ld, it may have behaved this way too. But it works properly now. The key issue is that when the __ehdr_start symbol is in its indeterminate state, it still needs to be understood to be STV_HIDDEN so that it doesn't generate dynamic relocs. When it is eventually found that it can be defined and it is--a decision that can only be made after layout--then it will be defined as STV_HIDDEN, so it should always get static resolution. If at the end of the way it can't be defined (layout with no readable segment covering the file headers), then it is probably fine to get an error rather than generating the dynamic relocs that would be normal for any random undefined symbol--albeit, that is a regression relative to pre-2.23 linkers when users happen to use this once mundane, now magic symbol name. (I'm not sure any more if 2.24/trunk BFD ld has that regression against its own pre-2.23 behavior or not.) I think what has to happen is something analogous to what BFD ld does now: Add the symbol early on in an undefined or indeterminate state, and then define it (or don't) later on. It's not very clear to me how to do that correctly in gold. It would seem clearest if it can be done in define_standard_symbols. But I don't know what Layout::finalize (where it's created now) should do to find it and adjust it to point at the right segment. I also don't know how either case would interact with a user defining (in input or script) this symbol name himself, in which case (IIRC the BFD ld behavior) the user's symbol should be wholly unmolested (but it might be that BFD ld uses the user's value but marks it STV_HIDDEN anyway--I'm not sure). This is yet another time I've wished we had a common test suite for both linkers to verify matching behavior. Of course I understand why output doesn't match byte-for-byte and why the whole style of the existing ld testsuite doesn't mesh well with a more semantically-precise testing methodology. But still, harumph. Thanks, Roland ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-03-30 1:15 gold regression vs BFD ld: __ehdr_start Roland McGrath @ 2014-03-31 19:19 ` Ian Lance Taylor 2014-03-31 21:03 ` Cary Coutant 0 siblings, 1 reply; 19+ messages in thread From: Ian Lance Taylor @ 2014-03-31 19:19 UTC (permalink / raw) To: Roland McGrath; +Cc: Binutils, Roland McGrath On Sat, Mar 29, 2014 at 6:15 PM, Roland McGrath <roland@hack.frob.com> wrote: > With trunk on x86_64-linux-gnu: > > $ cat foo.s > .text > .globl _start > _start: > lea __ehdr_start(%rip), %r11 > $ ./gas/as-new -o foo.o foo.s > $ ./ld/ld-new -shared -o foo.so foo.o > $ readelf -rs foo.so > > There are no relocations in this file. > > Symbol table '.dynsym' contains 6 entries: > Num: Value Size Type Bind Vis Ndx Name > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > 1: 00000000000001c1 0 SECTION LOCAL DEFAULT 4 > 2: 00000000000001c1 0 NOTYPE GLOBAL DEFAULT 4 _start > 3: 0000000000200278 0 NOTYPE GLOBAL DEFAULT 6 __bss_start > 4: 0000000000200278 0 NOTYPE GLOBAL DEFAULT 6 _edata > 5: 0000000000200278 0 NOTYPE GLOBAL DEFAULT 6 _end > > Symbol table '.symtab' contains 14 entries: > Num: Value Size Type Bind Vis Ndx Name > 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND > 1: 00000000000000e8 0 SECTION LOCAL DEFAULT 1 > 2: 0000000000000118 0 SECTION LOCAL DEFAULT 2 > 3: 00000000000001a8 0 SECTION LOCAL DEFAULT 3 > 4: 00000000000001c1 0 SECTION LOCAL DEFAULT 4 > 5: 00000000000001c8 0 SECTION LOCAL DEFAULT 5 > 6: 00000000002001c8 0 SECTION LOCAL DEFAULT 6 > 7: 0000000000000000 0 NOTYPE LOCAL DEFAULT 1 __ehdr_start > 8: 00000000002001c8 0 OBJECT LOCAL DEFAULT 6 _DYNAMIC > 9: 0000000000000000 0 OBJECT LOCAL DEFAULT 6 _GLOBAL_OFFSET_TABLE_ > 10: 00000000000001c1 0 NOTYPE GLOBAL DEFAULT 4 _start > 11: 0000000000200278 0 NOTYPE GLOBAL DEFAULT 6 __bss_start > 12: 0000000000200278 0 NOTYPE GLOBAL DEFAULT 6 _edata > 13: 0000000000200278 0 NOTYPE GLOBAL DEFAULT 6 _end > $ ./gold/ld-new -shared -o foo.so foo.o > ./gold/ld-new: error: foo.o: requires dynamic R_X86_64_PC32 reloc against '__ehdr_start' which may overflow at runtime; recompile with -fPIC > [Exit 1] > $ > > I think when __ehdr_start was first added to BFD ld, it may have behaved > this way too. But it works properly now. The key issue is that when the > __ehdr_start symbol is in its indeterminate state, it still needs to be > understood to be STV_HIDDEN so that it doesn't generate dynamic relocs. > When it is eventually found that it can be defined and it is--a decision > that can only be made after layout--then it will be defined as STV_HIDDEN, > so it should always get static resolution. If at the end of the way it > can't be defined (layout with no readable segment covering the file > headers), then it is probably fine to get an error rather than generating > the dynamic relocs that would be normal for any random undefined > symbol--albeit, that is a regression relative to pre-2.23 linkers when > users happen to use this once mundane, now magic symbol name. (I'm not > sure any more if 2.24/trunk BFD ld has that regression against its own > pre-2.23 behavior or not.) > > I think what has to happen is something analogous to what BFD ld does now: > Add the symbol early on in an undefined or indeterminate state, and then > define it (or don't) later on. It's not very clear to me how to do that > correctly in gold. It would seem clearest if it can be done in > define_standard_symbols. But I don't know what Layout::finalize (where > it's created now) should do to find it and adjust it to point at the right > segment. I also don't know how either case would interact with a user > defining (in input or script) this symbol name himself, in which case (IIRC > the BFD ld behavior) the user's symbol should be wholly unmolested (but it > might be that BFD ld uses the user's value but marks it STV_HIDDEN > anyway--I'm not sure). I don't know why __ehdr_start is defined in Layout::finalize. As you suggest, it should be defined by define_standard_symbols instead, setting the only_if_ref field to true. If necessary, Layout::finalize can adjust the symbol value using init_output_segment, but only if the symbol is marked as predefined. Layout::finalize can simply look up the symbol by name. Ian ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-03-31 19:19 ` Ian Lance Taylor @ 2014-03-31 21:03 ` Cary Coutant 2014-03-31 21:07 ` Roland McGrath 0 siblings, 1 reply; 19+ messages in thread From: Cary Coutant @ 2014-03-31 21:03 UTC (permalink / raw) To: Ian Lance Taylor; +Cc: Roland McGrath, Binutils, Roland McGrath [-- Attachment #1: Type: text/plain, Size: 1642 bytes --] >> I think what has to happen is something analogous to what BFD ld does now: >> Add the symbol early on in an undefined or indeterminate state, and then >> define it (or don't) later on. It's not very clear to me how to do that >> correctly in gold. It would seem clearest if it can be done in >> define_standard_symbols. But I don't know what Layout::finalize (where >> it's created now) should do to find it and adjust it to point at the right >> segment. I also don't know how either case would interact with a user >> defining (in input or script) this symbol name himself, in which case (IIRC >> the BFD ld behavior) the user's symbol should be wholly unmolested (but it >> might be that BFD ld uses the user's value but marks it STV_HIDDEN >> anyway--I'm not sure). > > I don't know why __ehdr_start is defined in Layout::finalize. As you > suggest, it should be defined by define_standard_symbols instead, > setting the only_if_ref field to true. If necessary, Layout::finalize > can adjust the symbol value using init_output_segment, but only if the > symbol is marked as predefined. Layout::finalize can simply look up > the symbol by name. The attached patch should fix this. Can you give it a try, and if it works, I'll add a testcase and check it in. -cary 2014-03-31 Cary Coutant <ccoutant@google.com> * gold/defstd.cc (in_segment): Define __ehdr_start here... * gold/layout.cc (Layout::finalize): ...Instead of here. Set the output segment when known. * gold/symtab.cc (Symbol::set_output_segment): New function. * gold/symtab.h (Symbol::set_output_segment): New function. [-- Attachment #2: ehdr-start-patch.txt --] [-- Type: text/plain, Size: 3409 bytes --] 2014-03-31 Cary Coutant <ccoutant@google.com> * gold/defstd.cc (in_segment): Define __ehdr_start here... * gold/layout.cc (Layout::finalize): ...Instead of here. Set the output segment when known. * gold/symtab.cc (Symbol::set_output_segment): New function. * gold/symtab.h (Symbol::set_output_segment): New function. diff --git a/gold/defstd.cc b/gold/defstd.cc index a50e75d..cee68a0 100644 --- a/gold/defstd.cc +++ b/gold/defstd.cc @@ -141,6 +141,20 @@ const Define_symbol_in_segment in_segment[] = true // only_if_ref }, { + "__ehdr_start", // name + elfcpp::PT_LOAD, // segment_type + elfcpp::PF(0), // segment_flags_set + elfcpp::PF(0), // segment_flags_clear + 0, // value + 0, // size + elfcpp::STT_NOTYPE, // type + elfcpp::STB_GLOBAL, // binding + elfcpp::STV_HIDDEN, // visibility + 0, // nonvis + Symbol::SEGMENT_START, // offset_from_base + true // only_if_ref + }, + { "etext", // name elfcpp::PT_LOAD, // segment_type elfcpp::PF_X, // segment_flags_set diff --git a/gold/layout.cc b/gold/layout.cc index c96516c..99460cc 100644 --- a/gold/layout.cc +++ b/gold/layout.cc @@ -2743,12 +2743,15 @@ Layout::finalize(const Input_objects* input_objects, Symbol_table* symtab, // If there is a load segment that contains the file and program headers, // provide a symbol __ehdr_start pointing there. // A program can use this to examine itself robustly. - if (load_seg != NULL) - symtab->define_in_output_segment("__ehdr_start", NULL, - Symbol_table::PREDEFINED, load_seg, 0, 0, - elfcpp::STT_NOTYPE, elfcpp::STB_GLOBAL, - elfcpp::STV_HIDDEN, 0, - Symbol::SEGMENT_START, true); + Symbol *ehdr_start = symtab->lookup("__ehdr_start"); + if (ehdr_start != NULL && ehdr_start->is_predefined()) + { + if (load_seg != NULL) + ehdr_start->set_output_segment(load_seg, Symbol::SEGMENT_START); + else + gold_error("program references \"__ehdr_start\"," + " but no load segment contains the ELF header"); + } // Set the file offsets of all the non-data sections we've seen so // far which don't have to wait for the input sections. We need diff --git a/gold/symtab.cc b/gold/symtab.cc index 1a69f5b..58db441 100644 --- a/gold/symtab.cc +++ b/gold/symtab.cc @@ -527,6 +527,17 @@ Symbol::set_output_section(Output_section* os) } } +// Set the symbol's output segment. This is used for pre-defined +// symbols whose segments aren't known until after layout is done +// (e.g., __ehdr_start). +void +Symbol::set_output_segment(Output_segment* os, Segment_offset_base base) +{ + gold_assert(this->source_ == IN_OUTPUT_SEGMENT); + this->u_.in_output_segment.output_segment = os; + this->u_.in_output_segment.offset_base = base; +} + // Class Symbol_table. Symbol_table::Symbol_table(unsigned int count, diff --git a/gold/symtab.h b/gold/symtab.h index b06c7b4..1ff5c22 100644 --- a/gold/symtab.h +++ b/gold/symtab.h @@ -782,6 +782,12 @@ class Symbol void set_output_section(Output_section*); + // Set the symbol's output segment. This is used for pre-defined + // symbols whose segments aren't known until after layout is done + // (e.g., __ehdr_start). + void + set_output_segment(Output_segment*, Segment_offset_base); + // Return whether there should be a warning for references to this // symbol. bool ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-03-31 21:03 ` Cary Coutant @ 2014-03-31 21:07 ` Roland McGrath 2014-03-31 21:18 ` Cary Coutant 0 siblings, 1 reply; 19+ messages in thread From: Roland McGrath @ 2014-03-31 21:07 UTC (permalink / raw) To: Cary Coutant; +Cc: Ian Lance Taylor, Roland McGrath, Binutils That's extremely similar to what I was just writing myself. I had a different (and possibly infeasible) idea for the behavior in the case when it can't be defined. That is, to morph it into an actually undefined symbol. What would yours do in case of weak references to __ehdr_start when it cannot be defined? On Mon, Mar 31, 2014 at 2:03 PM, Cary Coutant <ccoutant@google.com> wrote: >>> I think what has to happen is something analogous to what BFD ld does now: >>> Add the symbol early on in an undefined or indeterminate state, and then >>> define it (or don't) later on. It's not very clear to me how to do that >>> correctly in gold. It would seem clearest if it can be done in >>> define_standard_symbols. But I don't know what Layout::finalize (where >>> it's created now) should do to find it and adjust it to point at the right >>> segment. I also don't know how either case would interact with a user >>> defining (in input or script) this symbol name himself, in which case (IIRC >>> the BFD ld behavior) the user's symbol should be wholly unmolested (but it >>> might be that BFD ld uses the user's value but marks it STV_HIDDEN >>> anyway--I'm not sure). >> >> I don't know why __ehdr_start is defined in Layout::finalize. As you >> suggest, it should be defined by define_standard_symbols instead, >> setting the only_if_ref field to true. If necessary, Layout::finalize >> can adjust the symbol value using init_output_segment, but only if the >> symbol is marked as predefined. Layout::finalize can simply look up >> the symbol by name. > > The attached patch should fix this. Can you give it a try, and if it > works, I'll add a testcase and check it in. > > -cary > > 2014-03-31 Cary Coutant <ccoutant@google.com> > > * gold/defstd.cc (in_segment): Define __ehdr_start here... > * gold/layout.cc (Layout::finalize): ...Instead of here. Set the > output segment when known. > * gold/symtab.cc (Symbol::set_output_segment): New function. > * gold/symtab.h (Symbol::set_output_segment): New function. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-03-31 21:07 ` Roland McGrath @ 2014-03-31 21:18 ` Cary Coutant 2014-03-31 21:27 ` Roland McGrath 0 siblings, 1 reply; 19+ messages in thread From: Cary Coutant @ 2014-03-31 21:18 UTC (permalink / raw) To: Roland McGrath; +Cc: Ian Lance Taylor, Roland McGrath, Binutils > That's extremely similar to what I was just writing myself. > I had a different (and possibly infeasible) idea for the behavior in > the case when it can't be defined. > That is, to morph it into an actually undefined symbol. > > What would yours do in case of weak references to __ehdr_start when it > cannot be defined? You'll still get the error. I thought about transforming the symbol into an undef so that the error would come from relocate_section(), but you'd also have to modify issue_undefined_symbol_error() to check for hidden symbols. Given your assertion that an early error was reasonable, it didn't seem worth the churn. -cary ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-03-31 21:18 ` Cary Coutant @ 2014-03-31 21:27 ` Roland McGrath 2014-03-31 22:19 ` Cary Coutant 0 siblings, 1 reply; 19+ messages in thread From: Roland McGrath @ 2014-03-31 21:27 UTC (permalink / raw) To: Cary Coutant; +Cc: Ian Lance Taylor, Roland McGrath, Binutils On Mon, Mar 31, 2014 at 2:18 PM, Cary Coutant <ccoutant@google.com> wrote: > I thought about transforming the symbol into an undef so that the > error would come from relocate_section(), but you'd also have to > modify issue_undefined_symbol_error() to check for hidden symbols. > Given your assertion that an early error was reasonable, it didn't > seem worth the churn. I was unclear. In the case of a static link with a weak undefined reference to __ehdr_start, that must resolve to 0 normally and not generate an error. The case I was talking about was under -shared, when the treatment of an undefined symbol would normally be to generate a dynamic reloc. In that case, I think it's acceptable to generate an error instead of going back and resizing the dynamic sections and everything you'd have to do. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-03-31 21:27 ` Roland McGrath @ 2014-03-31 22:19 ` Cary Coutant 2014-04-01 16:34 ` Roland McGrath 0 siblings, 1 reply; 19+ messages in thread From: Cary Coutant @ 2014-03-31 22:19 UTC (permalink / raw) To: Roland McGrath; +Cc: Ian Lance Taylor, Roland McGrath, Binutils [-- Attachment #1: Type: text/plain, Size: 1045 bytes --] > I was unclear. In the case of a static link with a weak undefined > reference to __ehdr_start, that must resolve to 0 normally and not > generate an error. > The case I was talking about was under -shared, when the treatment of > an undefined symbol would normally be to generate a dynamic reloc. > In that case, I think it's acceptable to generate an error instead of > going back and resizing the dynamic sections and everything you'd have > to do. OK, here's a revised patch... -cary 2014-03-31 Cary Coutant <ccoutant@google.com> * gold/defstd.cc (in_segment): Define __ehdr_start here... * gold/layout.cc (Layout::finalize): ...Instead of here. Set the output segment when known. * gold/symtab.cc (Symbol::set_output_segment): New function. (Symbol::set_undefined): New function. * gold/symtab.h (Symbol::set_output_segment): New function. (Symbol::set_undefined): New function. * gold/target-reloc.h (issue_undefined_symbol_error): Check for hidden undefs. [-- Attachment #2: ehdr-start-patch-2.txt --] [-- Type: text/plain, Size: 4568 bytes --] 2014-03-31 Cary Coutant <ccoutant@google.com> * gold/defstd.cc (in_segment): Define __ehdr_start here... * gold/layout.cc (Layout::finalize): ...Instead of here. Set the output segment when known. * gold/symtab.cc (Symbol::set_output_segment): New function. (Symbol::set_undefined): New function. * gold/symtab.h (Symbol::set_output_segment): New function. (Symbol::set_undefined): New function. * gold/target-reloc.h (issue_undefined_symbol_error): Check for hidden undefs. diff --git a/gold/defstd.cc b/gold/defstd.cc index a50e75d..cee68a0 100644 --- a/gold/defstd.cc +++ b/gold/defstd.cc @@ -141,6 +141,20 @@ const Define_symbol_in_segment in_segment[] = true // only_if_ref }, { + "__ehdr_start", // name + elfcpp::PT_LOAD, // segment_type + elfcpp::PF(0), // segment_flags_set + elfcpp::PF(0), // segment_flags_clear + 0, // value + 0, // size + elfcpp::STT_NOTYPE, // type + elfcpp::STB_GLOBAL, // binding + elfcpp::STV_HIDDEN, // visibility + 0, // nonvis + Symbol::SEGMENT_START, // offset_from_base + true // only_if_ref + }, + { "etext", // name elfcpp::PT_LOAD, // segment_type elfcpp::PF_X, // segment_flags_set diff --git a/gold/layout.cc b/gold/layout.cc index c96516c..9bd19c5 100644 --- a/gold/layout.cc +++ b/gold/layout.cc @@ -2743,12 +2743,14 @@ Layout::finalize(const Input_objects* input_objects, Symbol_table* symtab, // If there is a load segment that contains the file and program headers, // provide a symbol __ehdr_start pointing there. // A program can use this to examine itself robustly. - if (load_seg != NULL) - symtab->define_in_output_segment("__ehdr_start", NULL, - Symbol_table::PREDEFINED, load_seg, 0, 0, - elfcpp::STT_NOTYPE, elfcpp::STB_GLOBAL, - elfcpp::STV_HIDDEN, 0, - Symbol::SEGMENT_START, true); + Symbol *ehdr_start = symtab->lookup("__ehdr_start"); + if (ehdr_start != NULL && ehdr_start->is_predefined()) + { + if (load_seg != NULL) + ehdr_start->set_output_segment(load_seg, Symbol::SEGMENT_START); + else + ehdr_start->set_undefined(); + } // Set the file offsets of all the non-data sections we've seen so // far which don't have to wait for the input sections. We need diff --git a/gold/symtab.cc b/gold/symtab.cc index 1a69f5b..6fa788a 100644 --- a/gold/symtab.cc +++ b/gold/symtab.cc @@ -527,6 +527,30 @@ Symbol::set_output_section(Output_section* os) } } +// Set the symbol's output segment. This is used for pre-defined +// symbols whose segments aren't known until after layout is done +// (e.g., __ehdr_start). + +void +Symbol::set_output_segment(Output_segment* os, Segment_offset_base base) +{ + gold_assert(this->source_ == IN_OUTPUT_SEGMENT && this->is_predefined_); + this->u_.in_output_segment.output_segment = os; + this->u_.in_output_segment.offset_base = base; +} + +// Set the symbol to undefined. This is used for pre-defined +// symbols whose segments aren't known until after layout is done +// (e.g., __ehdr_start). + +void +Symbol::set_undefined() +{ + gold_assert(this->source_ == IN_OUTPUT_SEGMENT && this->is_predefined_); + this->source_ = IS_UNDEFINED; + this->is_predefined_ = false; +} + // Class Symbol_table. Symbol_table::Symbol_table(unsigned int count, diff --git a/gold/symtab.h b/gold/symtab.h index b06c7b4..d5aa135 100644 --- a/gold/symtab.h +++ b/gold/symtab.h @@ -782,6 +782,18 @@ class Symbol void set_output_section(Output_section*); + // Set the symbol's output segment. This is used for pre-defined + // symbols whose segments aren't known until after layout is done + // (e.g., __ehdr_start). + void + set_output_segment(Output_segment*, Segment_offset_base); + + // Set the symbol to undefined. This is used for pre-defined + // symbols whose segments aren't known until after layout is done + // (e.g., __ehdr_start). + void + set_undefined(); + // Return whether there should be a warning for references to this // symbol. bool diff --git a/gold/target-reloc.h b/gold/target-reloc.h index f49020a..b47aea8 100644 --- a/gold/target-reloc.h +++ b/gold/target-reloc.h @@ -216,6 +216,10 @@ issue_undefined_symbol_error(const Symbol* sym) return false; } + // If the symbol is hidden, report it. + if (sym->visibility() == elfcpp::STV_HIDDEN) + return true; + // When creating a shared library, only report unresolved symbols if // -z defs was used. if (parameters->options().shared() && !parameters->options().defs()) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-03-31 22:19 ` Cary Coutant @ 2014-04-01 16:34 ` Roland McGrath 2014-04-01 21:50 ` Cary Coutant 0 siblings, 1 reply; 19+ messages in thread From: Roland McGrath @ 2014-04-01 16:34 UTC (permalink / raw) To: Cary Coutant; +Cc: Ian Lance Taylor, Binutils I tried it against all the ehdr_start test cases in ld/testsuite. It dies with an assertion failure on ld-elf/ehdr_start-missing.d and ld-elf/ehdr_start-weak.d. Thanks, Roland ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-04-01 16:34 ` Roland McGrath @ 2014-04-01 21:50 ` Cary Coutant 2014-04-01 22:28 ` Cary Coutant ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: Cary Coutant @ 2014-04-01 21:50 UTC (permalink / raw) To: Roland McGrath; +Cc: Ian Lance Taylor, Binutils > I tried it against all the ehdr_start test cases in ld/testsuite. > It dies with an assertion failure on ld-elf/ehdr_start-missing.d and > ld-elf/ehdr_start-weak.d. Those test cases use a linker script with a SECTIONS clause and no PHDRS clause. In this case, gold runs define_standard_symbols before any segments have been created at all. As a result, all the standard symbols that get added are added as constants, and that failed my assert in set_output_segment(). I can relax that assert, but I've got to wonder if we're doing the right thing here with any of the other standard symbols -- it seems to me they'll all get set incorrectly. In ld/testsuite, the ehdr_start_missing.t script attempts to construct a situation where the first load segment cannot contain the headers by setting the text section start address to 0x10000000. That doesn't work! (At least, not in gold.) Gold will happily place the headers at 0x10000000, and place the text section immediately after that. I'm not sure if this is a bug or a feature. I think this is a known difference between the two linkers that has been discussed before. The only way to keep the headers out of the first load segment is to set the text segment at an address that is not a multiple of the page size. For example, if I run with -Ttext=0x1000100, I'll get the expected undefined reference error. Unfortunately, even if the reference is weak, we always get an error. By defining the symbol early, it's overriding the symbol's binding with STB_GLOBAL, so by the time we try to apply the relocation, we've lost track of the weak binding. -cary ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-04-01 21:50 ` Cary Coutant @ 2014-04-01 22:28 ` Cary Coutant 2014-04-01 22:50 ` Roland McGrath 2014-04-01 22:59 ` Cary Coutant 2 siblings, 0 replies; 19+ messages in thread From: Cary Coutant @ 2014-04-01 22:28 UTC (permalink / raw) To: Roland McGrath; +Cc: Ian Lance Taylor, Binutils > Those test cases use a linker script with a SECTIONS clause and no > PHDRS clause. Sorry, whether or not there's a PHDRS clause is irrelevant. If there's a SECTIONS clause, we delay allocating sections to segments until Layout::finalize(). -cary ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-04-01 21:50 ` Cary Coutant 2014-04-01 22:28 ` Cary Coutant @ 2014-04-01 22:50 ` Roland McGrath 2014-04-01 22:59 ` Cary Coutant 2 siblings, 0 replies; 19+ messages in thread From: Roland McGrath @ 2014-04-01 22:50 UTC (permalink / raw) To: Cary Coutant; +Cc: Ian Lance Taylor, Binutils On Tue, Apr 1, 2014 at 2:50 PM, Cary Coutant <ccoutant@google.com> wrote: > Those test cases use a linker script with a SECTIONS clause and no > PHDRS clause. In this case, gold runs define_standard_symbols before > any segments have been created at all. As a result, all the standard > symbols that get added are added as constants, and that failed my > assert in set_output_segment(). I can relax that assert, but I've got > to wonder if we're doing the right thing here with any of the other > standard symbols -- it seems to me they'll all get set incorrectly. Certainly worth worrying about. I've definitely seen cases in the past where gold did not implement linker script semantics properly. (https://sourceware.org/bugzilla/show_bug.cgi?id=13163 comes to mind.) > In ld/testsuite, the ehdr_start_missing.t script attempts to construct > a situation where the first load segment cannot contain the headers by > setting the text section start address to 0x10000000. That doesn't > work! (At least, not in gold.) Gold will happily place the headers at > 0x10000000, and place the text section immediately after that. I'm not > sure if this is a bug or a feature. I think this is a known difference > between the two linkers that has been discussed before. It's a bug. The linker script said where the .text section should be, and you placed the .text section someplace else. I was not aware of this particular "difference" before. It feels similar to two issues I am aware of: * gold's -Ttext=x means -Ttext-segment=x, whereas -Ttext=x is supposed to mean --section-start=.text=x and that (like the linker script case above) does not give the linker license to actually place .text at x+SIZEOF_HEADERS. * https://sourceware.org/bugzilla/show_bug.cgi?id=13163c#5 points out out a case where assignment of . to a constant means something different in gold (I think it acts as relative to some section rather than absolute, but I no longer recall the exact details). > The only way to keep the headers out of the first load segment is to > set the text segment at an address that is not a multiple of the page > size. For example, if I run with -Ttext=0x1000100, I'll get the > expected undefined reference error. That's a fine thing to test, too. For gold it certainly makes sense to exercise cases without linker scripts, since the internal logic is so different for with vs without. But getting that case right is not really sufficient to declare __ehdr_start properly supported. > Unfortunately, even if the reference is weak, we always get an error. > By defining the symbol early, it's overriding the symbol's binding > with STB_GLOBAL, so by the time we try to apply the relocation, we've > lost track of the weak binding. That's certainly a bug. I don't know how to address it. I had hoped that the 'only_if_ref' flag would have the semantics that if there was another definition, none of the symbol details in the call would modify it. But I can imagine why it's not like that. The -shared case working right without trying to generate a dynamic reloc is sufficient for gold to pass glibc's configure check for __ehdr_start support and for __ehdr_start as used in ld.so to be correct. So perhaps what you've got now is enough better to be worth putting in. The other corner cases are of some interest to me, but I don't think anything I'm actually planning to do will wind up exercising them. Thanks, Roland ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-04-01 21:50 ` Cary Coutant 2014-04-01 22:28 ` Cary Coutant 2014-04-01 22:50 ` Roland McGrath @ 2014-04-01 22:59 ` Cary Coutant 2014-04-01 23:02 ` Cary Coutant 2 siblings, 1 reply; 19+ messages in thread From: Cary Coutant @ 2014-04-01 22:59 UTC (permalink / raw) To: Roland McGrath; +Cc: Ian Lance Taylor, Binutils > In ld/testsuite, the ehdr_start_missing.t script attempts to construct > a situation where the first load segment cannot contain the headers by > setting the text section start address to 0x10000000. That doesn't > work! (At least, not in gold.) Gold will happily place the headers at > 0x10000000, and place the text section immediately after that. I'm not > sure if this is a bug or a feature. I think this is a known difference > between the two linkers that has been discussed before. Oops, not quite right. When using -Ttext=0x10000000, gold puts the headers at 0x10000000 and starts the text immediately after. But with a SECTIONS clause, gold puts the .text section at 0x10000000, and if it can, it will create a load segment for the headers at 0x0ffff000. -cary ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-04-01 22:59 ` Cary Coutant @ 2014-04-01 23:02 ` Cary Coutant 2014-04-03 17:27 ` Roland McGrath 0 siblings, 1 reply; 19+ messages in thread From: Cary Coutant @ 2014-04-01 23:02 UTC (permalink / raw) To: Roland McGrath; +Cc: Ian Lance Taylor, Binutils [-- Attachment #1: Type: text/plain, Size: 1070 bytes --] Here's an updated patch that handles the weak reference properly and handles __ehdr_start correctly (at least) when a SECTIONS clause is present. -cary 2014-04-01 Cary Coutant <ccoutant@google.com> * gold/defstd.cc (in_segment): Define __ehdr_start here... * gold/layout.cc (Layout::finalize): ...Instead of here. Set the output segment when known. * gold/resolve.cc (Symbol::override_base_with_special): Remember the original binding. * gold/symtab.cc (Symbol::set_output_segment): New function. (Symbol::set_undefined): New function. * gold/symtab.h (Symbol::is_weak_undefined): Check original undef binding. (Symbol::is_strong_undefined): New function. (Symbol::set_output_segment): New function. (Symbol::set_undefined): New function. * gold/target-reloc.h (is_strong_undefined): Remove. (issue_undefined_symbol_error): Call Symbol::is_weak_undefined. Check for hidden undefs. (relocate_section): Call Symbol::is_strong_undefined. [-- Attachment #2: ehdr-start-patch-3.txt --] [-- Type: text/plain, Size: 7902 bytes --] 2014-04-01 Cary Coutant <ccoutant@google.com> * gold/defstd.cc (in_segment): Define __ehdr_start here... * gold/layout.cc (Layout::finalize): ...Instead of here. Set the output segment when known. * gold/resolve.cc (Symbol::override_base_with_special): Remember the original binding. * gold/symtab.cc (Symbol::set_output_segment): New function. (Symbol::set_undefined): New function. * gold/symtab.h (Symbol::is_weak_undefined): Check original undef binding. (Symbol::is_strong_undefined): New function. (Symbol::set_output_segment): New function. (Symbol::set_undefined): New function. * gold/target-reloc.h (is_strong_undefined): Remove. (issue_undefined_symbol_error): Call Symbol::is_weak_undefined. Check for hidden undefs. (relocate_section): Call Symbol::is_strong_undefined. diff --git a/gold/defstd.cc b/gold/defstd.cc index a50e75d..cee68a0 100644 --- a/gold/defstd.cc +++ b/gold/defstd.cc @@ -141,6 +141,20 @@ const Define_symbol_in_segment in_segment[] = true // only_if_ref }, { + "__ehdr_start", // name + elfcpp::PT_LOAD, // segment_type + elfcpp::PF(0), // segment_flags_set + elfcpp::PF(0), // segment_flags_clear + 0, // value + 0, // size + elfcpp::STT_NOTYPE, // type + elfcpp::STB_GLOBAL, // binding + elfcpp::STV_HIDDEN, // visibility + 0, // nonvis + Symbol::SEGMENT_START, // offset_from_base + true // only_if_ref + }, + { "etext", // name elfcpp::PT_LOAD, // segment_type elfcpp::PF_X, // segment_flags_set diff --git a/gold/layout.cc b/gold/layout.cc index c96516c..9bd19c5 100644 --- a/gold/layout.cc +++ b/gold/layout.cc @@ -2743,12 +2743,14 @@ Layout::finalize(const Input_objects* input_objects, Symbol_table* symtab, // If there is a load segment that contains the file and program headers, // provide a symbol __ehdr_start pointing there. // A program can use this to examine itself robustly. - if (load_seg != NULL) - symtab->define_in_output_segment("__ehdr_start", NULL, - Symbol_table::PREDEFINED, load_seg, 0, 0, - elfcpp::STT_NOTYPE, elfcpp::STB_GLOBAL, - elfcpp::STV_HIDDEN, 0, - Symbol::SEGMENT_START, true); + Symbol *ehdr_start = symtab->lookup("__ehdr_start"); + if (ehdr_start != NULL && ehdr_start->is_predefined()) + { + if (load_seg != NULL) + ehdr_start->set_output_segment(load_seg, Symbol::SEGMENT_START); + else + ehdr_start->set_undefined(); + } // Set the file offsets of all the non-data sections we've seen so // far which don't have to wait for the input sections. We need diff --git a/gold/resolve.cc b/gold/resolve.cc index 9b442e2..8cc637a 100644 --- a/gold/resolve.cc +++ b/gold/resolve.cc @@ -915,6 +915,10 @@ Symbol::override_base_with_special(const Symbol* from) bool same_name = this->name_ == from->name_; gold_assert(same_name || this->has_alias()); + // If we are overriding an undef, remember the original binding. + if (this->is_undefined()) + this->set_undef_binding(this->binding_); + this->source_ = from->source_; switch (from->source_) { diff --git a/gold/symtab.cc b/gold/symtab.cc index 1a69f5b..4e8afb1 100644 --- a/gold/symtab.cc +++ b/gold/symtab.cc @@ -527,6 +527,31 @@ Symbol::set_output_section(Output_section* os) } } +// Set the symbol's output segment. This is used for pre-defined +// symbols whose segments aren't known until after layout is done +// (e.g., __ehdr_start). + +void +Symbol::set_output_segment(Output_segment* os, Segment_offset_base base) +{ + gold_assert(this->is_predefined_); + this->source_ = IN_OUTPUT_SEGMENT; + this->u_.in_output_segment.output_segment = os; + this->u_.in_output_segment.offset_base = base; +} + +// Set the symbol to undefined. This is used for pre-defined +// symbols whose segments aren't known until after layout is done +// (e.g., __ehdr_start). + +void +Symbol::set_undefined() +{ + gold_assert(this->is_predefined_); + this->source_ = IS_UNDEFINED; + this->is_predefined_ = false; +} + // Class Symbol_table. Symbol_table::Symbol_table(unsigned int count, diff --git a/gold/symtab.h b/gold/symtab.h index b06c7b4..b6366d4 100644 --- a/gold/symtab.h +++ b/gold/symtab.h @@ -238,7 +238,7 @@ class Symbol override_visibility(elfcpp::STV); // Set whether the symbol was originally a weak undef or a regular undef - // when resolved by a dynamic def. + // when resolved by a dynamic def or by a special symbol. inline void set_undef_binding(elfcpp::STB bind) { @@ -249,7 +249,8 @@ class Symbol } } - // Return TRUE if a weak undef was resolved by a dynamic def. + // Return TRUE if a weak undef was resolved by a dynamic def or + // by a special symbol. inline bool is_undef_binding_weak() const { return this->undef_binding_weak_; } @@ -517,7 +518,20 @@ class Symbol // Return whether this is a weak undefined symbol. bool is_weak_undefined() const - { return this->is_undefined() && this->binding() == elfcpp::STB_WEAK; } + { + return (this->is_undefined() + && (this->binding() == elfcpp::STB_WEAK + || this->is_undef_binding_weak())); + } + + // Return whether this is a strong undefined symbol. + bool + is_strong_undefined() const + { + return (this->is_undefined() + && this->binding() != elfcpp::STB_WEAK + && !this->is_undef_binding_weak()); + } // Return whether this is an absolute symbol. bool @@ -782,6 +796,18 @@ class Symbol void set_output_section(Output_section*); + // Set the symbol's output segment. This is used for pre-defined + // symbols whose segments aren't known until after layout is done + // (e.g., __ehdr_start). + void + set_output_segment(Output_segment*, Segment_offset_base); + + // Set the symbol to undefined. This is used for pre-defined + // symbols whose segments aren't known until after layout is done + // (e.g., __ehdr_start). + void + set_undefined(); + // Return whether there should be a warning for references to this // symbol. bool @@ -1030,7 +1056,7 @@ class Symbol // True if UNDEF_BINDING_WEAK_ has been set (bit 32). bool undef_binding_set_ : 1; // True if this symbol was a weak undef resolved by a dynamic def - // (bit 33). + // or by a special symbol (bit 33). bool undef_binding_weak_ : 1; // True if this symbol is a predefined linker symbol (bit 34). bool is_predefined_ : 1; diff --git a/gold/target-reloc.h b/gold/target-reloc.h index f49020a..e44519b 100644 --- a/gold/target-reloc.h +++ b/gold/target-reloc.h @@ -143,12 +143,6 @@ class Default_comdat_behavior } }; -inline bool -is_strong_undefined(const Symbol* sym) -{ - return sym->is_undefined() && sym->binding() != elfcpp::STB_WEAK; -} - // Give an error for a symbol with non-default visibility which is not // defined locally. @@ -190,7 +184,7 @@ issue_undefined_symbol_error(const Symbol* sym) return false; // We don't report weak symbols. - if (sym->binding() == elfcpp::STB_WEAK) + if (sym->is_weak_undefined()) return false; // We don't report symbols defined in discarded sections. @@ -216,6 +210,10 @@ issue_undefined_symbol_error(const Symbol* sym) return false; } + // If the symbol is hidden, report it. + if (sym->visibility() == elfcpp::STV_HIDDEN) + return true; + // When creating a shared library, only report unresolved symbols if // -z defs was used. if (parameters->options().shared() && !parameters->options().defs()) @@ -419,7 +417,7 @@ relocate_section( gold_undefined_symbol_at_location(sym, relinfo, i, offset); else if (sym != NULL && sym->visibility() != elfcpp::STV_DEFAULT - && (is_strong_undefined(sym) || sym->is_from_dynobj())) + && (sym->is_strong_undefined() || sym->is_from_dynobj())) visibility_error(sym); if (sym != NULL && sym->has_warning()) ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-04-01 23:02 ` Cary Coutant @ 2014-04-03 17:27 ` Roland McGrath 2014-04-23 21:08 ` Roland McGrath 0 siblings, 1 reply; 19+ messages in thread From: Roland McGrath @ 2014-04-03 17:27 UTC (permalink / raw) To: Cary Coutant; +Cc: Ian Lance Taylor, Binutils That seems to cover the cases I know to try. Thanks, Roland ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-04-03 17:27 ` Roland McGrath @ 2014-04-23 21:08 ` Roland McGrath 2014-04-23 21:45 ` Cary Coutant 0 siblings, 1 reply; 19+ messages in thread From: Roland McGrath @ 2014-04-23 21:08 UTC (permalink / raw) To: Cary Coutant; +Cc: Ian Lance Taylor, Binutils I was just surprised to find that your change never went in. Is there some reason you didn't commit it? Ideally I think we'd like to have this fixed on the 2.24 branch too. Thanks, Roland ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-04-23 21:08 ` Roland McGrath @ 2014-04-23 21:45 ` Cary Coutant 2014-04-23 22:03 ` Roland McGrath 0 siblings, 1 reply; 19+ messages in thread From: Cary Coutant @ 2014-04-23 21:45 UTC (permalink / raw) To: Roland McGrath; +Cc: Ian Lance Taylor, Binutils > I was just surprised to find that your change never went in. > Is there some reason you didn't commit it? > Ideally I think we'd like to have this fixed on the 2.24 branch too. Sorry, I've been meaning to adapt the gnu ld testcases for the gold testsuite along with the fix. -cary ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-04-23 21:45 ` Cary Coutant @ 2014-04-23 22:03 ` Roland McGrath 2014-05-06 21:53 ` Cary Coutant 0 siblings, 1 reply; 19+ messages in thread From: Roland McGrath @ 2014-04-23 22:03 UTC (permalink / raw) To: Cary Coutant; +Cc: Ian Lance Taylor, Binutils On Wed, Apr 23, 2014 at 2:45 PM, Cary Coutant <ccoutant@google.com> wrote: > Sorry, I've been meaning to adapt the gnu ld testcases for the gold > testsuite along with the fix. That's certainly a good thing to do. Just so the fix doesn't get dropped on the floor. It also seems like a shame if it takes so long to get in that it stops making sense for a 2.24 backport, since nobody will be using 2.25 for many more months. Thanks, Roland ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-04-23 22:03 ` Roland McGrath @ 2014-05-06 21:53 ` Cary Coutant 2014-05-06 21:54 ` Roland McGrath 0 siblings, 1 reply; 19+ messages in thread From: Cary Coutant @ 2014-05-06 21:53 UTC (permalink / raw) To: Roland McGrath; +Cc: Ian Lance Taylor, Binutils >> Sorry, I've been meaning to adapt the gnu ld testcases for the gold >> testsuite along with the fix. > > That's certainly a good thing to do. Just so the fix doesn't get dropped > on the floor. It also seems like a shame if it takes so long to get in > that it stops making sense for a 2.24 backport, since nobody will be using > 2.25 for many more months. I've committed the fix and backported it to 2.24. -cary ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: gold regression vs BFD ld: __ehdr_start 2014-05-06 21:53 ` Cary Coutant @ 2014-05-06 21:54 ` Roland McGrath 0 siblings, 0 replies; 19+ messages in thread From: Roland McGrath @ 2014-05-06 21:54 UTC (permalink / raw) To: Cary Coutant; +Cc: Ian Lance Taylor, Binutils Thanks! I wonder if/when we'll ever get another 2.24.x release... ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2014-05-06 21:54 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-03-30 1:15 gold regression vs BFD ld: __ehdr_start Roland McGrath 2014-03-31 19:19 ` Ian Lance Taylor 2014-03-31 21:03 ` Cary Coutant 2014-03-31 21:07 ` Roland McGrath 2014-03-31 21:18 ` Cary Coutant 2014-03-31 21:27 ` Roland McGrath 2014-03-31 22:19 ` Cary Coutant 2014-04-01 16:34 ` Roland McGrath 2014-04-01 21:50 ` Cary Coutant 2014-04-01 22:28 ` Cary Coutant 2014-04-01 22:50 ` Roland McGrath 2014-04-01 22:59 ` Cary Coutant 2014-04-01 23:02 ` Cary Coutant 2014-04-03 17:27 ` Roland McGrath 2014-04-23 21:08 ` Roland McGrath 2014-04-23 21:45 ` Cary Coutant 2014-04-23 22:03 ` Roland McGrath 2014-05-06 21:53 ` Cary Coutant 2014-05-06 21:54 ` Roland McGrath
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).