public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Problem with static const objects and LTO
@ 2020-09-16 16:51 Jeff Law
  2020-09-16 17:05 ` H.J. Lu
  2020-09-16 17:52 ` Joseph Myers
  0 siblings, 2 replies; 17+ messages in thread
From: Jeff Law @ 2020-09-16 16:51 UTC (permalink / raw)
  To: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 2026 bytes --]


Consider a TU with file scoped "static const object utf8_sb_map".   A
routine within the TU will stuff &utf8_sb_map into an object, something
like:

fu (...)

{

  if (cond)

    dfa->sb_char = utf8_sb_map;

  else

    dfa->sb_char = malloc (...);

}


There is another routine in the TU which looks like

bar (...)

{

  if (dfa->sb_char != utf8_sb_map)

    free (dfa->sb_char);

}


Now imagine that the TU is compiled (with LTO) into a static library,
libgl.a and there's a DSO (libdso.so) which gets linked against libgl.a
and references the first routine (fu).  We get a copy of fu in the DSO
along with a copy of utf8_sb_map.


Then imagine there's a main executable that dynamicly links against
libdso.so, then links statically against libgl.a.  Assume the  main
executable does not directly reference fu(), but does call a routine in
libdso.so which eventually calls fu().  Also assume the main executable
directly calls bar().  Again, remember we're compiling with LTO, so we
don't suck in the entire TU, just the routines/data we need.


In this scenario, both libdso.so and the main executable are going to a
copy of utf8_sb_map and they'll be at different addresses.  So when the
main executable calls into libdso.so which in turn calls libdso's copy
of fu() which stuffs the address of utf8_sb_map from the DSO into
dfa->sb_char.  Later the main executable calls bar() that's in the main
executable.  It does the comparison to see if dfa->sb_char is equal to
utf8_sb_map -- but it's using the main executable's copy of utf8_sb_map
and naturally free() blows us because it was passed a static object, not
a malloc'd object.


ISTM this is a lot like the problem we have where we inline functions
with static data.   To fix those we use STB_GNU_UNIQUE.  But I don't see
any code in the C front-end which would utilize STB_GNU_UNIQUE.  It's
support seems limited to C++.


How is this supposed to work for C?


Jeff



[-- Attachment #2: pEpkey.asc --]
[-- Type: application/pgp-keys, Size: 1763 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-09-16 16:51 Problem with static const objects and LTO Jeff Law
@ 2020-09-16 17:05 ` H.J. Lu
  2020-09-16 17:10   ` Jeff Law
  2020-09-16 17:52 ` Joseph Myers
  1 sibling, 1 reply; 17+ messages in thread
From: H.J. Lu @ 2020-09-16 17:05 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC Patches

On Wed, Sep 16, 2020 at 9:53 AM Jeff Law via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
> Consider a TU with file scoped "static const object utf8_sb_map".   A
> routine within the TU will stuff &utf8_sb_map into an object, something
> like:
>
> fu (...)
>
> {
>
>   if (cond)
>
>     dfa->sb_char = utf8_sb_map;
>
>   else
>
>     dfa->sb_char = malloc (...);
>
> }
>
>
> There is another routine in the TU which looks like
>
> bar (...)
>
> {
>
>   if (dfa->sb_char != utf8_sb_map)
>
>     free (dfa->sb_char);
>
> }
>
>
> Now imagine that the TU is compiled (with LTO) into a static library,
> libgl.a and there's a DSO (libdso.so) which gets linked against libgl.a
> and references the first routine (fu).  We get a copy of fu in the DSO
> along with a copy of utf8_sb_map.
>
>
> Then imagine there's a main executable that dynamicly links against
> libdso.so, then links statically against libgl.a.  Assume the  main
> executable does not directly reference fu(), but does call a routine in
> libdso.so which eventually calls fu().  Also assume the main executable
> directly calls bar().  Again, remember we're compiling with LTO, so we
> don't suck in the entire TU, just the routines/data we need.
>
>
> In this scenario, both libdso.so and the main executable are going to a
> copy of utf8_sb_map and they'll be at different addresses.  So when the
> main executable calls into libdso.so which in turn calls libdso's copy
> of fu() which stuffs the address of utf8_sb_map from the DSO into
> dfa->sb_char.  Later the main executable calls bar() that's in the main
> executable.  It does the comparison to see if dfa->sb_char is equal to
> utf8_sb_map -- but it's using the main executable's copy of utf8_sb_map
> and naturally free() blows us because it was passed a static object, not
> a malloc'd object.
>
>
> ISTM this is a lot like the problem we have where we inline functions
> with static data.   To fix those we use STB_GNU_UNIQUE.  But I don't see
> any code in the C front-end which would utilize STB_GNU_UNIQUE.  It's
> support seems limited to C++.
>
>
> How is this supposed to work for C?
>
>
> Jeff
>
>

Can you group utf8_sb_map, fu and bar together so that they are defined
together?

-- 
H.J.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-09-16 17:05 ` H.J. Lu
@ 2020-09-16 17:10   ` Jeff Law
  2020-09-16 17:13     ` H.J. Lu
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Law @ 2020-09-16 17:10 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 2758 bytes --]


On 9/16/20 11:05 AM, H.J. Lu wrote:
> On Wed, Sep 16, 2020 at 9:53 AM Jeff Law via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> Consider a TU with file scoped "static const object utf8_sb_map".   A
>> routine within the TU will stuff &utf8_sb_map into an object, something
>> like:
>>
>> fu (...)
>>
>> {
>>
>>   if (cond)
>>
>>     dfa->sb_char = utf8_sb_map;
>>
>>   else
>>
>>     dfa->sb_char = malloc (...);
>>
>> }
>>
>>
>> There is another routine in the TU which looks like
>>
>> bar (...)
>>
>> {
>>
>>   if (dfa->sb_char != utf8_sb_map)
>>
>>     free (dfa->sb_char);
>>
>> }
>>
>>
>> Now imagine that the TU is compiled (with LTO) into a static library,
>> libgl.a and there's a DSO (libdso.so) which gets linked against libgl.a
>> and references the first routine (fu).  We get a copy of fu in the DSO
>> along with a copy of utf8_sb_map.
>>
>>
>> Then imagine there's a main executable that dynamicly links against
>> libdso.so, then links statically against libgl.a.  Assume the  main
>> executable does not directly reference fu(), but does call a routine in
>> libdso.so which eventually calls fu().  Also assume the main executable
>> directly calls bar().  Again, remember we're compiling with LTO, so we
>> don't suck in the entire TU, just the routines/data we need.
>>
>>
>> In this scenario, both libdso.so and the main executable are going to a
>> copy of utf8_sb_map and they'll be at different addresses.  So when the
>> main executable calls into libdso.so which in turn calls libdso's copy
>> of fu() which stuffs the address of utf8_sb_map from the DSO into
>> dfa->sb_char.  Later the main executable calls bar() that's in the main
>> executable.  It does the comparison to see if dfa->sb_char is equal to
>> utf8_sb_map -- but it's using the main executable's copy of utf8_sb_map
>> and naturally free() blows us because it was passed a static object, not
>> a malloc'd object.
>>
>>
>> ISTM this is a lot like the problem we have where we inline functions
>> with static data.   To fix those we use STB_GNU_UNIQUE.  But I don't see
>> any code in the C front-end which would utilize STB_GNU_UNIQUE.  It's
>> support seems limited to C++.
>>
>>
>> How is this supposed to work for C?
>>
>>
>> Jeff
>>
>>
> Can you group utf8_sb_map, fu and bar together so that they are defined
> together?

They're all defined within the same TU in gnulib.  It's the LTO
dead/unreachable code elimination that results in just parts of the TU
being copied into the DSO and a different set copied into the main
executable.  In many ways LTO makes this look a lot like the static data
member problems we've had to deal with in the C++ world.


jeff


[-- Attachment #2: pEpkey.asc --]
[-- Type: application/pgp-keys, Size: 1763 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-09-16 17:10   ` Jeff Law
@ 2020-09-16 17:13     ` H.J. Lu
  2020-09-16 17:24       ` Jeff Law
  0 siblings, 1 reply; 17+ messages in thread
From: H.J. Lu @ 2020-09-16 17:13 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC Patches

On Wed, Sep 16, 2020 at 10:10 AM Jeff Law <law@redhat.com> wrote:
>
>
> On 9/16/20 11:05 AM, H.J. Lu wrote:
> > On Wed, Sep 16, 2020 at 9:53 AM Jeff Law via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> Consider a TU with file scoped "static const object utf8_sb_map".   A
> >> routine within the TU will stuff &utf8_sb_map into an object, something
> >> like:
> >>
> >> fu (...)
> >>
> >> {
> >>
> >>   if (cond)
> >>
> >>     dfa->sb_char = utf8_sb_map;
> >>
> >>   else
> >>
> >>     dfa->sb_char = malloc (...);
> >>
> >> }
> >>
> >>
> >> There is another routine in the TU which looks like
> >>
> >> bar (...)
> >>
> >> {
> >>
> >>   if (dfa->sb_char != utf8_sb_map)
> >>
> >>     free (dfa->sb_char);
> >>
> >> }
> >>
> >>
> >> Now imagine that the TU is compiled (with LTO) into a static library,
> >> libgl.a and there's a DSO (libdso.so) which gets linked against libgl.a
> >> and references the first routine (fu).  We get a copy of fu in the DSO
> >> along with a copy of utf8_sb_map.
> >>
> >>
> >> Then imagine there's a main executable that dynamicly links against
> >> libdso.so, then links statically against libgl.a.  Assume the  main
> >> executable does not directly reference fu(), but does call a routine in
> >> libdso.so which eventually calls fu().  Also assume the main executable
> >> directly calls bar().  Again, remember we're compiling with LTO, so we
> >> don't suck in the entire TU, just the routines/data we need.
> >>
> >>
> >> In this scenario, both libdso.so and the main executable are going to a
> >> copy of utf8_sb_map and they'll be at different addresses.  So when the
> >> main executable calls into libdso.so which in turn calls libdso's copy
> >> of fu() which stuffs the address of utf8_sb_map from the DSO into
> >> dfa->sb_char.  Later the main executable calls bar() that's in the main
> >> executable.  It does the comparison to see if dfa->sb_char is equal to
> >> utf8_sb_map -- but it's using the main executable's copy of utf8_sb_map
> >> and naturally free() blows us because it was passed a static object, not
> >> a malloc'd object.
> >>
> >>
> >> ISTM this is a lot like the problem we have where we inline functions
> >> with static data.   To fix those we use STB_GNU_UNIQUE.  But I don't see
> >> any code in the C front-end which would utilize STB_GNU_UNIQUE.  It's
> >> support seems limited to C++.
> >>
> >>
> >> How is this supposed to work for C?
> >>
> >>
> >> Jeff
> >>
> >>
> > Can you group utf8_sb_map, fu and bar together so that they are defined
> > together?
>
> They're all defined within the same TU in gnulib.  It's the LTO
> dead/unreachable code elimination that results in just parts of the TU
> being copied into the DSO and a different set copied into the main
> executable.  In many ways LTO makes this look a lot like the static data
> member problems we've had to deal with in the C++ world.

In this case, LTO should treat them as in a single group.   Removing
one group member should remove the whole group.  Keep one member
should keep the whole group.

-- 
H.J.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-09-16 17:13     ` H.J. Lu
@ 2020-09-16 17:24       ` Jeff Law
  2020-09-16 17:32         ` H.J. Lu
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Law @ 2020-09-16 17:24 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 3469 bytes --]


On 9/16/20 11:13 AM, H.J. Lu wrote:
> On Wed, Sep 16, 2020 at 10:10 AM Jeff Law <law@redhat.com> wrote:
>>
>> On 9/16/20 11:05 AM, H.J. Lu wrote:
>>> On Wed, Sep 16, 2020 at 9:53 AM Jeff Law via Gcc-patches
>>> <gcc-patches@gcc.gnu.org> wrote:
>>>> Consider a TU with file scoped "static const object utf8_sb_map".   A
>>>> routine within the TU will stuff &utf8_sb_map into an object, something
>>>> like:
>>>>
>>>> fu (...)
>>>>
>>>> {
>>>>
>>>>   if (cond)
>>>>
>>>>     dfa->sb_char = utf8_sb_map;
>>>>
>>>>   else
>>>>
>>>>     dfa->sb_char = malloc (...);
>>>>
>>>> }
>>>>
>>>>
>>>> There is another routine in the TU which looks like
>>>>
>>>> bar (...)
>>>>
>>>> {
>>>>
>>>>   if (dfa->sb_char != utf8_sb_map)
>>>>
>>>>     free (dfa->sb_char);
>>>>
>>>> }
>>>>
>>>>
>>>> Now imagine that the TU is compiled (with LTO) into a static library,
>>>> libgl.a and there's a DSO (libdso.so) which gets linked against libgl.a
>>>> and references the first routine (fu).  We get a copy of fu in the DSO
>>>> along with a copy of utf8_sb_map.
>>>>
>>>>
>>>> Then imagine there's a main executable that dynamicly links against
>>>> libdso.so, then links statically against libgl.a.  Assume the  main
>>>> executable does not directly reference fu(), but does call a routine in
>>>> libdso.so which eventually calls fu().  Also assume the main executable
>>>> directly calls bar().  Again, remember we're compiling with LTO, so we
>>>> don't suck in the entire TU, just the routines/data we need.
>>>>
>>>>
>>>> In this scenario, both libdso.so and the main executable are going to a
>>>> copy of utf8_sb_map and they'll be at different addresses.  So when the
>>>> main executable calls into libdso.so which in turn calls libdso's copy
>>>> of fu() which stuffs the address of utf8_sb_map from the DSO into
>>>> dfa->sb_char.  Later the main executable calls bar() that's in the main
>>>> executable.  It does the comparison to see if dfa->sb_char is equal to
>>>> utf8_sb_map -- but it's using the main executable's copy of utf8_sb_map
>>>> and naturally free() blows us because it was passed a static object, not
>>>> a malloc'd object.
>>>>
>>>>
>>>> ISTM this is a lot like the problem we have where we inline functions
>>>> with static data.   To fix those we use STB_GNU_UNIQUE.  But I don't see
>>>> any code in the C front-end which would utilize STB_GNU_UNIQUE.  It's
>>>> support seems limited to C++.
>>>>
>>>>
>>>> How is this supposed to work for C?
>>>>
>>>>
>>>> Jeff
>>>>
>>>>
>>> Can you group utf8_sb_map, fu and bar together so that they are defined
>>> together?
>> They're all defined within the same TU in gnulib.  It's the LTO
>> dead/unreachable code elimination that results in just parts of the TU
>> being copied into the DSO and a different set copied into the main
>> executable.  In many ways LTO makes this look a lot like the static data
>> member problems we've had to deal with in the C++ world.
> In this case, LTO should treat them as in a single group.   Removing
> one group member should remove the whole group.  Keep one member
> should keep the whole group.

Do you mean ensure they're all in a partition together?  I think that
might work in the immediate term, but is probably brittle in the long
term.  I'd tend to lean towards forcing these static data objects to be
STB_GNU_UNIQUE -- that seems more robust to me.


jeff


>

[-- Attachment #2: pEpkey.asc --]
[-- Type: application/pgp-keys, Size: 1763 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-09-16 17:24       ` Jeff Law
@ 2020-09-16 17:32         ` H.J. Lu
  2020-09-16 17:41           ` Jeff Law
  0 siblings, 1 reply; 17+ messages in thread
From: H.J. Lu @ 2020-09-16 17:32 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC Patches

On Wed, Sep 16, 2020 at 10:24 AM Jeff Law <law@redhat.com> wrote:
>
>
> On 9/16/20 11:13 AM, H.J. Lu wrote:
> > On Wed, Sep 16, 2020 at 10:10 AM Jeff Law <law@redhat.com> wrote:
> >>
> >> On 9/16/20 11:05 AM, H.J. Lu wrote:
> >>> On Wed, Sep 16, 2020 at 9:53 AM Jeff Law via Gcc-patches
> >>> <gcc-patches@gcc.gnu.org> wrote:
> >>>> Consider a TU with file scoped "static const object utf8_sb_map".   A
> >>>> routine within the TU will stuff &utf8_sb_map into an object, something
> >>>> like:
> >>>>
> >>>> fu (...)
> >>>>
> >>>> {
> >>>>
> >>>>   if (cond)
> >>>>
> >>>>     dfa->sb_char = utf8_sb_map;
> >>>>
> >>>>   else
> >>>>
> >>>>     dfa->sb_char = malloc (...);
> >>>>
> >>>> }
> >>>>
> >>>>
> >>>> There is another routine in the TU which looks like
> >>>>
> >>>> bar (...)
> >>>>
> >>>> {
> >>>>
> >>>>   if (dfa->sb_char != utf8_sb_map)
> >>>>
> >>>>     free (dfa->sb_char);
> >>>>
> >>>> }
> >>>>
> >>>>
> >>>> Now imagine that the TU is compiled (with LTO) into a static library,
> >>>> libgl.a and there's a DSO (libdso.so) which gets linked against libgl.a
> >>>> and references the first routine (fu).  We get a copy of fu in the DSO
> >>>> along with a copy of utf8_sb_map.
> >>>>
> >>>>
> >>>> Then imagine there's a main executable that dynamicly links against
> >>>> libdso.so, then links statically against libgl.a.  Assume the  main
> >>>> executable does not directly reference fu(), but does call a routine in
> >>>> libdso.so which eventually calls fu().  Also assume the main executable
> >>>> directly calls bar().  Again, remember we're compiling with LTO, so we
> >>>> don't suck in the entire TU, just the routines/data we need.
> >>>>
> >>>>
> >>>> In this scenario, both libdso.so and the main executable are going to a
> >>>> copy of utf8_sb_map and they'll be at different addresses.  So when the
> >>>> main executable calls into libdso.so which in turn calls libdso's copy
> >>>> of fu() which stuffs the address of utf8_sb_map from the DSO into
> >>>> dfa->sb_char.  Later the main executable calls bar() that's in the main
> >>>> executable.  It does the comparison to see if dfa->sb_char is equal to
> >>>> utf8_sb_map -- but it's using the main executable's copy of utf8_sb_map
> >>>> and naturally free() blows us because it was passed a static object, not
> >>>> a malloc'd object.
> >>>>
> >>>>
> >>>> ISTM this is a lot like the problem we have where we inline functions
> >>>> with static data.   To fix those we use STB_GNU_UNIQUE.  But I don't see
> >>>> any code in the C front-end which would utilize STB_GNU_UNIQUE.  It's
> >>>> support seems limited to C++.
> >>>>
> >>>>
> >>>> How is this supposed to work for C?
> >>>>
> >>>>
> >>>> Jeff
> >>>>
> >>>>
> >>> Can you group utf8_sb_map, fu and bar together so that they are defined
> >>> together?
> >> They're all defined within the same TU in gnulib.  It's the LTO
> >> dead/unreachable code elimination that results in just parts of the TU
> >> being copied into the DSO and a different set copied into the main
> >> executable.  In many ways LTO makes this look a lot like the static data
> >> member problems we've had to deal with in the C++ world.
> > In this case, LTO should treat them as in a single group.   Removing
> > one group member should remove the whole group.  Keep one member
> > should keep the whole group.
>
> Do you mean ensure they're all in a partition together?  I think that
> might work in the immediate term, but is probably brittle in the long
> term.  I'd tend to lean towards forcing these static data objects to be
> STB_GNU_UNIQUE -- that seems more robust to me.

Isn't STB_GNU_UNIQUE binding global?  How does it work with

static const int foo;

and

static const double foo;

in different files?

-- 
H.J.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-09-16 17:32         ` H.J. Lu
@ 2020-09-16 17:41           ` Jeff Law
  0 siblings, 0 replies; 17+ messages in thread
From: Jeff Law @ 2020-09-16 17:41 UTC (permalink / raw)
  To: H.J. Lu; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 4492 bytes --]


On 9/16/20 11:32 AM, H.J. Lu wrote:
> On Wed, Sep 16, 2020 at 10:24 AM Jeff Law <law@redhat.com> wrote:
>>
>> On 9/16/20 11:13 AM, H.J. Lu wrote:
>>> On Wed, Sep 16, 2020 at 10:10 AM Jeff Law <law@redhat.com> wrote:
>>>> On 9/16/20 11:05 AM, H.J. Lu wrote:
>>>>> On Wed, Sep 16, 2020 at 9:53 AM Jeff Law via Gcc-patches
>>>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>>> Consider a TU with file scoped "static const object utf8_sb_map".   A
>>>>>> routine within the TU will stuff &utf8_sb_map into an object, something
>>>>>> like:
>>>>>>
>>>>>> fu (...)
>>>>>>
>>>>>> {
>>>>>>
>>>>>>   if (cond)
>>>>>>
>>>>>>     dfa->sb_char = utf8_sb_map;
>>>>>>
>>>>>>   else
>>>>>>
>>>>>>     dfa->sb_char = malloc (...);
>>>>>>
>>>>>> }
>>>>>>
>>>>>>
>>>>>> There is another routine in the TU which looks like
>>>>>>
>>>>>> bar (...)
>>>>>>
>>>>>> {
>>>>>>
>>>>>>   if (dfa->sb_char != utf8_sb_map)
>>>>>>
>>>>>>     free (dfa->sb_char);
>>>>>>
>>>>>> }
>>>>>>
>>>>>>
>>>>>> Now imagine that the TU is compiled (with LTO) into a static library,
>>>>>> libgl.a and there's a DSO (libdso.so) which gets linked against libgl.a
>>>>>> and references the first routine (fu).  We get a copy of fu in the DSO
>>>>>> along with a copy of utf8_sb_map.
>>>>>>
>>>>>>
>>>>>> Then imagine there's a main executable that dynamicly links against
>>>>>> libdso.so, then links statically against libgl.a.  Assume the  main
>>>>>> executable does not directly reference fu(), but does call a routine in
>>>>>> libdso.so which eventually calls fu().  Also assume the main executable
>>>>>> directly calls bar().  Again, remember we're compiling with LTO, so we
>>>>>> don't suck in the entire TU, just the routines/data we need.
>>>>>>
>>>>>>
>>>>>> In this scenario, both libdso.so and the main executable are going to a
>>>>>> copy of utf8_sb_map and they'll be at different addresses.  So when the
>>>>>> main executable calls into libdso.so which in turn calls libdso's copy
>>>>>> of fu() which stuffs the address of utf8_sb_map from the DSO into
>>>>>> dfa->sb_char.  Later the main executable calls bar() that's in the main
>>>>>> executable.  It does the comparison to see if dfa->sb_char is equal to
>>>>>> utf8_sb_map -- but it's using the main executable's copy of utf8_sb_map
>>>>>> and naturally free() blows us because it was passed a static object, not
>>>>>> a malloc'd object.
>>>>>>
>>>>>>
>>>>>> ISTM this is a lot like the problem we have where we inline functions
>>>>>> with static data.   To fix those we use STB_GNU_UNIQUE.  But I don't see
>>>>>> any code in the C front-end which would utilize STB_GNU_UNIQUE.  It's
>>>>>> support seems limited to C++.
>>>>>>
>>>>>>
>>>>>> How is this supposed to work for C?
>>>>>>
>>>>>>
>>>>>> Jeff
>>>>>>
>>>>>>
>>>>> Can you group utf8_sb_map, fu and bar together so that they are defined
>>>>> together?
>>>> They're all defined within the same TU in gnulib.  It's the LTO
>>>> dead/unreachable code elimination that results in just parts of the TU
>>>> being copied into the DSO and a different set copied into the main
>>>> executable.  In many ways LTO makes this look a lot like the static data
>>>> member problems we've had to deal with in the C++ world.
>>> In this case, LTO should treat them as in a single group.   Removing
>>> one group member should remove the whole group.  Keep one member
>>> should keep the whole group.
>> Do you mean ensure they're all in a partition together?  I think that
>> might work in the immediate term, but is probably brittle in the long
>> term.  I'd tend to lean towards forcing these static data objects to be
>> STB_GNU_UNIQUE -- that seems more robust to me.
> Isn't STB_GNU_UNIQUE binding global?  How does it work with
>
> static const int foo;
>
> and
>
> static const double foo;
>
> in different files?

It's global in the sense that it uniqifies across TU boundaries, which
is its purpose.  Consider a function scoped static object in a template
or C++ inlined function.  We can end up with multiple copies in
different TUs and we have to collapse them to a single representative
object.  We'd need to do the same thing here.  Without thinking a whole
lot about it, but ISTM that if LTO pulls in a function that references a
static object, then that static object would need to be STB_GNU_UNIQUE.


For something like you've shown above, I'd *hope* we give an ODR error :-)


Jeff


[-- Attachment #2: pEpkey.asc --]
[-- Type: application/pgp-keys, Size: 1763 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-09-16 16:51 Problem with static const objects and LTO Jeff Law
  2020-09-16 17:05 ` H.J. Lu
@ 2020-09-16 17:52 ` Joseph Myers
  2020-09-16 20:24   ` Jeff Law
  1 sibling, 1 reply; 17+ messages in thread
From: Joseph Myers @ 2020-09-16 17:52 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC Patches

On Wed, 16 Sep 2020, Jeff Law via Gcc-patches wrote:

> ISTM this is a lot like the problem we have where we inline functions
> with static data.   To fix those we use STB_GNU_UNIQUE.  But I don't see
> any code in the C front-end which would utilize STB_GNU_UNIQUE.  It's
> support seems limited to C++.
> 
> 
> How is this supposed to work for C?

C inline functions don't try to address this.  The standard has a rule "An 
inline definition of a function with external linkage shall not contain a 
definition of a modifiable object with static or thread storage duration, 
and shall not contain a reference to an identifier with internal 
linkage.", which avoids some cases of this, but you can still get multiple 
copies of objects (with static storage duration, not modifiable, no 
linkage, i.e. static const defined inside an inline function) as noted in 
a footnote "Since an inline definition is distinct from the corresponding 
external definition and from any other corresponding inline definitions in 
other translation units, all corresponding objects with static storage 
duration are also distinct in each of the definitions.".

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-09-16 17:52 ` Joseph Myers
@ 2020-09-16 20:24   ` Jeff Law
  2020-09-17  7:04     ` Richard Biener
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Law @ 2020-09-16 20:24 UTC (permalink / raw)
  To: Joseph Myers; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 2648 bytes --]


On 9/16/20 11:52 AM, Joseph Myers wrote:
> On Wed, 16 Sep 2020, Jeff Law via Gcc-patches wrote:
>
>> ISTM this is a lot like the problem we have where we inline functions
>> with static data.   To fix those we use STB_GNU_UNIQUE.  But I don't see
>> any code in the C front-end which would utilize STB_GNU_UNIQUE.  It's
>> support seems limited to C++.
>>
>>
>> How is this supposed to work for C?
> C inline functions don't try to address this.  The standard has a rule "An 
> inline definition of a function with external linkage shall not contain a 
> definition of a modifiable object with static or thread storage duration, 
> and shall not contain a reference to an identifier with internal 
> linkage.", which avoids some cases of this, but you can still get multiple 
> copies of objects (with static storage duration, not modifiable, no 
> linkage, i.e. static const defined inside an inline function) as noted in 
> a footnote "Since an inline definition is distinct from the corresponding 
> external definition and from any other corresponding inline definitions in 
> other translation units, all corresponding objects with static storage 
> duration are also distinct in each of the definitions.".

I was kindof guessing C would end up with something like the above --
I'm just happy its as well documented as it is.  Thanks!



The more I ponder this problem the more worried I get.   Before LTO we
could consider the TU an indivisible unit and if the main program
referenced something in the TU, then a static link of that TU would
bring the entire TU into the main program and that would satisfy all
references to all symbols defined by the TU, including those in any DSOs
used by the main program.  So it's an indivisible unit at runtime too. 
I could change internal data structures, rebuild the static library &
main program (leaving the DSOs alone) and it'd just work.


In an LTO world the TU isn't indivisible anymore.  LTO will happily
discard things which don't appear to be used.   So parts of the TU may
be in the main program, other parts may be in DSOs used by the main
program.   This can mean that objects are unexpectedly being passed
across DSO boundaries and the details of those objects has, in effect,
become part of the ABI.  If I was to change an internal data structure,
build the static library and main program we could get bad behavior
because an instance of that data structure could be passed to a DSO
because the main executable is "incomplete" and we end up calling copies
of routines from the (not rebuilt) DSOs.


Jeff

 





[-- Attachment #2: pEpkey.asc --]
[-- Type: application/pgp-keys, Size: 1763 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-09-16 20:24   ` Jeff Law
@ 2020-09-17  7:04     ` Richard Biener
  2020-09-17 18:18       ` Jeff Law
  0 siblings, 1 reply; 17+ messages in thread
From: Richard Biener @ 2020-09-17  7:04 UTC (permalink / raw)
  To: Jeff Law; +Cc: Joseph Myers, GCC Patches

On Wed, Sep 16, 2020 at 10:24 PM Jeff Law via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
>
> On 9/16/20 11:52 AM, Joseph Myers wrote:
> > On Wed, 16 Sep 2020, Jeff Law via Gcc-patches wrote:
> >
> >> ISTM this is a lot like the problem we have where we inline functions
> >> with static data.   To fix those we use STB_GNU_UNIQUE.  But I don't see
> >> any code in the C front-end which would utilize STB_GNU_UNIQUE.  It's
> >> support seems limited to C++.
> >>
> >>
> >> How is this supposed to work for C?
> > C inline functions don't try to address this.  The standard has a rule "An
> > inline definition of a function with external linkage shall not contain a
> > definition of a modifiable object with static or thread storage duration,
> > and shall not contain a reference to an identifier with internal
> > linkage.", which avoids some cases of this, but you can still get multiple
> > copies of objects (with static storage duration, not modifiable, no
> > linkage, i.e. static const defined inside an inline function) as noted in
> > a footnote "Since an inline definition is distinct from the corresponding
> > external definition and from any other corresponding inline definitions in
> > other translation units, all corresponding objects with static storage
> > duration are also distinct in each of the definitions.".
>
> I was kindof guessing C would end up with something like the above --
> I'm just happy its as well documented as it is.  Thanks!
>
>
>
> The more I ponder this problem the more worried I get.   Before LTO we
> could consider the TU an indivisible unit and if the main program
> referenced something in the TU, then a static link of that TU would
> bring the entire TU into the main program and that would satisfy all
> references to all symbols defined by the TU, including those in any DSOs
> used by the main program.  So it's an indivisible unit at runtime too.
> I could change internal data structures, rebuild the static library &
> main program (leaving the DSOs alone) and it'd just work.
>
>
> In an LTO world the TU isn't indivisible anymore.  LTO will happily
> discard things which don't appear to be used.   So parts of the TU may
> be in the main program, other parts may be in DSOs used by the main
> program.   This can mean that objects are unexpectedly being passed
> across DSO boundaries and the details of those objects has, in effect,
> become part of the ABI.  If I was to change an internal data structure,
> build the static library and main program we could get bad behavior
> because an instance of that data structure could be passed to a DSO
> because the main executable is "incomplete" and we end up calling copies
> of routines from the (not rebuilt) DSOs.

I think the situation is simpler - LTO can duplicate data objects for the
purpose of optimization within some constraints and there might be a simple
error in it thinking duplicating of the static const object into two LTRANS
units is OK.  So - do you have a testcase?

That said, LTO doesn't produce both the main program and the DSO
at the same time - you produce both separately and the latent "hidden ABI"
issue would exist even w/o LTO.

Richard.

>
> Jeff
>
>
>
>
>
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-09-17  7:04     ` Richard Biener
@ 2020-09-17 18:18       ` Jeff Law
  2020-09-17 19:03         ` Jakub Jelinek
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Law @ 2020-09-17 18:18 UTC (permalink / raw)
  To: Richard Biener; +Cc: Joseph Myers, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 4160 bytes --]


On 9/17/20 1:04 AM, Richard Biener wrote:
> On Wed, Sep 16, 2020 at 10:24 PM Jeff Law via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> On 9/16/20 11:52 AM, Joseph Myers wrote:
>>> On Wed, 16 Sep 2020, Jeff Law via Gcc-patches wrote:
>>>
>>>> ISTM this is a lot like the problem we have where we inline functions
>>>> with static data.   To fix those we use STB_GNU_UNIQUE.  But I don't see
>>>> any code in the C front-end which would utilize STB_GNU_UNIQUE.  It's
>>>> support seems limited to C++.
>>>>
>>>>
>>>> How is this supposed to work for C?
>>> C inline functions don't try to address this.  The standard has a rule "An
>>> inline definition of a function with external linkage shall not contain a
>>> definition of a modifiable object with static or thread storage duration,
>>> and shall not contain a reference to an identifier with internal
>>> linkage.", which avoids some cases of this, but you can still get multiple
>>> copies of objects (with static storage duration, not modifiable, no
>>> linkage, i.e. static const defined inside an inline function) as noted in
>>> a footnote "Since an inline definition is distinct from the corresponding
>>> external definition and from any other corresponding inline definitions in
>>> other translation units, all corresponding objects with static storage
>>> duration are also distinct in each of the definitions.".
>> I was kindof guessing C would end up with something like the above --
>> I'm just happy its as well documented as it is.  Thanks!
>>
>>
>>
>> The more I ponder this problem the more worried I get.   Before LTO we
>> could consider the TU an indivisible unit and if the main program
>> referenced something in the TU, then a static link of that TU would
>> bring the entire TU into the main program and that would satisfy all
>> references to all symbols defined by the TU, including those in any DSOs
>> used by the main program.  So it's an indivisible unit at runtime too.
>> I could change internal data structures, rebuild the static library &
>> main program (leaving the DSOs alone) and it'd just work.
>>
>>
>> In an LTO world the TU isn't indivisible anymore.  LTO will happily
>> discard things which don't appear to be used.   So parts of the TU may
>> be in the main program, other parts may be in DSOs used by the main
>> program.   This can mean that objects are unexpectedly being passed
>> across DSO boundaries and the details of those objects has, in effect,
>> become part of the ABI.  If I was to change an internal data structure,
>> build the static library and main program we could get bad behavior
>> because an instance of that data structure could be passed to a DSO
>> because the main executable is "incomplete" and we end up calling copies
>> of routines from the (not rebuilt) DSOs.
> I think the situation is simpler - LTO can duplicate data objects for the
> purpose of optimization within some constraints and there might be a simple
> error in it thinking duplicating of the static const object into two LTRANS
> units is OK.  So - do you have a testcase?

man-db trips over this.  The executable links against a static version
of gnulib as well as linking dynamically to DSOs which themselves were
linked against a static gnulib.  We're getting bits of regcomp.c in the
main executable and other bits are in a DSO.  We have two copies of
utf8_sb_map (one in the main executable, another in a DSO) -- and the
code within regcomp assumes there is only one copy of utf8_sb_map.

Prior to LTO, when the main executable linked against the static gnulib
and had to pull in anything from regcomp.c, it ended up pulling in *all*
of regcomp.c into the main executable and those definitions overrode
anything in the DSO and it "just worked", though it is a nightmare from
a composability standpoint.


>
> That said, LTO doesn't produce both the main program and the DSO
> at the same time - you produce both separately and the latent "hidden ABI"
> issue would exist even w/o LTO.

Absolutely.  This is a composibility problem that's made worse by LTO.


jeff


[-- Attachment #2: pEpkey.asc --]
[-- Type: application/pgp-keys, Size: 1763 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-09-17 18:18       ` Jeff Law
@ 2020-09-17 19:03         ` Jakub Jelinek
  2020-10-07 22:08           ` Jeff Law
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Jelinek @ 2020-09-17 19:03 UTC (permalink / raw)
  To: Jeff Law; +Cc: Richard Biener, GCC Patches, Joseph Myers

On Thu, Sep 17, 2020 at 12:18:40PM -0600, Jeff Law via Gcc-patches wrote:
> >> In an LTO world the TU isn't indivisible anymore.  LTO will happily
> >> discard things which don't appear to be used.   So parts of the TU may
> >> be in the main program, other parts may be in DSOs used by the main
> >> program.   This can mean that objects are unexpectedly being passed
> >> across DSO boundaries and the details of those objects has, in effect,
> >> become part of the ABI.  If I was to change an internal data structure,
> >> build the static library and main program we could get bad behavior
> >> because an instance of that data structure could be passed to a DSO
> >> because the main executable is "incomplete" and we end up calling copies
> >> of routines from the (not rebuilt) DSOs.
> > I think the situation is simpler - LTO can duplicate data objects for the
> > purpose of optimization within some constraints and there might be a simple
> > error in it thinking duplicating of the static const object into two LTRANS
> > units is OK.  So - do you have a testcase?
> 
> man-db trips over this.  The executable links against a static version
> of gnulib as well as linking dynamically to DSOs which themselves were
> linked against a static gnulib.  We're getting bits of regcomp.c in the
> main executable and other bits are in a DSO.  We have two copies of
> utf8_sb_map (one in the main executable, another in a DSO) -- and the
> code within regcomp assumes there is only one copy of utf8_sb_map.
> 
> Prior to LTO, when the main executable linked against the static gnulib
> and had to pull in anything from regcomp.c, it ended up pulling in *all*
> of regcomp.c into the main executable and those definitions overrode
> anything in the DSO and it "just worked", though it is a nightmare from
> a composability standpoint.

That looks to me like a problem on the linker or linker plugin side.
Because, if a TU has two global entrypoints, then when linking a *.a library
with that TU, either linker chooses to link it in and then we should ensure
both symbols are exported, or it is not linked and nothing is added.
Both of the symbols are part of the library ABI (of course, unless using a
versioning script, symbols are hidden etc.).
Can you check if it behaves always that way for shared libraries?

On the executable side, I guess it is slightly different case, unless e.g.
-Wl,-E / -Wl,--export-dynamic (in which case again everything not hidden
should be treated like exported as in shared libraries), I guess we might
consider symbols not really used as not needed, but I'd say we should even
here really follow what the linker would do normally, i.e. if it would
normally put symbols into .dynsym (as e.g. some shared library references
that symbol and the binary is linked against it), we should still ensure the
symbol is exported.

	Jakub


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-09-17 19:03         ` Jakub Jelinek
@ 2020-10-07 22:08           ` Jeff Law
  2020-10-07 22:09             ` Jeff Law
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Law @ 2020-10-07 22:08 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, GCC Patches, Joseph Myers


On 9/17/20 1:03 PM, Jakub Jelinek wrote:
[ ... Big snip, starting over ... ]

I may have not explained things too well.  So I've put together a small
example that folks can play with to show the underlying issue.


There's a static library libfu.a.  In this static library we have a hunk
of local static data (utf8_sb_map) and two functions.  One function puts
the address utf8_sb_map into a structure (rpl_regcomp), the other
verifies that the address stored in the structure is the same as the
address of utf8_sb_map (rpl_regfree).


That static library is linked into DSO libdso.so.  The DSO sources
define a single function xregcomp which calls rpl_regcomp, but
references  nothing else from the static library.  Since libfu.a was
linked into the library we actually get a *copy* of rpl_regcomp in the
DSO.  In fact, we get a copy of the entire .o file from libfu.a, which
matches traditional linkage models where the .o file is an atomic unit
for linking.


The main program calls xregcomp which is defined in the DSO and calls
rpl_regfree.  The main program links against libdso.so and libfu.a. 
Because it links libfu.a, it gets  a copy of rpl_regfree, but *only*
rpl_regfree.  That copy of rpl_regfree references a new and distinct
copy of utf8_sb_map.  Naturally the address of utf8_sb_map in the main
program is different from the one in libdso.so and the test aborts.


Without LTO the main program would still reference rpl_regfree, but the
main program would not have its own copy.  rpl_regfree and rpl_regcomp
would both be satisfied by the DSO (which remember has a complete copy
of the .o file from libfu.a).  Thus there would be only one utf8_sb_map
as well and naturally the program will exit normally.


So I've got a bunch of thoughts here, but will defer sharing them
immediately so as not to unduly influence anyone.


I don't have a sense of how pervasive this issue is.   I know it affects
man-db, but there could well be others.




jeff


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-10-07 22:08           ` Jeff Law
@ 2020-10-07 22:09             ` Jeff Law
  2020-10-07 23:12               ` H.J. Lu
  0 siblings, 1 reply; 17+ messages in thread
From: Jeff Law @ 2020-10-07 22:09 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Richard Biener, GCC Patches, Joseph Myers

[-- Attachment #1: Type: text/plain, Size: 2122 bytes --]

Adding the testcase...

On 10/7/20 4:08 PM, Jeff Law wrote:
> On 9/17/20 1:03 PM, Jakub Jelinek wrote:
> [ ... Big snip, starting over ... ]
>
> I may have not explained things too well.  So I've put together a small
> example that folks can play with to show the underlying issue.
>
>
> There's a static library libfu.a.  In this static library we have a hunk
> of local static data (utf8_sb_map) and two functions.  One function puts
> the address utf8_sb_map into a structure (rpl_regcomp), the other
> verifies that the address stored in the structure is the same as the
> address of utf8_sb_map (rpl_regfree).
>
>
> That static library is linked into DSO libdso.so.  The DSO sources
> define a single function xregcomp which calls rpl_regcomp, but
> references  nothing else from the static library.  Since libfu.a was
> linked into the library we actually get a *copy* of rpl_regcomp in the
> DSO.  In fact, we get a copy of the entire .o file from libfu.a, which
> matches traditional linkage models where the .o file is an atomic unit
> for linking.
>
>
> The main program calls xregcomp which is defined in the DSO and calls
> rpl_regfree.  The main program links against libdso.so and libfu.a. 
> Because it links libfu.a, it gets  a copy of rpl_regfree, but *only*
> rpl_regfree.  That copy of rpl_regfree references a new and distinct
> copy of utf8_sb_map.  Naturally the address of utf8_sb_map in the main
> program is different from the one in libdso.so and the test aborts.
>
>
> Without LTO the main program would still reference rpl_regfree, but the
> main program would not have its own copy.  rpl_regfree and rpl_regcomp
> would both be satisfied by the DSO (which remember has a complete copy
> of the .o file from libfu.a).  Thus there would be only one utf8_sb_map
> as well and naturally the program will exit normally.
>
>
> So I've got a bunch of thoughts here, but will defer sharing them
> immediately so as not to unduly influence anyone.
>
>
> I don't have a sense of how pervasive this issue is.   I know it affects
> man-db, but there could well be others.
>
>
>
>
> jeff
>

[-- Attachment #2: t.tar.gz --]
[-- Type: application/gzip, Size: 859 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-10-07 22:09             ` Jeff Law
@ 2020-10-07 23:12               ` H.J. Lu
  2020-10-07 23:16                 ` Jeff Law
  2020-10-09 18:36                 ` Jeff Law
  0 siblings, 2 replies; 17+ messages in thread
From: H.J. Lu @ 2020-10-07 23:12 UTC (permalink / raw)
  To: Jeff Law; +Cc: Jakub Jelinek, GCC Patches, Joseph Myers

On Wed, Oct 7, 2020 at 3:09 PM Jeff Law via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Adding the testcase...
>
> On 10/7/20 4:08 PM, Jeff Law wrote:
> > On 9/17/20 1:03 PM, Jakub Jelinek wrote:
> > [ ... Big snip, starting over ... ]
> >
> > I may have not explained things too well.  So I've put together a small
> > example that folks can play with to show the underlying issue.
> >
> >
> > There's a static library libfu.a.  In this static library we have a hunk
> > of local static data (utf8_sb_map) and two functions.  One function puts
> > the address utf8_sb_map into a structure (rpl_regcomp), the other
> > verifies that the address stored in the structure is the same as the
> > address of utf8_sb_map (rpl_regfree).
> >
> >
> > That static library is linked into DSO libdso.so.  The DSO sources
> > define a single function xregcomp which calls rpl_regcomp, but
> > references  nothing else from the static library.  Since libfu.a was
> > linked into the library we actually get a *copy* of rpl_regcomp in the
> > DSO.  In fact, we get a copy of the entire .o file from libfu.a, which
> > matches traditional linkage models where the .o file is an atomic unit
> > for linking.
> >
> >
> > The main program calls xregcomp which is defined in the DSO and calls
> > rpl_regfree.  The main program links against libdso.so and libfu.a.
> > Because it links libfu.a, it gets  a copy of rpl_regfree, but *only*
> > rpl_regfree.  That copy of rpl_regfree references a new and distinct
> > copy of utf8_sb_map.  Naturally the address of utf8_sb_map in the main
> > program is different from the one in libdso.so and the test aborts.
> >
> >
> > Without LTO the main program would still reference rpl_regfree, but the
> > main program would not have its own copy.  rpl_regfree and rpl_regcomp
> > would both be satisfied by the DSO (which remember has a complete copy
> > of the .o file from libfu.a).  Thus there would be only one utf8_sb_map
> > as well and naturally the program will exit normally.
> >
> >
> > So I've got a bunch of thoughts here, but will defer sharing them
> > immediately so as not to unduly influence anyone.
> >
> >
> > I don't have a sense of how pervasive this issue is.   I know it affects
> > man-db, but there could well be others.

This is:

https://sourceware.org/bugzilla/show_bug.cgi?id=26530
https://sourceware.org/bugzilla/show_bug.cgi?id=26314

-- 
H.J.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-10-07 23:12               ` H.J. Lu
@ 2020-10-07 23:16                 ` Jeff Law
  2020-10-09 18:36                 ` Jeff Law
  1 sibling, 0 replies; 17+ messages in thread
From: Jeff Law @ 2020-10-07 23:16 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jakub Jelinek, GCC Patches, Joseph Myers


On 10/7/20 5:12 PM, H.J. Lu wrote:
> On Wed, Oct 7, 2020 at 3:09 PM Jeff Law via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>> Adding the testcase...
>>
>> On 10/7/20 4:08 PM, Jeff Law wrote:
>>> On 9/17/20 1:03 PM, Jakub Jelinek wrote:
>>> [ ... Big snip, starting over ... ]
>>>
>>> I may have not explained things too well.  So I've put together a small
>>> example that folks can play with to show the underlying issue.
>>>
>>>
>>> There's a static library libfu.a.  In this static library we have a hunk
>>> of local static data (utf8_sb_map) and two functions.  One function puts
>>> the address utf8_sb_map into a structure (rpl_regcomp), the other
>>> verifies that the address stored in the structure is the same as the
>>> address of utf8_sb_map (rpl_regfree).
>>>
>>>
>>> That static library is linked into DSO libdso.so.  The DSO sources
>>> define a single function xregcomp which calls rpl_regcomp, but
>>> references  nothing else from the static library.  Since libfu.a was
>>> linked into the library we actually get a *copy* of rpl_regcomp in the
>>> DSO.  In fact, we get a copy of the entire .o file from libfu.a, which
>>> matches traditional linkage models where the .o file is an atomic unit
>>> for linking.
>>>
>>>
>>> The main program calls xregcomp which is defined in the DSO and calls
>>> rpl_regfree.  The main program links against libdso.so and libfu.a.
>>> Because it links libfu.a, it gets  a copy of rpl_regfree, but *only*
>>> rpl_regfree.  That copy of rpl_regfree references a new and distinct
>>> copy of utf8_sb_map.  Naturally the address of utf8_sb_map in the main
>>> program is different from the one in libdso.so and the test aborts.
>>>
>>>
>>> Without LTO the main program would still reference rpl_regfree, but the
>>> main program would not have its own copy.  rpl_regfree and rpl_regcomp
>>> would both be satisfied by the DSO (which remember has a complete copy
>>> of the .o file from libfu.a).  Thus there would be only one utf8_sb_map
>>> as well and naturally the program will exit normally.
>>>
>>>
>>> So I've got a bunch of thoughts here, but will defer sharing them
>>> immediately so as not to unduly influence anyone.
>>>
>>>
>>> I don't have a sense of how pervasive this issue is.   I know it affects
>>> man-db, but there could well be others.
> This is:
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=26530
> https://sourceware.org/bugzilla/show_bug.cgi?id=26314

I'm highly confident the fix for 26314 is already in my trees -- in fact
I'm the one that originally reported that to Nick to begin with and much
of that text is mine :-)


I will check 26530.  Thanks for the pointers.

jeff


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Problem with static const objects and LTO
  2020-10-07 23:12               ` H.J. Lu
  2020-10-07 23:16                 ` Jeff Law
@ 2020-10-09 18:36                 ` Jeff Law
  1 sibling, 0 replies; 17+ messages in thread
From: Jeff Law @ 2020-10-09 18:36 UTC (permalink / raw)
  To: H.J. Lu; +Cc: Jakub Jelinek, GCC Patches, Joseph Myers


On 10/7/20 5:12 PM, H.J. Lu wrote:
> On Wed, Oct 7, 2020 at 3:09 PM Jeff Law via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>> Adding the testcase...
>>
>> On 10/7/20 4:08 PM, Jeff Law wrote:
>>> On 9/17/20 1:03 PM, Jakub Jelinek wrote:
>>> [ ... Big snip, starting over ... ]
>>>
>>> I may have not explained things too well.  So I've put together a small
>>> example that folks can play with to show the underlying issue.
>>>
>>>
>>> There's a static library libfu.a.  In this static library we have a hunk
>>> of local static data (utf8_sb_map) and two functions.  One function puts
>>> the address utf8_sb_map into a structure (rpl_regcomp), the other
>>> verifies that the address stored in the structure is the same as the
>>> address of utf8_sb_map (rpl_regfree).
>>>
>>>
>>> That static library is linked into DSO libdso.so.  The DSO sources
>>> define a single function xregcomp which calls rpl_regcomp, but
>>> references  nothing else from the static library.  Since libfu.a was
>>> linked into the library we actually get a *copy* of rpl_regcomp in the
>>> DSO.  In fact, we get a copy of the entire .o file from libfu.a, which
>>> matches traditional linkage models where the .o file is an atomic unit
>>> for linking.
>>>
>>>
>>> The main program calls xregcomp which is defined in the DSO and calls
>>> rpl_regfree.  The main program links against libdso.so and libfu.a.
>>> Because it links libfu.a, it gets  a copy of rpl_regfree, but *only*
>>> rpl_regfree.  That copy of rpl_regfree references a new and distinct
>>> copy of utf8_sb_map.  Naturally the address of utf8_sb_map in the main
>>> program is different from the one in libdso.so and the test aborts.
>>>
>>>
>>> Without LTO the main program would still reference rpl_regfree, but the
>>> main program would not have its own copy.  rpl_regfree and rpl_regcomp
>>> would both be satisfied by the DSO (which remember has a complete copy
>>> of the .o file from libfu.a).  Thus there would be only one utf8_sb_map
>>> as well and naturally the program will exit normally.
>>>
>>>
>>> So I've got a bunch of thoughts here, but will defer sharing them
>>> immediately so as not to unduly influence anyone.
>>>
>>>
>>> I don't have a sense of how pervasive this issue is.   I know it affects
>>> man-db, but there could well be others.
> This is:
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=26530
> https://sourceware.org/bugzilla/show_bug.cgi?id=26314

Just to close the loop here.  There was a followup patch from Alan
attached to 26314 that wasn't in Fedora.  Adding that to Fedora fixes
the man-db issues.


Thanks!


jeff


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-10-09 18:36 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-16 16:51 Problem with static const objects and LTO Jeff Law
2020-09-16 17:05 ` H.J. Lu
2020-09-16 17:10   ` Jeff Law
2020-09-16 17:13     ` H.J. Lu
2020-09-16 17:24       ` Jeff Law
2020-09-16 17:32         ` H.J. Lu
2020-09-16 17:41           ` Jeff Law
2020-09-16 17:52 ` Joseph Myers
2020-09-16 20:24   ` Jeff Law
2020-09-17  7:04     ` Richard Biener
2020-09-17 18:18       ` Jeff Law
2020-09-17 19:03         ` Jakub Jelinek
2020-10-07 22:08           ` Jeff Law
2020-10-07 22:09             ` Jeff Law
2020-10-07 23:12               ` H.J. Lu
2020-10-07 23:16                 ` Jeff Law
2020-10-09 18:36                 ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).