Re: inlining change and abidiff noise

public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed

* Re: inlining change and abidiff noise
  2017-01-01  0:00 ` Dodji Seketeli
       [not found]   ` <78E49230-1B78-43CE-A643-63AC1039F694@redhat.com>
@ 2017-01-01  0:00   ` Ben Woodard
  1 sibling, 0 replies; 4+ messages in thread
From: Ben Woodard @ 2017-01-01  0:00 UTC (permalink / raw)
  To: libabigail


> On Jun 15, 2017, at 5:25 AM, Dodji Seketeli <dodji@seketeli.org> wrote:
> 
> Hello Ben,
> 
> Ben Woodard <woodard@redhat.com> a écrit:
> 
>> One of the challenges doing inter-compiler comparison between object
>> files created by different compilers is that the signal to noise ratio
>> is very high apparently due to different inlining decisions.
>> 
>> Would there be any negative consequences to having abidiff consider
>> changes which appear to be due to just differences in how the
>> compilers choose to inline functions.
>> 
>> As a case in point:
>> 
>> $ ./tools/abidiff ./tools/.libs/abidiff ../build-llvl/tools/.libs/abidiff
>> Functions changes summary: 4 Removed, 0 Changed, 0 Added functions
>> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>> Function symbols changes summary: 7 Removed, 223 Added function
>> symbols not referenced by debug info
>> Variable symbols changes summary: 0 Removed, 3 Added variable symbols
>> not referenced by debug info
>> 
>> when you look 3 out of 4 of the removed functions and all 233 of the
>> added functions you can quickly see that they are all weak symbols.
> 
> A weak symbol is not necessarily a symbol that results from function
> inlining.
> 
> A user can very well write in her code:
> 
>    int __attribute__((weak)) power2(int x);

in addition to a declaration like that, wouldn’t there also need to be a definition somewhere?

> 
> And the symbol of that function 'power2' is going to be weak.
> 
> 
> In this case, however, abidiff is saying that: 
> 
>> Function symbols changes summary: 7 Removed, 223 Added function symbols not referenced by debug info
> 
> In other words, it is saying that the symbols that got added/removed
> match *no* debug info.  That is to say that no function explicitely
> written by the user generated those ELF symbols.  It follows that those
> symbols result from functions that were generated artificially by the
> compiler.
> 
> If this change was happening between two binaries generated with the
> same compiler (and using the same compiler option), it would certainly
> be a hint of a problem.
> 
What if the compiler version changes and the decision about which functions change?
I would also argue that at least for most compiler option changes like changing -O level it shouldn’t create any ABI artifacts and nor should libabigail generate any false positives if you simply change the optimization level. I would argue that the inter-compiler test is just a more rigorous test of the changes that happen when the optimization level changes.
> 
> And the symbol of that function 'power2' is going to be weak.
> 
> 
> In this case, however, abidiff is saying that: 
> 
>> Function symbols changes summary: 7 Removed, 223 Added function symbols not referenced by debug info
> 
> In other words, it is saying that the symbols that got added/removed
> match *no* debug info.  That is to say that no function explicitely
> written by the user generated those ELF symbols.  It follows that those
> symbols result from functions that were generated artificially by the
> compiler.
> 
> If this change was happening between two binaries generated with the
> same compiler (and using the same compiler option), it would certainly
> be a hint of a problem.
> 
> In this case of binaries generated by different compilers, I guess it's
> different, especially after you've analyzed the changes as you did.
> 
> So in this case, you could have used the --no-unreferenced-symbols
> option of abidiff to tell it to avoid showing you changes on symbols for
> which there is no debug info.
> 

I think that is a bit of a big hammer. I think that in the example you give below…

>> The assumption that I'm making is that adding logic like:
>> if a symbol is added or removed and the only reference to it is with a
>> weak symbol then it is a result of a change in inlining and it is
>> therefore harmless.
> 
> I think that assumption is not correct, unfortunately.  A removed weak
> symbol can result from a using removing the declaration below from the
> code:
> 
>    int __attribute__((weak)) power2(int x);
> 
Would a declaration without a definition generate anything at all? 

Isn’t the unique characteristic in this case that we can filter on the fact that we have weak symbols with no DWARF. Is there another example you can contrive where a CU has DWARF (i.e. it isn’t just stripped ELF or the result of a compilation without -g) but the compiler generates weak symbols that do not have associated DWARF where there is in fact a legitimate ABI change?

> So doing this by default might lead us to miss some real problems.


You’re the expert but I’m not completely convinced yet.

> 
> Maybe we could add a --no-weak-symbols option to abidiff that would
> implement the logic you are talking about?  That way, people who know
> what they are doing, like yourself, would use it.
> 
> Then, we'd write a guide for people analysing ABI changes that are due
> to using different compilers.  That guide would introduce the subtleties
> we are talking about here and mention the abidiff options that need to
> be used, when and why.
> 
> What do you think?
> 
> -- 
> 		Dodji

^ permalink raw reply	[flat|nested] 4+ messages in thread

* inlining change and abidiff noise
@ 2017-01-01  0:00 Ben Woodard
  2017-01-01  0:00 ` Dodji Seketeli
  0 siblings, 1 reply; 4+ messages in thread
From: Ben Woodard @ 2017-01-01  0:00 UTC (permalink / raw)
  To: libabigail

One of the challenges doing inter-compiler comparison between object
files created by different compilers is that the signal to noise ratio
is very high apparently due to different inlining decisions.

Would there be any negative consequences to having abidiff consider
changes which appear to be due to just differences in how the
compilers choose to inline functions.

As a case in point:

$ ./tools/abidiff ./tools/.libs/abidiff ../build-llvl/tools/.libs/abidiff
Functions changes summary: 4 Removed, 0 Changed, 0 Added functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
Function symbols changes summary: 7 Removed, 223 Added function
symbols not referenced by debug info
Variable symbols changes summary: 0 Removed, 3 Added variable symbols
not referenced by debug info

when you look 3 out of 4 of the removed functions and all 233 of the
added functions you can quickly see that they are all weak symbols.

The assumption that I'm making is that adding logic like:
if a symbol is added or removed and the only reference to it is with a
weak symbol then it is a result of a change in inlining and it is
therefore harmless.

What I would like to know is if that assumption is in fact correct?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: inlining change and abidiff noise
  2017-01-01  0:00 inlining change and abidiff noise Ben Woodard
@ 2017-01-01  0:00 ` Dodji Seketeli
       [not found]   ` <78E49230-1B78-43CE-A643-63AC1039F694@redhat.com>
  2017-01-01  0:00   ` Ben Woodard
  0 siblings, 2 replies; 4+ messages in thread
From: Dodji Seketeli @ 2017-01-01  0:00 UTC (permalink / raw)
  To: Ben Woodard; +Cc: libabigail

Hello Ben,

Ben Woodard <woodard@redhat.com> a écrit:

> One of the challenges doing inter-compiler comparison between object
> files created by different compilers is that the signal to noise ratio
> is very high apparently due to different inlining decisions.
>
> Would there be any negative consequences to having abidiff consider
> changes which appear to be due to just differences in how the
> compilers choose to inline functions.
>
> As a case in point:
>
> $ ./tools/abidiff ./tools/.libs/abidiff ../build-llvl/tools/.libs/abidiff
> Functions changes summary: 4 Removed, 0 Changed, 0 Added functions
> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> Function symbols changes summary: 7 Removed, 223 Added function
> symbols not referenced by debug info
> Variable symbols changes summary: 0 Removed, 3 Added variable symbols
> not referenced by debug info
>
> when you look 3 out of 4 of the removed functions and all 233 of the
> added functions you can quickly see that they are all weak symbols.

A weak symbol is not necessarily a symbol that results from function
inlining.

A user can very well write in her code:

    int __attribute__((weak)) power2(int x);

And the symbol of that function 'power2' is going to be weak.

In this case, however, abidiff is saying that: 

    > Function symbols changes summary: 7 Removed, 223 Added function symbols not referenced by debug info

In other words, it is saying that the symbols that got added/removed
match *no* debug info.  That is to say that no function explicitely
written by the user generated those ELF symbols.  It follows that those
symbols result from functions that were generated artificially by the
compiler.

If this change was happening between two binaries generated with the
same compiler (and using the same compiler option), it would certainly
be a hint of a problem.

In this case of binaries generated by different compilers, I guess it's
different, especially after you've analyzed the changes as you did.

So in this case, you could have used the --no-unreferenced-symbols
option of abidiff to tell it to avoid showing you changes on symbols for
which there is no debug info.

> The assumption that I'm making is that adding logic like:
> if a symbol is added or removed and the only reference to it is with a
> weak symbol then it is a result of a change in inlining and it is
> therefore harmless.

I think that assumption is not correct, unfortunately.  A removed weak
symbol can result from a using removing the declaration below from the
code:

    int __attribute__((weak)) power2(int x);

So doing this by default might lead us to miss some real problems.

Maybe we could add a --no-weak-symbols option to abidiff that would
implement the logic you are talking about?  That way, people who know
what they are doing, like yourself, would use it.

Then, we'd write a guide for people analysing ABI changes that are due
to using different compilers.  That guide would introduce the subtleties
we are talking about here and mention the abidiff options that need to
be used, when and why.

What do you think?

-- 
		Dodji

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: inlining change and abidiff noise
       [not found]   ` <78E49230-1B78-43CE-A643-63AC1039F694@redhat.com>
@ 2017-01-01  0:00     ` Dodji Seketeli
  0 siblings, 0 replies; 4+ messages in thread
From: Dodji Seketeli @ 2017-01-01  0:00 UTC (permalink / raw)
  To: Ben Woodard; +Cc: libabigail

Ben Woodard <woodard@redhat.com> a écrit:

[...}

>> A user can very well write in her code:
>> 
>>    int __attribute__((weak)) power2(int x);
>
> in addition to a declaration like that, wouldn’t there also need to be
> a definition somewhere?

Yes.  You are right.

What I wanted to say is that only the declaration needs to have the weak
attribute.  But anyway, that's a detail.

My point was that a user can willingly decide that a function is going
to be generated as a weak symbol.  And that would have nothing to do
with inlining.

[...]

>> And the symbol of that function 'power2' is going to be weak.
>> 
>> 
>> In this case, however, abidiff is saying that: 
>> 
>>> Function symbols changes summary: 7 Removed, 223 Added function symbols not referenced by debug info
>> 
>> In other words, it is saying that the symbols that got added/removed
>> match *no* debug info.  That is to say that no function explicitely
>> written by the user generated those ELF symbols.  It follows that those
>> symbols result from functions that were generated artificially by the
>> compiler.
>> 
>> If this change was happening between two binaries generated with the
>> same compiler (and using the same compiler option), it would certainly
>> be a hint of a problem.
>> 
> What if the compiler version changes and the decision about which
> functions change?

Sorry, I don't understand what you mean by 'and the decision about which
functions change'.

> I would also argue that at least for most compiler option changes like
> changing -O level it shouldn’t create any ABI artifacts and nor should
> libabigail generate any false positives if you simply change the
> optimization level.

I agree.  The same source code compiled with two different versions of the same
compiler should yield the same ABI.

And *generally*, changing optimization options should not incur any ABI
change either.

> I would argue that the inter-compiler test is just a more rigorous
> test of the changes that happen when the optimization level changes.

In theory, maybe.  In theory, what you say would be a cool assertion
that would allow users to mix objects coming from different compilers.
But in practise, I disagree.

Right now, I think ABI changes between two binaries compiled with
different compilers (i.e, clang++ and g++) are not *necessarily* of the
same nature as ABI changes between two binaries compiled with the same
compiler but with different optimization levels.

You seem to think that "WEAK symbols are necessarily related to
inlining", roughly.  And inlining is impacted by optimization level.
And thus, it would follow that if two different compilers incur ABI
changes related to WEAK symbols, then these changes are of the same
nature as inlining changes and so as optimization level changes.

But the problem is that a WEAK symbols are not necessarily related to
inlining.

[...]

>> In this case of binaries generated by different compilers, I guess it's
>> different, especially after you've analyzed the changes as you did.
>> 
>> So in this case, you could have used the --no-unreferenced-symbols
>> option of abidiff to tell it to avoid showing you changes on symbols for
>> which there is no debug info.
>
> I think that is a bit of a big hammer. I think that in the example you
> give below...
>> 
>>> The assumption that I'm making is that adding logic like:
>>> if a symbol is added or removed and the only reference to it is with a
>>> weak symbol then it is a result of a change in inlining and it is
>>> therefore harmless.
>> 
>> I think that assumption is not correct, unfortunately.  A removed weak
>> symbol can result from a using removing the declaration below from the
>> code:
>> 
>>    int __attribute__((weak)) power2(int x);
>
> Would a declaration without a definition generate anything at all? 

No.

But a declaration, followed by definition (that doesn't have the weak
attribute) would generate it.  And in that case, just removing the weak
attribute from the declaration would generate the ABI change.

In other words, a *source code change* would very well generate an ABI
change in which a WEAK symbol is added or removed.

> Isn’t the unique characteristic in this case that we can filter on the
> fact that we have weak symbols with no DWARF.

This characteristic is narrower (and hence, I think, better) than the
one you expressed initially, which was (I quote from your initial
message):

    The assumption that I'm making is that adding logic like: if a
    symbol is added or removed and the only reference to it is with a
    weak symbol then it is a result of a change in inlining and it is
    therefore harmless.

In that initial assumption, I didn't understand that you meant there was
no DWARF describing the function the symbol originated from.

So yes, if there is no DWARF describing the function which the WEAK
symbol originates from, (which implies that there is no function
declaration accompanying the code related to the WEAK symbol), then the
probability is higher than the WEAK symbol was artificially generated by
the compiler.  Note, however, that the user could have written assembly
code, inline in his C or C++ source code as well.  And that would have
resulted into an ELF symbol not described by any DWARF.  And even if the
code related to the WEAK ELF symbol was indeed artificially generated by
the compiler, nothing assures us that it was related to inlining.

In other words, I still think that we should rather provide the user
with options to filter out, for instance, "unreferenced weak symbols",
rather than filtering them by default.  Because otherwise, we might hide
real problems, by default.

Cheers,

-- 
		Dodji

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-06-19  8:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-01  0:00 inlining change and abidiff noise Ben Woodard
2017-01-01  0:00 ` Dodji Seketeli
     [not found]   ` <78E49230-1B78-43CE-A643-63AC1039F694@redhat.com>
2017-01-01  0:00     ` Dodji Seketeli
2017-01-01  0:00   ` Ben Woodard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).