* inlining change and abidiff noise @ 2017-01-01 0:00 Ben Woodard 2017-01-01 0:00 ` Dodji Seketeli 0 siblings, 1 reply; 4+ messages in thread From: Ben Woodard @ 2017-01-01 0:00 UTC (permalink / raw) To: libabigail One of the challenges doing inter-compiler comparison between object files created by different compilers is that the signal to noise ratio is very high apparently due to different inlining decisions. Would there be any negative consequences to having abidiff consider changes which appear to be due to just differences in how the compilers choose to inline functions. As a case in point: $ ./tools/abidiff ./tools/.libs/abidiff ../build-llvl/tools/.libs/abidiff Functions changes summary: 4 Removed, 0 Changed, 0 Added functions Variables changes summary: 0 Removed, 0 Changed, 0 Added variable Function symbols changes summary: 7 Removed, 223 Added function symbols not referenced by debug info Variable symbols changes summary: 0 Removed, 3 Added variable symbols not referenced by debug info when you look 3 out of 4 of the removed functions and all 233 of the added functions you can quickly see that they are all weak symbols. The assumption that I'm making is that adding logic like: if a symbol is added or removed and the only reference to it is with a weak symbol then it is a result of a change in inlining and it is therefore harmless. What I would like to know is if that assumption is in fact correct? ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: inlining change and abidiff noise 2017-01-01 0:00 inlining change and abidiff noise Ben Woodard @ 2017-01-01 0:00 ` Dodji Seketeli 2017-01-01 0:00 ` Ben Woodard [not found] ` <78E49230-1B78-43CE-A643-63AC1039F694@redhat.com> 0 siblings, 2 replies; 4+ messages in thread From: Dodji Seketeli @ 2017-01-01 0:00 UTC (permalink / raw) To: Ben Woodard; +Cc: libabigail Hello Ben, Ben Woodard <woodard@redhat.com> a écrit: > One of the challenges doing inter-compiler comparison between object > files created by different compilers is that the signal to noise ratio > is very high apparently due to different inlining decisions. > > Would there be any negative consequences to having abidiff consider > changes which appear to be due to just differences in how the > compilers choose to inline functions. > > As a case in point: > > $ ./tools/abidiff ./tools/.libs/abidiff ../build-llvl/tools/.libs/abidiff > Functions changes summary: 4 Removed, 0 Changed, 0 Added functions > Variables changes summary: 0 Removed, 0 Changed, 0 Added variable > Function symbols changes summary: 7 Removed, 223 Added function > symbols not referenced by debug info > Variable symbols changes summary: 0 Removed, 3 Added variable symbols > not referenced by debug info > > when you look 3 out of 4 of the removed functions and all 233 of the > added functions you can quickly see that they are all weak symbols. A weak symbol is not necessarily a symbol that results from function inlining. A user can very well write in her code: int __attribute__((weak)) power2(int x); And the symbol of that function 'power2' is going to be weak. In this case, however, abidiff is saying that: > Function symbols changes summary: 7 Removed, 223 Added function symbols not referenced by debug info In other words, it is saying that the symbols that got added/removed match *no* debug info. That is to say that no function explicitely written by the user generated those ELF symbols. It follows that those symbols result from functions that were generated artificially by the compiler. If this change was happening between two binaries generated with the same compiler (and using the same compiler option), it would certainly be a hint of a problem. In this case of binaries generated by different compilers, I guess it's different, especially after you've analyzed the changes as you did. So in this case, you could have used the --no-unreferenced-symbols option of abidiff to tell it to avoid showing you changes on symbols for which there is no debug info. > The assumption that I'm making is that adding logic like: > if a symbol is added or removed and the only reference to it is with a > weak symbol then it is a result of a change in inlining and it is > therefore harmless. I think that assumption is not correct, unfortunately. A removed weak symbol can result from a using removing the declaration below from the code: int __attribute__((weak)) power2(int x); So doing this by default might lead us to miss some real problems. Maybe we could add a --no-weak-symbols option to abidiff that would implement the logic you are talking about? That way, people who know what they are doing, like yourself, would use it. Then, we'd write a guide for people analysing ABI changes that are due to using different compilers. That guide would introduce the subtleties we are talking about here and mention the abidiff options that need to be used, when and why. What do you think? -- Dodji ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: inlining change and abidiff noise 2017-01-01 0:00 ` Dodji Seketeli @ 2017-01-01 0:00 ` Ben Woodard [not found] ` <78E49230-1B78-43CE-A643-63AC1039F694@redhat.com> 1 sibling, 0 replies; 4+ messages in thread From: Ben Woodard @ 2017-01-01 0:00 UTC (permalink / raw) To: libabigail > On Jun 15, 2017, at 5:25 AM, Dodji Seketeli <dodji@seketeli.org> wrote: > > Hello Ben, > > Ben Woodard <woodard@redhat.com> a écrit: > >> One of the challenges doing inter-compiler comparison between object >> files created by different compilers is that the signal to noise ratio >> is very high apparently due to different inlining decisions. >> >> Would there be any negative consequences to having abidiff consider >> changes which appear to be due to just differences in how the >> compilers choose to inline functions. >> >> As a case in point: >> >> $ ./tools/abidiff ./tools/.libs/abidiff ../build-llvl/tools/.libs/abidiff >> Functions changes summary: 4 Removed, 0 Changed, 0 Added functions >> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable >> Function symbols changes summary: 7 Removed, 223 Added function >> symbols not referenced by debug info >> Variable symbols changes summary: 0 Removed, 3 Added variable symbols >> not referenced by debug info >> >> when you look 3 out of 4 of the removed functions and all 233 of the >> added functions you can quickly see that they are all weak symbols. > > A weak symbol is not necessarily a symbol that results from function > inlining. > > A user can very well write in her code: > > int __attribute__((weak)) power2(int x); in addition to a declaration like that, wouldn’t there also need to be a definition somewhere? > > And the symbol of that function 'power2' is going to be weak. > > > In this case, however, abidiff is saying that: > >> Function symbols changes summary: 7 Removed, 223 Added function symbols not referenced by debug info > > In other words, it is saying that the symbols that got added/removed > match *no* debug info. That is to say that no function explicitely > written by the user generated those ELF symbols. It follows that those > symbols result from functions that were generated artificially by the > compiler. > > If this change was happening between two binaries generated with the > same compiler (and using the same compiler option), it would certainly > be a hint of a problem. > What if the compiler version changes and the decision about which functions change? I would also argue that at least for most compiler option changes like changing -O level it shouldn’t create any ABI artifacts and nor should libabigail generate any false positives if you simply change the optimization level. I would argue that the inter-compiler test is just a more rigorous test of the changes that happen when the optimization level changes. > > And the symbol of that function 'power2' is going to be weak. > > > In this case, however, abidiff is saying that: > >> Function symbols changes summary: 7 Removed, 223 Added function symbols not referenced by debug info > > In other words, it is saying that the symbols that got added/removed > match *no* debug info. That is to say that no function explicitely > written by the user generated those ELF symbols. It follows that those > symbols result from functions that were generated artificially by the > compiler. > > If this change was happening between two binaries generated with the > same compiler (and using the same compiler option), it would certainly > be a hint of a problem. > > In this case of binaries generated by different compilers, I guess it's > different, especially after you've analyzed the changes as you did. > > So in this case, you could have used the --no-unreferenced-symbols > option of abidiff to tell it to avoid showing you changes on symbols for > which there is no debug info. > I think that is a bit of a big hammer. I think that in the example you give below… >> The assumption that I'm making is that adding logic like: >> if a symbol is added or removed and the only reference to it is with a >> weak symbol then it is a result of a change in inlining and it is >> therefore harmless. > > I think that assumption is not correct, unfortunately. A removed weak > symbol can result from a using removing the declaration below from the > code: > > int __attribute__((weak)) power2(int x); > Would a declaration without a definition generate anything at all? Isn’t the unique characteristic in this case that we can filter on the fact that we have weak symbols with no DWARF. Is there another example you can contrive where a CU has DWARF (i.e. it isn’t just stripped ELF or the result of a compilation without -g) but the compiler generates weak symbols that do not have associated DWARF where there is in fact a legitimate ABI change? > So doing this by default might lead us to miss some real problems. You’re the expert but I’m not completely convinced yet. > > Maybe we could add a --no-weak-symbols option to abidiff that would > implement the logic you are talking about? That way, people who know > what they are doing, like yourself, would use it. > > Then, we'd write a guide for people analysing ABI changes that are due > to using different compilers. That guide would introduce the subtleties > we are talking about here and mention the abidiff options that need to > be used, when and why. > > What do you think? > > -- > Dodji ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <78E49230-1B78-43CE-A643-63AC1039F694@redhat.com>]
* Re: inlining change and abidiff noise [not found] ` <78E49230-1B78-43CE-A643-63AC1039F694@redhat.com> @ 2017-01-01 0:00 ` Dodji Seketeli 0 siblings, 0 replies; 4+ messages in thread From: Dodji Seketeli @ 2017-01-01 0:00 UTC (permalink / raw) To: Ben Woodard; +Cc: libabigail Ben Woodard <woodard@redhat.com> a écrit: [...} >> A user can very well write in her code: >> >> int __attribute__((weak)) power2(int x); > > in addition to a declaration like that, wouldn’t there also need to be > a definition somewhere? Yes. You are right. What I wanted to say is that only the declaration needs to have the weak attribute. But anyway, that's a detail. My point was that a user can willingly decide that a function is going to be generated as a weak symbol. And that would have nothing to do with inlining. [...] >> And the symbol of that function 'power2' is going to be weak. >> >> >> In this case, however, abidiff is saying that: >> >>> Function symbols changes summary: 7 Removed, 223 Added function symbols not referenced by debug info >> >> In other words, it is saying that the symbols that got added/removed >> match *no* debug info. That is to say that no function explicitely >> written by the user generated those ELF symbols. It follows that those >> symbols result from functions that were generated artificially by the >> compiler. >> >> If this change was happening between two binaries generated with the >> same compiler (and using the same compiler option), it would certainly >> be a hint of a problem. >> > What if the compiler version changes and the decision about which > functions change? Sorry, I don't understand what you mean by 'and the decision about which functions change'. > I would also argue that at least for most compiler option changes like > changing -O level it shouldn’t create any ABI artifacts and nor should > libabigail generate any false positives if you simply change the > optimization level. I agree. The same source code compiled with two different versions of the same compiler should yield the same ABI. And *generally*, changing optimization options should not incur any ABI change either. > I would argue that the inter-compiler test is just a more rigorous > test of the changes that happen when the optimization level changes. In theory, maybe. In theory, what you say would be a cool assertion that would allow users to mix objects coming from different compilers. But in practise, I disagree. Right now, I think ABI changes between two binaries compiled with different compilers (i.e, clang++ and g++) are not *necessarily* of the same nature as ABI changes between two binaries compiled with the same compiler but with different optimization levels. You seem to think that "WEAK symbols are necessarily related to inlining", roughly. And inlining is impacted by optimization level. And thus, it would follow that if two different compilers incur ABI changes related to WEAK symbols, then these changes are of the same nature as inlining changes and so as optimization level changes. But the problem is that a WEAK symbols are not necessarily related to inlining. [...] >> In this case of binaries generated by different compilers, I guess it's >> different, especially after you've analyzed the changes as you did. >> >> So in this case, you could have used the --no-unreferenced-symbols >> option of abidiff to tell it to avoid showing you changes on symbols for >> which there is no debug info. > > I think that is a bit of a big hammer. I think that in the example you > give below... >> >>> The assumption that I'm making is that adding logic like: >>> if a symbol is added or removed and the only reference to it is with a >>> weak symbol then it is a result of a change in inlining and it is >>> therefore harmless. >> >> I think that assumption is not correct, unfortunately. A removed weak >> symbol can result from a using removing the declaration below from the >> code: >> >> int __attribute__((weak)) power2(int x); > > Would a declaration without a definition generate anything at all? No. But a declaration, followed by definition (that doesn't have the weak attribute) would generate it. And in that case, just removing the weak attribute from the declaration would generate the ABI change. In other words, a *source code change* would very well generate an ABI change in which a WEAK symbol is added or removed. > Isn’t the unique characteristic in this case that we can filter on the > fact that we have weak symbols with no DWARF. This characteristic is narrower (and hence, I think, better) than the one you expressed initially, which was (I quote from your initial message): The assumption that I'm making is that adding logic like: if a symbol is added or removed and the only reference to it is with a weak symbol then it is a result of a change in inlining and it is therefore harmless. In that initial assumption, I didn't understand that you meant there was no DWARF describing the function the symbol originated from. So yes, if there is no DWARF describing the function which the WEAK symbol originates from, (which implies that there is no function declaration accompanying the code related to the WEAK symbol), then the probability is higher than the WEAK symbol was artificially generated by the compiler. Note, however, that the user could have written assembly code, inline in his C or C++ source code as well. And that would have resulted into an ELF symbol not described by any DWARF. And even if the code related to the WEAK ELF symbol was indeed artificially generated by the compiler, nothing assures us that it was related to inlining. In other words, I still think that we should rather provide the user with options to filter out, for instance, "unreferenced weak symbols", rather than filtering them by default. Because otherwise, we might hide real problems, by default. Cheers, -- Dodji ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-06-19 8:47 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-01-01 0:00 inlining change and abidiff noise Ben Woodard 2017-01-01 0:00 ` Dodji Seketeli 2017-01-01 0:00 ` Ben Woodard [not found] ` <78E49230-1B78-43CE-A643-63AC1039F694@redhat.com> 2017-01-01 0:00 ` Dodji Seketeli
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).