API changes that do not impact ABI

public inbox for libabigail@sourceware.org
 help / color / mirror / Atom feed

* API changes that do not impact ABI
@ 2021-10-06 23:35 Ben Woodard
  2021-10-07 11:01 ` Matthias Maennich
  0 siblings, 1 reply; 8+ messages in thread
From: Ben Woodard @ 2021-10-06 23:35 UTC (permalink / raw)
  To: libabigail

We’ve been discussing API changes which probably don’t make any ABI difference but reflect changing language idioms. Consider the two files below. 

---- a.c
char *chop( char *buf){
  return buf+2;
}

---- b.c
void *chop( void *buf){
  return buf+2;
}

$ abidiff a.o b.o 
Functions changes summary: 0 Removed, 1 Changed, 0 Added function
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

1 function with some indirect sub-type change:

  [C] 'function char* chop(char*)' at b.c:1:1 has some indirect sub-type changes:
    return type changed:
      in pointed to type 'char':
        type name changed from 'char' to 'void'
        type size changed from 8 to 0 (in bits)
    parameter 1 of type 'char*' changed:
      in pointed to type 'char':
        type name changed from 'char' to 'void'
        type size changed from 8 to 0 (in bits)

$ echo $?
4

This is totally an API change but Is this really an ABI change? They both compile to the same machine code.

        push    rbp
        mov     rbp, rsp
        mov     QWORD PTR [rbp-8], rdi
        mov     rax, QWORD PTR [rbp-8]
        add     rax, 2
        pop     rbp
        ret

I was thinking that we should probably add special logic that evaluate char* and void* changes to be harmless when the language is C. When the language is C++ their linkage name mangles differently (unless the functions are given C linkage) and pointer arithmetic on void pointers is not allowed by default so we don’t need to worry about it there. The error reported in that case makes more sense:

$ abidiff a.o2 b.o2 
Functions changes summary: 1 Removed, 0 Changed, 1 Added functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

1 Removed function:

  [D] 'function char* chop(char*)'    {_Z4chopPc}

1 Added function:

  [A] 'function void* chop(void*)'    {_Z4chopPv}

Can anyone see a problem with that?

-ben

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: API changes that do not impact ABI
  2021-10-06 23:35 API changes that do not impact ABI Ben Woodard
@ 2021-10-07 11:01 ` Matthias Maennich
  2021-10-07 16:55   ` Giuliano Procida
  2021-10-07 19:21   ` Ben Woodard
  0 siblings, 2 replies; 8+ messages in thread
From: Matthias Maennich @ 2021-10-07 11:01 UTC (permalink / raw)
  To: Ben Woodard; +Cc: libabigail

On Wed, Oct 06, 2021 at 04:35:25PM -0700, Ben Woodard via Libabigail wrote:
>We’ve been discussing API changes which probably don’t make any ABI difference but reflect changing language idioms. Consider the two files below.
>
>---- a.c
>char *chop( char *buf){
>  return buf+2;
>}
>
>---- b.c
>void *chop( void *buf){
>  return buf+2;
>}
>
>
>$ abidiff a.o b.o
>Functions changes summary: 0 Removed, 1 Changed, 0 Added function
>Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>
>1 function with some indirect sub-type change:
>
>  [C] 'function char* chop(char*)' at b.c:1:1 has some indirect sub-type changes:
>    return type changed:
>      in pointed to type 'char':
>        type name changed from 'char' to 'void'
>        type size changed from 8 to 0 (in bits)
>    parameter 1 of type 'char*' changed:
>      in pointed to type 'char':
>        type name changed from 'char' to 'void'
>        type size changed from 8 to 0 (in bits)
>
>$ echo $?
>4
>
>This is totally an API change but Is this really an ABI change? They both compile to the same machine code.
>
>        push    rbp
>        mov     rbp, rsp
>        mov     QWORD PTR [rbp-8], rdi
>        mov     rax, QWORD PTR [rbp-8]
>        add     rax, 2
>        pop     rbp
>        ret
>
>I was thinking that we should probably add special logic that evaluate char* and void* changes to be harmless when the language is C. When the language is C++ their linkage name mangles differently (unless the functions are given C linkage) and pointer arithmetic on void pointers is not allowed by default so we don’t need to worry about it there. The error reported in that case makes more sense:
>
>$ abidiff a.o2 b.o2
>Functions changes summary: 1 Removed, 0 Changed, 1 Added functions
>Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>
>1 Removed function:
>
>  [D] 'function char* chop(char*)'    {_Z4chopPc}
>
>1 Added function:
>
>  [A] 'function void* chop(void*)'    {_Z4chopPv}
>
>Can anyone see a problem with that?

The line between API and ABI changes is quite blurred. At least there
are changes that are exclusively ABI or API only changes. Further there
are changes that are not ABI breaking, but likely API breaking and
therefore caught by libabigail. Internally, libabigail has an
opinionated model of what is breaking and what is harmless and some
thoughts have discussed to make this model a bit more flexible. But as
of now, libabigail is rather conservative and will report an issue if
the runtime stability is possibly affected. Examples are opaque pointers
that are technically ABI compatible, but where the DWARF is providing
the underlying type information or enums where additional values can
indicate an API change, despite ABI stability (if the additional does
not trigger an underlying type change).

The two examples show that ABI stability is likely given, but since it
depends on how the values are used, this could still break at runtime.
Therefore it reports.

I would like to keep this behaviour as it is safer to ignore the
reporting than to miss an ABI/API breakage. In this particular case, the
change can indicate an underlying compatibility problem and reporting it
directs the attention of the developer to this subtle change (which
probably will not be noticed in other testing as well).

Lastly, if you care about ABI stability in your binary, why would you
allow a change like this to land? So, apparently harmless, not patching
that in the first place is arguably the better strategy?

Cheers,
Matthias

>
>-ben
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: API changes that do not impact ABI
  2021-10-07 11:01 ` Matthias Maennich
@ 2021-10-07 16:55   ` Giuliano Procida
  2021-10-07 19:48     ` Ben Woodard
  2021-10-07 19:21   ` Ben Woodard
  1 sibling, 1 reply; 8+ messages in thread
From: Giuliano Procida @ 2021-10-07 16:55 UTC (permalink / raw)
  To: Matthias Maennich; +Cc: Ben Woodard, libabigail

Hi.

On Thu, 7 Oct 2021 at 12:01, Matthias Maennich via Libabigail
<libabigail@sourceware.org> wrote:
>
> On Wed, Oct 06, 2021 at 04:35:25PM -0700, Ben Woodard via Libabigail wrote:
> >We’ve been discussing API changes which probably don’t make any ABI difference but reflect changing language idioms. Consider the two files below.
> >
> >---- a.c
> >char *chop( char *buf){
> >  return buf+2;
> >}
> >
> >---- b.c
> >void *chop( void *buf){
> >  return buf+2;
> >}
> >

void * is an interesting example.

Firstly, (pointer type) "void *" has nothing to do with (return type)
"void" from a type / semantics point of view, unlike other pointer
types. The keyword is being used for a different purpose. Casting to
void is a third oddity.

Secondly, consider more complicated things ... and you may get to the
point where ABI differences arise because strict aliasing rules may
apply in one case, but not the other.

> >
> >$ abidiff a.o b.o
> >Functions changes summary: 0 Removed, 1 Changed, 0 Added function
> >Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> >
> >1 function with some indirect sub-type change:
> >
> >  [C] 'function char* chop(char*)' at b.c:1:1 has some indirect sub-type changes:
> >    return type changed:
> >      in pointed to type 'char':
> >        type name changed from 'char' to 'void'
> >        type size changed from 8 to 0 (in bits)
> >    parameter 1 of type 'char*' changed:
> >      in pointed to type 'char':
> >        type name changed from 'char' to 'void'
> >        type size changed from 8 to 0 (in bits)
> >
> >$ echo $?
> >4
> >
> >This is totally an API change but Is this really an ABI change? They both compile to the same machine code.
> >
> >        push    rbp
> >        mov     rbp, rsp
> >        mov     QWORD PTR [rbp-8], rdi
> >        mov     rax, QWORD PTR [rbp-8]
> >        add     rax, 2
> >        pop     rbp
> >        ret
> >
> >I was thinking that we should probably add special logic that evaluate char* and void* changes to be harmless when the language is C. When the language is C++ their linkage name mangles differently (unless the functions are given C linkage) and pointer arithmetic on void pointers is not allowed by default so we don’t need to worry about it there. The error reported in that case makes more sense:
> >
> >$ abidiff a.o2 b.o2
> >Functions changes summary: 1 Removed, 0 Changed, 1 Added functions
> >Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> >
> >1 Removed function:
> >
> >  [D] 'function char* chop(char*)'    {_Z4chopPc}
> >
> >1 Added function:
> >
> >  [A] 'function void* chop(void*)'    {_Z4chopPv}
> >
> >Can anyone see a problem with that?
>
> The line between API and ABI changes is quite blurred. At least there
> are changes that are exclusively ABI or API only changes. Further there
> are changes that are not ABI breaking, but likely API breaking and
> therefore caught by libabigail. Internally, libabigail has an
> opinionated model of what is breaking and what is harmless and some
> thoughts have discussed to make this model a bit more flexible. But as

There are some things we might want to consider harmful that abidiff
currently considers harmless. One example is enumerator additions, for
two reasons: symmetry with respect to the ordering of the ABIs being
compared and the contravariant/covariant uses of the type.

Other things we might want the other way around. The main thing I can
think of here is when libabigail's internal naming of anonymous types
leaks into ABI descriptions and subsequently into diffs.

The situation in C++ is even worse than what may appear in symbol
mangling. Renaming a type affects RTTI which is accessible to code and
evil code may flip behaviour on that basis.

> of now, libabigail is rather conservative and will report an issue if
> the runtime stability is possibly affected. Examples are opaque pointers
> that are technically ABI compatible, but where the DWARF is providing
> the underlying type information or enums where additional values can
> indicate an API change, despite ABI stability (if the additional does
> not trigger an underlying type change).
>
> The two examples show that ABI stability is likely given, but since it
> depends on how the values are used, this could still break at runtime.
> Therefore it reports.
>
> I would like to keep this behaviour as it is safer to ignore the
> reporting than to miss an ABI/API breakage. In this particular case, the
> change can indicate an underlying compatibility problem and reporting it
> directs the attention of the developer to this subtle change (which
> probably will not be noticed in other testing as well).
>

I think the general position we should take is "be safe" and report
differences. libabigail actually has (from memory) 4 different kinds
of diffs that can be suppressed and a variety of ways of specifying
the suppressions and that logic could in theory be extended to cover
more of the cases that might arise.

The general problem of API compatibility is surely undecidable, even
given formal specifications of behaviour or the source code, neither
of which we have in DWARF.

Regards,
Giuliano.

> Lastly, if you care about ABI stability in your binary, why would you
> allow a change like this to land? So, apparently harmless, not patching
> that in the first place is arguably the better strategy?
>
> Cheers,
> Matthias
>
> >
> >-ben
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: API changes that do not impact ABI
  2021-10-07 11:01 ` Matthias Maennich
  2021-10-07 16:55   ` Giuliano Procida
@ 2021-10-07 19:21   ` Ben Woodard
  1 sibling, 0 replies; 8+ messages in thread
From: Ben Woodard @ 2021-10-07 19:21 UTC (permalink / raw)
  To: Matthias Maennich; +Cc: libabigail



> On Oct 7, 2021, at 4:01 AM, Matthias Maennich <maennich@google.com> wrote:
> 
> The line between API and ABI changes is quite blurred.

I totally agree with that!

> At least there
> are changes that are exclusively ABI or API only changes. Further there
> are changes that are not ABI breaking, but likely API breaking and
> therefore caught by libabigail. Internally, libabigail has an
> opinionated model of what is breaking and what is harmless and some
> thoughts have discussed to make this model a bit more flexible. But as
> of now, libabigail is rather conservative and will report an issue if
> the runtime stability is possibly affected. Examples are opaque pointers
> that are technically ABI compatible, but where the DWARF is providing
> the underlying type information or enums where additional values can
> indicate an API change, despite ABI stability (if the additional does
> not trigger an underlying type change).

I have been thinking about how to handle these things and have sort of come up with a rough classification system:

- API derived changes - to be very honest I do not really care about these. API changes happen over the years. One thing that I have been considering is if we can detect reasonably reliably when a change is likely and API change rather than an ABI change and then flag that in the output. It seems like API changes could be handled differently by people evaluating the output. One thing that I would like to warned about which I currently am not is when the API changes and the soname version doesn’t or probably vice versa.

- Backend implementaiton change e.g. string - this of course is interesting to me and has deep implications for template derived clases. I really don’t know how common this is. I would like to know in practice. For libraries that I have looked at like libstdc++ and boost it seems like the developers have gone to great lengths to try to keep the backend implementations ABI compatible. With good tools to detect ABI breaks like libabigail, I wonder how important it is to go to those great lengths. 

- Compilation opts - this is a topic that I would like to investigate but do not have sufficient data to make conclusions. When are two compiles of  the same version of a library end up being incompatible due to the options that it was compiled with. I would like to further subdivide this category down to 3 sub-groups.
  - user supplied 
  - configuration supplied
  - optimization variation
  - microarchitecture

- Compiler - this has long been my pet project. Part of my reason for asking this question is right now libabigail has a lot of false negatives when trying to evaluate compatibility between a library built with one toolchain vs another. I’m beginning to believe that in addition to suppressions, we may need to introduce a new concept of “DWARF idioms” or “type synonyms” or something like that. I was specifically thinking about these type synonyms when I posted about “char *” vs. “void *"
- toolchain (i.e. llvm vs. gcc) - unfortunately there are enough differences in the idioms of DWARF generated by different compilers that idiomatic differences in the DWARF end up being reported by libabigail as ABI changes when in fact they are not actual ABI incompatibilities. This leads me to believe that we need something in libabigail that can resolve those differences. Suppressions as they are now specified do not work for this. Having looked at the current output, it is more algorithmically defined synonyms. 
  - I expect that one of the challenges with toolchains is going to be language runtime ABIs clang and gcc handle this pretty well but other vendor compilers really don’t.
- compiler version variation
- small ABI changes 
  - unrecognized ambiguity in spec
  - underspecification of spec
- psABI
- bugs in the implementation
- language ABI
- bugs in the implementation
- the fortran problem - no standard language ABI

> 
> The two examples show that ABI stability is likely given, but since it
> depends on how the values are used, this could still break at runtime.
> Therefore it reports.
> 

Part of what I would like to consider is how to resolve the cases where an API change such as this is meaningful from the ABI perspective and where it isn’t. One idea that we are tossing around is something closely related to liveness analysis for variables. https://developers.redhat.com/articles/2021/08/09/debugging-function-parameters-dyninst

Another related example is a function where a parameter changes to being const. There are two versions of this:
- the parameter could have been const all along but either the idiom of the time or the programmer failed to realize that fact
- the function has been refactored so that the parameter can now be const.

These API change questions are on the frontier of what is currently possible. In other words, I don’t know how to do this yet.

> I would like to keep this behaviour as it is safer to ignore the
> reporting than to miss an ABI/API breakage. In this particular case, the
> change can indicate an underlying compatibility problem and reporting it
> directs the attention of the developer to this subtle change (which
> probably will not be noticed in other testing as well).
> 
> Lastly, if you care about ABI stability in your binary, why would you
> allow a change like this to land? So, apparently harmless, not patching
> that in the first place is arguably the better strategy?

Our problem is different. We have a large body of software that uses a lot of libraries. Within this body of software there are various apps that have dependencies on specific versions of the libraries. This ends up resulting in us having to maintain _MANY_ versions of the same library. We want to be able to identify cases where we can safely move an application forward to newer version of the library so that we do not have to maintain as many builds. 

-ben


> 
> Cheers,
> Matthias
> 
>> 
>> -ben


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: API changes that do not impact ABI
  2021-10-07 16:55   ` Giuliano Procida
@ 2021-10-07 19:48     ` Ben Woodard
  2021-10-27 10:22       ` Dodji Seketeli
  0 siblings, 1 reply; 8+ messages in thread
From: Ben Woodard @ 2021-10-07 19:48 UTC (permalink / raw)
  To: Giuliano Procida; +Cc: Matthias Maennich, libabigail



> On Oct 7, 2021, at 9:55 AM, Giuliano Procida <gprocida@google.com> wrote:
> 
> Hi.
> 
> On Thu, 7 Oct 2021 at 12:01, Matthias Maennich via Libabigail
> <libabigail@sourceware.org <mailto:libabigail@sourceware.org>> wrote:
>> 
>> On Wed, Oct 06, 2021 at 04:35:25PM -0700, Ben Woodard via Libabigail wrote:
>>> We’ve been discussing API changes which probably don’t make any ABI difference but reflect changing language idioms. Consider the two files below.
>>> 
>>> ---- a.c
>>> char *chop( char *buf){
>>> return buf+2;
>>> }
>>> 
>>> ---- b.c
>>> void *chop( void *buf){
>>> return buf+2;
>>> }
>>> 
> 
> void * is an interesting example.
> 
> Firstly, (pointer type) "void *" has nothing to do with (return type)
> "void" from a type / semantics point of view, unlike other pointer
> types. The keyword is being used for a different purpose. Casting to
> void is a third oddity.

In C in particular “void” is weird and while you can have a void* you really can’t have a void type and so I kind of feel like messages like:

$ abidiff a.o2 b.o2 
Functions changes summary: 0 Removed, 1 Changed, 0 Added function
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

1 function with some indirect sub-type change:

  [C] 'function char* chop(char*)' at b.c:2:1 has some indirect sub-type changes:
    return type changed:
      in pointed to type 'char':
        type name changed from 'char' to 'void'
        type size changed from 8 to 0 (in bits)
    parameter 1 of type 'char*' changed:
      in pointed to type 'char':
        type name changed from 'char' to 'void'
        type size changed from 8 to 0 (in bits)

are spurious. I kind of feel like there may need to be some special logic that stops at the pointer level with voids and doesn’t try to drill down. i.e. 

  [C] 'function char* chop(char*)' at b.c:2:1 has some type changes:
    return type changed from 'char*' to ‘void*'
    parameter 1 of type 'char*’ changed to ‘void*'

> 
> Secondly, consider more complicated things ... and you may get to the
> point where ABI differences arise because strict aliasing rules may
> apply in one case, but not the other.
> 
>>> 
>>> $ abidiff a.o b.o
>>> Functions changes summary: 0 Removed, 1 Changed, 0 Added function
>>> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>>> 
>>> 1 function with some indirect sub-type change:
>>> 
>>> [C] 'function char* chop(char*)' at b.c:1:1 has some indirect sub-type changes:
>>>   return type changed:
>>>     in pointed to type 'char':
>>>       type name changed from 'char' to 'void'
>>>       type size changed from 8 to 0 (in bits)
>>>   parameter 1 of type 'char*' changed:
>>>     in pointed to type 'char':
>>>       type name changed from 'char' to 'void'
>>>       type size changed from 8 to 0 (in bits)
>>> 
>>> $ echo $?
>>> 4
>>> 
>>> This is totally an API change but Is this really an ABI change? They both compile to the same machine code.
>>> 
>>>       push    rbp
>>>       mov     rbp, rsp
>>>       mov     QWORD PTR [rbp-8], rdi
>>>       mov     rax, QWORD PTR [rbp-8]
>>>       add     rax, 2
>>>       pop     rbp
>>>       ret
>>> 
>>> I was thinking that we should probably add special logic that evaluate char* and void* changes to be harmless when the language is C. When the language is C++ their linkage name mangles differently (unless the functions are given C linkage) and pointer arithmetic on void pointers is not allowed by default so we don’t need to worry about it there. The error reported in that case makes more sense:
>>> 
>>> $ abidiff a.o2 b.o2
>>> Functions changes summary: 1 Removed, 0 Changed, 1 Added functions
>>> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>>> 
>>> 1 Removed function:
>>> 
>>> [D] 'function char* chop(char*)'    {_Z4chopPc}
>>> 
>>> 1 Added function:
>>> 
>>> [A] 'function void* chop(void*)'    {_Z4chopPv}
>>> 
>>> Can anyone see a problem with that?
>> 
>> The line between API and ABI changes is quite blurred. At least there
>> are changes that are exclusively ABI or API only changes. Further there
>> are changes that are not ABI breaking, but likely API breaking and
>> therefore caught by libabigail. Internally, libabigail has an
>> opinionated model of what is breaking and what is harmless and some
>> thoughts have discussed to make this model a bit more flexible. But as
> 
> There are some things we might want to consider harmful that abidiff
> currently considers harmless. One example is enumerator additions, for
> two reasons: symmetry with respect to the ordering of the ABIs being
> compared and the contravariant/covariant uses of the type.

Yeah that one is interesting I was playing with that one last night:

$ abidiff c.o d.o
Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added function
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

$ echo $?
0
$ abidiff d.o c.o
Functions changes summary: 0 Removed, 1 Changed, 0 Added function
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

1 function with some indirect sub-type change:

  [C] 'function char* chop(char*, codes)' at c.C:3:1 has some indirect sub-type changes:
    parameter 2 of type 'enum codes' has sub-type changes:
      type size hasn't changed
      1 enumerator deletion:
        'codes::ccode' value '2'

$ echo $?
4


> 
> Other things we might want the other way around. The main thing I can
> think of here is when libabigail's internal naming of anonymous types
> leaks into ABI descriptions and subsequently into diffs.

I think that I pointed out quite a large number of those to dodji and he fixed many problems associated with that in the 2.0 release. That is not to suggest that we caught all of those cases but it is better than it was. As dodji will attest, I’m really good at breaking things.

> 
> The situation in C++ is even worse than what may appear in symbol
> mangling. Renaming a type affects RTTI which is accessible to code and
> evil code may flip behaviour on that basis.

At least one version of that shows up in https://sourceware.org/bugzilla/show_bug.cgi?id=28025

> 
>> of now, libabigail is rather conservative and will report an issue if
>> the runtime stability is possibly affected. Examples are opaque pointers
>> that are technically ABI compatible, but where the DWARF is providing
>> the underlying type information or enums where additional values can
>> indicate an API change, despite ABI stability (if the additional does
>> not trigger an underlying type change).
>> 
>> The two examples show that ABI stability is likely given, but since it
>> depends on how the values are used, this could still break at runtime.
>> Therefore it reports.
>> 
>> I would like to keep this behaviour as it is safer to ignore the
>> reporting than to miss an ABI/API breakage. In this particular case, the
>> change can indicate an underlying compatibility problem and reporting it
>> directs the attention of the developer to this subtle change (which
>> probably will not be noticed in other testing as well).
>> 
> 
> I think the general position we should take is "be safe" and report
> differences. libabigail actually has (from memory) 4 different kinds
> of diffs that can be suppressed and a variety of ways of specifying
> the suppressions and that logic could in theory be extended to cover
> more of the cases that might arise.
> 
> The general problem of API compatibility is surely undecidable, even
> given formal specifications of behaviour or the source code, neither
> of which we have in DWARF.

I think that I’m slightly less pessimistic than you are. My hope is that ABI compatibility can be made decidable. I agree that currently with just the ELF and the DWARF as it now is, it really isn’t. the psABI and language ABI is written from the perspective of the needs of the compiler authors, and DWARF is written from the perspective of the needs of a debugger author. I hope that ELF, the psABIs, the language ABIs and DWARF could be enhanced to include the needs of a “library swapper” who needs to know that one DSO is compatible with another at least in the context of a particular application if not generally for all consumers of that library.

At the moment, I do not have enough information to push the problem further into the corner and define specifically what I need from the various standards to make this work. I’m working on that.

-ben

> 
> Regards,
> Giuliano.
> 
>> Lastly, if you care about ABI stability in your binary, why would you
>> allow a change like this to land? So, apparently harmless, not patching
>> that in the first place is arguably the better strategy?
>> 
>> Cheers,
>> Matthias
>> 
>>> 
>>> -ben


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: API changes that do not impact ABI
  2021-10-07 19:48     ` Ben Woodard
@ 2021-10-27 10:22       ` Dodji Seketeli
  2021-10-27 14:32         ` Giuliano Procida
  2021-10-27 21:52         ` Ben Woodard
  0 siblings, 2 replies; 8+ messages in thread
From: Dodji Seketeli @ 2021-10-27 10:22 UTC (permalink / raw)
  To: Ben Woodard via Libabigail
  Cc: Giuliano Procida, Ben Woodard, Matthias Maennich

Ben Woodard via Libabigail <libabigail@sourceware.org> a écrit:

[...]

> In C in particular “void” is weird and while you can have a void* you really can’t have a void type and so I kind of feel like messages like:
>
> $ abidiff a.o2 b.o2 
> Functions changes summary: 0 Removed, 1 Changed, 0 Added function
> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>
> 1 function with some indirect sub-type change:
>
>   [C] 'function char* chop(char*)' at b.c:2:1 has some indirect sub-type changes:
>     return type changed:
>       in pointed to type 'char':
>         type name changed from 'char' to 'void'
>         type size changed from 8 to 0 (in bits)
>     parameter 1 of type 'char*' changed:
>       in pointed to type 'char':
>         type name changed from 'char' to 'void'
>         type size changed from 8 to 0 (in bits)
>
> are spurious. I kind of feel like there may need to be some special logic that stops at the pointer level with voids and doesn’t try to drill down. i.e. 
>
>   [C] 'function char* chop(char*)' at b.c:2:1 has some type changes:
>     return type changed from 'char*' to ‘void*'
>     parameter 1 of type 'char*’ changed to ‘void*'

I think this is a reasonable feature request.  Could you please file a
bug for it?

>> Secondly, consider more complicated things ... and you may get to the
>> point where ABI differences arise because strict aliasing rules may
>> apply in one case, but not the other.
>> 
>>>> 
>>>> $ abidiff a.o b.o
>>>> Functions changes summary: 0 Removed, 1 Changed, 0 Added function
>>>> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>>>> 
>>>> 1 function with some indirect sub-type change:
>>>> 
>>>> [C] 'function char* chop(char*)' at b.c:1:1 has some indirect sub-type changes:
>>>>   return type changed:
>>>>     in pointed to type 'char':
>>>>       type name changed from 'char' to 'void'
>>>>       type size changed from 8 to 0 (in bits)
>>>>   parameter 1 of type 'char*' changed:
>>>>     in pointed to type 'char':
>>>>       type name changed from 'char' to 'void'
>>>>       type size changed from 8 to 0 (in bits)
>>>> 
>>>> $ echo $?
>>>> 4
>>>> 
>>>> This is totally an API change but Is this really an ABI change? They both compile to the same machine code.
>>>> 
>>>>       push    rbp
>>>>       mov     rbp, rsp
>>>>       mov     QWORD PTR [rbp-8], rdi
>>>>       mov     rax, QWORD PTR [rbp-8]
>>>>       add     rax, 2
>>>>       pop     rbp
>>>>       ret
>>>> 
>>>> I was thinking that we should probably add special logic that
>>>> evaluate char* and void* changes to be harmless when the language
>>>> is C.

This can indeed be done.  This is the usual "categorization" problem.
Over the years, we have created categories for change reports in
libabigail based on user request and consensus.  If you feel strongly
about it and there is consensu then we'll do it :-)

[...]

Matthias Maennich <maennich@google.com> a écrit:

>>> I would like to keep this behaviour as it is safer to ignore the
>>> reporting than to miss an ABI/API breakage. In this particular case, the
>>> change can indicate an underlying compatibility problem and reporting it
>>> directs the attention of the developer to this subtle change (which
>>> probably will not be noticed in other testing as well).

Hehe, see? :-)

It seems to me that we can have a consensus on what you proposed
earlier, which is:

> [...] stops at the pointer level with voids and doesn’t try to drill down. i.e. 
>
>   [C] 'function char* chop(char*)' at b.c:2:1 has some type changes:
>     return type changed from 'char*' to ‘void*'
>     parameter 1 of type 'char*’ changed to ‘void*'

Unless you guys'ngals think otherwise.

[...]

Giuliano Procida <gprocida@google.com> a écrit:

>> There are some things we might want to consider harmful that abidiff
>> currently considers harmless. One example is enumerator additions, for
>> two reasons: symmetry with respect to the ordering of the ABIs being
>> compared and the contravariant/covariant uses of the type.

I am not sure to understand why (in concrete terms) an enumerator
/addition/ is harmful from an ABI standpoint?  Could you elaborate with
an example, for instance?  Or maybe we can talk about that in a
different thread.

Ben Woodard <woodard@redhat.com> a écrit:

> Yeah that one is interesting I was playing with that one last night:
>
> $ abidiff c.o d.o
> Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0 Added function
> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>
> $ echo $?
> 0
> $ abidiff d.o c.o
> Functions changes summary: 0 Removed, 1 Changed, 0 Added function
> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
>
> 1 function with some indirect sub-type change:
>
>   [C] 'function char* chop(char*, codes)' at c.C:3:1 has some indirect sub-type changes:
>     parameter 2 of type 'enum codes' has sub-type changes:
>       type size hasn't changed
>       1 enumerator deletion:
>         'codes::ccode' value '2'
>
> $ echo $?
> 4

Right, this behaviour seems correct to me, unless I am missing something
obvious.  What am I missing?


Giuliano Procida <gprocida@google.com> a écrit:

>> Other things we might want the other way around. The main thing I can
>> think of here is when libabigail's internal naming of anonymous types
>> leaks into ABI descriptions and subsequently into diffs.

Ben Woodard <woodard@redhat.com> a écrit:

> I think that I pointed out quite a large number of those to dodji and
> he fixed many problems associated with that in the 2.0 release. That
> is not to suggest that we caught all of those cases but it is better
> than it was. As dodji will attest, I’m really good at breaking things.

Right.  We fixed a lot of these with 2.0.

I you find instances of that problem still remaining, please do not
hesitate to file them.  We ought to fix them and I believe we should be
able to fix them all, unless proven otherwise of course.

Giuliano Procida <gprocida@google.com> a écrit:

>> The situation in C++ is even worse than what may appear in symbol
>> mangling. Renaming a type affects RTTI which is accessible to code and
>> evil code may flip behaviour on that basis.

We should hopefully catch all instance of these, assuming they are
expressed in DWARF and are part of the ABI, as opposed as being a
runtime-only property.

[...]

Ben Woodard <woodard@redhat.com> a écrit:

> At least one version of that shows up in https://sourceware.org/bugzilla/show_bug.cgi?id=28025

This is typically an example of runtime-only property.  And we logically
don't detect it.  I should comment on the bug ...

[...]

Cheers,

-- 
		Dodji

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: API changes that do not impact ABI
  2021-10-27 10:22       ` Dodji Seketeli
@ 2021-10-27 14:32         ` Giuliano Procida
  2021-10-27 21:52         ` Ben Woodard
  1 sibling, 0 replies; 8+ messages in thread
From: Giuliano Procida @ 2021-10-27 14:32 UTC (permalink / raw)
  To: Dodji Seketeli; +Cc: Ben Woodard via Libabigail, Ben Woodard, Matthias Maennich

Hi.

On Wed, 27 Oct 2021 at 11:22, Dodji Seketeli <dodji@seketeli.org> wrote:

> Ben Woodard via Libabigail <libabigail@sourceware.org> a écrit:
>
> [...]
>
> > In C in particular “void” is weird and while you can have a void* you
> really can’t have a void type and so I kind of feel like messages like:
> >
> > $ abidiff a.o2 b.o2
> > Functions changes summary: 0 Removed, 1 Changed, 0 Added function
> > Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> >
> > 1 function with some indirect sub-type change:
> >
> >   [C] 'function char* chop(char*)' at b.c:2:1 has some indirect sub-type
> changes:
> >     return type changed:
> >       in pointed to type 'char':
> >         type name changed from 'char' to 'void'
> >         type size changed from 8 to 0 (in bits)
> >     parameter 1 of type 'char*' changed:
> >       in pointed to type 'char':
> >         type name changed from 'char' to 'void'
> >         type size changed from 8 to 0 (in bits)
> >
> > are spurious. I kind of feel like there may need to be some special
> logic that stops at the pointer level with voids and doesn’t try to drill
> down. i.e.
> >
> >   [C] 'function char* chop(char*)' at b.c:2:1 has some type changes:
> >     return type changed from 'char*' to ‘void*'
> >     parameter 1 of type 'char*’ changed to ‘void*'
>
> I think this is a reasonable feature request.  Could you please file a
> bug for it?
>
> >> Secondly, consider more complicated things ... and you may get to the
> >> point where ABI differences arise because strict aliasing rules may
> >> apply in one case, but not the other.
> >>
> >>>>
> >>>> $ abidiff a.o b.o
> >>>> Functions changes summary: 0 Removed, 1 Changed, 0 Added function
> >>>> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> >>>>
> >>>> 1 function with some indirect sub-type change:
> >>>>
> >>>> [C] 'function char* chop(char*)' at b.c:1:1 has some indirect
> sub-type changes:
> >>>>   return type changed:
> >>>>     in pointed to type 'char':
> >>>>       type name changed from 'char' to 'void'
> >>>>       type size changed from 8 to 0 (in bits)
> >>>>   parameter 1 of type 'char*' changed:
> >>>>     in pointed to type 'char':
> >>>>       type name changed from 'char' to 'void'
> >>>>       type size changed from 8 to 0 (in bits)
> >>>>
> >>>> $ echo $?
> >>>> 4
> >>>>
> >>>> This is totally an API change but Is this really an ABI change? They
> both compile to the same machine code.
> >>>>
> >>>>       push    rbp
> >>>>       mov     rbp, rsp
> >>>>       mov     QWORD PTR [rbp-8], rdi
> >>>>       mov     rax, QWORD PTR [rbp-8]
> >>>>       add     rax, 2
> >>>>       pop     rbp
> >>>>       ret
> >>>>
> >>>> I was thinking that we should probably add special logic that
> >>>> evaluate char* and void* changes to be harmless when the language
> >>>> is C.
>
> This can indeed be done.  This is the usual "categorization" problem.
> Over the years, we have created categories for change reports in
> libabigail based on user request and consensus.  If you feel strongly
> about it and there is consensu then we'll do it :-)
>
> [...]
>
> Matthias Maennich <maennich@google.com> a écrit:
>
> >>> I would like to keep this behaviour as it is safer to ignore the
> >>> reporting than to miss an ABI/API breakage. In this particular case,
> the
> >>> change can indicate an underlying compatibility problem and reporting
> it
> >>> directs the attention of the developer to this subtle change (which
> >>> probably will not be noticed in other testing as well).
>
> Hehe, see? :-)
>
> It seems to me that we can have a consensus on what you proposed
> earlier, which is:
>
> > [...] stops at the pointer level with voids and doesn’t try to drill
> down. i.e.
> >
> >   [C] 'function char* chop(char*)' at b.c:2:1 has some type changes:
> >     return type changed from 'char*' to ‘void*'
> >     parameter 1 of type 'char*’ changed to ‘void*'
>
> Unless you guys'ngals think otherwise.
>
> [...]
>
> Giuliano Procida <gprocida@google.com> a écrit:
>
> >> There are some things we might want to consider harmful that abidiff
> >> currently considers harmless. One example is enumerator additions, for
> >> two reasons: symmetry with respect to the ordering of the ABIs being
> >> compared and the contravariant/covariant uses of the type.
>
> I am not sure to understand why (in concrete terms) an enumerator
> /addition/ is harmful from an ABI standpoint?  Could you elaborate with
> an example, for instance?  Or maybe we can talk about that in a
> different thread.
>
>
I think I've mentioned this before...

Suppose we have a library and it has a function:

enum Outcome { Failure, Success };

enum Outcome DoTheWork();

and user code looks like (or it could be an if statement, but we're talking
about enums):

switch (DoTheWork()) {
  case Failure:
     blah;
     break;
  case default:
      LaunchMissiles();
      break;
}

Now the library is updated to:

enum Outcome { Failure, Success, PermissionDenied };

And the compiled user code is no longer safe. We will launch some missiles
when we should probably not.

What has made this unsafe, compared with a library function that takes such
an enum as a readonly input, is the position of the enum type w.r.t. the
function type. This is what I meant by co/contravariance. More elaborate
examples are possible, but this is about the simplest one.

See (somewhat related):
https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)
And:
https://stackoverflow.com/questions/2773004/what-is-the-origin-of-launch-the-missiles

Regards,
Giuliano.

Ben Woodard <woodard@redhat.com> a écrit:
>
> > Yeah that one is interesting I was playing with that one last night:
> >
> > $ abidiff c.o d.o
> > Functions changes summary: 0 Removed, 0 Changed (1 filtered out), 0
> Added function
> > Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> >
> > $ echo $?
> > 0
> > $ abidiff d.o c.o
> > Functions changes summary: 0 Removed, 1 Changed, 0 Added function
> > Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> >
> > 1 function with some indirect sub-type change:
> >
> >   [C] 'function char* chop(char*, codes)' at c.C:3:1 has some indirect
> sub-type changes:
> >     parameter 2 of type 'enum codes' has sub-type changes:
> >       type size hasn't changed
> >       1 enumerator deletion:
> >         'codes::ccode' value '2'
> >
> > $ echo $?
> > 4
>
> Right, this behaviour seems correct to me, unless I am missing something
> obvious.  What am I missing?
>
>
> Giuliano Procida <gprocida@google.com> a écrit:
>
> >> Other things we might want the other way around. The main thing I can
> >> think of here is when libabigail's internal naming of anonymous types
> >> leaks into ABI descriptions and subsequently into diffs.
>
> Ben Woodard <woodard@redhat.com> a écrit:
>
> > I think that I pointed out quite a large number of those to dodji and
> > he fixed many problems associated with that in the 2.0 release. That
> > is not to suggest that we caught all of those cases but it is better
> > than it was. As dodji will attest, I’m really good at breaking things.
>
> Right.  We fixed a lot of these with 2.0.
>
> I you find instances of that problem still remaining, please do not
> hesitate to file them.  We ought to fix them and I believe we should be
> able to fix them all, unless proven otherwise of course.
>
> Giuliano Procida <gprocida@google.com> a écrit:
>
> >> The situation in C++ is even worse than what may appear in symbol
> >> mangling. Renaming a type affects RTTI which is accessible to code and
> >> evil code may flip behaviour on that basis.
>
> We should hopefully catch all instance of these, assuming they are
> expressed in DWARF and are part of the ABI, as opposed as being a
> runtime-only property.
>
> [...]
>
> Ben Woodard <woodard@redhat.com> a écrit:
>
> > At least one version of that shows up in
> https://sourceware.org/bugzilla/show_bug.cgi?id=28025
>
> This is typically an example of runtime-only property.  And we logically
> don't detect it.  I should comment on the bug ...
>
> [...]
>
> Cheers,
>
> --
>                 Dodji
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: API changes that do not impact ABI
  2021-10-27 10:22       ` Dodji Seketeli
  2021-10-27 14:32         ` Giuliano Procida
@ 2021-10-27 21:52         ` Ben Woodard
  1 sibling, 0 replies; 8+ messages in thread
From: Ben Woodard @ 2021-10-27 21:52 UTC (permalink / raw)
  To: Dodji Seketeli
  Cc: Ben Woodard via Libabigail, Giuliano Procida, Matthias Maennich

> On Oct 27, 2021, at 3:22 AM, Dodji Seketeli <dodji@seketeli.org> wrote:
> 
> Ben Woodard <woodard@redhat.com <mailto:woodard@redhat.com>> a écrit:
> 
>> At least one version of that shows up in https://sourceware.org/bugzilla/show_bug.cgi?id=28025 <https://sourceware.org/bugzilla/show_bug.cgi?id=28025>
> 
> This is typically an example of runtime-only property.  And we logically
> don't detect it.  I should comment on the bug ...

I don’t think you are correct about that.
The layout of the type is included and then compiled into one version of the binary but then a change of the type is not picked up as being different in the other version of the library. That is basically the definition of an ABI break.

I thnk that the fix is relatively simple. The types which are involved in RTTI should be included in the types which are evaluated for compatiblity in the same way that types which are used as extern variables or parameters are considered. To enumerate these types you just need to look in the LSDA the typenames you must consider are there.

-ben

> 
> [...]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-10-27 21:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-06 23:35 API changes that do not impact ABI Ben Woodard
2021-10-07 11:01 ` Matthias Maennich
2021-10-07 16:55   ` Giuliano Procida
2021-10-07 19:48     ` Ben Woodard
2021-10-27 10:22       ` Dodji Seketeli
2021-10-27 14:32         ` Giuliano Procida
2021-10-27 21:52         ` Ben Woodard
2021-10-07 19:21   ` Ben Woodard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).