* Argument passing and ABI: GCC vs clang
@ 2026-02-06 11:20 Jose E. Marchesi
2026-02-09 5:11 ` Vineet Gupta
0 siblings, 1 reply; 17+ messages in thread
From: Jose E. Marchesi @ 2026-02-06 11:20 UTC (permalink / raw)
To: bpf
Hello people!
Yesterday evening Vineet pointed out in IRC some differences between the
way GCC handles argument passing compared to clang/LLVM.
Fact is, there are currently no conventions (documented or not) on how
to pass parameters to BPF functions. When I wrote the backend
originally I just did what clang seemed to be doing.
This is not a problem in practice right now, because external (with ld)
linking is not a thing in the BPF world. However, internal (with
libbpf) linking is something that will probably become the norm,
especially considering the kernel people want to do things like parsing
ELF files from BPF programs, which will require support for some sort of
libraries.
Let's address this.
I have created a new wiki page at:
https://gcc.gnu.org/wiki/BPFBackEnd/ABI
The idea is to precisely document the ABI that GCC (and the rest of the
GNU toolchain) implements. Note how the sectioning follows the typical
psABI document. In each section, we can list the known differences with
what clang/LLVM does. I have started by adding the passing of small
structs in registers.
Please add additional differences as you find them. Vineet I think
observed some difference in how 32-bit arguments/return values are
expanded.
The BPFBackenEnd/ABI page is now also linked from the main BPFBackEnd
page.
PS: if someone doesn't have a wiki account or is not in EditorGroup
please speak up.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-06 11:20 Argument passing and ABI: GCC vs clang Jose E. Marchesi
@ 2026-02-09 5:11 ` Vineet Gupta
2026-02-11 3:14 ` Alexei Starovoitov
2026-02-19 23:11 ` Function Return ABI (was Re: Argument passing and ABI: GCC vs clang) Vineet Gupta
0 siblings, 2 replies; 17+ messages in thread
From: Vineet Gupta @ 2026-02-09 5:11 UTC (permalink / raw)
To: Jose E. Marchesi, bpf; +Cc: ast, yonghong.song
On 2/6/26 03:20, Jose E. Marchesi wrote:
> Hello people!
>
> Yesterday evening Vineet pointed out in IRC some differences between the
> way GCC handles argument passing compared to clang/LLVM.
For illustration, here's what a simple test looks like across clang, gcc and x86
and aarch64
https://godbolt.org/z/oerG3eT6f
typedef struct {
int a;
short b;
signed char c;
} my_t;
void foo(signed char, short, int);
char args_setup(my_t *s)
{
foo(s->c, s->b, s->a);
}
int args_consume (signed char a, short b, int c)
{
int x = a;
int y = b;
int z = c;
return x + y + z;
}
clang: -O2 -mcpu=v4 | gcc: -O2 -mcpu=v4
|
args_setup: | args_setup:
r2 = *(s16 *)(r1 + 4) | r3 = *(u32 *) (r1+0)
w3 = *(u32 *)(r1 + 0) | r2 = *(u16 *) (r1+4)
r1 = *(s8 *)(r1 + 6) | r1 = *(u8 *) (r1+6)
call foo | call foo
exit | exit
|
args_consume: | args_consume:
w0 = w2 | r2 = (s16) r2
w0 += w1 | r0 = r3
w0 += w3 | r1 = (s8) r1
exit | w1 += w2
| w0 += w1
| exit
So clang is narrowing the args on caller site while gcc is doing this in caller.
This is not consistent and needs to fixed in one of the compilers.
Doing this in callee seems like a better/safer approach as doesn't assume caller
to always be doing the right thing, specially when mixing bpf user code with
kernel etc.
BTW where is the w<N> assembler notation and semantics (32-bit, zero extension
etc) documented. I don't anything relevant in [1]
Presumably it is just a compiler notation so [1] might not be the ideal place.
[1] https://docs.kernel.org/bpf/standardization/instruction-set.html
> Fact is, there are currently no conventions (documented or not) on how
> to pass parameters to BPF functions. When I wrote the backend
> originally I just did what clang seemed to be doing.
>
> This is not a problem in practice right now, because external (with ld)
> linking is not a thing in the BPF world. However, internal (with
> libbpf) linking is something that will probably become the norm,
> especially considering the kernel people want to do things like parsing
> ELF files from BPF programs, which will require support for some sort of
> libraries.
>
> Let's address this.
>
> I have created a new wiki page at:
>
> https://gcc.gnu.org/wiki/BPFBackEnd/ABI
>
> The idea is to precisely document the ABI that GCC (and the rest of the
> GNU toolchain) implements. Note how the sectioning follows the typical
> psABI document. In each section, we can list the known differences with
> what clang/LLVM does. I have started by adding the passing of small
> structs in registers.
Thx for getting this going.
> Please add additional differences as you find them. Vineet I think
> observed some difference in how 32-bit arguments/return values are
> expanded.
Right, once this thread resolves, we can document aspect this as well.
Thx,
-Vineet
> The BPFBackenEnd/ABI page is now also linked from the main BPFBackEnd
> page.
>
> PS: if someone doesn't have a wiki account or is not in EditorGroup
> please speak up.
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-09 5:11 ` Vineet Gupta
@ 2026-02-11 3:14 ` Alexei Starovoitov
2026-02-11 3:27 ` Andrew Pinski
2026-02-19 23:11 ` Function Return ABI (was Re: Argument passing and ABI: GCC vs clang) Vineet Gupta
1 sibling, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2026-02-11 3:14 UTC (permalink / raw)
To: Vineet Gupta; +Cc: Jose E. Marchesi, bpf, Alexei Starovoitov, Yonghong Song
On Sun, Feb 8, 2026 at 9:11 PM Vineet Gupta <vineet.gupta@linux.dev> wrote:
>
> On 2/6/26 03:20, Jose E. Marchesi wrote:
> > Hello people!
> >
> > Yesterday evening Vineet pointed out in IRC some differences between the
> > way GCC handles argument passing compared to clang/LLVM.
>
> For illustration, here's what a simple test looks like across clang, gcc and x86
> and aarch64
> https://godbolt.org/z/oerG3eT6f
I played with godbolt a bit and looks like GCC is wrong
for both x86 and BPF.
There is no need to do sign extension in args_consume().
It's unnecessary double work.
clang is not doing it in callee for x86 and BPF
and that is a sensible behavior.
> typedef struct {
> int a;
> short b;
> signed char c;
> } my_t;
>
> void foo(signed char, short, int);
>
> char args_setup(my_t *s)
> {
> foo(s->c, s->b, s->a);
> }
>
> int args_consume (signed char a, short b, int c)
> {
> int x = a;
> int y = b;
> int z = c;
>
> return x + y + z;
> }
>
> clang: -O2 -mcpu=v4 | gcc: -O2 -mcpu=v4
> |
> args_setup: | args_setup:
> r2 = *(s16 *)(r1 + 4) | r3 = *(u32 *) (r1+0)
> w3 = *(u32 *)(r1 + 0) | r2 = *(u16 *) (r1+4)
> r1 = *(s8 *)(r1 + 6) | r1 = *(u8 *) (r1+6)
> call foo | call foo
> exit | exit
> |
> args_consume: | args_consume:
> w0 = w2 | r2 = (s16) r2
> w0 += w1 | r0 = r3
> w0 += w3 | r1 = (s8) r1
> exit | w1 += w2
> | w0 += w1
> | exit
>
>
> So clang is narrowing the args on caller site while gcc is doing this in caller.
> This is not consistent and needs to fixed in one of the compilers.
gcc should be fixed, obviously.
> Doing this in callee seems like a better/safer approach as doesn't assume caller
> to always be doing the right thing, specially when mixing bpf user code with
> kernel etc.
nope. clang bpf matches clang x86 and that's what we have to preserve.
> BTW where is the w<N> assembler notation and semantics (32-bit, zero extension
> etc) documented. I don't anything relevant in [1]
> Presumably it is just a compiler notation so [1] might not be the ideal place.
>
> [1] https://docs.kernel.org/bpf/standardization/instruction-set.html
The standard says:
"
If execution would result in modulo by zero, for ``ALU64`` the value of
the destination register is unchanged whereas for ``ALU`` the upper
32 bits of the destination register are zeroed.
"
'w' register means that it is alu32 operation.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-11 3:14 ` Alexei Starovoitov
@ 2026-02-11 3:27 ` Andrew Pinski
2026-02-11 17:28 ` Alexei Starovoitov
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Pinski @ 2026-02-11 3:27 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Vineet Gupta, Jose E. Marchesi, bpf, Alexei Starovoitov, Yonghong Song
On Tue, Feb 10, 2026 at 7:14 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sun, Feb 8, 2026 at 9:11 PM Vineet Gupta <vineet.gupta@linux.dev> wrote:
> >
> > On 2/6/26 03:20, Jose E. Marchesi wrote:
> > > Hello people!
> > >
> > > Yesterday evening Vineet pointed out in IRC some differences between the
> > > way GCC handles argument passing compared to clang/LLVM.
> >
> > For illustration, here's what a simple test looks like across clang, gcc and x86
> > and aarch64
> > https://godbolt.org/z/oerG3eT6f
>
> I played with godbolt a bit and looks like GCC is wrong
> for both x86 and BPF.
> There is no need to do sign extension in args_consume().
> It's unnecessary double work.
> clang is not doing it in callee for x86 and BPF
> and that is a sensible behavior.
No clang/LLVM is known to be broken for x86 ABI. See
https://github.com/llvm/llvm-project/issues/12579 and
https://gcc.gnu.org/legacy-ml/gcc/2013-01/msg00447.html .
And the ABI issue was filed as
https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/17 which clarified
it such that clang is broken.
>
> > typedef struct {
> > int a;
> > short b;
> > signed char c;
> > } my_t;
> >
> > void foo(signed char, short, int);
> >
> > char args_setup(my_t *s)
> > {
> > foo(s->c, s->b, s->a);
> > }
> >
> > int args_consume (signed char a, short b, int c)
> > {
> > int x = a;
> > int y = b;
> > int z = c;
> >
> > return x + y + z;
> > }
> >
> > clang: -O2 -mcpu=v4 | gcc: -O2 -mcpu=v4
> > |
> > args_setup: | args_setup:
> > r2 = *(s16 *)(r1 + 4) | r3 = *(u32 *) (r1+0)
> > w3 = *(u32 *)(r1 + 0) | r2 = *(u16 *) (r1+4)
> > r1 = *(s8 *)(r1 + 6) | r1 = *(u8 *) (r1+6)
> > call foo | call foo
> > exit | exit
> > |
> > args_consume: | args_consume:
> > w0 = w2 | r2 = (s16) r2
> > w0 += w1 | r0 = r3
> > w0 += w3 | r1 = (s8) r1
> > exit | w1 += w2
> > | w0 += w1
> > | exit
> >
> >
> > So clang is narrowing the args on caller site while gcc is doing this in caller.
> > This is not consistent and needs to fixed in one of the compilers.
>
> gcc should be fixed, obviously.
>
> > Doing this in callee seems like a better/safer approach as doesn't assume caller
> > to always be doing the right thing, specially when mixing bpf user code with
> > kernel etc.
>
> nope. clang bpf matches clang x86 and that's what we have to preserve.
Except clang x86 does NOT match the x86 ABI so ...
Thanks,
Andrew
>
> > BTW where is the w<N> assembler notation and semantics (32-bit, zero extension
> > etc) documented. I don't anything relevant in [1]
> > Presumably it is just a compiler notation so [1] might not be the ideal place.
> >
> > [1] https://docs.kernel.org/bpf/standardization/instruction-set.html
>
> The standard says:
> "
> If execution would result in modulo by zero, for ``ALU64`` the value of
> the destination register is unchanged whereas for ``ALU`` the upper
> 32 bits of the destination register are zeroed.
> "
>
> 'w' register means that it is alu32 operation.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-11 3:27 ` Andrew Pinski
@ 2026-02-11 17:28 ` Alexei Starovoitov
2026-02-11 17:32 ` Andrew Pinski
2026-02-11 20:34 ` Jose E. Marchesi
0 siblings, 2 replies; 17+ messages in thread
From: Alexei Starovoitov @ 2026-02-11 17:28 UTC (permalink / raw)
To: Andrew Pinski
Cc: Vineet Gupta, Jose E. Marchesi, bpf, Alexei Starovoitov, Yonghong Song
On Tue, Feb 10, 2026 at 7:27 PM Andrew Pinski
<andrew.pinski@oss.qualcomm.com> wrote:
>
> On Tue, Feb 10, 2026 at 7:14 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Sun, Feb 8, 2026 at 9:11 PM Vineet Gupta <vineet.gupta@linux.dev> wrote:
> > >
> > > On 2/6/26 03:20, Jose E. Marchesi wrote:
> > > > Hello people!
> > > >
> > > > Yesterday evening Vineet pointed out in IRC some differences between the
> > > > way GCC handles argument passing compared to clang/LLVM.
> > >
> > > For illustration, here's what a simple test looks like across clang, gcc and x86
> > > and aarch64
> > > https://godbolt.org/z/oerG3eT6f
> >
> > I played with godbolt a bit and looks like GCC is wrong
> > for both x86 and BPF.
> > There is no need to do sign extension in args_consume().
> > It's unnecessary double work.
> > clang is not doing it in callee for x86 and BPF
> > and that is a sensible behavior.
>
> No clang/LLVM is known to be broken for x86 ABI. See
> https://github.com/llvm/llvm-project/issues/12579 and
> https://gcc.gnu.org/legacy-ml/gcc/2013-01/msg00447.html .
> And the ABI issue was filed as
> https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/17 which clarified
> it such that clang is broken.
I'm reading this differently.
14 year old "issue" is still not fixed and it won't be.
It's a bug in x86 psABI now. clang is being practical and
not being religious about "standards" that don't make much sense.
Like this case. There is zero reason to waste cycles in the callee.
If gcc wants to waste cycles because "standard" that's gcc choice.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-11 17:28 ` Alexei Starovoitov
@ 2026-02-11 17:32 ` Andrew Pinski
2026-02-11 20:34 ` Jose E. Marchesi
1 sibling, 0 replies; 17+ messages in thread
From: Andrew Pinski @ 2026-02-11 17:32 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Vineet Gupta, Jose E. Marchesi, bpf, Alexei Starovoitov, Yonghong Song
On Wed, Feb 11, 2026 at 9:28 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Feb 10, 2026 at 7:27 PM Andrew Pinski
> <andrew.pinski@oss.qualcomm.com> wrote:
> >
> > On Tue, Feb 10, 2026 at 7:14 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Sun, Feb 8, 2026 at 9:11 PM Vineet Gupta <vineet.gupta@linux.dev> wrote:
> > > >
> > > > On 2/6/26 03:20, Jose E. Marchesi wrote:
> > > > > Hello people!
> > > > >
> > > > > Yesterday evening Vineet pointed out in IRC some differences between the
> > > > > way GCC handles argument passing compared to clang/LLVM.
> > > >
> > > > For illustration, here's what a simple test looks like across clang, gcc and x86
> > > > and aarch64
> > > > https://godbolt.org/z/oerG3eT6f
> > >
> > > I played with godbolt a bit and looks like GCC is wrong
> > > for both x86 and BPF.
> > > There is no need to do sign extension in args_consume().
> > > It's unnecessary double work.
> > > clang is not doing it in callee for x86 and BPF
> > > and that is a sensible behavior.
> >
> > No clang/LLVM is known to be broken for x86 ABI. See
> > https://github.com/llvm/llvm-project/issues/12579 and
> > https://gcc.gnu.org/legacy-ml/gcc/2013-01/msg00447.html .
> > And the ABI issue was filed as
> > https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/17 which clarified
> > it such that clang is broken.
>
> I'm reading this differently.
> 14 year old "issue" is still not fixed and it won't be.
> It's a bug in x86 psABI now. clang is being practical and
> not being religious about "standards" that don't make much sense.
Again the psABI was just clarified last year to say LLVM is out of spec.
> Like this case. There is zero reason to waste cycles in the callee.
> If gcc wants to waste cycles because "standard" that's gcc choice.
NO, LLVM is still broken and needs to get fixed.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-11 17:28 ` Alexei Starovoitov
2026-02-11 17:32 ` Andrew Pinski
@ 2026-02-11 20:34 ` Jose E. Marchesi
2026-02-12 1:33 ` Alexei Starovoitov
1 sibling, 1 reply; 17+ messages in thread
From: Jose E. Marchesi @ 2026-02-11 20:34 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Andrew Pinski, Vineet Gupta, bpf, Alexei Starovoitov, Yonghong Song
> On Tue, Feb 10, 2026 at 7:27 PM Andrew Pinski
> <andrew.pinski@oss.qualcomm.com> wrote:
>>
>> On Tue, Feb 10, 2026 at 7:14 PM Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>> >
>> > On Sun, Feb 8, 2026 at 9:11 PM Vineet Gupta <vineet.gupta@linux.dev> wrote:
>> > >
>> > > On 2/6/26 03:20, Jose E. Marchesi wrote:
>> > > > Hello people!
>> > > >
>> > > > Yesterday evening Vineet pointed out in IRC some differences between the
>> > > > way GCC handles argument passing compared to clang/LLVM.
>> > >
>> > > For illustration, here's what a simple test looks like across clang, gcc and x86
>> > > and aarch64
>> > > https://godbolt.org/z/oerG3eT6f
>> >
>> > I played with godbolt a bit and looks like GCC is wrong
>> > for both x86 and BPF.
>> > There is no need to do sign extension in args_consume().
>> > It's unnecessary double work.
>> > clang is not doing it in callee for x86 and BPF
>> > and that is a sensible behavior.
>>
>> No clang/LLVM is known to be broken for x86 ABI. See
>> https://github.com/llvm/llvm-project/issues/12579 and
>> https://gcc.gnu.org/legacy-ml/gcc/2013-01/msg00447.html .
>> And the ABI issue was filed as
>> https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/17 which clarified
>> it such that clang is broken.
>
> I'm reading this differently.
> 14 year old "issue" is still not fixed and it won't be.
> It's a bug in x86 psABI now. clang is being practical and
> not being religious about "standards" that don't make much sense.
> Like this case. There is zero reason to waste cycles in the callee.
> If gcc wants to waste cycles because "standard" that's gcc choice.
I'm down with the flu after FOSDEM (thought I would escape this year
unscathered, but no..) and not very operative right now, but it seems to
me that it matters little how buggy clang is in x86 or how careful they
are following standards.
What we need is to decide what to do in BPF, agree to an ABI, and make
sure both compilers stick to it.
I agree it makes sense to avoid superfluous instructions whenever
possible. So what makes most sense for BPF, callee or caller?
As far as I understand it, the concern expressed in [1] regarding
partial register stalls would be moot if the caller is required (unlike
in x86) to always set all the bits of registers used in argument
passing..
[1] https://gcc.gnu.org/legacy-ml/gcc/2013-01/msg00447.html
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-11 20:34 ` Jose E. Marchesi
@ 2026-02-12 1:33 ` Alexei Starovoitov
2026-02-12 2:46 ` Andrew Pinski
0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2026-02-12 1:33 UTC (permalink / raw)
To: Jose E. Marchesi
Cc: Andrew Pinski, Vineet Gupta, bpf, Alexei Starovoitov, Yonghong Song
On Wed, Feb 11, 2026 at 12:34 PM Jose E. Marchesi <jemarch@gnu.org> wrote:
>
>
> > On Tue, Feb 10, 2026 at 7:27 PM Andrew Pinski
> > <andrew.pinski@oss.qualcomm.com> wrote:
> >>
> >> On Tue, Feb 10, 2026 at 7:14 PM Alexei Starovoitov
> >> <alexei.starovoitov@gmail.com> wrote:
> >> >
> >> > On Sun, Feb 8, 2026 at 9:11 PM Vineet Gupta <vineet.gupta@linux.dev> wrote:
> >> > >
> >> > > On 2/6/26 03:20, Jose E. Marchesi wrote:
> >> > > > Hello people!
> >> > > >
> >> > > > Yesterday evening Vineet pointed out in IRC some differences between the
> >> > > > way GCC handles argument passing compared to clang/LLVM.
> >> > >
> >> > > For illustration, here's what a simple test looks like across clang, gcc and x86
> >> > > and aarch64
> >> > > https://godbolt.org/z/oerG3eT6f
> >> >
> >> > I played with godbolt a bit and looks like GCC is wrong
> >> > for both x86 and BPF.
> >> > There is no need to do sign extension in args_consume().
> >> > It's unnecessary double work.
> >> > clang is not doing it in callee for x86 and BPF
> >> > and that is a sensible behavior.
> >>
> >> No clang/LLVM is known to be broken for x86 ABI. See
> >> https://github.com/llvm/llvm-project/issues/12579 and
> >> https://gcc.gnu.org/legacy-ml/gcc/2013-01/msg00447.html .
> >> And the ABI issue was filed as
> >> https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/17 which clarified
> >> it such that clang is broken.
> >
> > I'm reading this differently.
> > 14 year old "issue" is still not fixed and it won't be.
> > It's a bug in x86 psABI now. clang is being practical and
> > not being religious about "standards" that don't make much sense.
> > Like this case. There is zero reason to waste cycles in the callee.
> > If gcc wants to waste cycles because "standard" that's gcc choice.
>
> I'm down with the flu after FOSDEM (thought I would escape this year
> unscathered, but no..) and not very operative right now, but it seems to
> me that it matters little how buggy clang is in x86 or how careful they
> are following standards.
>
> What we need is to decide what to do in BPF, agree to an ABI, and make
> sure both compilers stick to it.
>
> I agree it makes sense to avoid superfluous instructions whenever
> possible. So what makes most sense for BPF, callee or caller?
It doesn't matter what is more logical or convenient.
The kernels are compiled with clang and clang extends on the caller
side. For passing the args and accepting returns.
BPF codegen has to match that behavior.
Whether it's a bug or feature, whether it conforms to the
standard or not, it doesn't matter.
This behavior was there for years and BPF has to be compatible.
So clang-bpf will keep extending in the caller and not in callee.
If gcc wants to do it in both, that's fine, but as a minimum it has to do
in the caller. Currently it doesn't, which means that gcc compiled
bpf progs will not run properly in the kernel because they won't be
compatible ABI-wise with x86 kernel code compiled by clang.
Also I really doubt that clang will change its x86 behavior
just because psABI was clarified. At this point it's a feature, not a bug.
Even if it does change in some future version the kernels are
compiled with current clang, so clang-bpf and gcc-bpf have no choice.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-12 1:33 ` Alexei Starovoitov
@ 2026-02-12 2:46 ` Andrew Pinski
2026-02-12 3:01 ` Alexei Starovoitov
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Pinski @ 2026-02-12 2:46 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Jose E. Marchesi, Vineet Gupta, bpf, Alexei Starovoitov, Yonghong Song
On Wed, Feb 11, 2026 at 5:33 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Feb 11, 2026 at 12:34 PM Jose E. Marchesi <jemarch@gnu.org> wrote:
> >
> >
> > > On Tue, Feb 10, 2026 at 7:27 PM Andrew Pinski
> > > <andrew.pinski@oss.qualcomm.com> wrote:
> > >>
> > >> On Tue, Feb 10, 2026 at 7:14 PM Alexei Starovoitov
> > >> <alexei.starovoitov@gmail.com> wrote:
> > >> >
> > >> > On Sun, Feb 8, 2026 at 9:11 PM Vineet Gupta <vineet.gupta@linux.dev> wrote:
> > >> > >
> > >> > > On 2/6/26 03:20, Jose E. Marchesi wrote:
> > >> > > > Hello people!
> > >> > > >
> > >> > > > Yesterday evening Vineet pointed out in IRC some differences between the
> > >> > > > way GCC handles argument passing compared to clang/LLVM.
> > >> > >
> > >> > > For illustration, here's what a simple test looks like across clang, gcc and x86
> > >> > > and aarch64
> > >> > > https://godbolt.org/z/oerG3eT6f
> > >> >
> > >> > I played with godbolt a bit and looks like GCC is wrong
> > >> > for both x86 and BPF.
> > >> > There is no need to do sign extension in args_consume().
> > >> > It's unnecessary double work.
> > >> > clang is not doing it in callee for x86 and BPF
> > >> > and that is a sensible behavior.
> > >>
> > >> No clang/LLVM is known to be broken for x86 ABI. See
> > >> https://github.com/llvm/llvm-project/issues/12579 and
> > >> https://gcc.gnu.org/legacy-ml/gcc/2013-01/msg00447.html .
> > >> And the ABI issue was filed as
> > >> https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/17 which clarified
> > >> it such that clang is broken.
> > >
> > > I'm reading this differently.
> > > 14 year old "issue" is still not fixed and it won't be.
> > > It's a bug in x86 psABI now. clang is being practical and
> > > not being religious about "standards" that don't make much sense.
> > > Like this case. There is zero reason to waste cycles in the callee.
> > > If gcc wants to waste cycles because "standard" that's gcc choice.
> >
> > I'm down with the flu after FOSDEM (thought I would escape this year
> > unscathered, but no..) and not very operative right now, but it seems to
> > me that it matters little how buggy clang is in x86 or how careful they
> > are following standards.
> >
> > What we need is to decide what to do in BPF, agree to an ABI, and make
> > sure both compilers stick to it.
> >
> > I agree it makes sense to avoid superfluous instructions whenever
> > possible. So what makes most sense for BPF, callee or caller?
>
> It doesn't matter what is more logical or convenient.
> The kernels are compiled with clang and clang extends on the caller
> side. For passing the args and accepting returns.
> BPF codegen has to match that behavior.
> Whether it's a bug or feature, whether it conforms to the
> standard or not, it doesn't matter.
> This behavior was there for years and BPF has to be compatible.
> So clang-bpf will keep extending in the caller and not in callee.
> If gcc wants to do it in both, that's fine, but as a minimum it has to do
> in the caller. Currently it doesn't, which means that gcc compiled
> bpf progs will not run properly in the kernel because they won't be
> compatible ABI-wise with x86 kernel code compiled by clang.
Except GCC has never done it on both sides. That is my point.
Things are already broken.
Also aarch64 ABI only does it on the callee side rather than the
caller side (clang does it the same as GCC here except for darwin
which has a different ABI).
So how do you propose to handle that?
>
> Also I really doubt that clang will change its x86 behavior
> just because psABI was clarified. At this point it's a feature, not a bug.
> Even if it does change in some future version the kernels are
> compiled with current clang, so clang-bpf and gcc-bpf have no choice.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-12 2:46 ` Andrew Pinski
@ 2026-02-12 3:01 ` Alexei Starovoitov
2026-02-12 3:42 ` Andrew Pinski
0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2026-02-12 3:01 UTC (permalink / raw)
To: Andrew Pinski
Cc: Jose E. Marchesi, Vineet Gupta, bpf, Alexei Starovoitov, Yonghong Song
On Wed, Feb 11, 2026 at 6:46 PM Andrew Pinski
<andrew.pinski@oss.qualcomm.com> wrote:
>
> On Wed, Feb 11, 2026 at 5:33 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Feb 11, 2026 at 12:34 PM Jose E. Marchesi <jemarch@gnu.org> wrote:
> > >
> > >
> > > > On Tue, Feb 10, 2026 at 7:27 PM Andrew Pinski
> > > > <andrew.pinski@oss.qualcomm.com> wrote:
> > > >>
> > > >> On Tue, Feb 10, 2026 at 7:14 PM Alexei Starovoitov
> > > >> <alexei.starovoitov@gmail.com> wrote:
> > > >> >
> > > >> > On Sun, Feb 8, 2026 at 9:11 PM Vineet Gupta <vineet.gupta@linux.dev> wrote:
> > > >> > >
> > > >> > > On 2/6/26 03:20, Jose E. Marchesi wrote:
> > > >> > > > Hello people!
> > > >> > > >
> > > >> > > > Yesterday evening Vineet pointed out in IRC some differences between the
> > > >> > > > way GCC handles argument passing compared to clang/LLVM.
> > > >> > >
> > > >> > > For illustration, here's what a simple test looks like across clang, gcc and x86
> > > >> > > and aarch64
> > > >> > > https://godbolt.org/z/oerG3eT6f
> > > >> >
> > > >> > I played with godbolt a bit and looks like GCC is wrong
> > > >> > for both x86 and BPF.
> > > >> > There is no need to do sign extension in args_consume().
> > > >> > It's unnecessary double work.
> > > >> > clang is not doing it in callee for x86 and BPF
> > > >> > and that is a sensible behavior.
> > > >>
> > > >> No clang/LLVM is known to be broken for x86 ABI. See
> > > >> https://github.com/llvm/llvm-project/issues/12579 and
> > > >> https://gcc.gnu.org/legacy-ml/gcc/2013-01/msg00447.html .
> > > >> And the ABI issue was filed as
> > > >> https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/17 which clarified
> > > >> it such that clang is broken.
> > > >
> > > > I'm reading this differently.
> > > > 14 year old "issue" is still not fixed and it won't be.
> > > > It's a bug in x86 psABI now. clang is being practical and
> > > > not being religious about "standards" that don't make much sense.
> > > > Like this case. There is zero reason to waste cycles in the callee.
> > > > If gcc wants to waste cycles because "standard" that's gcc choice.
> > >
> > > I'm down with the flu after FOSDEM (thought I would escape this year
> > > unscathered, but no..) and not very operative right now, but it seems to
> > > me that it matters little how buggy clang is in x86 or how careful they
> > > are following standards.
> > >
> > > What we need is to decide what to do in BPF, agree to an ABI, and make
> > > sure both compilers stick to it.
> > >
> > > I agree it makes sense to avoid superfluous instructions whenever
> > > possible. So what makes most sense for BPF, callee or caller?
> >
> > It doesn't matter what is more logical or convenient.
> > The kernels are compiled with clang and clang extends on the caller
> > side. For passing the args and accepting returns.
> > BPF codegen has to match that behavior.
> > Whether it's a bug or feature, whether it conforms to the
> > standard or not, it doesn't matter.
> > This behavior was there for years and BPF has to be compatible.
> > So clang-bpf will keep extending in the caller and not in callee.
> > If gcc wants to do it in both, that's fine, but as a minimum it has to do
> > in the caller. Currently it doesn't, which means that gcc compiled
> > bpf progs will not run properly in the kernel because they won't be
> > compatible ABI-wise with x86 kernel code compiled by clang.
>
> Except GCC has never done it on both sides. That is my point.
What do you mean?
The first link in this discussion:
https://godbolt.org/z/oerG3eT6f
shows that gcc sign extends in the caller and callee on x86.
> Things are already broken.
> Also aarch64 ABI only does it on the callee side rather than the
> caller side (clang does it the same as GCC here except for darwin
> which has a different ABI).
> So how do you propose to handle that?
JITs add necessary extensions when necessary. Like on riscv and loongarch
it's unavoidable.
Looks like we missed arm64 case when bpf prog is a callee and caller
is arm64 kernel. Not a difficult fix, but we must avoid the overhead on x86.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-12 3:01 ` Alexei Starovoitov
@ 2026-02-12 3:42 ` Andrew Pinski
2026-02-12 3:46 ` Andrew Pinski
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Pinski @ 2026-02-12 3:42 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Jose E. Marchesi, Vineet Gupta, bpf, Alexei Starovoitov, Yonghong Song
On Wed, Feb 11, 2026 at 7:01 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Feb 11, 2026 at 6:46 PM Andrew Pinski
> <andrew.pinski@oss.qualcomm.com> wrote:
> >
> > On Wed, Feb 11, 2026 at 5:33 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Wed, Feb 11, 2026 at 12:34 PM Jose E. Marchesi <jemarch@gnu.org> wrote:
> > > >
> > > >
> > > > > On Tue, Feb 10, 2026 at 7:27 PM Andrew Pinski
> > > > > <andrew.pinski@oss.qualcomm.com> wrote:
> > > > >>
> > > > >> On Tue, Feb 10, 2026 at 7:14 PM Alexei Starovoitov
> > > > >> <alexei.starovoitov@gmail.com> wrote:
> > > > >> >
> > > > >> > On Sun, Feb 8, 2026 at 9:11 PM Vineet Gupta <vineet.gupta@linux.dev> wrote:
> > > > >> > >
> > > > >> > > On 2/6/26 03:20, Jose E. Marchesi wrote:
> > > > >> > > > Hello people!
> > > > >> > > >
> > > > >> > > > Yesterday evening Vineet pointed out in IRC some differences between the
> > > > >> > > > way GCC handles argument passing compared to clang/LLVM.
> > > > >> > >
> > > > >> > > For illustration, here's what a simple test looks like across clang, gcc and x86
> > > > >> > > and aarch64
> > > > >> > > https://godbolt.org/z/oerG3eT6f
> > > > >> >
> > > > >> > I played with godbolt a bit and looks like GCC is wrong
> > > > >> > for both x86 and BPF.
> > > > >> > There is no need to do sign extension in args_consume().
> > > > >> > It's unnecessary double work.
> > > > >> > clang is not doing it in callee for x86 and BPF
> > > > >> > and that is a sensible behavior.
> > > > >>
> > > > >> No clang/LLVM is known to be broken for x86 ABI. See
> > > > >> https://github.com/llvm/llvm-project/issues/12579 and
> > > > >> https://gcc.gnu.org/legacy-ml/gcc/2013-01/msg00447.html .
> > > > >> And the ABI issue was filed as
> > > > >> https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/17 which clarified
> > > > >> it such that clang is broken.
> > > > >
> > > > > I'm reading this differently.
> > > > > 14 year old "issue" is still not fixed and it won't be.
> > > > > It's a bug in x86 psABI now. clang is being practical and
> > > > > not being religious about "standards" that don't make much sense.
> > > > > Like this case. There is zero reason to waste cycles in the callee.
> > > > > If gcc wants to waste cycles because "standard" that's gcc choice.
> > > >
> > > > I'm down with the flu after FOSDEM (thought I would escape this year
> > > > unscathered, but no..) and not very operative right now, but it seems to
> > > > me that it matters little how buggy clang is in x86 or how careful they
> > > > are following standards.
> > > >
> > > > What we need is to decide what to do in BPF, agree to an ABI, and make
> > > > sure both compilers stick to it.
> > > >
> > > > I agree it makes sense to avoid superfluous instructions whenever
> > > > possible. So what makes most sense for BPF, callee or caller?
> > >
> > > It doesn't matter what is more logical or convenient.
> > > The kernels are compiled with clang and clang extends on the caller
> > > side. For passing the args and accepting returns.
> > > BPF codegen has to match that behavior.
> > > Whether it's a bug or feature, whether it conforms to the
> > > standard or not, it doesn't matter.
> > > This behavior was there for years and BPF has to be compatible.
> > > So clang-bpf will keep extending in the caller and not in callee.
> > > If gcc wants to do it in both, that's fine, but as a minimum it has to do
> > > in the caller. Currently it doesn't, which means that gcc compiled
> > > bpf progs will not run properly in the kernel because they won't be
> > > compatible ABI-wise with x86 kernel code compiled by clang.
> >
> > Except GCC has never done it on both sides. That is my point.
>
> What do you mean?
> The first link in this discussion:
> https://godbolt.org/z/oerG3eT6f
> shows that gcc sign extends in the caller and callee on x86.
Except it does sign extend always and NOT just sign or zero extend :).
>
> > Things are already broken.
> > Also aarch64 ABI only does it on the callee side rather than the
> > caller side (clang does it the same as GCC here except for darwin
> > which has a different ABI).
> > So how do you propose to handle that?
>
> JITs add necessary extensions when necessary. Like on riscv and loongarch
> it's unavoidable.
> Looks like we missed arm64 case when bpf prog is a callee and caller
> is arm64 kernel. Not a difficult fix, but we must avoid the overhead on x86.
Except you can't because you didn't test zero vs sign extend; only
thinking it sign extends.
Also it is does NOT sign extend to 64bit either in your test; 32bit
zero extends.
So again the upper bits for the ABI for x86 is still undefined.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-12 3:42 ` Andrew Pinski
@ 2026-02-12 3:46 ` Andrew Pinski
2026-02-12 4:37 ` Andrew Pinski
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Pinski @ 2026-02-12 3:46 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Jose E. Marchesi, Vineet Gupta, bpf, Alexei Starovoitov, Yonghong Song
On Wed, Feb 11, 2026 at 7:42 PM Andrew Pinski
<andrew.pinski@oss.qualcomm.com> wrote:
>
> On Wed, Feb 11, 2026 at 7:01 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Feb 11, 2026 at 6:46 PM Andrew Pinski
> > <andrew.pinski@oss.qualcomm.com> wrote:
> > >
> > > On Wed, Feb 11, 2026 at 5:33 PM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Wed, Feb 11, 2026 at 12:34 PM Jose E. Marchesi <jemarch@gnu.org> wrote:
> > > > >
> > > > >
> > > > > > On Tue, Feb 10, 2026 at 7:27 PM Andrew Pinski
> > > > > > <andrew.pinski@oss.qualcomm.com> wrote:
> > > > > >>
> > > > > >> On Tue, Feb 10, 2026 at 7:14 PM Alexei Starovoitov
> > > > > >> <alexei.starovoitov@gmail.com> wrote:
> > > > > >> >
> > > > > >> > On Sun, Feb 8, 2026 at 9:11 PM Vineet Gupta <vineet.gupta@linux.dev> wrote:
> > > > > >> > >
> > > > > >> > > On 2/6/26 03:20, Jose E. Marchesi wrote:
> > > > > >> > > > Hello people!
> > > > > >> > > >
> > > > > >> > > > Yesterday evening Vineet pointed out in IRC some differences between the
> > > > > >> > > > way GCC handles argument passing compared to clang/LLVM.
> > > > > >> > >
> > > > > >> > > For illustration, here's what a simple test looks like across clang, gcc and x86
> > > > > >> > > and aarch64
> > > > > >> > > https://godbolt.org/z/oerG3eT6f
> > > > > >> >
> > > > > >> > I played with godbolt a bit and looks like GCC is wrong
> > > > > >> > for both x86 and BPF.
> > > > > >> > There is no need to do sign extension in args_consume().
> > > > > >> > It's unnecessary double work.
> > > > > >> > clang is not doing it in callee for x86 and BPF
> > > > > >> > and that is a sensible behavior.
> > > > > >>
> > > > > >> No clang/LLVM is known to be broken for x86 ABI. See
> > > > > >> https://github.com/llvm/llvm-project/issues/12579 and
> > > > > >> https://gcc.gnu.org/legacy-ml/gcc/2013-01/msg00447.html .
> > > > > >> And the ABI issue was filed as
> > > > > >> https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/17 which clarified
> > > > > >> it such that clang is broken.
> > > > > >
> > > > > > I'm reading this differently.
> > > > > > 14 year old "issue" is still not fixed and it won't be.
> > > > > > It's a bug in x86 psABI now. clang is being practical and
> > > > > > not being religious about "standards" that don't make much sense.
> > > > > > Like this case. There is zero reason to waste cycles in the callee.
> > > > > > If gcc wants to waste cycles because "standard" that's gcc choice.
> > > > >
> > > > > I'm down with the flu after FOSDEM (thought I would escape this year
> > > > > unscathered, but no..) and not very operative right now, but it seems to
> > > > > me that it matters little how buggy clang is in x86 or how careful they
> > > > > are following standards.
> > > > >
> > > > > What we need is to decide what to do in BPF, agree to an ABI, and make
> > > > > sure both compilers stick to it.
> > > > >
> > > > > I agree it makes sense to avoid superfluous instructions whenever
> > > > > possible. So what makes most sense for BPF, callee or caller?
> > > >
> > > > It doesn't matter what is more logical or convenient.
> > > > The kernels are compiled with clang and clang extends on the caller
> > > > side. For passing the args and accepting returns.
> > > > BPF codegen has to match that behavior.
> > > > Whether it's a bug or feature, whether it conforms to the
> > > > standard or not, it doesn't matter.
> > > > This behavior was there for years and BPF has to be compatible.
> > > > So clang-bpf will keep extending in the caller and not in callee.
> > > > If gcc wants to do it in both, that's fine, but as a minimum it has to do
> > > > in the caller. Currently it doesn't, which means that gcc compiled
> > > > bpf progs will not run properly in the kernel because they won't be
> > > > compatible ABI-wise with x86 kernel code compiled by clang.
> > >
> > > Except GCC has never done it on both sides. That is my point.
> >
> > What do you mean?
> > The first link in this discussion:
> > https://godbolt.org/z/oerG3eT6f
> > shows that gcc sign extends in the caller and callee on x86.
>
> Except it does sign extend always and NOT just sign or zero extend :).
>
> >
> > > Things are already broken.
> > > Also aarch64 ABI only does it on the callee side rather than the
> > > caller side (clang does it the same as GCC here except for darwin
> > > which has a different ABI).
> > > So how do you propose to handle that?
> >
> > JITs add necessary extensions when necessary. Like on riscv and loongarch
> > it's unavoidable.
> > Looks like we missed arm64 case when bpf prog is a callee and caller
> > is arm64 kernel. Not a difficult fix, but we must avoid the overhead on x86.
>
> Except you can't because you didn't test zero vs sign extend; only
> thinking it sign extends.
> Also it is does NOT sign extend to 64bit either in your test; 32bit
> zero extends.
> So again the upper bits for the ABI for x86 is still undefined.
If you want to see the difference try:
```
typedef struct {
int a;
short b;
signed char c;
} my_t;
void foou(unsigned char, unsigned short, unsigned int);
void foo(signed char, signed short, signed int);
char args_setup(my_t s)
{
foo(s.c, s.b, s.a);
}
char args_setup_u(my_t s)
{
foou(s.c, s.b, s.a);
}
```
You can see GCC does NOT change the code between these 2 functions and
see the upper bits are not defined.
Thanks,
Andrew
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-12 3:46 ` Andrew Pinski
@ 2026-02-12 4:37 ` Andrew Pinski
2026-02-12 19:47 ` Alexei Starovoitov
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Pinski @ 2026-02-12 4:37 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Jose E. Marchesi, Vineet Gupta, bpf, Alexei Starovoitov, Yonghong Song
On Wed, Feb 11, 2026 at 7:46 PM Andrew Pinski
<andrew.pinski@oss.qualcomm.com> wrote:
>
> On Wed, Feb 11, 2026 at 7:42 PM Andrew Pinski
> <andrew.pinski@oss.qualcomm.com> wrote:
> >
> > On Wed, Feb 11, 2026 at 7:01 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Wed, Feb 11, 2026 at 6:46 PM Andrew Pinski
> > > <andrew.pinski@oss.qualcomm.com> wrote:
> > > >
> > > > On Wed, Feb 11, 2026 at 5:33 PM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Wed, Feb 11, 2026 at 12:34 PM Jose E. Marchesi <jemarch@gnu.org> wrote:
> > > > > >
> > > > > >
> > > > > > > On Tue, Feb 10, 2026 at 7:27 PM Andrew Pinski
> > > > > > > <andrew.pinski@oss.qualcomm.com> wrote:
> > > > > > >>
> > > > > > >> On Tue, Feb 10, 2026 at 7:14 PM Alexei Starovoitov
> > > > > > >> <alexei.starovoitov@gmail.com> wrote:
> > > > > > >> >
> > > > > > >> > On Sun, Feb 8, 2026 at 9:11 PM Vineet Gupta <vineet.gupta@linux.dev> wrote:
> > > > > > >> > >
> > > > > > >> > > On 2/6/26 03:20, Jose E. Marchesi wrote:
> > > > > > >> > > > Hello people!
> > > > > > >> > > >
> > > > > > >> > > > Yesterday evening Vineet pointed out in IRC some differences between the
> > > > > > >> > > > way GCC handles argument passing compared to clang/LLVM.
> > > > > > >> > >
> > > > > > >> > > For illustration, here's what a simple test looks like across clang, gcc and x86
> > > > > > >> > > and aarch64
> > > > > > >> > > https://godbolt.org/z/oerG3eT6f
> > > > > > >> >
> > > > > > >> > I played with godbolt a bit and looks like GCC is wrong
> > > > > > >> > for both x86 and BPF.
> > > > > > >> > There is no need to do sign extension in args_consume().
> > > > > > >> > It's unnecessary double work.
> > > > > > >> > clang is not doing it in callee for x86 and BPF
> > > > > > >> > and that is a sensible behavior.
> > > > > > >>
> > > > > > >> No clang/LLVM is known to be broken for x86 ABI. See
> > > > > > >> https://github.com/llvm/llvm-project/issues/12579 and
> > > > > > >> https://gcc.gnu.org/legacy-ml/gcc/2013-01/msg00447.html .
> > > > > > >> And the ABI issue was filed as
> > > > > > >> https://gitlab.com/x86-psABIs/x86-64-ABI/-/issues/17 which clarified
> > > > > > >> it such that clang is broken.
> > > > > > >
> > > > > > > I'm reading this differently.
> > > > > > > 14 year old "issue" is still not fixed and it won't be.
> > > > > > > It's a bug in x86 psABI now. clang is being practical and
> > > > > > > not being religious about "standards" that don't make much sense.
> > > > > > > Like this case. There is zero reason to waste cycles in the callee.
> > > > > > > If gcc wants to waste cycles because "standard" that's gcc choice.
> > > > > >
> > > > > > I'm down with the flu after FOSDEM (thought I would escape this year
> > > > > > unscathered, but no..) and not very operative right now, but it seems to
> > > > > > me that it matters little how buggy clang is in x86 or how careful they
> > > > > > are following standards.
> > > > > >
> > > > > > What we need is to decide what to do in BPF, agree to an ABI, and make
> > > > > > sure both compilers stick to it.
> > > > > >
> > > > > > I agree it makes sense to avoid superfluous instructions whenever
> > > > > > possible. So what makes most sense for BPF, callee or caller?
> > > > >
> > > > > It doesn't matter what is more logical or convenient.
> > > > > The kernels are compiled with clang and clang extends on the caller
> > > > > side. For passing the args and accepting returns.
> > > > > BPF codegen has to match that behavior.
> > > > > Whether it's a bug or feature, whether it conforms to the
> > > > > standard or not, it doesn't matter.
> > > > > This behavior was there for years and BPF has to be compatible.
> > > > > So clang-bpf will keep extending in the caller and not in callee.
> > > > > If gcc wants to do it in both, that's fine, but as a minimum it has to do
> > > > > in the caller. Currently it doesn't, which means that gcc compiled
> > > > > bpf progs will not run properly in the kernel because they won't be
> > > > > compatible ABI-wise with x86 kernel code compiled by clang.
> > > >
> > > > Except GCC has never done it on both sides. That is my point.
> > >
> > > What do you mean?
> > > The first link in this discussion:
> > > https://godbolt.org/z/oerG3eT6f
> > > shows that gcc sign extends in the caller and callee on x86.
> >
> > Except it does sign extend always and NOT just sign or zero extend :).
> >
> > >
> > > > Things are already broken.
> > > > Also aarch64 ABI only does it on the callee side rather than the
> > > > caller side (clang does it the same as GCC here except for darwin
> > > > which has a different ABI).
> > > > So how do you propose to handle that?
> > >
> > > JITs add necessary extensions when necessary. Like on riscv and loongarch
> > > it's unavoidable.
> > > Looks like we missed arm64 case when bpf prog is a callee and caller
> > > is arm64 kernel. Not a difficult fix, but we must avoid the overhead on x86.
> >
> > Except you can't because you didn't test zero vs sign extend; only
> > thinking it sign extends.
> > Also it is does NOT sign extend to 64bit either in your test; 32bit
> > zero extends.
> > So again the upper bits for the ABI for x86 is still undefined.
>
>
> If you want to see the difference try:
> ```
> typedef struct {
> int a;
> short b;
> signed char c;
> } my_t;
>
> void foou(unsigned char, unsigned short, unsigned int);
> void foo(signed char, signed short, signed int);
> char args_setup(my_t s)
> {
> foo(s.c, s.b, s.a);
> }
> char args_setup_u(my_t s)
> {
> foou(s.c, s.b, s.a);
> }
> ```
>
> You can see GCC does NOT change the code between these 2 functions and
> see the upper bits are not defined.
Here is another example:
```
void foo(signed char, signed short, signed int);
void foou(unsigned char, unsigned short, unsigned int);
void args_setup(long c, long b, long a)
{
foo(c, b, a);
}
void args_setupu(long c, long b, long a)
{
foou(c, b, a);
}
```
GCC uses sign extends for both args_setupu and args_setup while clang
uses zero extends for args_setupu and sign extends for args_setup.
Oh look, the upper 32bits of the int case are undefined in both cases.
In theory GCC does not need to a sign extend in args_setup/args_setupu
but does because of reasons.
>
> Thanks,
> Andrew
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Argument passing and ABI: GCC vs clang
2026-02-12 4:37 ` Andrew Pinski
@ 2026-02-12 19:47 ` Alexei Starovoitov
0 siblings, 0 replies; 17+ messages in thread
From: Alexei Starovoitov @ 2026-02-12 19:47 UTC (permalink / raw)
To: Andrew Pinski
Cc: Jose E. Marchesi, Vineet Gupta, bpf, Alexei Starovoitov, Yonghong Song
On Wed, Feb 11, 2026 at 8:37 PM Andrew Pinski
<andrew.pinski@oss.qualcomm.com> wrote:
>
>
> Here is another example:
> ```
> void foo(signed char, signed short, signed int);
> void foou(unsigned char, unsigned short, unsigned int);
> void args_setup(long c, long b, long a)
> {
> foo(c, b, a);
> }
> void args_setupu(long c, long b, long a)
> {
> foou(c, b, a);
> }
> ```
>
> GCC uses sign extends for both args_setupu and args_setup while clang
> uses zero extends for args_setupu and sign extends for args_setup.
url for above:
https://godbolt.org/z/jW79W9bGq
yeah. it's a mess. Looks like gcc compiled code is incompatible
with clang on x86.
clang does nothing in args_consume(), since it assumes that
the caller did appropriate sign or zero extension.
This is the behavior for both x86 and bpf.
So compiling args_setupu() with gcc-x86 and args_consume() with clang-x86
is broken.
> Oh look, the upper 32bits of the int case are undefined in both cases.
well, the upper 32-bit are defined to be zero-d as any write to 32-bit
sub-register clears them on x86, arm64 and bpf.
> In theory GCC does not need to a sign extend in args_setup/args_setupu
gcc-x86 should do zero or sign extend just like clang does to be compatible.
Anyhow, back to gcc-bpf. I hope now it's clear that it has to do what
clang-bpf does.
For gcc compiled kernels the x86 JIT would need to add zero extension
in a trampoline when it transitions from gcc-x86 code to bpf for
unsigned char/short arguments. Oh well.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Function Return ABI (was Re: Argument passing and ABI: GCC vs clang)
2026-02-09 5:11 ` Vineet Gupta
2026-02-11 3:14 ` Alexei Starovoitov
@ 2026-02-19 23:11 ` Vineet Gupta
2026-02-20 2:41 ` Alexei Starovoitov
1 sibling, 1 reply; 17+ messages in thread
From: Vineet Gupta @ 2026-02-19 23:11 UTC (permalink / raw)
To: Jose E. Marchesi, bpf; +Cc: ast, yonghong.song, Andrew Pinski
On 2/8/26 9:11 PM, Vineet Gupta wrote:
> On 2/6/26 03:20, Jose E. Marchesi wrote:
>> Hello people!
>>
>> Yesterday evening Vineet pointed out in IRC some differences between the
>> way GCC handles argument passing compared to clang/LLVM.
> For illustration, here's what a simple test looks like across clang, gcc and x86
> and aarch64
> https://godbolt.org/z/oerG3eT6f
>
> typedef struct {
> int a;
> short b;
> signed char c;
> } my_t;
>
> void foo(signed char, short, int);
>
> char args_setup(my_t *s)
> {
> foo(s->c, s->b, s->a);
> }
>
> int args_consume (signed char a, short b, int c)
> {
> int x = a;
> int y = b;
> int z = c;
>
> return x + y + z;
> }
>
> clang: -O2 -mcpu=v4 | gcc: -O2 -mcpu=v4
> |
> args_setup: | args_setup:
> r2 = *(s16 *)(r1 + 4) | r3 = *(u32 *) (r1+0)
> w3 = *(u32 *)(r1 + 0) | r2 = *(u16 *) (r1+4)
> r1 = *(s8 *)(r1 + 6) | r1 = *(u8 *) (r1+6)
> call foo | call foo
> exit | exit
> |
> args_consume: | args_consume:
> w0 = w2 | r2 = (s16) r2
> w0 += w1 | r0 = r3
> w0 += w3 | r1 = (s8) r1
> exit | w1 += w2
> | w0 += w1
> | exit
>
>
> So clang is narrowing the args on caller site while gcc is doing this in caller.
> This is not consistent and needs to fixed in one of the compilers.
> Doing this in callee seems like a better/safer approach as doesn't assume caller
> to always be doing the right thing, specially when mixing bpf user code with
> kernel etc.
>
> BTW where is the w<N> assembler notation and semantics (32-bit, zero extension
> etc) documented. I don't anything relevant in [1]
> Presumably it is just a compiler notation so [1] might not be the ideal place.
>
> [1] https://docs.kernel.org/bpf/standardization/instruction-set.html
So the function arguments issue is captured as PR/124171 and I have a
tentative fix for that, but...
There's second half of the problem which is promotion of narrow function
return values.
It seems the existing clang ABI is not symmetrical. The function
arguments promotion is handled in callee, but function return is handled
in caller [1] which is a bit atypical IMO and would need to be handled
in gcc as well.
Since we are on the topic of ABI change, I wanted to surface that as
well before we go off and tackle that.
Simple example
_Bool bar_bool(void);
int ret_caller(void) {
if (bar_bool() != 1) return 0; else return 1;
}
char ret_callee(my_t *s)
{
return s->c; // c is a char in a struct
}
ret_caller:
call bar_bool
r0 &= 0xff <-- ret promotion in caller
exit
ret_callee:
r0 = *(u8 *) (r1+6) <-- no ret promotion in callee
exit
Thx,
-Vineet
[1] https://reviews.llvm.org/D131598
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Function Return ABI (was Re: Argument passing and ABI: GCC vs clang)
2026-02-19 23:11 ` Function Return ABI (was Re: Argument passing and ABI: GCC vs clang) Vineet Gupta
@ 2026-02-20 2:41 ` Alexei Starovoitov
2026-02-20 3:38 ` Vineet Gupta
0 siblings, 1 reply; 17+ messages in thread
From: Alexei Starovoitov @ 2026-02-20 2:41 UTC (permalink / raw)
To: Vineet Gupta, bpf
Cc: Jose E. Marchesi, bpf, Alexei Starovoitov, Yonghong Song, Andrew Pinski
On Thu, Feb 19, 2026 at 3:11 PM Vineet Gupta <vineet.gupta@linux.dev> wrote:
>
> On 2/8/26 9:11 PM, Vineet Gupta wrote:
> > On 2/6/26 03:20, Jose E. Marchesi wrote:
> >> Hello people!
> >>
> >> Yesterday evening Vineet pointed out in IRC some differences between the
> >> way GCC handles argument passing compared to clang/LLVM.
> > For illustration, here's what a simple test looks like across clang, gcc and x86
> > and aarch64
> > https://godbolt.org/z/oerG3eT6f
> >
> > typedef struct {
> > int a;
> > short b;
> > signed char c;
> > } my_t;
> >
> > void foo(signed char, short, int);
> >
> > char args_setup(my_t *s)
> > {
> > foo(s->c, s->b, s->a);
> > }
> >
> > int args_consume (signed char a, short b, int c)
> > {
> > int x = a;
> > int y = b;
> > int z = c;
> >
> > return x + y + z;
> > }
> >
> > clang: -O2 -mcpu=v4 | gcc: -O2 -mcpu=v4
> > |
> > args_setup: | args_setup:
> > r2 = *(s16 *)(r1 + 4) | r3 = *(u32 *) (r1+0)
> > w3 = *(u32 *)(r1 + 0) | r2 = *(u16 *) (r1+4)
> > r1 = *(s8 *)(r1 + 6) | r1 = *(u8 *) (r1+6)
> > call foo | call foo
> > exit | exit
> > |
> > args_consume: | args_consume:
> > w0 = w2 | r2 = (s16) r2
> > w0 += w1 | r0 = r3
> > w0 += w3 | r1 = (s8) r1
> > exit | w1 += w2
> > | w0 += w1
> > | exit
> >
> >
> > So clang is narrowing the args on caller site while gcc is doing this in caller.
> > This is not consistent and needs to fixed in one of the compilers.
> > Doing this in callee seems like a better/safer approach as doesn't assume caller
> > to always be doing the right thing, specially when mixing bpf user code with
> > kernel etc.
> >
> > BTW where is the w<N> assembler notation and semantics (32-bit, zero extension
> > etc) documented. I don't anything relevant in [1]
> > Presumably it is just a compiler notation so [1] might not be the ideal place.
> >
> > [1] https://docs.kernel.org/bpf/standardization/instruction-set.html
> So the function arguments issue is captured as PR/124171 and I have a
> tentative fix for that, but...
> There's second half of the problem which is promotion of narrow function
> return values.
>
> It seems the existing clang ABI is not symmetrical. The function
> arguments promotion is handled in callee, but function return is handled
> in caller [1] which is a bit atypical IMO and would need to be handled
> in gcc as well.
hmm. I thought we already concluded that for clang-x86 and clang-bpf
argument extension is handled in *caller*.
The same thing for returns. It's a caller responsibility.
gcc-bpf doing args in the callee only is broken and has to be fixed.
> Since we are on the topic of ABI change, I wanted to surface that as
> well before we go off and tackle that.
>
> Simple example
>
> _Bool bar_bool(void);
>
> int ret_caller(void) {
> if (bar_bool() != 1) return 0; else return 1;
> }
>
> char ret_callee(my_t *s)
> {
> return s->c; // c is a char in a struct
> }
>
> ret_caller:
> call bar_bool
> r0 &= 0xff <-- ret promotion in caller
> exit
>
> ret_callee:
> r0 = *(u8 *) (r1+6) <-- no ret promotion in callee
> exit
Exactly as it should be because x86 will populate 8-bit sub
registers in the callee and bpf side has to do r0 &= 0xff
in the caller.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Function Return ABI (was Re: Argument passing and ABI: GCC vs clang)
2026-02-20 2:41 ` Alexei Starovoitov
@ 2026-02-20 3:38 ` Vineet Gupta
0 siblings, 0 replies; 17+ messages in thread
From: Vineet Gupta @ 2026-02-20 3:38 UTC (permalink / raw)
To: Alexei Starovoitov, bpf
Cc: Jose E. Marchesi, bpf, Alexei Starovoitov, Yonghong Song, Andrew Pinski
On 2/19/26 6:41 PM, Alexei Starovoitov wrote:
>> It seems the existing clang ABI is not symmetrical. The function
>> arguments promotion is handled in callee, but function return is handled
>> in caller [1] which is a bit atypical IMO and would need to be handled
>> in gcc as well.
Ignore this as it is clearly wrong (my jetlag and waking up at 3 am is
to blame :-)
With PR/124171 gcc-bpf will promote args in caller same as llvm (not
callee)
> hmm. I thought we already concluded that for clang-x86 and clang-bpf
> argument extension is handled in*caller*.
> The same thing for returns. It's a caller responsibility.
> gcc-bpf doing args in the callee only is broken and has to be fixed.
Correct, agreed.
>> Since we are on the topic of ABI change, I wanted to surface that as
>> well before we go off and tackle that.
>>
>> Simple example
>>
>> _Bool bar_bool(void);
>>
>> int ret_caller(void) {
>> if (bar_bool() != 1) return 0; else return 1;
>> }
>>
>> char ret_callee(my_t *s)
>> {
>> return s->c; // c is a char in a struct
>> }
>>
>> ret_caller:
>> call bar_bool
>> r0 &= 0xff <-- ret promotion in caller
>> exit
>>
>> ret_callee:
>> r0 = *(u8 *) (r1+6) <-- no ret promotion in callee
>> exit
> Exactly as it should be because x86 will populate 8-bit sub
> registers in the callee and bpf side has to do r0 &= 0xff
> in the caller.
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2026-02-20 3:38 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-02-06 11:20 Argument passing and ABI: GCC vs clang Jose E. Marchesi
2026-02-09 5:11 ` Vineet Gupta
2026-02-11 3:14 ` Alexei Starovoitov
2026-02-11 3:27 ` Andrew Pinski
2026-02-11 17:28 ` Alexei Starovoitov
2026-02-11 17:32 ` Andrew Pinski
2026-02-11 20:34 ` Jose E. Marchesi
2026-02-12 1:33 ` Alexei Starovoitov
2026-02-12 2:46 ` Andrew Pinski
2026-02-12 3:01 ` Alexei Starovoitov
2026-02-12 3:42 ` Andrew Pinski
2026-02-12 3:46 ` Andrew Pinski
2026-02-12 4:37 ` Andrew Pinski
2026-02-12 19:47 ` Alexei Starovoitov
2026-02-19 23:11 ` Function Return ABI (was Re: Argument passing and ABI: GCC vs clang) Vineet Gupta
2026-02-20 2:41 ` Alexei Starovoitov
2026-02-20 3:38 ` Vineet Gupta
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).