public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] linux: Avoid va_list for generic syscall wrappers if possible
@ 2023-04-04 10:12 Wilco Dijkstra
  0 siblings, 0 replies; 11+ messages in thread
From: Wilco Dijkstra @ 2023-04-04 10:12 UTC (permalink / raw)
  To: caiyinyu; +Cc: 'GNU C Library'

Hi,

> So I believe that the assembly implementation of syscalls is still 
> necessary, especially for users who are using GCC<=13.

As far as I can tell, all internal syscalls are inlined so they never use
syscall() and thus don't have this issue. Does that not happen on
LoongArch? If not, you might need something to enable the inlining.

It's not clear to me how often syscall() is used by user code instead of
direct calls to read() etc. So is it really necessary to consider optimizing
syscall()? Are there microbenchmarks that use it?

Also does sysdeps/unix/sysv/linux/riscv/syscall.c work for you?

So I think we can avoid unnecessary assembly code.

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] linux: Avoid va_list for generic syscall wrappers if possible
  2023-04-05 13:19 Wilco Dijkstra
  2023-04-05 14:04 ` Adhemerval Zanella Netto
@ 2023-04-07  7:31 ` caiyinyu
  1 sibling, 0 replies; 11+ messages in thread
From: caiyinyu @ 2023-04-07  7:31 UTC (permalink / raw)
  To: Wilco Dijkstra, Xi Ruoyao; +Cc: 'GNU C Library'


在 2023/4/5 下午9:19, Wilco Dijkstra 写道:
> Hi,
>
>> Maybe we can use a custom C implementation (like RISC-V) as well.  But
>> strictly speaking the RISC-V syscall.c is invoking undefined behavior
>> (like my proposal) so I agree with assembly here.
> How does assembly code not invoke the exact same undefined behaviour?
> In both cases you rely on variable arguments being passed in the same
> way as integer args - this is true in many ABIs. The goal is to avoid introducing
> unnecessary assembly and use generic implementations where possible.
>
> So if that C version works as well as assembler on LoongArch and assuming
> that bare syscall() performance is essential, then why not use it?

Thanks.

The patch is dropped and we will go with generic code.




>
> Cheers,
> Wilco


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] linux: Avoid va_list for generic syscall wrappers if possible
  2023-04-05 13:19 Wilco Dijkstra
@ 2023-04-05 14:04 ` Adhemerval Zanella Netto
  2023-04-07  7:31 ` caiyinyu
  1 sibling, 0 replies; 11+ messages in thread
From: Adhemerval Zanella Netto @ 2023-04-05 14:04 UTC (permalink / raw)
  To: libc-alpha



On 05/04/23 10:19, Wilco Dijkstra via Libc-alpha wrote:
> Hi,
> 
>> Maybe we can use a custom C implementation (like RISC-V) as well.  But
>> strictly speaking the RISC-V syscall.c is invoking undefined behavior
>> (like my proposal) so I agree with assembly here.
> 
> How does assembly code not invoke the exact same undefined behaviour?
> In both cases you rely on variable arguments being passed in the same
> way as integer args - this is true in many ABIs. The goal is to avoid introducing
> unnecessary assembly and use generic implementations where possible.
> 
> So if that C version works as well as assembler on LoongArch and assuming
> that bare syscall() performance is essential, then why not use it?

After this discussion, it seems that RISCV implementation is essentially
broken and it would be better to either provide a arch-specific assembly
implementation and/or use the generic C one and fix the code generation
issue.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] linux: Avoid va_list for generic syscall wrappers if possible
@ 2023-04-05 13:19 Wilco Dijkstra
  2023-04-05 14:04 ` Adhemerval Zanella Netto
  2023-04-07  7:31 ` caiyinyu
  0 siblings, 2 replies; 11+ messages in thread
From: Wilco Dijkstra @ 2023-04-05 13:19 UTC (permalink / raw)
  To: Xi Ruoyao; +Cc: 'GNU C Library', caiyinyu

Hi,

> Maybe we can use a custom C implementation (like RISC-V) as well.  But
> strictly speaking the RISC-V syscall.c is invoking undefined behavior
> (like my proposal) so I agree with assembly here.

How does assembly code not invoke the exact same undefined behaviour?
In both cases you rely on variable arguments being passed in the same
way as integer args - this is true in many ABIs. The goal is to avoid introducing
unnecessary assembly and use generic implementations where possible.

So if that C version works as well as assembler on LoongArch and assuming
that bare syscall() performance is essential, then why not use it?

Cheers,
Wilco

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] linux: Avoid va_list for generic syscall wrappers if possible
  2023-04-04  1:25     ` caiyinyu
@ 2023-04-04 12:12       ` Xi Ruoyao
  0 siblings, 0 replies; 11+ messages in thread
From: Xi Ruoyao @ 2023-04-04 12:12 UTC (permalink / raw)
  To: caiyinyu, Carlos O'Donell, libc-alpha
  Cc: Wang Xuerui, Adhemerval Zanella Netto, Andreas Schwab, Florian Weimer

On Tue, 2023-04-04 at 09:25 +0800, caiyinyu wrote:
> 
> 在 2023/3/27 下午10:45, Xi Ruoyao 写道:
> > On Mon, 2023-03-27 at 10:04 -0400, Carlos O'Donell wrote:
> > 
> > > In summary, I think this is a compiler problem
> > Definitely true.
> > 
> > > and that working around this in glibc
> > > is going to result in:
> > > 
> > > - Odd corner case ABI issues between public declarations of variadic
> > > functions and
> > >    internal non-variadic definitions.
> > > 
> > > - Poorer testing of #else code that uses variadic arguments, as the
> > > public interface
> > >    requires.
> > > 
> > > I don't support going in this direction.
> > Valid reasons.  Abandon this series then.
> > 
> > But I hope these could be raised earlier (in the discussion about
> > LoongArch syscall.S) so I wouldn't write all the code :).
> > 
> > > Is there an alternative that could generate better code that doesn't
> > > go this way?
> > For LoongArch I can improve GCC to save only the GARs containing the
> > arguments really used in va_arg (i.e. one GAR for things like open() or
> > fcntl() instead of all 8 GARs), but I guess the patch will be delayed
> > into GCC 14.
> > 
> > Generally I've not got an idea about how to make GCC avoid saving GARs
> > unnecessarily with va_arg.
> 
> So I believe that the assembly implementation of syscalls is still 
> necessary, especially for users who are using GCC<=13.

Maybe we can use a custom C implementation (like RISC-V) as well.  But
strictly speaking the RISC-V syscall.c is invoking undefined behavior
(like my proposal) so I agree with assembly here.

> patch:
> 
> https://sourceware.org/pipermail/libc-alpha/2023-March/146588.html
> 

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] linux: Avoid va_list for generic syscall wrappers if possible
  2023-03-27 14:45   ` Xi Ruoyao
  2023-03-27 14:51     ` Xi Ruoyao
@ 2023-04-04  1:25     ` caiyinyu
  2023-04-04 12:12       ` Xi Ruoyao
  1 sibling, 1 reply; 11+ messages in thread
From: caiyinyu @ 2023-04-04  1:25 UTC (permalink / raw)
  To: Xi Ruoyao, Carlos O'Donell, libc-alpha
  Cc: Wang Xuerui, Adhemerval Zanella Netto, Andreas Schwab, Florian Weimer


在 2023/3/27 下午10:45, Xi Ruoyao 写道:
> On Mon, 2023-03-27 at 10:04 -0400, Carlos O'Donell wrote:
>
>> In summary, I think this is a compiler problem
> Definitely true.
>
>> and that working around this in glibc
>> is going to result in:
>>
>> - Odd corner case ABI issues between public declarations of variadic
>> functions and
>>    internal non-variadic definitions.
>>
>> - Poorer testing of #else code that uses variadic arguments, as the
>> public interface
>>    requires.
>>
>> I don't support going in this direction.
> Valid reasons.  Abandon this series then.
>
> But I hope these could be raised earlier (in the discussion about
> LoongArch syscall.S) so I wouldn't write all the code :).
>
>> Is there an alternative that could generate better code that doesn't
>> go this way?
> For LoongArch I can improve GCC to save only the GARs containing the
> arguments really used in va_arg (i.e. one GAR for things like open() or
> fcntl() instead of all 8 GARs), but I guess the patch will be delayed
> into GCC 14.
>
> Generally I've not got an idea about how to make GCC avoid saving GARs
> unnecessarily with va_arg.

So I believe that the assembly implementation of syscalls is still 
necessary, especially for users who are using GCC<=13.

patch:

https://sourceware.org/pipermail/libc-alpha/2023-March/146588.html

>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] linux: Avoid va_list for generic syscall wrappers if possible
  2023-03-27 14:45   ` Xi Ruoyao
@ 2023-03-27 14:51     ` Xi Ruoyao
  2023-04-04  1:25     ` caiyinyu
  1 sibling, 0 replies; 11+ messages in thread
From: Xi Ruoyao @ 2023-03-27 14:51 UTC (permalink / raw)
  To: Carlos O'Donell, libc-alpha
  Cc: caiyinyu, Wang Xuerui, Adhemerval Zanella Netto, Andreas Schwab,
	Florian Weimer

Sorry, mail duplicated because of some network issue.

On Mon, 2023-03-27 at 22:45 +0800, Xi Ruoyao wrote:
> On Mon, 2023-03-27 at 10:04 -0400, Carlos O'Donell wrote:
> 
> > In summary, I think this is a compiler problem
> 
> Definitely true.

/* snip */

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] linux: Avoid va_list for generic syscall wrappers if possible
  2023-03-27 14:04 ` Carlos O'Donell
  2023-03-27 14:44   ` Xi Ruoyao
@ 2023-03-27 14:45   ` Xi Ruoyao
  2023-03-27 14:51     ` Xi Ruoyao
  2023-04-04  1:25     ` caiyinyu
  1 sibling, 2 replies; 11+ messages in thread
From: Xi Ruoyao @ 2023-03-27 14:45 UTC (permalink / raw)
  To: Carlos O'Donell, libc-alpha
  Cc: caiyinyu, Wang Xuerui, Adhemerval Zanella Netto, Andreas Schwab,
	Florian Weimer

On Mon, 2023-03-27 at 10:04 -0400, Carlos O'Donell wrote:

> In summary, I think this is a compiler problem

Definitely true.

> and that working around this in glibc
> is going to result in:
> 
> - Odd corner case ABI issues between public declarations of variadic
> functions and
>   internal non-variadic definitions.
> 
> - Poorer testing of #else code that uses variadic arguments, as the
> public interface
>   requires.
> 
> I don't support going in this direction.

Valid reasons.  Abandon this series then.

But I hope these could be raised earlier (in the discussion about
LoongArch syscall.S) so I wouldn't write all the code :).

> Is there an alternative that could generate better code that doesn't
> go this way?

For LoongArch I can improve GCC to save only the GARs containing the
arguments really used in va_arg (i.e. one GAR for things like open() or
fcntl() instead of all 8 GARs), but I guess the patch will be delayed
into GCC 14.

Generally I've not got an idea about how to make GCC avoid saving GARs
unnecessarily with va_arg.

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] linux: Avoid va_list for generic syscall wrappers if possible
  2023-03-27 14:04 ` Carlos O'Donell
@ 2023-03-27 14:44   ` Xi Ruoyao
  2023-03-27 14:45   ` Xi Ruoyao
  1 sibling, 0 replies; 11+ messages in thread
From: Xi Ruoyao @ 2023-03-27 14:44 UTC (permalink / raw)
  To: Carlos O'Donell, libc-alpha
  Cc: caiyinyu, Wang Xuerui, Adhemerval Zanella Netto, Andreas Schwab,
	Florian Weimer

On Mon, 2023-03-27 at 10:04 -0400, Carlos O'Donell wrote:

> In summary, I think this is a compiler problem

Definitely true.

> and that working around this in glibc
> is going to result in:
> 
> - Odd corner case ABI issues between public declarations of variadic functions and
>   internal non-variadic definitions.
> 
> - Poorer testing of #else code that uses variadic arguments, as the public interface
>   requires.
> 
> I don't support going in this direction.

Valid reasons.  Abandon this series then.

But I hope these could be raised earlier (in the discussion about
LoongArch syscall.S) so I wouldn't write all the code :).

> Is there an alternative that could generate better code that doesn't go this way?

For LoongArch I can improve GCC to save only the GARs containing the
arguments really used in va_arg (i.e. one GAR for things like open() or
fcntl() instead of all 8 GARs), but I guess the patch will be delayed
into GCC 14.

Generally I've not got an idea about how to make GCC avoid saving GARs
unnecessarily with va_arg.

-- 
Xi Ruoyao <xry111@xry111.site>
School of Aerospace Science and Technology, Xidian University

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 0/5] linux: Avoid va_list for generic syscall wrappers if possible
  2023-03-25 14:08 Xi Ruoyao
@ 2023-03-27 14:04 ` Carlos O'Donell
  2023-03-27 14:44   ` Xi Ruoyao
  2023-03-27 14:45   ` Xi Ruoyao
  0 siblings, 2 replies; 11+ messages in thread
From: Carlos O'Donell @ 2023-03-27 14:04 UTC (permalink / raw)
  To: Xi Ruoyao, libc-alpha
  Cc: caiyinyu, Wang Xuerui, Adhemerval Zanella Netto, Andreas Schwab,
	Florian Weimer

On 3/25/23 10:08, Xi Ruoyao via Libc-alpha wrote:
> Currently GCC generates highly sub-optimal code on architectures where
> the calling convention prefers registers for arugment passing.  This is
> GCC PR100955.  While it's technically a missed-optimization in GCC, it
> seems not trivial to fix (I've not seen any compiler which can optimize
> this properly yet).

I'm glad to see we have a gcc PR open for this.

We should be working to improve the compiler instead of working around the issue in
glibc.

> As the generic Linux syscall wrappers often uses a fixed number of
> arguments, and "overreading" the variable arguments is usually safe
> (fcntl etc. is already doing this), we can pretend these wrappers were
> declared with named arguments and make the compiler do right thing.



> Add a macro __ASSUME_SYSCALL_NAMED_WORKS to control this, which should
> be defined if we can safely replace "..." with several named arguments.
> Use an internal function prototype with named arguments if it's defined,
> for fcntl64, fcntl_nocancel, ioctl, mremap, open64, open64_nocancel,
> openat64, openat64_nocancel, prctl, ptrace, and generic syscall()
> wrapper.  I've not changed open* without "64" because I don't have a
> test platform.  shm_ctl is also not changed because it contains
> aggregate variable arugment which is more tricky than integers or
> pointers.


 
> Define this macro for LoongArch, x86-64, and AArch64.  This should be
> also suitable for some other architectures (I think it will be fine on
> RISC-V) but again I don't have a test platform.
> 
> This is the first time I make such a large change in Glibc so it's
> likely I've done something wrong.  Please correct me :).

There are few things that I see are wrong, and I'll list them out here at the top-level:

(1) Error prone macro usage.

#ifdef foo
/* Some Stuff.  */
#else
/* Other stuff. */
#endif

(a) Always define foo, either 0 or 1.
(b) Always use #if foo / #else

(2) Public declarations must match internal declarations.

The biggest problem I see here is that the public declaration of the functions
you are changing are all variadic, and so my strong opinion is that we should not
change the internal definition of these functions to be non-variadic.

I expect there are going to be knock-on effects with developer tooling that
expects the implementation not to over-read e.g. valgrind looking at reading
of registers that have undefined values.

---

In summary, I think this is a compiler problem, and that working around this in glibc
is going to result in:

- Odd corner case ABI issues between public declarations of variadic functions and
  internal non-variadic definitions.

- Poorer testing of #else code that uses variadic arguments, as the public interface
  requires.

I don't support going in this direction.

Is there an alternative that could generate better code that doesn't go this way?

> Xi Ruoyao (5):
>   linux: Add __ASSUME_SYSCALL_NAMED_WORKS to allow avoiding va_list for
>     generic syscall
>   linux: [__ASSUME_SYSCALL_NAMED_WORKS] Avoid using va_list for various
>     syscall wrappers
>   LoongArch: Define __ASSUME_SYSCALL_NAMED_WORKS for Linux
>   x86_64: Define __ASSUME_SYSCALL_NAMED_WORKS for Linux
>   aarch64: Define __ASSUME_SYSCALL_NAMED_WORKS for Linux
> 
>  include/fcntl.h                               | 32 ++++++++-----
>  .../unix/sysv/linux/aarch64/kernel-features.h |  9 ++++
>  sysdeps/unix/sysv/linux/fcntl64.c             | 40 +++++++++++++----
>  sysdeps/unix/sysv/linux/fcntl_nocancel.c      | 30 ++++++++++---
>  sysdeps/unix/sysv/linux/ioctl.c               | 38 ++++++++++++----
>  .../sysv/linux/loongarch/kernel-features.h    | 29 ++++++++++++
>  sysdeps/unix/sysv/linux/mremap.c              | 36 +++++++++++++--
>  sysdeps/unix/sysv/linux/not-cancel.h          |  8 ++--
>  sysdeps/unix/sysv/linux/open64.c              | 35 ++++++++++++---
>  sysdeps/unix/sysv/linux/open64_nocancel.c     | 29 ++++++++++--
>  sysdeps/unix/sysv/linux/openat64.c            | 30 +++++++++++--
>  sysdeps/unix/sysv/linux/openat64_nocancel.c   | 29 ++++++++++--
>  sysdeps/unix/sysv/linux/prctl.c               | 23 +++++++++-
>  sysdeps/unix/sysv/linux/ptrace.c              | 45 ++++++++++++++-----
>  sysdeps/unix/sysv/linux/syscall.c             | 35 +++++++++++----
>  .../unix/sysv/linux/x86_64/kernel-features.h  |  9 ++++
>  16 files changed, 382 insertions(+), 75 deletions(-)
>  create mode 100644 sysdeps/unix/sysv/linux/loongarch/kernel-features.h
> 

-- 
Cheers,
Carlos.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2 0/5] linux: Avoid va_list for generic syscall wrappers if possible
@ 2023-03-25 14:08 Xi Ruoyao
  2023-03-27 14:04 ` Carlos O'Donell
  0 siblings, 1 reply; 11+ messages in thread
From: Xi Ruoyao @ 2023-03-25 14:08 UTC (permalink / raw)
  To: libc-alpha
  Cc: caiyinyu, Wang Xuerui, Adhemerval Zanella Netto, Andreas Schwab,
	Florian Weimer, Xi Ruoyao

Currently GCC generates highly sub-optimal code on architectures where
the calling convention prefers registers for arugment passing.  This is
GCC PR100955.  While it's technically a missed-optimization in GCC, it
seems not trivial to fix (I've not seen any compiler which can optimize
this properly yet).

As the generic Linux syscall wrappers often uses a fixed number of
arguments, and "overreading" the variable arguments is usually safe
(fcntl etc. is already doing this), we can pretend these wrappers were
declared with named arguments and make the compiler do right thing.

Add a macro __ASSUME_SYSCALL_NAMED_WORKS to control this, which should
be defined if we can safely replace "..." with several named arguments.
Use an internal function prototype with named arguments if it's defined,
for fcntl64, fcntl_nocancel, ioctl, mremap, open64, open64_nocancel,
openat64, openat64_nocancel, prctl, ptrace, and generic syscall()
wrapper.  I've not changed open* without "64" because I don't have a
test platform.  shm_ctl is also not changed because it contains
aggregate variable arugment which is more tricky than integers or
pointers.

Define this macro for LoongArch, x86-64, and AArch64.  This should be
also suitable for some other architectures (I think it will be fine on
RISC-V) but again I don't have a test platform.

This is the first time I make such a large change in Glibc so it's
likely I've done something wrong.  Please correct me :).

Xi Ruoyao (5):
  linux: Add __ASSUME_SYSCALL_NAMED_WORKS to allow avoiding va_list for
    generic syscall
  linux: [__ASSUME_SYSCALL_NAMED_WORKS] Avoid using va_list for various
    syscall wrappers
  LoongArch: Define __ASSUME_SYSCALL_NAMED_WORKS for Linux
  x86_64: Define __ASSUME_SYSCALL_NAMED_WORKS for Linux
  aarch64: Define __ASSUME_SYSCALL_NAMED_WORKS for Linux

 include/fcntl.h                               | 32 ++++++++-----
 .../unix/sysv/linux/aarch64/kernel-features.h |  9 ++++
 sysdeps/unix/sysv/linux/fcntl64.c             | 40 +++++++++++++----
 sysdeps/unix/sysv/linux/fcntl_nocancel.c      | 30 ++++++++++---
 sysdeps/unix/sysv/linux/ioctl.c               | 38 ++++++++++++----
 .../sysv/linux/loongarch/kernel-features.h    | 29 ++++++++++++
 sysdeps/unix/sysv/linux/mremap.c              | 36 +++++++++++++--
 sysdeps/unix/sysv/linux/not-cancel.h          |  8 ++--
 sysdeps/unix/sysv/linux/open64.c              | 35 ++++++++++++---
 sysdeps/unix/sysv/linux/open64_nocancel.c     | 29 ++++++++++--
 sysdeps/unix/sysv/linux/openat64.c            | 30 +++++++++++--
 sysdeps/unix/sysv/linux/openat64_nocancel.c   | 29 ++++++++++--
 sysdeps/unix/sysv/linux/prctl.c               | 23 +++++++++-
 sysdeps/unix/sysv/linux/ptrace.c              | 45 ++++++++++++++-----
 sysdeps/unix/sysv/linux/syscall.c             | 35 +++++++++++----
 .../unix/sysv/linux/x86_64/kernel-features.h  |  9 ++++
 16 files changed, 382 insertions(+), 75 deletions(-)
 create mode 100644 sysdeps/unix/sysv/linux/loongarch/kernel-features.h

-- 
2.40.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-04-07  7:31 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-04 10:12 [PATCH v2 0/5] linux: Avoid va_list for generic syscall wrappers if possible Wilco Dijkstra
  -- strict thread matches above, loose matches on Subject: below --
2023-04-05 13:19 Wilco Dijkstra
2023-04-05 14:04 ` Adhemerval Zanella Netto
2023-04-07  7:31 ` caiyinyu
2023-03-25 14:08 Xi Ruoyao
2023-03-27 14:04 ` Carlos O'Donell
2023-03-27 14:44   ` Xi Ruoyao
2023-03-27 14:45   ` Xi Ruoyao
2023-03-27 14:51     ` Xi Ruoyao
2023-04-04  1:25     ` caiyinyu
2023-04-04 12:12       ` Xi Ruoyao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).