public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed
From: Matheus Afonso Martins Moreira <matheus.a.m.moreira@gmail.com>
To: gcc@gcc.gnu.org
Subject: [RFC] Linux system call builtins
Date: Mon, 08 Apr 2024 06:19:14 -0300	[thread overview]
Message-ID: <2d2f1e405361d2b36dd513e3fabd1fe0@gmail.com> (raw)

Hello! I'm a beginner when it comes to GCC development.
I want to learn how it works and start contributing.
Decided to start by implementing something relatively simple
but which would still be very useful for me: Linux builtins.
I sought help in the OFTC IRC channel and it was suggested
that I discuss it here first and obtain consensus before
spending more time on it since it might not be acceptable.

I'd like to add GCC builtins for generating Linux system call
code for all architectures supported by Linux.

They would look like this:

    __builtin_linux_system_call(long n, ...)
    __builtin_linux_system_call_1(long n, long _1)
    __builtin_linux_system_call_2(long n, long _1, long _2)
    /* More definitions, all the way up to 6 arguments */

Calling these builtins will make GCC place all the parameters
in the correct registers for the system call, emit the appropriate
instruction for the target architecture and return the result.
In other words, they would implement the calling convention[1] of
the Linux system calls.

I'm often asked why anyone should care about this system call stuff,
and I've been asked why I want this added to GCC in particular.
My rationale is as follows:

  + It's stable

        This is one of the things which makes Linux unique
        in the operating system landscape: applications
        can target the kernel directly. Unlike in virtually
        every other operating system out there, the Linux kernel
        to user space binary interface is documented[2] as stable.
        Breaking it is considered a regression in the kernel.
        Therefore it makes sense for a compiler to target it.
        The same is not true for any other operating system.

  + It's a calling convention

        GCC already supports many calling conventions
        via function attributes. On x86 alone[3] there's
        cdecl, fastcall, thiscall, stdcall, ms_abi, sysv_abi,
        Win32 specific hot patching hooks. So I believe this
        would not at all be a strange addition to the compiler.

  + It's becoming common

        Despite being specific to the Linux kernel,
        support for it is showing up in other systems.
        FreeBSD implements limited support[4] for Linux ABIs.
        Windows Subsystem for Linux started out[5] similarly,
        as an implementation of this system call ABI.
        Apparently it's becoming something of a lingua franca.
        Maybe one day Linux programs will actually become
        portable by virtue of this stable binary interface.

  + It doesn't make sense for libraries to support it

        There are libraries out there that provide
        system call functionality. The various libcs do.
        However they usually don't support the full set
        of Linux system calls. Using certain system calls
        could invalidate global state in these libraries
        which leads to them not being supported. Clone is
        the quintessential example. So I think libraries
        are not the proper place for this functionality.

  + It allows freestanding software to easily target Linux

        Freestanding code usually refers to bare metal
        targets but Linux is also a viable target.
        This will make it much easier for developers
        to create freestanding nolibc no dependency
        software targeting Linux without having to
        write any assembly code at all, making GCC
        ever more useful.

  + It centralizes functionality in the compiler

        Currently every programmer who wants to use
        these system calls must rely on libraries
        with incomplete support or recreate the
        system call machinery via inline assembly.
        Even the Linux kernel ended up doing it[6].
        It would be so much nicer if the compiler
        simply had support for it. I'm a huge fan
        of builtins like __builtin_frame_address,
        they make it very easy to solve difficult
        problems which would otherwise require tons
        of target specific assembly code. Getting
        the compiler to do that for Linux system
        calls is what this proposal is for.

  + It allows other languages to easily target Linux

        GCC is a compiler collection and has support
        for numerous languages. These builtins should
        allow all of them to target Linux directly
        in one fell swoop.

  + Compilers seem like the proper place for it

        The compiler knows everything about registers
        and instructions and calling conventions.
        It just seems like the right place for it.
        A just in time compiler could also generate
        this code instead of calling native functions.
        I really have no idea why they don't do that.
        Maybe this will prove that it's viable.

Implementation wise, I have managed to define the above builtins
in my GCC branch and compile it successfully. I have not yet
figured out how or even where to implement the code generation.
I was hoping to show up here with patches ready for review
but it really is a complex project. That's why I would like to
to see what the community thinks before proceeding.

A related proposal: hard register operand constraints[7]
for inline assembly code. Essentially, allowing the programmer
to specify the exact registers that must be used in the inline
assembly expression itself. This gets rid of numerous temporary
variables whose only purpose is to get GCC to put them in the
correct registers, as many as 7 local variables for system calls.

I've been told that implementing it would make this proposal
redundant. There is no doubt that this would make code much
simpler, easier to write and understand. It would be a valuable
enhancement to the compiler and I would certainly use it.
However, even with better inline assembly, I still believe
there's value in a simple system call builtin function.
The API is much nicer if nothing else.

Thanks for your attention,
  Matheus

[1]: https://www.man7.org/linux/man-pages/man2/syscall.2.html
[2]: https://www.kernel.org/doc/html/latest/admin-guide/abi-stable.html#the-kernel-syscall-interface
[3]: https://gcc.gnu.org/onlinedocs/gcc/x86-Function-Attributes.html
[4]: https://man.freebsd.org/cgi/man.cgi?linux
[5]: https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux
[6]: https://lwn.net/Articles/920158/
[7]: https://gcc.gnu.org/pipermail/gcc/2021-June/236269.html

             reply	other threads:[~2024-04-08  9:19 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-08  9:19 Matheus Afonso Martins Moreira [this message]
2024-04-08  9:58 ` Jonathan Wakely
2024-04-08 11:59   ` Matheus Afonso Martins Moreira
2024-04-08 14:00     ` Jonathan Wakely
2024-04-08 11:24 ` Florian Weimer
2024-04-08 11:44   ` Alexander Monakov
2024-04-08 11:50     ` Florian Weimer
2024-04-08 13:01       ` Alexander Monakov
2024-04-08 13:37   ` Matheus Afonso Martins Moreira
2024-04-08 18:18 ` Paul Iannetta
2024-04-08 18:26   ` Andrew Pinski
2024-04-08 20:01     ` Paul Iannetta
2024-04-08 20:20       ` Paul Koning
2024-04-10  1:48     ` Matheus Afonso Martins Moreira
2024-04-10 13:15       ` Paul Koning
2024-04-10 14:10         ` Matheus Afonso Martins Moreira
2024-04-10  1:26   ` Matheus Afonso Martins Moreira
2024-04-08 20:24 ` Paul Floyd
2024-04-10  2:19   ` Matheus Afonso Martins Moreira
2024-04-09 11:45 ` Szabolcs Nagy
2024-04-10  2:59   ` Matheus Afonso Martins Moreira
2024-04-10 11:04     ` Szabolcs Nagy
2024-04-10 14:00       ` Matheus Afonso Martins Moreira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2d2f1e405361d2b36dd513e3fabd1fe0@gmail.com \
    --to=matheus.a.m.moreira@gmail.com \
    --cc=gcc@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).