From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mateusz Guzik <mjguzik@gmail.com>,
Andreas Schwab <schwab@suse.de>, Rich Felker <dalias@libc.org>,
Mateusz Guzik via Libc-alpha <libc-alpha@sourceware.org>
Subject: Re: fstat(2) penalized by using newfstatat(6, "", buf, AT_EMPTY_PATH)
Date: Tue, 5 Sep 2023 16:16:58 -0300 [thread overview]
Message-ID: <b22fcbd5-eb69-7755-c76a-01706006b3cd@linaro.org> (raw)
In-Reply-To: <387426d5-17c8-cf3e-83a1-88b55c9d22a1@linaro.org>
On 05/09/23 15:22, Adhemerval Zanella Netto wrote:
>
>
> On 05/09/23 14:45, Linus Torvalds wrote:
>> On Tue, 5 Sept 2023 at 10:28, Adhemerval Zanella Netto
>> <adhemerval.zanella@linaro.org> wrote:
>>>
>>> I think we can still make it in a generic way, stat family would use more
>>> syscall (which some filters might complain) but it should be ok:
>>
>> Ugh. That patch of yours certainly seems to avoid the kernel overhead
>> (where that "fetch a byte from user space just to check it is '\0'" is
>> much more expensive than it is in user space), but why does glibc do
>> that whole "turn fstat() into fstatat(), and then turn it back again"?
>>
>> It seems just stupid. If the user asked for 'fstat()', just give the
>> user 'fstat()'.
>>
>> Your patch literally adds *complexity*, rather than take it away.
>>
>> Is this all just because glibc did its own 'struct stat'
>> implementation way back when, and you want to have just one place
>> where the stat conversion is done?
>>
>> Can't you just special case *that* case instead? It's long been
>> irrelevant. Afaik, on 32-bit x86, you just want to use the 'stat64'
>> system call family, and on 64-bit x86 you just use the regular 'stat'
>> family (yes, called 'newfstatat' for that one system call for internal
>> historial reasons), and there is no conversion needed.
>>
>> IOW, all those wrappers just for 'struct stat' conversion are pure
>> overhead and silly garbage. Why do they exist? And if there is still
>> some other reason for them to exist, can't you make that be a
>> 'sysdeps' file of its own, and not penalize all the sane cases with
>> 'copy stat structures around'?
>
> The old glibc stat code was a complete mess that was added prior symbol
> versioning, with multiple implementations, along with a static linkage
> that defines the current version (there is no need to describe it
> further). The consolidation also provide a much simpler way to support
> y2038.
>
> On x86_64 and some other 64 bits ABIs where glibc exported struct stat
> is the same as the kernel, there is no conversion whatsoever. It is
> only required on some ABI where due historical mistakes the exported
> user struct does not match kernel (sparc64, mips64) and on 32 bits
> where the no-LFS support is just routed to LFS interface to simplify
> things.
>
> So the fstat to fstatat approach is a way to simplify the code and put
> all the required syscall logic only at fstatat implementation. By adding
> the 'special' case back on fstat.c it would require to replicate all this
> logic plus when to use statx instead (FSTATAT_USE_STATX), which is far
> from ideal. It is doable, but not really simpler...
Below it is the stat logic implemented directly on stat64.c code. It avoids
the "flag == AT_EMPTY_PATH ..." check, but it only handles LFS calls
where statx is not required (so 32 bit architectures will still continue to
use statx, but I don't think this would be a problem). It not really simpler
than add the logic on fstatat64.c, but it should be ok.
diff --git a/sysdeps/unix/sysv/linux/fstat64.c b/sysdeps/unix/sysv/linux/fstat64.c
index 124384e57f..a7f4935a01 100644
--- a/sysdeps/unix/sysv/linux/fstat64.c
+++ b/sysdeps/unix/sysv/linux/fstat64.c
@@ -19,20 +19,53 @@
#define __fstat __redirect___fstat
#define fstat __redirect_fstat
#include <sys/stat.h>
+#undef __fstat
+#undef fstat
#include <fcntl.h>
-#include <kernel_stat.h>
-#include <stat_t64_cp.h>
+#include <internal-stat.h>
#include <errno.h>
int
__fstat64_time64 (int fd, struct __stat64_t64 *buf)
{
+#if !FSTATAT_USE_STATX
+# if XSTAT_IS_XSTAT64
+# ifdef __NR_newfstatat
+ /* 64-bit kABI, e.g. aarch64, ia64, powerpc64*, s390x, riscv64, and
+ x86_64. */
+ return INLINE_SYSCALL_CALL (fstat, fd, buf);
+# elif defined __NR_fstatat64
+# if STAT64_IS_KERNEL_STAT64
+ /* 64-bit kABI outlier, e.g. alpha */
+ return INLINE_SYSCALL_CALL (fstat64, fd, buf);
+# else
+ /* 64-bit kABI outlier, e.g. sparc64. */
+ struct kernel_stat64 kst64;
+ int r = INLINE_SYSCALL_CALL (fstat64, fd, &kst64);
+ if (r == 0)
+ __cp_stat64_kstat64 (buf, &kst64);
+ return r;
+# endif /* STAT64_IS_KERNEL_STAT64 */
+# endif
+# else /* XSTAT_IS_XSTAT64 */
+ /* 64-bit kabi outlier, e.g. mips64 and mips64-n32. */
+ struct kernel_stat kst;
+ int r = INLINE_SYSCALL_CALL (fstat, fd, &kst);
+ if (r == 0)
+ __cp_kstat_stat64_t64 (&kst, buf);
+ return r;
+# endif
+#else /* !FSTATAT_USE_STATX */
+ /* All kABIs with non-LFS support and with old 32-bit time_t support
+ e.g. arm, csky, i386, hppa, m68k, microblaze, nios2, sh, powerpc32,
+ and sparc32. */
if (fd < 0)
{
__set_errno (EBADF);
return -1;
}
return __fstatat64_time64 (fd, "", buf, AT_EMPTY_PATH);
+#endif
}
#if __TIMESIZE != 64
hidden_def (__fstat64_time64)
diff --git a/sysdeps/unix/sysv/linux/fstatat64.c b/sysdeps/unix/sysv/linux/fstatat64.c
index 3509d3ca6d..127c6ff601 100644
--- a/sysdeps/unix/sysv/linux/fstatat64.c
+++ b/sysdeps/unix/sysv/linux/fstatat64.c
@@ -21,12 +21,10 @@
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
-#include <kernel_stat.h>
#include <sysdep.h>
#include <time.h>
-#include <kstat_cp.h>
-#include <stat_t64_cp.h>
#include <sys/sysmacros.h>
+#include <internal-stat.h>
#if __TIMESIZE == 64 \
&& (__WORDSIZE == 32 \
@@ -40,11 +38,7 @@ _Static_assert (sizeof (__blkcnt_t) == sizeof (__blkcnt64_t),
"__blkcnt_t and __blkcnt64_t must match");
#endif
-#if (__WORDSIZE == 32 \
- && (!defined __SYSCALL_WORDSIZE || __SYSCALL_WORDSIZE == 32)) \
- || defined STAT_HAS_TIME32 \
- || (!defined __NR_newfstatat && !defined __NR_fstatat64)
-# define FSTATAT_USE_STATX 1
+#if FSTATAT_USE_STATX
static inline int
fstatat64_time64_statx (int fd, const char *file, struct __stat64_t64 *buf,
@@ -79,8 +73,6 @@ fstatat64_time64_statx (int fd, const char *file, struct __stat64_t64 *buf,
return r;
}
-#else
-# define FSTATAT_USE_STATX 0
#endif
/* Only statx supports 64-bit timestamps for 32-bit architectures with
diff --git a/sysdeps/unix/sysv/linux/internal-stat.h b/sysdeps/unix/sysv/linux/internal-stat.h
new file mode 100644
index 0000000000..e3b0569853
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/internal-stat.h
@@ -0,0 +1,31 @@
+/* Internal stat definitions.
+ Copyright (C) 2023 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <sysdep.h>
+#include <stat_t64_cp.h>
+#include <kernel_stat.h>
+#include <kstat_cp.h>
+
+#if (__WORDSIZE == 32 \
+ && (!defined __SYSCALL_WORDSIZE || __SYSCALL_WORDSIZE == 32)) \
+ || defined STAT_HAS_TIME32 \
+ || (!defined __NR_newfstatat && !defined __NR_fstatat64)
+# define FSTATAT_USE_STATX 1
+#else
+# define FSTATAT_USE_STATX 0
+#endif
next prev parent reply other threads:[~2023-09-05 19:17 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-04 9:55 Mateusz Guzik
2023-09-04 10:08 ` Andreas Schwab
2023-09-04 10:11 ` Mateusz Guzik
2023-09-05 13:01 ` Adhemerval Zanella Netto
2023-09-05 13:14 ` Mateusz Guzik
2023-09-05 17:28 ` Adhemerval Zanella Netto
2023-09-05 17:45 ` Linus Torvalds
2023-09-05 18:22 ` Adhemerval Zanella Netto
2023-09-05 19:16 ` Adhemerval Zanella Netto [this message]
2023-09-05 19:21 ` Linus Torvalds
2023-09-05 21:42 ` Rich Felker
2023-09-05 21:46 ` Mateusz Guzik
2023-09-05 17:29 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b22fcbd5-eb69-7755-c76a-01706006b3cd@linaro.org \
--to=adhemerval.zanella@linaro.org \
--cc=dalias@libc.org \
--cc=libc-alpha@sourceware.org \
--cc=mjguzik@gmail.com \
--cc=schwab@suse.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).