public inbox for libc-ports@sourceware.org
 help / color / mirror / Atom feed
* [RFC] Copy as much as you can during a 32-bit stat before returning EOVERFLOW?
@ 2013-02-26 19:55 Carlos O'Donell
  2013-02-26 20:37 ` Paul Eggert
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Carlos O'Donell @ 2013-02-26 19:55 UTC (permalink / raw)
  To: GNU C Library, libc-ports

Community,

There are several filesystems which are going to be switching over
or already have switched over to unconditional 64-bit inodes and 
file sizes. This means that the current crop of 32-bit applications
may not be able to run reliably on these filesystems since stat 
may return EOVERFLOW for large values.

The correct solution is to switch to the 64-bit function variants
and avoid the overflow when converting from the kernel stat 
structure to the userspace version.

The correct solution is going to take time to implement and I am 
being asked if the glibc community can help lessen the burden of
the transition. I know that this has been coming now for over
a decade, but we should not ignore the request for assistance.

For example in the scenario where the application does not use 
any field which would overflow the user could ignore EOVERFLOW
from stat, mark the offending code as needing an audit and move
on.

The guarantee from glibc would be that on EOVERFLOW and -1 returned
that as much of the user stat structure will be filled as possible.
In the current glibc code we bail out at any point when we detect
an overflow during field conversion. The following patch shows how
the new behaviour would be implemented. I saw only two places 
where we quit early without copying other fields.

I liked the way the Hurd code in sysdeps/mach/hurd/xstatconv.c
does the overflow check last for all fields that might overflow.

The only argument against such a change is that glibc would like
to have a fail fast policy, which we don't today. The current
library policy is that failures are considered slow paths and
do not need to fail fast.

Comments?

Cheers,
Carlos.

ports/ChangeLog.mips

2012-02-26  Carlos O'Donell  <carlos@redhat.com>

	* sysdeps/unix/sysv/linux/mips/xstatconv.c (__xstat_conv):
	Convert all fields first. Set errno if any field had an overflow.

ChangeLog

2013-02-26  Carlos O'Donell  <carlos@redhat.com>

	* sysdeps/unix/sysv/linux/xstatconv.c (__xstat32_conv):
	Convert all fields first. Set errno if any field had an overflow.


diff --git a/ports/sysdeps/unix/sysv/linux/mips/xstatconv.c b/ports/sysdeps/unix/sysv/linux/mips/xstatconv.c
index 3fe65b1..891a98e 100644
--- a/ports/sysdeps/unix/sysv/linux/mips/xstatconv.c
+++ b/ports/sysdeps/unix/sysv/linux/mips/xstatconv.c
@@ -37,6 +37,7 @@ __xstat_conv (int vers, struct kernel_stat *kbuf, void *ubuf)
     case _STAT_VER_LINUX:
       {
 	struct stat *buf = ubuf;
+	int err = 0;
 
 	/* Convert to current kernel version of `struct stat'.  */
 	buf->st_dev = kbuf->st_dev;
@@ -44,10 +45,7 @@ __xstat_conv (int vers, struct kernel_stat *kbuf, void *ubuf)
 	buf->st_ino = kbuf->st_ino;
 	/* Check for overflow.  */
 	if (buf->st_ino != kbuf->st_ino)
-	  {
-	    __set_errno (EOVERFLOW);
-	    return -1;
-	  }
+	  err = EOVERFLOW;
 	buf->st_mode = kbuf->st_mode;
 	buf->st_nlink = kbuf->st_nlink;
 	buf->st_uid = kbuf->st_uid;
@@ -57,10 +55,7 @@ __xstat_conv (int vers, struct kernel_stat *kbuf, void *ubuf)
 	buf->st_size = kbuf->st_size;
 	/* Check for overflow.  */
 	if (buf->st_size != kbuf->st_size)
-	  {
-	    __set_errno (EOVERFLOW);
-	    return -1;
-	  }
+	  err = EOVERFLOW;
 	buf->st_pad3 = 0;
 	buf->st_atim.tv_sec = kbuf->st_atime_sec;
 	buf->st_atim.tv_nsec = kbuf->st_atime_nsec;
@@ -72,11 +67,13 @@ __xstat_conv (int vers, struct kernel_stat *kbuf, void *ubuf)
 	buf->st_blocks = kbuf->st_blocks;
 	/* Check for overflow.  */
 	if (buf->st_blocks != kbuf->st_blocks)
+	  err = EOVERFLOW;
+	memset (&buf->st_pad5, 0, sizeof (buf->st_pad5));
+	if (err)
 	  {
-	    __set_errno (EOVERFLOW);
+	    __set_errno (err);
 	    return -1;
 	  }
-	memset (&buf->st_pad5, 0, sizeof (buf->st_pad5));
       }
       break;
 
diff --git a/sysdeps/unix/sysv/linux/xstatconv.c b/sysdeps/unix/sysv/linux/xstatconv.c
index 858b911..68995df 100644
--- a/sysdeps/unix/sysv/linux/xstatconv.c
+++ b/sysdeps/unix/sysv/linux/xstatconv.c
@@ -185,6 +185,7 @@ __xstat32_conv (int vers, struct stat64 *kbuf, struct stat *buf)
     {
     case _STAT_VER_LINUX:
       {
+	int err = 0;
 	/* Convert current kernel version of `struct stat64' to
            `struct stat'.  */
 	buf->st_dev = kbuf->st_dev;
@@ -201,19 +202,13 @@ __xstat32_conv (int vers, struct stat64 *kbuf, struct stat *buf)
 	    buf->st_ino = kbuf->st_ino;
 	    if (sizeof (buf->st_ino) != sizeof (kbuf->st_ino)
 		&& buf->st_ino != kbuf->st_ino)
-	      {
-		__set_errno (EOVERFLOW);
-		return -1;
-	      }
+	      err = EOVERFLOW;
 	  }
 #else
 	buf->st_ino = kbuf->st_ino;
 	if (sizeof (buf->st_ino) != sizeof (kbuf->st_ino)
 	    && buf->st_ino != kbuf->st_ino)
-	  {
-	    __set_errno (EOVERFLOW);
-	    return -1;
-	  }
+	  err = EOVERFLOW;
 #endif
 	buf->st_mode = kbuf->st_mode;
 	buf->st_nlink = kbuf->st_nlink;
@@ -227,19 +222,13 @@ __xstat32_conv (int vers, struct stat64 *kbuf, struct stat *buf)
 	/* Check for overflow.  */
 	if (sizeof (buf->st_size) != sizeof (kbuf->st_size)
 	    && buf->st_size != kbuf->st_size)
-	  {
-	    __set_errno (EOVERFLOW);
-	    return -1;
-	  }
+	  err = EOVERFLOW;
 	buf->st_blksize = kbuf->st_blksize;
 	buf->st_blocks = kbuf->st_blocks;
 	/* Check for overflow.  */
 	if (sizeof (buf->st_blocks) != sizeof (kbuf->st_blocks)
 	    && buf->st_blocks != kbuf->st_blocks)
-	  {
-	    __set_errno (EOVERFLOW);
-	    return -1;
-	  }
+	  err = EOVERFLOW;
 #ifdef _HAVE_STAT_NSEC
 	buf->st_atim.tv_sec = kbuf->st_atim.tv_sec;
 	buf->st_atim.tv_nsec = kbuf->st_atim.tv_nsec;
@@ -268,6 +257,12 @@ __xstat32_conv (int vers, struct stat64 *kbuf, struct stat *buf)
 #ifdef _HAVE_STAT___UNUSED5
 	buf->__unused5 = 0;
 #endif
+
+	if (err)
+	  {
+	    __set_errno (err);
+	    return -1;
+	  }
       }
       break;
---

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] Copy as much as you can during a 32-bit stat before returning EOVERFLOW?
  2013-02-26 19:55 [RFC] Copy as much as you can during a 32-bit stat before returning EOVERFLOW? Carlos O'Donell
@ 2013-02-26 20:37 ` Paul Eggert
  2013-02-26 22:59   ` Joseph S. Myers
  2013-02-26 21:00 ` Roland McGrath
  2013-02-26 23:49 ` Rich Felker
  2 siblings, 1 reply; 8+ messages in thread
From: Paul Eggert @ 2013-02-26 20:37 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: GNU C Library, libc-ports

On 02/26/13 11:55, Carlos O'Donell wrote:> Community,

> The correct solution is going to take time to implement

Don't the proposed patches to xstatconv.c etc. slow down glibc
slightly for the normal (successful) case?  That's a minus.

Here's a different idea: enable wide ino_t, time_t, etc. if
_FILE_OFFSET_BITS is 64.  This would require zero changes to
applications that are portable and largefile aware, as they
already define _FILE_OFFSET_BITS.  And even for nonportable
applications, compiling with -D_FILE_OFFSET_BITS=64 (and fixing any
resulting porting glitches) should take less time and should be safer
than modifying the applications to make the unportable assumption that a
struct stat is filled in if errno == EOVERFLOW.

A more-conservative variant of this idea would establish a new symbol
that means "use wide-enough values for all file-related types".
-D_WIDE_FILE_TYPES, say, would use wide types for off_t, time_t,
ino_t, etc.  Portable application developers could modify their code
to set this symbol as well as setting _FILE_OFFSET_BITS, which is a
bit of a hassle, but I've seen worse (e.g., Mac OS X uses
_DARWIN_USE_64_BIT_INODE to select 64-bit inodes, but that's all it
does -- yuck).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] Copy as much as you can during a 32-bit stat before returning EOVERFLOW?
  2013-02-26 19:55 [RFC] Copy as much as you can during a 32-bit stat before returning EOVERFLOW? Carlos O'Donell
  2013-02-26 20:37 ` Paul Eggert
@ 2013-02-26 21:00 ` Roland McGrath
  2013-02-26 23:05   ` Joseph S. Myers
  2013-02-27 19:26   ` Carlos O'Donell
  2013-02-26 23:49 ` Rich Felker
  2 siblings, 2 replies; 8+ messages in thread
From: Roland McGrath @ 2013-02-26 21:00 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: GNU C Library, libc-ports

The argument in favor of this API change seems quite thin.  An old
program will have to be modified to accept EOVERFLOW failures, so why
not modify it to use *64 interfaces or -D_FILE_OFFSET_BITS=64 instead?
It may seem at first blush that the change would be simpler in complex
programs.  But, in fact, to be robust when using older libcs a program
would have to do something very special to distinguish a library call
(new-style) that delivered some truncated values from one (old-style)
that delivered some or all uninitialized fields.

Performance concerns aside, the real cost of this is making the API more
arcane.  It's a straightforward rule today that calls that fail leave
any result parameters in an undefined state and programs must not assume
anything about those bits of memory after a failing call.  There are
some exceptions such as nanosleep's EINTR case, but those are very few
and are well-specified in POSIX.

Old programs wanting something like you suggest can easily get it from a
wrapper function something_stat supplied in some separate library
(either an actual library or an informal one like gnulib) that calls
stat64 and converts to the old format with the EOVERFLOW semantics you
describe.

I don't see any defensible rationale for putting such a change into
libc.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] Copy as much as you can during a 32-bit stat before returning EOVERFLOW?
  2013-02-26 20:37 ` Paul Eggert
@ 2013-02-26 22:59   ` Joseph S. Myers
  0 siblings, 0 replies; 8+ messages in thread
From: Joseph S. Myers @ 2013-02-26 22:59 UTC (permalink / raw)
  To: Paul Eggert; +Cc: Carlos O'Donell, GNU C Library, libc-ports

On Tue, 26 Feb 2013, Paul Eggert wrote:

> Here's a different idea: enable wide ino_t, time_t, etc. if
> _FILE_OFFSET_BITS is 64.  This would require zero changes to

ino_t is already __ino64_t when _FILE_OFFSET_BITS is 64 (ignoring the lack 
of 64-bit ino_t in the kernel stat64 interface on SH).  64-bit time_t 
interfaces on 32-bit systems will require major new work in both glibc and 
the Linux kernel (including new versions of various syscalls, though 
presumably only the relevant syscalls that are in the generic ABI and not 
the legacy ones).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] Copy as much as you can during a 32-bit stat before returning EOVERFLOW?
  2013-02-26 21:00 ` Roland McGrath
@ 2013-02-26 23:05   ` Joseph S. Myers
  2013-02-27 19:26   ` Carlos O'Donell
  1 sibling, 0 replies; 8+ messages in thread
From: Joseph S. Myers @ 2013-02-26 23:05 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Carlos O'Donell, GNU C Library, libc-ports

On Tue, 26 Feb 2013, Roland McGrath wrote:

> The argument in favor of this API change seems quite thin.  An old
> program will have to be modified to accept EOVERFLOW failures, so why
> not modify it to use *64 interfaces or -D_FILE_OFFSET_BITS=64 instead?

Agreed.  _FILE_OFFSET_BITS changes can be fiddly when a library's 
interface involves off_t or other affected types, but I'd expect changing 
each program to handle EOVERFLOW specially to be much more fiddly and 
fragile than a library ABI transition for libraries whose interfaces 
involve affected types - even though changing _FILE_OFFSET_BITS may also 
involve reviewing applications for cases where types such as "int" or 
"long" or associated printf formats are used to store values that may no 
longer fit in those types.  (Where a library's interface depends on 
_FILE_OFFSET_BITS, I'd suggest a conditional #error in the library's 
headers to make sure that applications built with the library are using 
the same setting of _FILE_OFFSET_BITS.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] Copy as much as you can during a 32-bit stat before returning EOVERFLOW?
  2013-02-26 19:55 [RFC] Copy as much as you can during a 32-bit stat before returning EOVERFLOW? Carlos O'Donell
  2013-02-26 20:37 ` Paul Eggert
  2013-02-26 21:00 ` Roland McGrath
@ 2013-02-26 23:49 ` Rich Felker
  2013-02-26 23:58   ` Roland McGrath
  2 siblings, 1 reply; 8+ messages in thread
From: Rich Felker @ 2013-02-26 23:49 UTC (permalink / raw)
  To: Carlos O'Donell; +Cc: GNU C Library, libc-ports

On Tue, Feb 26, 2013 at 02:55:33PM -0500, Carlos O'Donell wrote:
> Community,
> 
> There are several filesystems which are going to be switching over
> or already have switched over to unconditional 64-bit inodes and 
> file sizes. This means that the current crop of 32-bit applications
> may not be able to run reliably on these filesystems since stat 
> may return EOVERFLOW for large values.

Can we PLEASE reconsider the decision to keep supporting 32-bit off_t?
If we're to the point where stat() *randomly* (from the application's
and user's perspective) fails when compiled with 32-bit off_t, then I
think such programs are just fundamentally broken now, and need to be
deprecated. Despite the move to compile everythin with
-D_FILE_OFFSET_BITS=64, last time I checked, Debian still has a number
of packages which are compiled without it, and thus which might
experience random breakage on newer filesystems.

In my opinion, the migration path should be something like this:

Stage 1: Change the default for _FILE_OFFSET_BITS to 64 on all
targets, and make 32-bit off_t accessible only by explicitly
requesting -D_FILE_OFFSET_BITS=32. This should help transition all the
stragglers who are just straggling because they don't know better; the
few who have fundamentally buggy code that's incompatible with 64-bit
off_t can use the workaround (-D_FILE_OFFSET_BITS=32) until they get
their acts together.

Stage 2: Remove all 32-bit off_t infrastructure from the headers and
leave the legacy 32-bit off_t symbols around only for use by
already-linked applications. At this point, New symbol versions can be
introduced for all of the functions (open, etc.) so that hacks
remapping them to the nonconforming namespace names (open->open64,
etc.) can be removed; legacy applications can get the legacy behavior
based on the symbol version rather than the symbol name. This will
also resolve the issue that glibc does not allow you to declare some
of these functions yourself without including the associated header.

I would be perfectly happy with stage 1 being made in the next release
cycle and stafe 2 in the following release cycle, but if it would be
more agreeable to wait 2 or more release cycles before stage 2, that's
still much better than living with the current breakage indefinitely.

Opinions?

Rich

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] Copy as much as you can during a 32-bit stat before returning EOVERFLOW?
  2013-02-26 23:49 ` Rich Felker
@ 2013-02-26 23:58   ` Roland McGrath
  0 siblings, 0 replies; 8+ messages in thread
From: Roland McGrath @ 2013-02-26 23:58 UTC (permalink / raw)
  To: Rich Felker; +Cc: Carlos O'Donell, GNU C Library, libc-ports

> Stage 1: Change the default for _FILE_OFFSET_BITS to 64 on all
> targets, and make 32-bit off_t accessible only by explicitly
> requesting -D_FILE_OFFSET_BITS=32. This should help transition all the
> stragglers who are just straggling because they don't know better; the
> few who have fundamentally buggy code that's incompatible with 64-bit
> off_t can use the workaround (-D_FILE_OFFSET_BITS=32) until they get
> their acts together.

This makes library ABIs silently become incompatible with past builds.
That's a non-starter.

I'm somewhat sympathetic to the overall goal, but I'm not at all sure what
a viable migration path might be.

Regardless, don't hijack this thread for that subject.
Start your own thread.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] Copy as much as you can during a 32-bit stat before returning EOVERFLOW?
  2013-02-26 21:00 ` Roland McGrath
  2013-02-26 23:05   ` Joseph S. Myers
@ 2013-02-27 19:26   ` Carlos O'Donell
  1 sibling, 0 replies; 8+ messages in thread
From: Carlos O'Donell @ 2013-02-27 19:26 UTC (permalink / raw)
  To: Roland McGrath; +Cc: GNU C Library, libc-ports

On 02/26/2013 04:00 PM, Roland McGrath wrote:
> The argument in favor of this API change seems quite thin.  An old
> program will have to be modified to accept EOVERFLOW failures, so why
> not modify it to use *64 interfaces or -D_FILE_OFFSET_BITS=64 instead?
> It may seem at first blush that the change would be simpler in complex
> programs.  But, in fact, to be robust when using older libcs a program
> would have to do something very special to distinguish a library call
> (new-style) that delivered some truncated values from one (old-style)
> that delivered some or all uninitialized fields.

A given users needs are far more focused. In practice they want to move
to a newer distribution or filesystem and keep down the cost of the 
upgrade while incrementally fixing applications.

> I don't see any defensible rationale for putting such a change into
> libc.

I agree. I didn't want to colour the conversation with my initial opinion,
but it seems like everyone is pretty well agreed that the change would
complicate the API without enough gain.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2013-02-27 19:26 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-26 19:55 [RFC] Copy as much as you can during a 32-bit stat before returning EOVERFLOW? Carlos O'Donell
2013-02-26 20:37 ` Paul Eggert
2013-02-26 22:59   ` Joseph S. Myers
2013-02-26 21:00 ` Roland McGrath
2013-02-26 23:05   ` Joseph S. Myers
2013-02-27 19:26   ` Carlos O'Donell
2013-02-26 23:49 ` Rich Felker
2013-02-26 23:58   ` Roland McGrath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).