From: Paul Eggert <eggert@cs.ucla.edu>
To: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Cc: bug-gnulib@gnu.org, libc-alpha@sourceware.org
Subject: Re: [PATCH v3 4/6] stdlib: Sync canonicalize with gnulib [BZ #10635] [BZ #26592] [BZ #26341] [BZ #24970]
Date: Fri, 1 Jan 2021 16:04:02 -0800 [thread overview]
Message-ID: <275283e0-70ee-5ea4-e63d-d0f1d1393667@cs.ucla.edu> (raw)
In-Reply-To: <502b6d2d-1139-ca9d-14cf-00082adc915e@linaro.org>
[-- Attachment #1: Type: text/plain, Size: 2898 bytes --]
On 12/30/20 5:10 AM, Adhemerval Zanella wrote:
>> it is just really
>> a small optimization that adds code complexity on a somewhat convoluted
>> code.
The code is indeed simpler without the NARROW_ADDRESSES optimization, so
I removed that optimization by installing the attached patch into Gnulib.
>> For ENAMETOOLONG, I think this is the right error code: it enforces
>> that we do not support internal objects longer that PTRDIFF_MAX.
This sounds backwards, as the code returns ENOMEM every other place it
tries to create an internal object longer than PTRDIFF_MAX - these
ENOMEM checks are in the malloc calls invoked by scratch_buffer_grow and
scratch_buffer_grow_preserve. It would be odd for canonicalize_file_name
to return ENAMETOOLONG for this one particular way of creating a
too-large object, while at the same time it returns ENOMEM for all the
other ways.
Besides, ENAMETOOLONG is the POSIX error code for exceeding NAME_MAX or
PATH_MAX, which is not what is happening here.
In Gnulib and other GNU apps we've long used the tradition that ENOMEM
means you've run out of memory, regardless of whether it's because your
heap or your address space is too small. This is a good tradition and
it'd be good to use it here too.
>> I think it should be a fair assumption to make it on internal code, such
>> as realpath
Yes, staying less than PTRDIFF_MAX is a vital assumption on internal
objects. I'd go even further and say it's important for user-supplied
objects, too, as so much code relies on pointer subtraction and we can't
realistically prohibit that within glibc.
> (this is another reason why I think NARROW_ADDRESSES is not
> necessary).
Unfortunately, if we merely assume every object has at most PTRDIFF_MAX
bytes, we still must check for overflow when adding the sizes of two
objects. The NARROW_ADDRESSES optimization would have let us avoid that
unnecessary check on 64-bit machines.
> And your fix (from 93e0186d4) does not really solve the issue, since
> now that len is a size_t the overflow check won't catch the potentially
> allocation larger than PTRDIFF_MAX (the realpath will still fail with
> ENOMEM though).
Sure, which means the code is doing the right thing: it's failing with
ENOMEM because it ran out of memory. There is no need for an extra
PTRDIFF_MAX check in canonicalize.c if malloc (via scratch_buffer_grow)
already does the check.
> Wouldn't the below be simpler?
>
> size_t len = strlen (end);
> if (len > IDX_MAX || INT_ADD_OVERFLOW ((idx_t) len, n))
> {
> __set_errno (ENAMETOOLONG);
> goto error_nomem;
> }
It's not simpler than the attached Gnulib patch, because it contains an
unnecessary comparison to IDX_MAX and an unnecessary cast to idx_t.
[-- Attachment #2: 0001-canonicalize-remove-NARROW_ADDRESSES-optimization.patch --]
[-- Type: text/x-patch, Size: 3184 bytes --]
From 8f6b9b66be6672bed1045c27e606dd9fcedcf022 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Fri, 1 Jan 2021 15:54:43 -0800
Subject: [PATCH] canonicalize: remove NARROW_ADDRESSES optimization
* lib/canonicalize-lgpl.c, lib/canonicalize.c (NARROW_ADDRESSES):
Remove, and remove all uses, as the optimization is arguably not
worth the extra complexity. Suggested by Adhemerval Zanella in:
https://sourceware.org/pipermail/libc-alpha/2020-December/121203.html
---
ChangeLog | 8 ++++++++
lib/canonicalize-lgpl.c | 6 +-----
lib/canonicalize.c | 6 +-----
3 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 2d498a5e9..fc45e1176 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2021-01-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ canonicalize: remove NARROW_ADDRESSES optimization
+ * lib/canonicalize-lgpl.c, lib/canonicalize.c (NARROW_ADDRESSES):
+ Remove, and remove all uses, as the optimization is arguably not
+ worth the extra complexity. Suggested by Adhemerval Zanella in:
+ https://sourceware.org/pipermail/libc-alpha/2020-December/121203.html
+
2021-01-01 Bruno Haible <bruno@clisp.org>
stddef: Try harder to get max_align_t defined on OpenBSD.
diff --git a/lib/canonicalize-lgpl.c b/lib/canonicalize-lgpl.c
index 560e24288..698f9ede2 100644
--- a/lib/canonicalize-lgpl.c
+++ b/lib/canonicalize-lgpl.c
@@ -85,10 +85,6 @@
# define IF_LINT(Code) /* empty */
#endif
-/* True if adding two valid object sizes might overflow idx_t.
- As a practical matter, this cannot happen on 64-bit machines. */
-enum { NARROW_ADDRESSES = IDX_MAX >> 31 >> 31 == 0 };
-
#ifndef DOUBLE_SLASH_IS_DISTINCT_ROOT
# define DOUBLE_SLASH_IS_DISTINCT_ROOT false
#endif
@@ -343,7 +339,7 @@ realpath_stk (const char *name, char *resolved,
if (end_in_extra_buffer)
end_idx = end - extra_buf;
size_t len = strlen (end);
- if (NARROW_ADDRESSES && INT_ADD_OVERFLOW (len, n))
+ if (INT_ADD_OVERFLOW (len, n))
{
__set_errno (ENOMEM);
goto error_nomem;
diff --git a/lib/canonicalize.c b/lib/canonicalize.c
index cc32260a8..3a1c8098b 100644
--- a/lib/canonicalize.c
+++ b/lib/canonicalize.c
@@ -42,10 +42,6 @@
# define IF_LINT(Code) /* empty */
#endif
-/* True if adding two valid object sizes might overflow idx_t.
- As a practical matter, this cannot happen on 64-bit machines. */
-enum { NARROW_ADDRESSES = IDX_MAX >> 31 >> 31 == 0 };
-
#ifndef DOUBLE_SLASH_IS_DISTINCT_ROOT
# define DOUBLE_SLASH_IS_DISTINCT_ROOT false
#endif
@@ -393,7 +389,7 @@ canonicalize_filename_mode_stk (const char *name, canonicalize_mode_t can_mode,
if (end_in_extra_buffer)
end_idx = end - extra_buf;
size_t len = strlen (end);
- if (NARROW_ADDRESSES && INT_ADD_OVERFLOW (len, n))
+ if (INT_ADD_OVERFLOW (len, n))
xalloc_die ();
while (extra_buffer.length <= len + n)
{
--
2.27.0
next prev parent reply other threads:[~2021-01-02 0:04 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-29 19:34 [PATCH v3 0/6] Multiple fixes to realpath Adhemerval Zanella
2020-12-29 19:34 ` [PATCH v3 1/6] Import idx.h from gnulib Adhemerval Zanella
2020-12-29 19:34 ` [PATCH v3 2/6] Import filename.h " Adhemerval Zanella
2020-12-29 19:34 ` [PATCH v3 3/6] malloc: Add scratch_buffer_dupfree Adhemerval Zanella
2020-12-29 19:34 ` [PATCH v3 4/6] stdlib: Sync canonicalize with gnulib [BZ #10635] [BZ #26592] [BZ #26341] [BZ #24970] Adhemerval Zanella
2020-12-30 1:21 ` Paul Eggert
2020-12-30 3:39 ` Paul Eggert
2020-12-30 6:19 ` Paul Eggert
2020-12-30 12:34 ` Adhemerval Zanella
2020-12-30 13:10 ` Adhemerval Zanella
2021-01-02 0:04 ` Paul Eggert [this message]
2021-01-04 12:52 ` Adhemerval Zanella
2021-01-09 1:24 ` Paul Eggert
2020-12-29 19:34 ` [PATCH v3 5/6] support: Add support_small_thread_stack_size Adhemerval Zanella
2020-12-30 8:54 ` Florian Weimer
2020-12-29 19:34 ` [PATCH v3 6/6] stdlib: Add testcase fro BZ #26241 Adhemerval Zanella
2021-01-20 4:33 ` DJ Delorie
2021-01-20 14:13 ` Adhemerval Zanella
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=275283e0-70ee-5ea4-e63d-d0f1d1393667@cs.ucla.edu \
--to=eggert@cs.ucla.edu \
--cc=adhemerval.zanella@linaro.org \
--cc=bug-gnulib@gnu.org \
--cc=libc-alpha@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).