public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: bug-gnulib@gnu.org, libc-alpha@sourceware.org
Subject: Re: [PATCH v3 4/6] stdlib: Sync canonicalize with gnulib [BZ #10635] [BZ #26592] [BZ #26341] [BZ #24970]
Date: Mon, 4 Jan 2021 09:52:10 -0300	[thread overview]
Message-ID: <35be46d6-1b5a-b1f9-c3db-0956448c55ae@linaro.org> (raw)
In-Reply-To: <275283e0-70ee-5ea4-e63d-d0f1d1393667@cs.ucla.edu>



On 01/01/2021 21:04, Paul Eggert wrote:
> On 12/30/20 5:10 AM, Adhemerval Zanella wrote:
> 
>>> it is just really
>>> a small optimization that adds code complexity on a somewhat convoluted
>>> code.
> 
> The code is indeed simpler without the NARROW_ADDRESSES optimization, so I removed that optimization by installing the attached patch into Gnulib.
> 
>>> For ENAMETOOLONG, I think this is the right error code: it enforces
>>> that we do not support internal objects longer that PTRDIFF_MAX.
> 
> This sounds backwards, as the code returns ENOMEM every other place it tries to create an internal object longer than PTRDIFF_MAX - these ENOMEM checks are in the malloc calls invoked by scratch_buffer_grow and scratch_buffer_grow_preserve. It would be odd for canonicalize_file_name to return ENAMETOOLONG for this one particular way of creating a too-large object, while at the same time it returns ENOMEM for all the other ways.
> 
> Besides, ENAMETOOLONG is the POSIX error code for exceeding NAME_MAX or PATH_MAX, which is not what is happening here.> 
> In Gnulib and other GNU apps we've long used the tradition that ENOMEM means you've run out of memory, regardless of whether it's because your heap or your address space is too small. This is a good tradition and it'd be good to use it here too.

Right, I think we can now assume that that since the  implementation does 
not really imposes any NAME_MAX or PATH_MAX limitations, returning memory
allocation errors instead of ENAMETOOLONG is ok.  I will adjust the 
stdlib/test-bz22786.c, it will require a slight large runtime and memory
requirements (which should be ok).
> 
>>> I think it should be a fair assumption to make it on internal code, such
>>> as realpath
> 
> Yes, staying less than PTRDIFF_MAX is a vital assumption on internal objects. I'd go even further and say it's important for user-supplied objects, too, as so much code relies on pointer subtraction and we can't realistically prohibit that within glibc.

We do enforce it through memory allocations, however users can still craft
such objects using mmap (some libc does imposes the same PTRDIFF_MAX limit
on mmap as well).

> 
>> (this is another reason why I think NARROW_ADDRESSES is not necessary).
> 
> Unfortunately, if we merely assume every object has at most PTRDIFF_MAX bytes, we still must check for overflow when adding the sizes of two objects. The NARROW_ADDRESSES optimization would have let us avoid that unnecessary check on 64-bit machines.
> 
>> And your fix (from 93e0186d4) does not really solve the issue, since
>> now that len is a size_t the overflow check won't catch the potentially
>> allocation larger than PTRDIFF_MAX (the realpath will still fail with
>> ENOMEM though).
> 
> Sure, which means the code is doing the right thing: it's failing with ENOMEM because it ran out of memory. There is no need for an extra PTRDIFF_MAX check in canonicalize.c if malloc (via scratch_buffer_grow) already does the check.
>> Wouldn't the below be simpler?
>>
>>                size_t len = strlen (end);
>>                if (len > IDX_MAX || INT_ADD_OVERFLOW ((idx_t) len, n))
>>                  {
>>                    __set_errno (ENAMETOOLONG);
>>                    goto error_nomem;
>>                  }
> 
> It's not simpler than the attached Gnulib patch, because it contains an unnecessary comparison to IDX_MAX and an unnecessary cast to idx_t.

The extra comparison might avoid the scratch_buffer resize that will
fail (since malloc will fail to try allocate PTRDIFF_MAX object), but
it will be used only when such objects are provided (which depending
of the system should not happen).

Thanks, I synced with the most recent gnulib version. 

  reply	other threads:[~2021-01-04 12:52 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-29 19:34 [PATCH v3 0/6] Multiple fixes to realpath Adhemerval Zanella
2020-12-29 19:34 ` [PATCH v3 1/6] Import idx.h from gnulib Adhemerval Zanella
2020-12-29 19:34 ` [PATCH v3 2/6] Import filename.h " Adhemerval Zanella
2020-12-29 19:34 ` [PATCH v3 3/6] malloc: Add scratch_buffer_dupfree Adhemerval Zanella
2020-12-29 19:34 ` [PATCH v3 4/6] stdlib: Sync canonicalize with gnulib [BZ #10635] [BZ #26592] [BZ #26341] [BZ #24970] Adhemerval Zanella
2020-12-30  1:21   ` Paul Eggert
2020-12-30  3:39     ` Paul Eggert
2020-12-30  6:19       ` Paul Eggert
2020-12-30 12:34     ` Adhemerval Zanella
2020-12-30 13:10       ` Adhemerval Zanella
2021-01-02  0:04         ` Paul Eggert
2021-01-04 12:52           ` Adhemerval Zanella [this message]
2021-01-09  1:24             ` Paul Eggert
2020-12-29 19:34 ` [PATCH v3 5/6] support: Add support_small_thread_stack_size Adhemerval Zanella
2020-12-30  8:54   ` Florian Weimer
2020-12-29 19:34 ` [PATCH v3 6/6] stdlib: Add testcase fro BZ #26241 Adhemerval Zanella
2021-01-20  4:33   ` DJ Delorie
2021-01-20 14:13     ` Adhemerval Zanella

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=35be46d6-1b5a-b1f9-c3db-0956448c55ae@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=bug-gnulib@gnu.org \
    --cc=eggert@cs.ucla.edu \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).