public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: bug-gnulib@gnu.org, libc-alpha@sourceware.org
Subject: Re: [PATCH 5/5] stdlib: Remove lstat usage from realpath [BZ #24970]
Date: Mon, 28 Dec 2020 08:42:52 -0300	[thread overview]
Message-ID: <414fce85-5ed9-6438-d5a5-c92b308a4052@linaro.org> (raw)
In-Reply-To: <2bbe633f-97b7-9416-8d6a-e4c934cf23fd@cs.ucla.edu>



On 24/12/2020 21:27, Paul Eggert wrote:
> This email finishes the review of this proposed glibc patchset. (I didn't look at patch 4/5 for test cases.)
> 
> On 12/24/20 7:17 AM, Adhemerval Zanella wrote:
> 
>> +/* Check if BUFFER is using the internal buffer.  */
>> +static __always_inline bool
>> +scratch_buffer_using_internal (struct scratch_buffer *buffer)
>> +{
>> +  return buffer->data == buffer->__space.__c;
>> +}
>> +
>> +/* Return the internal buffer from BUFFER if it is dynamic allocated,
>> +   otherwise returns NULL.  Initializes the BUFFER if the internal
>> +   dynamic buffer is returned.  */
>> +static __always_inline void *
>> +scratch_buffer_take_buffer (struct scratch_buffer *buffer)
>> +{
>> +  if (scratch_buffer_using_internal (buffer))
>> +    return NULL;
>> +
>> +  void *r = buffer->data;
>> +  scratch_buffer_init (buffer);
>> +  return r;
>> +}
> 
> This combination of functions is a little tricky. Instead, how about a single function that duplicates the scratch buffer on the heap, and frees the scratch buffer? Please see the attached proposed patch for Gnulib, which implements this idea. (I have not installed this into Gnulib.)

Indeed this seems a better alternative.

> 
> Also, shouldn't we merge the already-existing Gnulib scratch_buffer changes into glibc, along with this new change?

Which changes are you referring? Checking against bbaba6ce5 I see all
glibc files for scratch_buffer are similar to the gnulib ones.

> 
>>   static idx_t
>> +readlink_scratch_buffer (const char *path, struct scratch_buffer *buf)
>> +{
>> +  ssize_t r;
>> +  while (true)
>> +    {
>> +      ptrdiff_t bufsize = buf->length;
>> +      r = __readlink (path, buf->data, bufsize - 1);
>> +      if (r < bufsize - 1)
>> +    break;
>> +      if (!scratch_buffer_grow (buf))
>> +    return -1;
>> +    }
>> +  return r;
>> +}
> 
> This function seems to exist because the proposed code calls readlink in two places. Current gnulib has been changed to call it in just one place, so there's less need to split out the function (the splitting complicates out-of-memory checking).

Yes, this trades a stat call by an extra readlink call.  However it seems
that your strategy to use faccessat should be better, assuming it is lighter
syscall.

> 
>> -  scratch_buffer_init (rname_buf);
>> -  char *rname_on_stack = rname_buf->data;
>> ...
>> +  scratch_buffer_init (&rname_buffer);
>> +  char *rname = rname_buffer.data;
> 
> Doesn't this sort of thing have the potential to run into GCC bug 93644? That bug tends to be flaky; change platforms, or a few lines of code, and the problem recurs. Although it's just a bogus warning it cannot be turned off Gnulib has a GCC_LINT fix for this, which glibc could use simply with "#define GCC_LINT 1" in the "#ifdef _GLIBC" code.

I wasn't aware on the GCC issue in fact (it seems to affect GCC 10/11
and I am using GCC 9.2.1).

> 
>> @@ -206,6 +219,20 @@ __realpath (const char *name, char *resolved)
>>           /* nothing */;
>>         else if (startlen == 2 && start[0] == '.' && start[1] == '.')
>>           {
>> +          if (!ISSLASH (dest[-1]))
>> +            *dest++ = '/';
>> +          *dest = '\0';
>> +
>> +      ssize_t n = readlink_scratch_buffer (rname, &link_buffer);
>> +          if (n < 0)
>> +            {
>> +              if (errno == ENOTDIR && dest[-1] == '/')
>> +                dest[-1] = '\0';
>> +          if (errno == ENOMEM)
>> +        goto error_nomem;
>> +              if (errno != EINVAL)
>> +                goto error;
>> +            }
> 
> This can call readlink twice, once with trailing slash and once without. Better to call it just once.

Right.

> 
>> +          char *buf = (char*) link_buffer.data;
>>                   buf[n] = '\0';
>>   @@ -279,7 +296,7 @@ __realpath (const char *name, char *resolved)
>>                   /* Careful here, end may be a pointer into extra_buf... */
>>                 memmove (&extra_buf[n], end, len + 1);
>> -              name = end = memcpy (extra_buf, buf, n);
>> +              name = end = memcpy (extra_buf, link_buffer.data, n);
> 
> If buf already equals link_buffer.data, no need for the patch to change buf to link_buffer.data.
> 

Indeed, this is a left over from development.

>> -          else if (! (startlen == 0
>> -                      ? stat (rname, &st) == 0 || errno == EOVERFLOW
>> -                      : errno == EINVAL))
>> -            goto error;
> 
> I think current Gnulib addresses this issue in a different way, that doesn't involve the extra readlink calls.
>>     if (DOUBLE_SLASH_IS_DISTINCT_ROOT && dest == rname + 1 && !prefix_len
>>         && ISSLASH (*dest) && !ISSLASH (dest[1]))

Yes, I noted now on gnulib master that you handled it with an faccessat 
on dir_check. 

>>       dest++;
>> +  *dest = '\0';
>>     failed = false;
>>     error:
>> -  *dest++ = '\0';
> 
> This looks dubious, as the error case also needs *dest to be '\0' and to increment dest (for when returning NULL when resolved != NULL).
> 

Yes, I think this is also a leftover from development.

> Proposed patch to Gnulib attached. I hope this patch (along with what's already in Gnulib) addresses all the issues raised in your glibc patches, in the sense that the relevant files can be identical in Glibc and in Gnulib. I haven't installed this into Gnulib master on savannah.

Changes seems ok, I will send a v3 with the proposed changes to sync
the gnulib headers, along with the scratch_buffer change, the gnulib
sync that fix all the issues, and the extra test.

Thanks for working on this.

  reply	other threads:[~2020-12-28 11:43 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-24 15:16 [PATCH 0/5] Fix multiple realpath issues Adhemerval Zanella
2020-12-24 15:16 ` [PATCH 1/5] stdlib: Sync canonicalize with gnulib [BZ #10635] [BZ #26592] [BZ #26241] Adhemerval Zanella
2020-12-24 22:45   ` Paul Eggert
2020-12-25  0:44     ` [PATCH 1/5] warnings in canonicalize.c Bruno Haible
2020-12-25  5:56       ` Paul Eggert
2020-12-28 12:53     ` [PATCH 1/5] stdlib: Sync canonicalize with gnulib [BZ #10635] [BZ #26592] [BZ #26241] Adhemerval Zanella
2020-12-24 15:16 ` [PATCH 2/5] Import idx.h from gnulib Adhemerval Zanella
2020-12-24 22:53   ` Paul Eggert
2020-12-25 20:34   ` Florian Weimer
2020-12-25 21:38     ` Bruno Haible
2020-12-26  1:29     ` Paul Eggert
2020-12-31 23:12   ` Joseph Myers
2021-01-01  3:24     ` Paul Eggert
2020-12-24 15:16 ` [PATCH 3/5] Import filename.h " Adhemerval Zanella
2020-12-31 23:13   ` Joseph Myers
2021-01-01  3:31     ` Paul Eggert
2020-12-24 15:17 ` [PATCH 4/5] stdlib: Add testcase fro BZ #26241 Adhemerval Zanella
2020-12-24 15:17 ` [PATCH 5/5] stdlib: Remove lstat usage from realpath [BZ #24970] Adhemerval Zanella
2020-12-25  0:27   ` Paul Eggert
2020-12-28 11:42     ` Adhemerval Zanella [this message]
2020-12-28 13:32       ` Adhemerval Zanella
2020-12-28 21:04       ` Paul Eggert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=414fce85-5ed9-6438-d5a5-c92b308a4052@linaro.org \
    --to=adhemerval.zanella@linaro.org \
    --cc=bug-gnulib@gnu.org \
    --cc=eggert@cs.ucla.edu \
    --cc=libc-alpha@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).