From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <libc-alpha-return-104927-listarch-libc-alpha=sources.redhat.com@sourceware.org>
Received: (qmail 810 invoked by alias); 30 Aug 2019 12:53:01 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Received: (qmail 799 invoked by uid 89); 30 Aug 2019 12:53:00 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-11.2 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_NONE,SPF_PASS autolearn=ham version=3.3.1 spammy=
X-HELO: mail-qt1-f194.google.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=linaro.org; s=google;
        h=to:cc:references:from:openpgp:autocrypt:subject:message-id:date
         :user-agent:mime-version:in-reply-to:content-language
         :content-transfer-encoding;
        bh=AmHUDuVoh+XADUK/9i96P3Rc1KdFzViHfczpYx+x6EA=;
        b=PxA/0NgVL4EFuhG1rI4nAmbGA3t/0nbAbYKRK4Z2MU9ptgVjb1rMxF/brQ/x4BcCn/
         6VdLsKNzEfxQP0Q+tLmOim+aIjR0ycZLBdx6qf3Xc4/o11i7j9EqfSt3BpwDg0AezDBf
         ynSVqjfrHqOeAE8hUnkK8nYSn3u1z3vbWLFajtwDwjfWkjKqBdxZ1y+94LJtq+A0GOZX
         k01SIFIZ70Fw6gL9aBReGs0pM0xt94vA6aCPm9mP/PsGP25XTvpA1ikjW6e9T+8UL6n+
         HMjBIs0zkrboICHeLTZYgA9Kj4wXZP7KsJEZdMfR/ziwSBpj94/0mfempDePr0N7Bj22
         2O6A==
Return-Path: <adhemerval.zanella@linaro.org>
To: Florian Weimer <fweimer@redhat.com>
Cc: libc-alpha@sourceware.org
References: <20190731183136.21545-1-adhemerval.zanella@linaro.org>
 <87k1av6kuh.fsf@oldenburg2.str.redhat.com>
From: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Openpgp: preference=signencrypt
Subject: Re: [PATCH v2 1/5] mips: Do not malloc on getdents64 fallback
Message-ID: <49dc9440-d9a5-112f-9ec6-b62f4c0bcdf2@linaro.org>
Date: Fri, 30 Aug 2019 12:53:00 -0000
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.8.0
MIME-Version: 1.0
In-Reply-To: <87k1av6kuh.fsf@oldenburg2.str.redhat.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
X-SW-Source: 2019-08/txt/msg00841.txt.bz2


On 30/08/2019 06:52, Florian Weimer wrote:
> Sorry, I missed that despite all the 64s in the patch, this code is also
> used on 32-bit architectures.  The review below should take this new (to
> me) piece of information into account.
> 
> * Adhemerval Zanella:
> 
>> +  static bool getdents64_supportted = true;
>> +  if (atomic_load_relaxed (&getdents64_supportted))
> 
> Our atomics do not support bool, only 4 bytes and 8 bytes (if
> __HAVE_64B_ATOMICS) is defined.  See __atomic_check_size.
> 
> Probably it will work if it compiles, but I haven't checked that.

MIPS in fact does not define USE_ATOMIC_COMPILER_BUILTINS and thus it uses
the __atomic_check_size_ls macro.  In any case, I changed to a int for this 
case.

> 
>>    /* Unfortunately getdents64 was only wire-up for MIPS n64 on Linux 3.10.
>> +     If the syscall is not available it need to fallback to the non-LFS one.
>> +     Also to avoid an unbounded allocation through VLA/alloca or malloc (which
>> +     would make the syscall non async-signal-safe) it uses a limited buffer.
>> +     This is sub-optimal for large NBYTES, however this is a fallback
>> +     mechanism to emulate a syscall that kernel should provide.   */
>>  
>> +  enum { KBUF_SIZE = 1024 };
> 
> The choice of size needs a comment.  I think the largest possible
> practical length of the d_name member are 255 Unicode characters in the
> BMP, in UTF-8 encoding, so d_name is 766 bytes long, plus 10 bytes from
> the header, for 776 bytes total.  (NAME_MAX is not a constant on Linux
> in reality.)

I picked the buffer as an arbitrary value, what about:

  /* The largest possible practical length of the d_name member are 255
     Unicode characters in UTF-8 encoding, so d_name is 766 bytes long, plus
     18 (mips64) / 10 (mips64n32) bytes from header, for total of 784 (mips64)
     / 776 (mips64n32) bytes total.  Ensure that the minimum size hold at
     least one entry.  */
  enum { KBUF_SIZE = 1024 };


> 
>>    struct kernel_dirent
>> +  {
>> +    unsigned long d_ino;
>> +    unsigned long d_off;
>> +    unsigned short int d_reclen;
>> +    char d_name[1];
>> +  } kbuf[KBUF_SIZE / sizeof (struct kernel_dirent)];
>> +  size_t kbuf_size = nbytes < KBUF_SIZE ? nbytes : KBUF_SIZE;
> 
> I would define kbuf as a char array, and perhaps leave out the d_name
> member in struct kernel_dirent.  You can copy out the struct
> kernel_dirent using memcpy, which GCC should optimize away.

I defined the buffer as 'struct kernel_dirent' to make it easier to align
for the required fields.  It allows simplify the access on the loop to
avoid memcpy calls.

> 
> Ideally, we would perform the conversion in-line, with a forward scan to
> make the d_reclen members point backwards, followed by a backwards scan
> to move everything in place.  This would reduce stack usage quite
> significantly and avoid a hard restriction on d_name length.

I take this fallback code as a best effort due the restrictions we have
(make it async-signal-safe with bounded stack allocation).  Even with a
clever buffer managements we can't remove the d_name length with the
aforementioned restrictions.  The proper solution is indeed using the
syscall.

> 
>>    struct dirent64 *dp = (struct dirent64 *) buf;
>> +
>> +  size_t nb = 0;
>> +  off64_t last_offset = -1;
>> +
>> +  ssize_t r;
>> +  while ((r = INLINE_SYSCALL_CALL (getdents, fd, kbuf, kbuf_size)) > 0)
>>      {
> 
> Sorry, I don't see how the outer loop is exited.  I think we should
> remove it because it does not seem necessary.

We still need to handle the cases where NBYTES are larger than the temporary
buffer, because it might require multiple getdents calls.


> 
>> +      struct kernel_dirent *skdp, *kdp;
>> +      skdp = kdp = kbuf;
>> +
>> +      while ((char *) kdp < (char *) skdp + r)
>> +	{
>> +	  const size_t alignment = _Alignof (struct dirent64);
>> +	  size_t new_reclen = ((kdp->d_reclen + alignment - 1)
>> +			      & ~(alignment - 1));
> 
> I think this is the roundup macro.  If you use that, I think you don't
> need the alignment variable.

I changed to use ALIGN_UP from libc-pointer-arith.h.

> 
> Is the length really correct, though?  I'd expect it to grow by the
> additional size of the d_ino and d_off members.  I think it would be
> best recompute it from scratch, using the actual length of d_name.

It it because you are referencing to an older patch version, checking on
mips64-n32 I adjusted to:

  const size_t size_diff = (offsetof (struct dirent64, d_name)
                           - offsetof (struct kernel_dirent, d_name));
  [...]
               size_t new_reclen = ALIGN_UP (kdp->d_reclen + size_diff,
                                        _Alignof (struct dirent64));
  [...]

> 
>> +	  if (nb + new_reclen > nbytes)
>>  	    {
>> +		/* The new entry will overflow the input buffer, rewind to
>> +		   last obtained entry and return.  */
>> +	       __lseek64 (fd, last_offset, SEEK_SET);
> 
> I don't think last_offset is guaranteed to have been set with a proper
> offset at this point.  Given that d_name is essentially of unbounded
> length, even expanding the first entry can cause failure.
> 
> Maybe it's possible to avoid this corner case by limiting the amount of
> data being read so that we know that the application-supplied buffer is
> always large enough for any possible expansion.  I think the worse-case
> growth is for lengths 5 to 8, from 20 bytes to 32 bytes.  So perhaps we
> should divide the buffer size by 1.6 and use that?

For this case I really think we just need to return an error to user:

            if (nb + new_reclen > nbytes)
            {   
		/* Entry is too large for the static buffer.  */
		if (last_offset == -1)
		  {
		    __set_errno (EINVAL);
		    return -1;
		  }
                /* The new entry will overflow the input buffer, rewind to
                   last obtained entry and return.  */
               __lseek64 (fd, last_offset, SEEK_SET);
               goto out;
            }

Again I see this fallback code as a best-effort since we are emulating the
syscall with additional restraints.  Most of time glibc tries to play smart
emulating a syscall ended in a lot of headaches...


> 
>> +	       goto out;
>>  	    }
>> +	  nb += new_reclen;
>>  
>> +	  dp->d_ino = kdp->d_ino;
>> +	  dp->d_off = last_offset = kdp->d_off;
>> +	  dp->d_reclen = new_reclen;
>> +	  dp->d_type = *((char *) kdp + kdp->d_reclen - 1);
> 
> I think instead of reading through kdp, you should use char *s and
> memcpy, to avoid the aliasing violation, as discussed above.  Likewise
> for writing to dp.

I think if we proper setting the buffer alignment there is no need to do it.
Also the problem of using memcpy here is for mips64n32 the size is *not* 
equal for dp and kdp, each would require an extra step as:

   {
     typeof (kdp->d_ino) kino;
     memcpy (&kino, &kdp_d->ino, sizeof (kino));
     typeof (dp->d_ino) dino = kino;
     memcpy (&dp->d_ino, &kino, sizeof (dino));
   }

> 
>> +	  memcpy (dp->d_name, kdp->d_name,
>> +		  kdp->d_reclen - offsetof (struct kernel_dirent, d_name));
> 
> See above, I have concerns about the length.
> 
> Thanks,
> Florian
>